This fridayFagPacket was first published as a blog
Numbers that could have been done on the back of one and should probably come with a similar health warning…
Find out more.
Inspired by (giulio_mattioli?)’s recent paper on the car dependence of dog ownership we thought we’d take a look at cats and residential energy demand. Why? Well people like to keep their cats warm but, more importantly, they also cut big holes in doors and/or windows to let the cats in and out. Hardly a thermally sealed envelope!
For now we’re using:
gas_dt <- data.table::fread("~/Dropbox/data/beis/subnationalGas/lsoaDom/LSOA_GAS_2019.csv.gz")
gas_dt[, lsoa11cd := `Lower Layer Super Output Area (LSOA) Code`]
gas_dt[, mean_gas_kWh := `Mean consumption (kWh per meter)`]
gas_dt[, total_gas_kWh := `Consumption (kWh)`]
gas_dt[, nGasMeters := `Number of consuming meters`]
elec_dt <- data.table::fread("~/Dropbox/data/beis/subnationalElec/lsoaDom/LSOA_ELEC_2019.csv.gz")
elec_dt[, lsoa11cd := `Lower Layer Super Output Area (LSOA) Code`]
elec_dt[, mean_elec_kWh := `Mean domestic electricity consumption
(kWh per meter)`]
elec_dt[, total_elec_kWh := `Total domestic electricity consumption (kWh)`]
elec_dt[, nElecMeters := `Total number of domestic electricity meters`]
setkey(gas_dt, lsoa11cd)
setkey(elec_dt, lsoa11cd)
setkey(lsoa_DT, lsoa11cd)
merged_lsoa_DT <- gas_dt[, .(lsoa11cd, mean_gas_kWh, total_gas_kWh, nGasMeters)][elec_dt[, .(lsoa11cd,mean_elec_kWh,total_elec_kWh,nElecMeters)]][lsoa_DT]
# remove the record for postcodes which did not have a postcode sector
message("How many LSOAs do not map to a postcode sector?")
## How many LSOAs do not map to a postcode sector?
nrow(merged_lsoa_DT[is.na(pcd_sector)])
## [1] 0
head(merged_lsoa_DT[is.na(pcd_sector)])
## Empty data.table (0 rows and 11 cols): lsoa11cd,mean_gas_kWh,total_gas_kWh,nGasMeters,mean_elec_kWh,total_elec_kWh...
# !is.na(pcd_sector)
postcode_sector_energy <- merged_lsoa_DT[, .(nLSOAs = .N,
nPostcodes = sum(nPostcodes),
mean_gas_kWh = mean(mean_gas_kWh, na.rm = TRUE),
total_gas_kWh = sum(total_gas_kWh, na.rm = TRUE),
mean_elec_kWh = mean(mean_elec_kWh, na.rm = TRUE),
total_elec_kWh = sum(total_elec_kWh, na.rm = TRUE),
nGasMeters = sum(nGasMeters, na.rm = TRUE),
nElecMeters = sum(nElecMeters, na.rm = TRUE)), keyby = .(pcd_sector, ladnm, ladnmw)]
head(postcode_sector_energy)
## pcd_sector ladnm ladnmw nLSOAs nPostcodes mean_gas_kWh total_gas_kWh mean_elec_kWh total_elec_kWh nGasMeters
## 1: AB1 Aberdeen City 124 2499 15201.63 677121191 3255.818 181062343 45748
## 2: AB1 Aberdeenshire 16 156 15115.72 63539463 4241.188 28441637 4402
## 3: AB10 Aberdeen City 41 892 14526.84 242259559 3065.551 66022955 16706
## 4: AB11 1 1 NaN 0 NaN 0 0
## 5: AB11 Aberdeen City 36 895 11787.76 167210162 2642.843 51471062 14243
## 6: AB12 Aberdeen City 23 763 13431.97 105518093 3260.805 29649555 7790
## nElecMeters
## 1: 58381
## 2: 6663
## 3: 22302
## 4: 0
## 5: 19659
## 6: 9096
# cats
cats_DT <- data.table::fread("~/Dropbox/data/UK_Animal and Plant Health Agency/APHA0372-Cat_Density_Postcode_District.csv")
cats_DT[, pcd_sector := PostcodeDistrict]
setkey(cats_DT, pcd_sector)
setkey(postcode_sector_energy, pcd_sector)
pc_district <- cats_DT[postcode_sector_energy]
We could also use (SERL_UK?)’s smart meter gas/elec data, dwelling characteristics and pet ownership (but no species detail :-)
Well, in some places there seem to be a lot of estimated cats…
(We calculated mean cast per househodl by dividing by the number of electricity meters - probably a reasonable proxy)
pc_district[, mean_Cats := EstimatedCatPopulation/nElecMeters]
head(pc_district[, .(PostcodeDistrict, EstimatedCatPopulation, mean_Cats, nPostcodes, nElecMeters)][order(-mean_Cats)])
## PostcodeDistrict EstimatedCatPopulation mean_Cats nPostcodes nElecMeters
## 1: AB11 2072.99 Inf 1 0
## 2: AB23 1094.73 Inf 4 0
## 3: AB25 851.00 Inf 5 0
## 4: AB31 3961.30 Inf 4 0
## 5: AB39 4902.58 Inf 1 0
## 6: AB41 7457.56 Inf 1 0
LL23 is on the south east corner of the Snowdonia National Park… while EH25 is on the outskirts of Edinburgh.
Is there a correlation between estimated total cats and the number of dwellings (electricity meters)?
ggplot2::ggplot(pc_district, aes(x = nElecMeters , y = EstimatedCatPopulation)) +
geom_point() +
geom_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 563 rows containing non-finite values (stat_smooth).
## Warning: Removed 563 rows containing missing values (geom_point).
Is there a correlation between estimated cat ownership and total gas use?
ggplot2::ggplot(pc_district, aes(x = EstimatedCatPopulation, y = total_gas_kWh)) +
geom_point()
## Warning: Removed 563 rows containing missing values (geom_point).
Or mean gas use and mean cats?
ggplot2::ggplot(pc_district, aes(x = mean_Cats, y = mean_gas_kWh)) +
geom_point()
## Warning: Removed 1756 rows containing missing values (geom_point).
Or total electricity use and cats?
ggplot2::ggplot(pc_district, aes(x = EstimatedCatPopulation, y = total_elec_kWh)) +
geom_point()
## Warning: Removed 563 rows containing missing values (geom_point).
Or mean elec use and mean cats?
ggplot2::ggplot(pc_district, aes(x = mean_Cats, y = mean_elec_kWh)) +
geom_point()
## Warning: Removed 1423 rows containing missing values (geom_point).
Or total energy use and total cats?
pc_district[, total_energy_kWh := total_gas_kWh + total_elec_kWh]
ggplot2::ggplot(pc_district, aes(x = EstimatedCatPopulation, y = total_energy_kWh)) +
geom_point()
## Warning: Removed 563 rows containing missing values (geom_point).
Well, there may be something in there? Let’s try a boxplot by cat deciles… Figure 4.1
pc_district[, cat_decile := dplyr::ntile(EstimatedCatPopulation, 10)]
#head(pc_district[is.na(cat_decile)])
ggplot2::ggplot(pc_district[!is.na(cat_decile)], aes(x = as.factor(cat_decile), y = total_energy_kWh/1000000)) +
geom_boxplot() +
labs(x = "Cat ownership deciles",
y = "Total domestic electricity & gas GWh")
Figure 4.1: Cat ownership deciles and total annual residenital electricity & gas use
Well…
pc_district[, total_energy_kWh := total_gas_kWh + total_elec_kWh]
pc_district[, mean_energy_kWh := total_energy_kWh/nElecMeters]
ggplot2::ggplot(pc_district, aes(x = mean_Cats, y = mean_energy_kWh)) +
geom_point()
## Warning: Removed 1423 rows containing missing values (geom_point).