This fridayFagPacket was first published at…
Numbers that could have been done on the back of one and should probably come with a similar health warning…
Find out more.
Inspired by @giulio_mattioli
’s recent paper on the car dependence of dog ownership we thought we’d take a look at cats and residential energy demand. Why? Well people like to keep their cats warm but, more importantly, they also cut big holes in doors and/or windows to let the cats in and out. Hardly a thermally sealed envelope…
We could also use @SERL_UK
’s smart meter gas/elec data, dwelling characteristics and pet ownership (but no species detail :-)
So for now we’re using:
# cats
cats_DT <- data.table::fread(paste0(dp, "UK_Animal and Plant Health Agency/APHA0372-Cat_Density_Postcode_District.csv"))
cats_DT[, pcd_district := PostcodeDistrict]
setkey(cats_DT, pcd_district)
nrow(cats_DT)
## [1] 2830
setkey(pc_district_energy_dt, pcd_district)
nrow(pc_district_energy_dt)
## [1] 3185
pc_district <- pc_district_energy_dt[cats_DT] # keeps only postcode sectors where we have cat data
# this may include areas where we have no energy data
nrow(pc_district)
## [1] 2987
nrow(pc_district[!is.na(GOR10NM)])
## [1] 2936
# there are postcode sectors with no electricity meters - for now we'll remove them
# pending further investigation
t <- pc_district[!is.na(GOR10NM), .(nPostcodeDistricts = .N,
sumCats = sum(EstimatedCatPopulation)), keyby=.(GOR10NM)]
t[, catsPerDistrict := sumCats/nPostcodeDistricts]
makeFlexTable(t, cap = "Regions covered")
GOR10NM | nPostcodeDistricts | sumCats | catsPerDistrict |
(pseudo) Scotland | 419 | 997,963.9 | 2,381.8 |
(pseudo) Wales | 199 | 728,109.6 | 3,658.8 |
East Midlands | 163 | 819,002.6 | 5,024.6 |
East of England | 262 | 1,138,362.7 | 4,344.9 |
London | 295 | 602,199.5 | 2,041.4 |
North East | 130 | 390,408.6 | 3,003.1 |
North West | 288 | 1,032,288.2 | 3,584.3 |
South East | 412 | 1,804,589.1 | 4,380.1 |
South West | 288 | 1,488,017.7 | 5,166.7 |
West Midlands | 243 | 989,675.6 | 4,072.7 |
Yorkshire and The Humber | 237 | 936,527.3 | 3,951.6 |
Well, in some places there seem to be a lot of estimated cats…
(We calculated mean cats per household by dividing by the number of electricity meters - probably a reasonable proxy)
pc_district[, mean_Cats := EstimatedCatPopulation/nElecMeters]
head(pc_district[, .(PostcodeDistrict, EstimatedCatPopulation, mean_Cats, nPostcodes, nElecMeters)][order(-mean_Cats)])
## PostcodeDistrict EstimatedCatPopulation mean_Cats nPostcodes nElecMeters
## 1: SA63 1981.13 13.854056 7 143
## 2: LL23 8233.15 10.582455 21 778
## 3: LL66 104.70 9.518182 1 11
## 4: BS28 4034.58 8.945854 31 451
## 5: LA17 1341.85 8.886424 2 151
## 6: EH25 7489.82 8.884721 17 843
SA63 is in south west Wales while LL23 is on the edge of the Snowdonia National Park….
Is there a correlation between estimated total cats and the number of dwellings (electricity meters)?
ggplot2::ggplot(pc_district[!is.na(GOR10NM)], aes(x = nElecMeters , y = EstimatedCatPopulation,
colour = GOR10NM)) +
geom_point() +
geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 92 rows containing non-finite values (stat_smooth).
## Warning: Removed 92 rows containing missing values (geom_point).
Is there a correlation between estimated cat ownership and total gas use?
ggplot2::ggplot(pc_district[!is.na(GOR10NM)],
aes(x = EstimatedCatPopulation, y = total_gas_kWh, colour = GOR10NM)) +
geom_smooth() +
geom_point()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 291 rows containing non-finite values (stat_smooth).
## Warning: Removed 291 rows containing missing values (geom_point).
Or mean gas use and mean cats?
pc_district[, mean_gas_kWh := total_gas_kWh/nGasMeters]
ggplot2::ggplot(pc_district[!is.na(GOR10NM)],
aes(x = mean_Cats, y = mean_gas_kWh, colour = GOR10NM)) +
geom_smooth() +
geom_point()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 291 rows containing non-finite values (stat_smooth).
## Warning: Removed 291 rows containing missing values (geom_point).
Or total electricity use and cats?
ggplot2::ggplot(pc_district[!is.na(GOR10NM)], aes(x = EstimatedCatPopulation, y = total_elec_kWh, colour = GOR10NM)) +
geom_smooth() +
geom_point()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 92 rows containing non-finite values (stat_smooth).
## Warning: Removed 92 rows containing missing values (geom_point).
Or mean elec use and mean cats?
pc_district[, mean_elec_kWh := total_elec_kWh/nElecMeters]
ggplot2::ggplot(pc_district[!is.na(GOR10NM)], aes(x = mean_Cats, y = mean_elec_kWh, colour = GOR10NM)) +
geom_smooth() +
geom_point()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 92 rows containing non-finite values (stat_smooth).
## Warning: Removed 92 rows containing missing values (geom_point).
Or total energy use and total cats?
pc_district[, total_energy_kWh := total_gas_kWh + total_elec_kWh]
ggplot2::ggplot(pc_district[!is.na(GOR10NM)], aes(x = EstimatedCatPopulation, y = total_energy_kWh, colour = GOR10NM)) +
geom_smooth() +
geom_point()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 291 rows containing non-finite values (stat_smooth).
## Warning: Removed 291 rows containing missing values (geom_point).
Let’s try a boxplot by cat deciles… Figure 4.1 suggests the median energy use is higher in postcode districts with higher cat ownership.
pc_district[, cat_decile := dplyr::ntile(EstimatedCatPopulation, 10)]
#head(pc_district[is.na(cat_decile)])
ggplot2::ggplot(pc_district[!is.na(cat_decile) & !is.na(GOR10NM)], aes(x = as.factor(cat_decile), y = total_energy_kWh/1000000)) +
geom_boxplot() +
facet_wrap(. ~ GOR10NM) +
labs(x = "Cat ownership deciles",
y = "Total domestic electricity & gas GWh",
caption = "Postcode sectors (Data: BEIS & Animal and Plant Health Agency, 2015)")
## Warning: Removed 291 rows containing non-finite values (stat_boxplot).
Figure 4.1: Cat ownership deciles and total annual residenital electricity & gas use
Well…
pc_district[, total_energy_kWh := total_gas_kWh + total_elec_kWh]
pc_district[, mean_energy_kWh := total_energy_kWh/nElecMeters]
ggplot2::ggplot(pc_district[!is.na(GOR10NM)], aes(x = mean_Cats, y = mean_energy_kWh)) +
geom_point()
## Warning: Removed 291 rows containing missing values (geom_point).