This fridayFagPacket was first published at…
Numbers that could have been done on the back of one and should probably come with a similar health warning…
Find out more.
Inspired by @giulio_mattioli
’s recent paper on the car dependence of dog ownership we thought we’d take a look at cats and residential energy demand. Why? Well people like to keep their cats warm but, more importantly, they also cut big holes in doors and/or windows to let the cats in and out. Hardly a thermally sealed envelope…
We could also use @SERL_UK
’s smart meter gas/elec data, dwelling characteristics and pet ownership (but no species detail :-)
So for now we’re using:
# cats
cats_DT <- data.table::fread(paste0(dp, "UK_Animal and Plant Health Agency/APHA0372-Cat_Density_Postcode_District.csv"))
cats_DT[, pcd_district := PostcodeDistrict]
setkey(cats_DT, pcd_district)
nrow(cats_DT)
## [1] 2830
setkey(pc_district_energy_dt, pcd_district)
nrow(pc_district_energy_dt)
## [1] 3185
pc_district <- pc_district_energy_dt[cats_DT] # keeps only postcode districts where we have cat data
# this may include areas where we have no energy data
pc_district[, mean_Cats := EstimatedCatPopulation/nElecMeters]
nrow(pc_district)
## [1] 2987
nrow(pc_district[!is.na(GOR10NM)])
## [1] 2936
# there are postcode districts with no electricity meters - for now we'll remove them
# pending further investigation
t <- pc_district[!is.na(GOR10NM), .(nPostcodeDistricts = .N,
sumCats = sum(EstimatedCatPopulation)), keyby=.(GOR10NM)]
t[, catsPerDistrict := sumCats/nPostcodeDistricts]
makeFlexTable(t, cap = "Regions covered")
GOR10NM | nPostcodeDistricts | sumCats | catsPerDistrict |
(pseudo) Scotland | 419 | 997,963.9 | 2,381.8 |
(pseudo) Wales | 199 | 728,109.6 | 3,658.8 |
East Midlands | 163 | 819,002.6 | 5,024.6 |
East of England | 262 | 1,138,362.7 | 4,344.9 |
London | 295 | 602,199.5 | 2,041.4 |
North East | 130 | 390,408.6 | 3,003.1 |
North West | 288 | 1,032,288.2 | 3,584.3 |
South East | 412 | 1,804,589.1 | 4,380.1 |
South West | 288 | 1,488,017.7 | 5,166.7 |
West Midlands | 243 | 989,675.6 | 4,072.7 |
Yorkshire and The Humber | 237 | 936,527.3 | 3,951.6 |
Well, in some places there seem to be a lot of estimated cats per household…
(We calculated mean cats per household by dividing by the number of electricity meters - probably a reasonable proxy)
t <- head(pc_district[, .(PostcodeDistrict, EstimatedCatPopulation, mean_Cats, nPostcodes, nElecMeters)][order(-mean_Cats)],10)
makeFlexTable(t, cap = "Top 10 postcode districts by number of cats per 'household'")
PostcodeDistrict | EstimatedCatPopulation | mean_Cats | nPostcodes | nElecMeters |
SA63 | 1,981.1 | 13.9 | 7 | 143 |
LL23 | 8,233.1 | 10.6 | 21 | 778 |
LL66 | 104.7 | 9.5 | 1 | 11 |
BS28 | 4,034.6 | 8.9 | 31 | 451 |
LA17 | 1,341.8 | 8.9 | 2 | 151 |
EH25 | 7,489.8 | 8.9 | 17 | 843 |
CA9 | 1,819.9 | 8.2 | 23 | 223 |
SA66 | 3,298.3 | 7.9 | 14 | 419 |
LL24 | 1,388.6 | 7.0 | 18 | 197 |
LA20 | 1,598.1 | 6.6 | 7 | 241 |
SA63 is in south west Wales while LL23 is on the edge of the Snowdonia National Park….
Do these places have some largish catteries but few houses? 8,233 is a lot of estimated cats.
Is there a correlation between estimated total cats and the number of dwellings (electricity meters)?
ggplot2::ggplot(pc_district[!is.na(GOR10NM)], aes(x = nElecMeters , y = EstimatedCatPopulation,
colour = GOR10NM)) +
geom_point() +
geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 92 rows containing non-finite values (stat_smooth).
## Warning: Removed 92 rows containing missing values (geom_point).
Is there a correlation between estimated cat ownership and total gas use?
ggplot2::ggplot(pc_district[!is.na(GOR10NM)],
aes(x = EstimatedCatPopulation, y = total_gas_kWh, colour = GOR10NM)) +
geom_smooth() +
geom_point()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 291 rows containing non-finite values (stat_smooth).
## Warning: Removed 291 rows containing missing values (geom_point).
Or mean gas use and mean cats?
ggplot2::ggplot(pc_district[!is.na(GOR10NM)],
aes(x = mean_Cats, y = mean_gas_kWh, colour = GOR10NM)) +
geom_smooth() +
geom_point()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 291 rows containing non-finite values (stat_smooth).
## Warning: Removed 291 rows containing missing values (geom_point).
Or total electricity use and cats?
ggplot2::ggplot(pc_district[!is.na(GOR10NM)], aes(x = EstimatedCatPopulation, y = total_elec_kWh, colour = GOR10NM)) +
geom_smooth() +
geom_point()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 92 rows containing non-finite values (stat_smooth).
## Warning: Removed 92 rows containing missing values (geom_point).
Or mean elec use and mean cats?
ggplot2::ggplot(pc_district[!is.na(GOR10NM)], aes(x = mean_Cats, y = mean_elec_kWh, colour = GOR10NM)) +
geom_smooth() +
geom_point()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 92 rows containing non-finite values (stat_smooth).
## Warning: Removed 92 rows containing missing values (geom_point).
Or total energy use and total cats?
pc_district[, total_gas_kWh := ifelse(is.na(total_gas_kWh), 0, total_gas_kWh)]
pc_district[, total_energy_kWh := total_gas_kWh + total_elec_kWh]
pc_district[, mean_energy_kWh := total_energy_kWh/nElecMeters]
ggplot2::ggplot(pc_district[!is.na(GOR10NM)], aes(x = EstimatedCatPopulation, y = total_energy_kWh, colour = GOR10NM)) +
geom_smooth() +
geom_point()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 92 rows containing non-finite values (stat_smooth).
## Warning: Removed 92 rows containing missing values (geom_point).
Let’s try a boxplot by cat deciles… Figure 4.1 suggests the median energy use is higher in postcode districts with higher cat ownership.
pc_district[, cat_decile := dplyr::ntile(EstimatedCatPopulation, 10)]
#head(pc_district[is.na(cat_decile)])
ggplot2::ggplot(pc_district[!is.na(cat_decile) & !is.na(GOR10NM)], aes(x = as.factor(cat_decile), y = total_energy_kWh/1000000)) +
geom_boxplot() +
facet_wrap(. ~ GOR10NM) +
labs(x = "Cat ownership deciles",
y = "Total domestic electricity & gas GWh",
caption = "Postcode districts (Data: BEIS & Animal and Plant Health Agency, 2015)")
## Warning: Removed 92 rows containing non-finite values (stat_boxplot).
Figure 4.1: Cat ownership deciles and total annual residenital electricity & gas use
Well…
pc_district[, total_energy_kWh := total_gas_kWh + total_elec_kWh]
pc_district[, mean_energy_kWh := total_energy_kWh/nElecMeters]
ggplot2::ggplot(pc_district[!is.na(GOR10NM)], aes(x = mean_Cats, y = mean_energy_kWh)) +
geom_point()
## Warning: Removed 92 rows containing missing values (geom_point).