This fridayFagPacket was first published at…


1 fridayFagPackets

Numbers that could have been done on the back of one and should probably come with a similar health warning…

Find out more.

2 It’s the cats, stupid

Inspired by @giulio_mattioli’s recent paper on the car dependence of dog ownership we thought we’d take a look at cats and residential energy demand. Why? Well people like to keep their cats warm but, more importantly, they also cut big holes in doors and/or windows to let the cats in and out. Hardly a thermally sealed envelope!

3 What’s the data?

For now we’re using:

  • postcode sector level estimates of cat ownership in the UK in 2015. Does such a thing exist? YEAH! “This dataset gives the mean estimate for population for each district, and was generated as part of the delivery of commissioned research. The data contained within this dataset are modelled figures, based on national estimates for pet population, and available information on Veterinary activity across GB. The data are accurate as of 01/01/2015. The data provided are summarised to the postcode district level. Further information on this research is available in a research publication by James Aegerter, David Fouracre & Graham C. Smith, discussing the structure and density of pet cat and dog populations across Great Britain.
  • LSOA level data on gas and electricity ‘consumption’ at LSOA/SOA level for 2015 aggregated to postcode sectors
  • Indices of Deprivation 2019 for England
gas_dt <- data.table::fread(paste0(dp, "/beis/subnationalGas/lsoaDom/LSOA_GAS_2015.csv.gz"))
gas_dt[, lsoa11cd := `Lower Layer Super Output Area (LSOA) Code`]
gas_dt[, mean_gas_kWh := `Mean consumption (kWh per meter)`]
gas_dt[, total_gas_kWh := `Consumption (kWh)`]
gas_dt[, nGasMeters := `Number of consuming meters`]

elec_dt <- data.table::fread(paste0(dp, "/beis/subnationalElec/lsoaDom/LSOA_ELEC_2015.csv.gz"))
elec_dt[, lsoa11cd := `Lower Layer Super Output Area (LSOA) Code`]
elec_dt[, mean_elec_kWh := `Mean domestic electricity consumption 
(kWh per meter)`]
elec_dt[, total_elec_kWh := `Total domestic electricity consumption (kWh)`]
elec_dt[, nElecMeters := `Total number of domestic electricity meters`]

setkey(gas_dt, lsoa11cd)
setkey(elec_dt, lsoa11cd)
setkey(lsoa_DT, lsoa11cd)

merged_lsoa_DT <- gas_dt[, .(lsoa11cd, mean_gas_kWh, total_gas_kWh, nGasMeters)][elec_dt[, .(lsoa11cd,mean_elec_kWh,total_elec_kWh,nElecMeters)]][lsoa_DT]

# remove the record for postcodes which did not have a postcode sector
message("How many LSOAs do not map to a postcode sector?")
## How many LSOAs do not map to a postcode sector?
nrow(merged_lsoa_DT[is.na(pcd_sector)])
## [1] 0
head(merged_lsoa_DT[is.na(pcd_sector)])
## Empty data.table (0 rows and 11 cols): lsoa11cd,mean_gas_kWh,total_gas_kWh,nGasMeters,mean_elec_kWh,total_elec_kWh...
# !is.na(pcd_sector)
postcode_sector_energy <- merged_lsoa_DT[, .(nLSOAs = .N,
                                                               nPostcodes = sum(nPostcodes),
                                             mean_gas_kWh = mean(mean_gas_kWh, na.rm = TRUE),
                                             total_gas_kWh = sum(total_gas_kWh, na.rm = TRUE),
                                             mean_elec_kWh = mean(mean_elec_kWh, na.rm = TRUE),
                                             total_elec_kWh = sum(total_elec_kWh, na.rm = TRUE),
                                             nGasMeters = sum(nGasMeters, na.rm = TRUE),
                                             nElecMeters = sum(nElecMeters, na.rm = TRUE)), keyby = .(pcd_sector, ladnm, ladnmw)]
#head(postcode_sector_energy)

# cats
cats_DT <- data.table::fread(paste0(dp, "UK_Animal and Plant Health Agency/APHA0372-Cat_Density_Postcode_District.csv"))
cats_DT[, pcd_sector := PostcodeDistrict]

setkey(cats_DT, pcd_sector)
setkey(postcode_sector_energy, pcd_sector)

pc_district <- cats_DT[postcode_sector_energy]

We could also use @SERL_UK’s smart meter gas/elec data, dwelling characteristics and pet ownership (but no species detail :-)

4 What do we find?

Well, in some places there seem to be a lot of estimated cats…

(We calculated mean cats per household by dividing by the number of electricity meters - probably a reasonable proxy)

pc_district[, mean_Cats := EstimatedCatPopulation/nElecMeters]
head(pc_district[, .(PostcodeDistrict, EstimatedCatPopulation, mean_Cats, nPostcodes, nElecMeters)][order(-mean_Cats)])
##    PostcodeDistrict EstimatedCatPopulation mean_Cats nPostcodes nElecMeters
## 1:             AB11                2072.99       Inf          1           0
## 2:             AB23                1094.73       Inf          4           0
## 3:             AB25                 851.00       Inf          5           0
## 4:             AB31                3961.30       Inf          4           0
## 5:             AB39                4902.58       Inf          1           0
## 6:             AB41                7457.56       Inf          1           0

LL23 is on the south east corner of the Snowdonia National Park… while EH25 is on the outskirts of Edinburgh.

4.1 More dwellings, more cats?

Is there a correlation between estimated total cats and the number of dwellings (electricity meters)?

ggplot2::ggplot(pc_district, aes(x = nElecMeters , y = EstimatedCatPopulation)) +
  geom_point() +
  geom_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 563 rows containing non-finite values (stat_smooth).
## Warning: Removed 563 rows containing missing values (geom_point).

Is there a correlation between estimated cat ownership and total gas use?

ggplot2::ggplot(pc_district, aes(x = EstimatedCatPopulation, y = total_gas_kWh)) +
  geom_point()
## Warning: Removed 563 rows containing missing values (geom_point).

Or mean gas use and mean cats?

ggplot2::ggplot(pc_district, aes(x = mean_Cats, y = mean_gas_kWh)) +
  geom_point()
## Warning: Removed 1770 rows containing missing values (geom_point).

Or total electricity use and cats?

ggplot2::ggplot(pc_district, aes(x = EstimatedCatPopulation, y = total_elec_kWh)) +
  geom_point()
## Warning: Removed 563 rows containing missing values (geom_point).

Or mean elec use and mean cats?

ggplot2::ggplot(pc_district, aes(x = mean_Cats, y = mean_elec_kWh)) +
  geom_point()
## Warning: Removed 1423 rows containing missing values (geom_point).

Or total energy use and total cats?

pc_district[, total_energy_kWh := total_gas_kWh + total_elec_kWh]

ggplot2::ggplot(pc_district, aes(x = EstimatedCatPopulation, y = total_energy_kWh)) +
  geom_point()
## Warning: Removed 563 rows containing missing values (geom_point).

Well, there may be something in there? Let’s try a boxplot by cat deciles… Figure 4.1

pc_district[, cat_decile := dplyr::ntile(EstimatedCatPopulation, 10)]
#head(pc_district[is.na(cat_decile)])
ggplot2::ggplot(pc_district[!is.na(cat_decile)], aes(x = as.factor(cat_decile), y = total_energy_kWh/1000000)) +
  geom_boxplot() +
  labs(x = "Cat ownership deciles",
       y = "Total domestic electricity & gas GWh")
Cat ownership deciles and total annual residenital electricity & gas use

Figure 4.1: Cat ownership deciles and total annual residenital electricity & gas use

Well…

pc_district[, total_energy_kWh := total_gas_kWh + total_elec_kWh]
pc_district[, mean_energy_kWh := total_energy_kWh/nElecMeters]

ggplot2::ggplot(pc_district, aes(x = mean_Cats, y = mean_energy_kWh)) +
  geom_point()
## Warning: Removed 1423 rows containing missing values (geom_point).

5 R packages used

  • bookdown (Xie 2016a)
  • data.table (Dowle et al. 2015)
  • ggplot2 (Wickham 2009)
  • knitr (Xie 2016b)
  • rmarkdown (Allaire et al. 2018)

References

Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, and Winston Chang. 2018. Rmarkdown: Dynamic Documents for r. https://CRAN.R-project.org/package=rmarkdown.
Dowle, M, A Srinivasan, T Short, S Lianoglou with contributions from R Saporta, and E Antonyan. 2015. Data.table: Extension of Data.frame. https://CRAN.R-project.org/package=data.table.
Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.
Xie, Yihui. 2016a. Bookdown: Authoring Books and Technical Documents with R Markdown. Boca Raton, Florida: Chapman; Hall/CRC. https://github.com/rstudio/bookdown.
———. 2016b. Knitr: A General-Purpose Package for Dynamic Report Generation in r. https://CRAN.R-project.org/package=knitr.