itsTheCatsStupid.Rmd 8.03 KB
Newer Older
Ben Anderson's avatar
Ben Anderson committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
---
params:
  subtitle: ""
  title: ""
  authors: ""
title: '`r params$title`'
subtitle: '`r params$subtitle`'
author: '`r params$authors`'
date: 'Last run at: `r Sys.time()`'
output:
  bookdown::html_document2:
    self_contained: true
    fig_caption: yes
    code_folding: hide
    number_sections: yes
    toc: yes
    toc_depth: 2
    toc_float: TRUE
  bookdown::pdf_document2:
    fig_caption: yes
    number_sections: yes
  bookdown::word_document2:
    fig_caption: yes
    number_sections: yes
    toc: yes
    toc_depth: 2
    fig_width: 5
Ben Anderson's avatar
Ben Anderson committed
28
bibliography: '`r path.expand("~/bibliography.bib")`'
Ben Anderson's avatar
Ben Anderson committed
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---

<hr>

>This fridayFagPacket was first published as a [blog](https://dataknut.wordpress.com/2020/10/16/retrofit-or-bust/)

<hr>

# fridayFagPackets

Numbers that could have been done on the back of one and should probably come with a similar health warning...

>Find out [more](https://dataknut.github.io/fridayFagPackets/).

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(data.table)
library(ggplot2)
```

# It's the cats, stupid
Inspired by @giulio_mattioli's [recent paper on the car dependence of dog ownership](https://twitter.com/giulio_mattioli/status/1466361022747455492) we thought we'd take a look at [cats](https://twitter.com/giulio_mattioli/status/1466710752606179331) and residential energy demand. Why? Well people like to keep their cats warm but, more importantly, they also cut big holes in doors and/or windows to let the cats in and out. Hardly a thermally sealed envelope!

# What's the data?

For now we're using:

Ben Anderson's avatar
Ben Anderson committed
56
57
58
 * postcode sector level estimates of cat ownership in the UK. Does such a thing exist? [YEAH](https://data.gov.uk/dataset/febd29ff-7e7d-4f82-9908-031f7f0e0860/cat-population-per-postcode-district)! "_This dataset gives the mean estimate for population for each district, and was generated as part of the delivery of commissioned research. The data contained within this dataset are modelled figures, based on national estimates for pet population, and available information on Veterinary activity across GB. The data are accurate as of 01/01/2015. The data provided are summarised to the postcode district level. Further information on this research is available in a research publication by James Aegerter, David Fouracre & Graham C. Smith, discussing the structure and density of pet cat and dog populations across Great Britain._"
 * LSOA level data on [gas](https://www.gov.uk/government/collections/sub-national-gas-consumption-data) and [electricity](https://www.gov.uk/government/collections/sub-national-electricity-consumption-data) 'consumption' at LSOA/SOA level aggregated to postcode sectors
 * [Indices of Deprivation 2019](https://www.gov.uk/government/statistics/english-indices-of-deprivation-2019) for England
Ben Anderson's avatar
Ben Anderson committed
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113

```{r loadData}

gas_dt <- data.table::fread("~/Dropbox/data/beis/subnationalGas/lsoaDom/LSOA_GAS_2019.csv.gz")
gas_dt[, lsoa11cd := `Lower Layer Super Output Area (LSOA) Code`]
gas_dt[, mean_gas_kWh := `Mean consumption (kWh per meter)`]
gas_dt[, total_gas_kWh := `Consumption (kWh)`]
gas_dt[, nGasMeters := `Number of consuming meters`]

elec_dt <- data.table::fread("~/Dropbox/data/beis/subnationalElec/lsoaDom/LSOA_ELEC_2019.csv.gz")
elec_dt[, lsoa11cd := `Lower Layer Super Output Area (LSOA) Code`]
elec_dt[, mean_elec_kWh := `Mean domestic electricity consumption 
(kWh per meter)`]
elec_dt[, total_elec_kWh := `Total domestic electricity consumption (kWh)`]
elec_dt[, nElecMeters := `Total number of domestic electricity meters`]

setkey(gas_dt, lsoa11cd)
setkey(elec_dt, lsoa11cd)
setkey(lsoa_DT, lsoa11cd)

merged_lsoa_DT <- gas_dt[, .(lsoa11cd, mean_gas_kWh, total_gas_kWh, nGasMeters)][elec_dt[, .(lsoa11cd,mean_elec_kWh,total_elec_kWh,nElecMeters)]][lsoa_DT]

# remove the record for postcodes which did not have a postcode sector
message("How many LSOAs do not map to a postcode sector?")
nrow(merged_lsoa_DT[is.na(pcd_sector)])
head(merged_lsoa_DT[is.na(pcd_sector)])

# !is.na(pcd_sector)
postcode_sector_energy <- merged_lsoa_DT[, .(nLSOAs = .N,
                                                               nPostcodes = sum(nPostcodes),
                                             mean_gas_kWh = mean(mean_gas_kWh, na.rm = TRUE),
                                             total_gas_kWh = sum(total_gas_kWh, na.rm = TRUE),
                                             mean_elec_kWh = mean(mean_elec_kWh, na.rm = TRUE),
                                             total_elec_kWh = sum(total_elec_kWh, na.rm = TRUE),
                                             nGasMeters = sum(nGasMeters, na.rm = TRUE),
                                             nElecMeters = sum(nElecMeters, na.rm = TRUE)), keyby = .(pcd_sector, ladnm, ladnmw)]
head(postcode_sector_energy)

# cats
cats_DT <- data.table::fread("~/Dropbox/data/UK_Animal and Plant Health Agency/APHA0372-Cat_Density_Postcode_District.csv")
cats_DT[, pcd_sector := PostcodeDistrict]

setkey(cats_DT, pcd_sector)
setkey(postcode_sector_energy, pcd_sector)

pc_district <- cats_DT[postcode_sector_energy]

```

We could also use @SERL_UK's [smart meter gas/elec data](https://twitter.com/dataknut/status/1466712963222540289?s=20), dwelling characteristics and pet ownership (but no species detail :-) 

# What do we find?

Well, in some places there seem to be a lot of estimated cats...

Ben Anderson's avatar
Ben Anderson committed
114
115
(We calculated mean cast per househodl by dividing by the number of electricity meters - probably a reasonable proxy)

Ben Anderson's avatar
Ben Anderson committed
116
117
118
119
120
121
```{r maxCats}
pc_district[, mean_Cats := EstimatedCatPopulation/nElecMeters]
head(pc_district[, .(PostcodeDistrict, EstimatedCatPopulation, mean_Cats, nPostcodes, nElecMeters)][order(-mean_Cats)])
```
LL23 is on the south east corner of the [Snowdonia National Park...](https://www.google.co.uk/maps/place/Bala+LL23/@52.8953768,-3.775299,11z/data=!3m1!4b1!4m5!3m4!1s0x4865404ae1208f67:0x65a437b997c0dfb2!8m2!3d52.8825403!4d-3.6497989) while EH25 is on the outskirts of [Edinburgh](https://www.google.co.uk/maps/place/EH25/@55.8518992,-3.2076308,13z/data=!4m5!3m4!1s0x4887bf6548dd78d7:0xd6f980c5a3b93592!8m2!3d55.8560564!4d-3.1733124).

Ben Anderson's avatar
Ben Anderson committed
122
123
## More dwellings, more cats?

Ben Anderson's avatar
Ben Anderson committed
124
125
Is there a correlation between estimated total cats and the number of dwellings (electricity meters)?

Ben Anderson's avatar
Ben Anderson committed
126
```{r testTotalElecMeters}
Ben Anderson's avatar
Ben Anderson committed
127
128
129
130
131
ggplot2::ggplot(pc_district, aes(x = nElecMeters , y = EstimatedCatPopulation)) +
  geom_point() +
  geom_smooth()
```

Ben Anderson's avatar
Ben Anderson committed
132
Is there a correlation between estimated cat ownership and total gas use?
Ben Anderson's avatar
Ben Anderson committed
133
134
135
136
137
138

```{r testTotalGas}
ggplot2::ggplot(pc_district, aes(x = EstimatedCatPopulation, y = total_gas_kWh)) +
  geom_point()
```

Ben Anderson's avatar
Ben Anderson committed
139
Or mean gas use and mean cats?
Ben Anderson's avatar
Ben Anderson committed
140

Ben Anderson's avatar
Ben Anderson committed
141
```{r testMeanGas}
Ben Anderson's avatar
Ben Anderson committed
142
143
144
ggplot2::ggplot(pc_district, aes(x = mean_Cats, y = mean_gas_kWh)) +
  geom_point()
```
Ben Anderson's avatar
Ben Anderson committed
145
146
147

Or total electricity use and cats?

Ben Anderson's avatar
Ben Anderson committed
148
149
150
151
```{r testTotalElec}
ggplot2::ggplot(pc_district, aes(x = EstimatedCatPopulation, y = total_elec_kWh)) +
  geom_point()
```
Ben Anderson's avatar
Ben Anderson committed
152
153
154

Or mean elec use and mean cats?

Ben Anderson's avatar
Ben Anderson committed
155
156
157
158
159
```{r testMeanElec}
ggplot2::ggplot(pc_district, aes(x = mean_Cats, y = mean_elec_kWh)) +
  geom_point()
```

Ben Anderson's avatar
Ben Anderson committed
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
Or total energy use and total cats?

```{r testTotalEnergy}
pc_district[, total_energy_kWh := total_gas_kWh + total_elec_kWh]

ggplot2::ggplot(pc_district, aes(x = EstimatedCatPopulation, y = total_energy_kWh)) +
  geom_point()
```

Well, there may be something in there? Let's try a boxplot by cat deciles... Figure \@ref(fig:catDeciles)

```{r catDeciles, fig.cap = "Cat ownership deciles and total annual residenital electricity & gas use"}
pc_district[, cat_decile := dplyr::ntile(EstimatedCatPopulation, 10)]
#head(pc_district[is.na(cat_decile)])
ggplot2::ggplot(pc_district[!is.na(cat_decile)], aes(x = as.factor(cat_decile), y = total_energy_kWh/1000000)) +
  geom_boxplot() +
  labs(x = "Cat ownership deciles",
       y = "Total domestic electricity & gas GWh")
```
Well...

```{r testMeanEnergy}
pc_district[, total_energy_kWh := total_gas_kWh + total_elec_kWh]
pc_district[, mean_energy_kWh := total_energy_kWh/nElecMeters]

ggplot2::ggplot(pc_district, aes(x = mean_Cats, y = mean_energy_kWh)) +
  geom_point()
```

Ben Anderson's avatar
Ben Anderson committed
189
190
191
192
193
194
195
196
197
# R packages used

 * bookdown [@bookdown]
 * data.table [@data.table]
 * ggplot2 [@ggplot2]
 * knitr [@knitr]
 * rmarkdown [@rmarkdown]
 
# References