@dataknut
)This work is (c) the author(s).
This work is licensed under a Creative Commons Attribution 4.0 International License unless otherwise marked.
For the avoidance of doubt and explanation of terms please refer to the full license notice and legal code.
If you wish to use any of the material from this paper please cite as:
This work is (c) 2018 the authors.
This report uses circuit level extracts for ‘Heat Pumps’ from the NZ GREEN Grid Household Electricity Demand Data (https://dx.doi.org/10.5255/UKDA-SN-853334 (Anderson et al. 2018)). These have been extracted using the code found in https://github.com/CfSOtago/GREENGridData/blob/master/examples/code/extractCleanGridSpy1minCircuit.R
This work was supported by:
This report contains the analysis for a paper of the same name. The text is stored elsewhere for ease of editing.
season | meanMeanW | sdMeanW |
---|---|---|
Spring | 58.80597 | 113.53102 |
Summer | 35.13947 | 83.90258 |
Autumn | 68.37439 | 147.37279 |
Winter | 162.66915 | 325.51171 |
Observations are summarised to mean W per household during 16:00 - 20:00 on weekdays for year = 2015.
## Warning: replacing previous import 'data.table::melt' by 'reshape2::melt'
## when loading 'weGotThePower'
## Warning: replacing previous import 'data.table::dcast' by 'reshape2::dcast'
## when loading 'weGotThePower'
Figure 4.1 shows the initial p = 0.01 plot.
## Scale for 'y' is already present. Adding another scale for 'y', which
## will replace the existing scale.
Figure 4.1: Power analysis results (p = 0.01, power = 0.8)
## Saving 7 x 5 in image
Effect size at n = 1000: 28.37.
Figure 4.2 shows the plot for all results.
## Scale for 'y' is already present. Adding another scale for 'y', which
## will replace the existing scale.
Figure 4.2: Power analysis results (power = 0.8)
## Saving 7 x 5 in image
Full table of results:
## Using 'effectSize' as value column. Use 'value.var' to override
sampleN | p = 0.01 | p = 0.05 | p = 0.1 | p = 0.2 |
---|---|---|---|---|
50 | 128.57 | 100.21 | 85.33 | 67.49 |
100 | 90.27 | 70.61 | 60.21 | 47.68 |
150 | 73.53 | 57.58 | 49.13 | 38.92 |
200 | 63.61 | 49.84 | 42.53 | 33.70 |
250 | 56.86 | 44.56 | 38.03 | 30.14 |
300 | 51.88 | 40.67 | 34.71 | 27.51 |
350 | 48.01 | 37.65 | 32.14 | 25.47 |
400 | 44.90 | 35.21 | 30.06 | 23.82 |
450 | 42.33 | 33.20 | 28.34 | 22.46 |
500 | 40.15 | 31.49 | 26.88 | 21.31 |
550 | 38.27 | 30.02 | 25.63 | 20.31 |
600 | 36.64 | 28.74 | 24.54 | 19.45 |
650 | 35.20 | 27.61 | 23.57 | 18.69 |
700 | 33.92 | 26.61 | 22.72 | 18.01 |
750 | 32.77 | 25.71 | 21.95 | 17.40 |
800 | 31.72 | 24.89 | 21.25 | 16.84 |
850 | 30.77 | 24.14 | 20.61 | 16.34 |
900 | 29.91 | 23.46 | 20.03 | 15.88 |
950 | 29.11 | 22.84 | 19.50 | 15.46 |
1000 | 28.37 | 22.26 | 19.00 | 15.06 |
Does not require a sample. As a relatively simple example, suppose we were interested in the adoption of heat pumps in two equal sized samples. Suppose we thought in one sample (say, home owners) we thought it might be 40% and in rental properties it would be 25% (ref BRANZ 2015). What sample size would we need to conclude a significant difference with power = 0.8 and at various p values?
pwr::pwr.tp.test()
(ref pwr) can give us the answer…
n | sig.level | power | props |
---|---|---|---|
224.94 | 0.01 | 0.8 | p1 = 0.4 p2 = 0.25 |
151.17 | 0.05 | 0.8 | p1 = 0.4 p2 = 0.25 |
119.07 | 0.10 | 0.8 | p1 = 0.4 p2 = 0.25 |
86.73 | 0.20 | 0.8 | p1 = 0.4 p2 = 0.25 |
We can repeat this for other values of p1 and p2. For example, suppose both were much smaller (e.g. 10% and 15%)… Clearly we need much larger samples.
n | sig.level | power | props |
---|---|---|---|
1012.35 | 0.01 | 0.8 | p1 = 0.1 p2 = 0.15 |
680.35 | 0.05 | 0.8 | p1 = 0.1 p2 = 0.15 |
535.89 | 0.10 | 0.8 | p1 = 0.1 p2 = 0.15 |
390.31 | 0.20 | 0.8 | p1 = 0.1 p2 = 0.15 |
The above used an arcsine transform.
As a double check, using eqn to assess margin of error…
\[me = +/- z * \sqrt{\frac{p(1-p)} {n-1}}\]
If:
then the margin of error = +/- 0.078 (7.8%). So we could quote the Heat Pump uptake for owner-occupiers as 40% (+/- 7.8% [or 32.2 - 47.8] with p = 0.05).
This may be far too wide an error margin for our purposes so we may instead have recruited 500 per sample. Now the margin of error is +/- 0.043 (4.3%) so we can now quote the Heat Pump uptake for owner-occupiers as 40% (+/- 4.3% [or 35.7 - 44.3] with p = 0.05).
group | mean W | sd W | n households |
---|---|---|---|
Control | 162.66915 | 325.51171 | 28 |
Intervention 1 | 35.13947 | 83.90258 | 22 |
Intervention 2 | 58.80597 | 113.53102 | 26 |
Intervention 3 | 68.37439 | 147.37279 | 29 |
T test group 1
Control mean | Group 1 mean | Mean difference | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
162.6691 | 35.13947 | -127.5297 | -1.990661 | 0.0552626 | -258.11 | 3.050644 |
The results show that the mean power demand for the control group was 162.67W and for Intervention 1 was 35.14W. This is a (very) large difference in the mean of 127.53. The results of the t test are:
T test Group 2
Control mean | Group 2 mean | Mean difference | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
162.6691 | 58.80597 | -103.8632 | -1.587604 | 0.1216582 | -236.8285 | 29.10212 |
Now:
To detect Intervention Group 2’s effect size of 63.85% would have required control and trial group sizes of 47 respectively.
group | mean W | sd W | n households |
---|---|---|---|
Control | 169.60064 | 328.56355 | 1140 |
Intervention 1 | 34.50149 | 82.94015 | 907 |
Intervention 2 | 59.84020 | 112.74650 | 1008 |
Intervention 3 | 73.24102 | 148.25869 | 1145 |
Figure 5.1: Mean W demand per group for large sample (Error bars = 95% confidence intervals for the sample mean)
re-run T tests Group 1
Control mean | Group 1 mean | Mean difference | statistic | p.value | conf.low | conf.high |
---|---|---|---|---|---|---|
169.6006 | 59.8402 | -109.7604 | -10.59573 | 0 | -130.0807 | -89.44015 |
In this case:
Analysis completed in 51.37 seconds ( 0.86 minutes) using knitr in RStudio with R version 3.5.1 (2018-07-02) running on x86_64-apple-darwin15.6.0.
R packages used:
devtools::install_github("dataknut/dkUtils")
Session info:
## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] knitr_1.20 pwr_1.2-2 forcats_0.3.0
## [4] broom_0.5.0 lubridate_1.7.4 readr_1.1.1
## [7] ggplot2_3.1.0 dplyr_0.7.7 data.table_1.11.8
## [10] dkUtils_0.0.0.9000
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.19 highr_0.7 pillar_1.3.0
## [4] compiler_3.5.1 plyr_1.8.4 bindr_0.1.1
## [7] tools_3.5.1 digest_0.6.18 lattice_0.20-35
## [10] nlme_3.1-137 evaluate_0.12 tibble_1.4.2
## [13] gtable_0.2.0 pkgconfig_2.0.2 rlang_0.3.0.1
## [16] cli_1.0.1 yaml_2.2.0 xfun_0.4
## [19] bindrcpp_0.2.2 withr_2.1.2 stringr_1.3.1
## [22] hms_0.4.2 rprojroot_1.3-2 grid_3.5.1
## [25] tidyselect_0.2.5 glue_1.3.0 R6_2.3.0
## [28] fansi_0.4.0 rmarkdown_1.10 bookdown_0.7
## [31] reshape2_1.4.3 weGotThePower_0.1 tidyr_0.8.1
## [34] purrr_0.2.5 magrittr_1.5 backports_1.1.2
## [37] scales_1.0.0 htmltools_0.3.6 assertthat_0.2.0
## [40] colorspace_1.3-2 labeling_0.3 utf8_1.1.4
## [43] stringi_1.2.4 lazyeval_0.2.1 munsell_0.5.0
## [46] crayon_1.3.4
Anderson, Ben, David Eyers, Rebecca Ford, Diana Giraldo Ocampo, Rana Peniamina, Janet Stephenson, Kiti Suomalainen, Lara Wilcocks, and Michael Jack. 2018. “New Zealand GREEN Grid Household Electricity Demand Study 2014-2018,” September. doi:10.5255/UKDA-SN-853334.
Champely, Stephane. 2018. Pwr: Basic Functions for Power Analysis. https://CRAN.R-project.org/package=pwr.
Csárdi, Gábor, and Rich FitzJohn. 2016. Progress: Terminal Progress Bars. https://CRAN.R-project.org/package=progress.
Dowle, M, A Srinivasan, T Short, S Lianoglou with contributions from R Saporta, and E Antonyan. 2015. Data.table: Extension of Data.frame. https://CRAN.R-project.org/package=data.table.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. http://www.jstatsoft.org/v40/i03/.
R Core Team. 2016. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.
Wickham, Hadley, and Romain Francois. 2016. Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.
Wickham, Hadley, Jim Hester, and Romain Francois. 2016. Readr: Read Tabular Data. https://CRAN.R-project.org/package=readr.
Xie, Yihui. 2016. Knitr: A General-Purpose Package for Dynamic Report Generation in R. https://CRAN.R-project.org/package=knitr.