1 About

1.1 License

This work is (c) the author(s).

License This work is licensed under a Creative Commons Attribution 4.0 International License unless otherwise marked.

For the avoidance of doubt and explanation of terms please refer to the full license notice and legal code.

1.2 Citation

If you wish to use any of the material from this paper please cite as:

  • Ben Anderson, Tom Rushby, Abubakr Bahaj and Patrick James. (2019) Statistical Power, Statistical Significance, Study Design and Decision Making (A tale of three countries…), Southampton: University of Southampton.

This work is (c) 2019 the authors.

1.3 History

Code & report history:

1.4 Data:

This report uses:

  • Irish CER Smart meter electricity consumption (kWh) data for the pre-trial periods of:
    • October 2009 (Autumn)
    • December 2009 (Winter)
  • UK SAVE household efficiency/demand response interventions trial ‘smart meter’ electricity consumption (kWh) data for the pre-trial period of:
    • January 2017 (Winter)
  • NZ GREEN Grid Household electricity demand (kW) data ( (Anderson et al. 2018)) for the period of:
    • June - July 2015 (Winter)

1.5 Acknowledgements

This work was supported by:

2 Introduction

This report contains the analysis for a paper of the same name. The text is stored elsewhere for ease of editing.

3 Data

4 Scenarios

  • P = 0.8, p < 0.05 and effect size of 6%

5 Compare power

6 Statistical Annex

6.1 CER Data

Data as loaded and processed but before any filtering or exclusions…

October:

Table 6.1: Summary of CER October 2009 data
   ID </th>
AllocCode
  kWh </th>
r_dateTime peakLabel
Min. :1002 Length:6105125 Min. : 0.0000 Min. :2009-10-01 00:00:00 Length:6105125
1st Qu.:2603 Class :character 1st Qu.: 0.1180 1st Qu.:2009-10-08 12:30:00 Class :character
Median :4242 Mode :character Median : 0.2480 Median :2009-10-16 01:00:00 Mode :character
Mean :4226 NA Mean : 0.4921 Mean :2009-10-16 00:39:59 NA
3rd Qu.:5845 NA 3rd Qu.: 0.5770 3rd Qu.:2009-10-23 13:30:00 NA
Max. :7443 NA Max. :12.9700 Max. :2009-10-31 00:00:00 NA
## Skim summary statistics
##  n obs: 6105125 
##  n variables: 5 
## 
## ── Variable type:character ────────────────────────────────────────────────────────────────────────────────────────
##   variable missing complete       n min max empty n_unique
##  AllocCode       0  6105125 6105125  11  11     0        1
##  peakLabel       0  6105125 6105125   5   7     0        2
## 
## ── Variable type:integer ──────────────────────────────────────────────────────────────────────────────────────────
##  variable missing complete       n    mean      sd   p0  p25  p50  p75
##        ID       0  6105125 6105125 4225.57 1860.84 1002 2603 4242 5845
##  p100     hist
##  7443 ▇▇▇▇▇▇▇▇
## 
## ── Variable type:numeric ──────────────────────────────────────────────────────────────────────────────────────────
##  variable missing complete       n mean   sd p0  p25  p50  p75  p100
##       kWh       0  6105125 6105125 0.49 0.66  0 0.12 0.25 0.58 12.97
##      hist
##  ▇▁▁▁▁▁▁▁
## 
## ── Variable type:POSIXct ──────────────────────────────────────────────────────────────────────────────────────────
##    variable missing complete       n        min        max     median
##  r_dateTime       0  6105125 6105125 2009-10-01 2009-10-31 2009-10-16
##  n_unique
##      1441

December:

Table 6.2: Summary of CER December 2009 data
   ID </th>
AllocCode
  kWh </th>
r_dateTime peakLabel
Min. :1002 Length:6088225 Min. : 0.0000 Min. :2009-12-01 00:00:00 Length:6088225
1st Qu.:2603 Class :character 1st Qu.: 0.1290 1st Qu.:2009-12-08 12:00:00 Class :character
Median :4242 Mode :character Median : 0.3110 Median :2009-12-16 00:00:00 Mode :character
Mean :4226 NA Mean : 0.6218 Mean :2009-12-16 00:00:00 NA
3rd Qu.:5845 NA 3rd Qu.: 0.7820 3rd Qu.:2009-12-23 12:00:00 NA
Max. :7443 NA Max. :13.6780 Max. :2009-12-31 00:00:00 NA
## Skim summary statistics
##  n obs: 6088225 
##  n variables: 5 
## 
## ── Variable type:character ────────────────────────────────────────────────────────────────────────────────────────
##   variable missing complete       n min max empty n_unique
##  AllocCode       0  6088225 6088225  11  11     0        1
##  peakLabel       0  6088225 6088225   5   7     0        2
## 
## ── Variable type:integer ──────────────────────────────────────────────────────────────────────────────────────────
##  variable missing complete       n    mean      sd   p0  p25  p50  p75
##        ID       0  6088225 6088225 4225.57 1860.84 1002 2603 4242 5845
##  p100     hist
##  7443 ▇▇▇▇▇▇▇▇
## 
## ── Variable type:numeric ──────────────────────────────────────────────────────────────────────────────────────────
##  variable missing complete       n mean   sd p0  p25  p50  p75  p100
##       kWh       0  6088225 6088225 0.62 0.81  0 0.13 0.31 0.78 13.68
##      hist
##  ▇▁▁▁▁▁▁▁
## 
## ── Variable type:POSIXct ──────────────────────────────────────────────────────────────────────────────────────────
##    variable missing complete       n        min        max     median
##  r_dateTime       0  6088225 6088225 2009-12-01 2009-12-31 2009-12-16
##  n_unique
##      1441

6.2 SAVE Data

Table 6.3: Summary of SAVE January 2017 data
 bmg_id </th>
trialGroupNavetas peakLabel r_dateTime
  kWh </th>
Min. :956600012 Length:4821343 Length:4821343 Min. :2017-01-01 00:00:00 Min. : 0.0000
1st Qu.:956621842 Class :character Class :character 1st Qu.:2017-01-08 15:00:00 1st Qu.: 0.0610
Median :956634684 Mode :character Mode :character Median :2017-01-15 19:30:00 Median : 0.1270
Mean :956636290 NA NA Mean :2017-01-15 17:17:38 Mean : 0.2678
3rd Qu.:956649817 NA NA 3rd Qu.:2017-01-22 21:00:00 3rd Qu.: 0.2970
Max. :956663251 NA NA Max. :2017-01-29 23:30:00 Max. :11.7810
NA NA NA NA NA’s :557
## Skim summary statistics
##  n obs: 4821343 
##  n variables: 5 
## 
## ── Variable type:character ────────────────────────────────────────────────────────────────────────────────────────
##           variable missing complete       n min max empty n_unique
##          peakLabel       0  4821343 4821343   5   7     0        2
##  trialGroupNavetas       0  4821343 4821343   9   9     0        4
## 
## ── Variable type:integer ──────────────────────────────────────────────────────────────────────────────────────────
##  variable missing complete       n    mean       sd      p0     p25
##    bmg_id       0  4821343 4821343 9.6e+08 16775.45 9.6e+08 9.6e+08
##      p50     p75    p100     hist
##  9.6e+08 9.6e+08 9.6e+08 ▁▅▆▆▅▅▃▇
## 
## ── Variable type:numeric ──────────────────────────────────────────────────────────────────────────────────────────
##  variable missing complete       n mean   sd p0   p25  p50 p75  p100
##       kWh     557  4820786 4821343 0.27 0.39  0 0.061 0.13 0.3 11.78
##      hist
##  ▇▁▁▁▁▁▁▁
## 
## ── Variable type:POSIXct ──────────────────────────────────────────────────────────────────────────────────────────
##    variable missing complete       n        min        max     median
##  r_dateTime       0  4821343 4821343 2017-01-01 2017-01-29 2017-01-15
##  n_unique
##      1392

6.3 NZ Green Grid Data

Table 6.4: Summary of NZ Green Grid data
time_nz r_dateTime
linkID </th>
  sumW </th>
peakLabel
Length:15163214 Min. :2015-03-31 11:00:00 Length:15163214 Min. :-1084.7 Length:15163214
Class :character 1st Qu.:2015-06-23 20:38:00 Class :character 1st Qu.: 211.2 Class :character
Mode :character Median :2015-09-17 15:41:30 Mode :character Median : 425.9 Mode :character
NA Mean :2015-09-22 04:22:52 NA Mean : 942.3 NA
NA 3rd Qu.:2015-12-19 12:37:00 NA 3rd Qu.: 1172.6 NA
NA Max. :2016-03-31 10:59:00 NA Max. :13999.8 NA
## Skim summary statistics
##  n obs: 15163214 
##  n variables: 5 
## 
## ── Variable type:character ────────────────────────────────────────────────────────────────────────────────────────
##   variable missing complete        n min max empty n_unique
##     linkID       0 15163214 15163214   5   6     0       33
##  peakLabel       0 15163214 15163214   5   7     0        2
##    time_nz       0 15163214 15163214  20  20     0   526980
## 
## ── Variable type:numeric ──────────────────────────────────────────────────────────────────────────────────────────
##  variable missing complete        n  mean      sd       p0    p25    p50
##      sumW       0 15163214 15163214 942.3 1182.83 -1084.72 211.25 425.93
##      p75     p100     hist
##  1172.62 13999.81 ▇▂▁▁▁▁▁▁
## 
## ── Variable type:POSIXct ──────────────────────────────────────────────────────────────────────────────────────────
##    variable missing complete        n        min        max     median
##  r_dateTime       0 15163214 15163214 2015-03-31 2016-03-31 2015-09-17
##  n_unique
##    526980

7 Runtime

Analysis completed in 84.81 seconds ( 1.41 minutes) using knitr in RStudio with R version 3.5.2 (2018-12-20) running on x86_64-apple-darwin15.6.0.

8 R environment

R packages used:

  • base R - for the basics (R Core Team 2016)
  • data.table - for fast (big) data handling (Dowle et al. 2015)
  • ggplot2 - for slick graphics (Wickham 2009)
  • knitr - to create this document & neat tables (Xie 2016)
  • lubridate - date manipulation (Grolemund and Wickham 2011)
  • pwr - non-base power analysis (Champely 2018)
  • skmir - for data skimming (Arino de la Rubia et al. 2017)

and

  • dkUtils - for local dataknut utilities :-) devtools::install_github("dataknut/dkUtils")

Session info:

## R version 3.5.2 (2018-12-20)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
## 
## Matrix products: default
## BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] kableExtra_1.0.1   pwr_1.2-2          forcats_0.4.0     
##  [4] broom_0.5.1        ggplot2_3.1.0      dkUtils_0.0.0.9000
##  [7] tidyselect_0.2.5   lubridate_1.7.4    here_0.1          
## [10] data.table_1.12.0  drake_7.1.0       
## 
## loaded via a namespace (and not attached):
##  [1] storr_1.2.1       xfun_0.5          purrr_0.3.1      
##  [4] lattice_0.20-38   colorspace_1.4-0  generics_0.0.2   
##  [7] viridisLite_0.3.0 htmltools_0.3.6   yaml_2.2.0       
## [10] utf8_1.1.4        rlang_0.3.1       pillar_1.3.1     
## [13] glue_1.3.0        withr_2.1.2       plyr_1.8.4       
## [16] stringr_1.4.0     munsell_0.5.0     gtable_0.2.0     
## [19] rvest_0.3.2       visNetwork_2.0.5  htmlwidgets_1.3  
## [22] evaluate_0.13     knitr_1.22        fansi_0.4.0      
## [25] highr_0.7         Rcpp_1.0.0        readr_1.3.1      
## [28] backports_1.1.3   scales_1.0.0      webshot_0.5.1    
## [31] jsonlite_1.6      hms_0.4.2         digest_0.6.18    
## [34] stringi_1.3.1     bookdown_0.9      dplyr_0.8.0.1    
## [37] rprojroot_1.3-2   grid_3.5.2        cli_1.0.1        
## [40] tools_3.5.2       magrittr_1.5      base64url_1.4    
## [43] lazyeval_0.2.1    tibble_2.0.1      crayon_1.3.4     
## [46] tidyr_0.8.3       pkgconfig_2.0.2   xml2_1.2.0       
## [49] skimr_1.0.5       httr_1.4.0        assertthat_0.2.0 
## [52] rmarkdown_1.11    rstudioapi_0.9.0  R6_2.4.0         
## [55] igraph_1.2.4      nlme_3.1-137      compiler_3.5.2

References

Anderson, Ben, David Eyers, Rebecca Ford, Diana Giraldo Ocampo, Rana Peniamina, Janet Stephenson, Kiti Suomalainen, Lara Wilcocks, and Michael Jack. 2018. “New Zealand GREEN Grid Household Electricity Demand Study 2014-2018,” September. doi:10.5255/UKDA-SN-853334.

Arino de la Rubia, Eduardo, Hao Zhu, Shannon Ellis, Elin Waring, and Michael Quinn. 2017. Skimr: Skimr. https://github.com/ropenscilabs/skimr.

Champely, Stephane. 2018. Pwr: Basic Functions for Power Analysis. https://CRAN.R-project.org/package=pwr.

Dowle, M, A Srinivasan, T Short, S Lianoglou with contributions from R Saporta, and E Antonyan. 2015. Data.table: Extension of Data.frame. https://CRAN.R-project.org/package=data.table.

Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. http://www.jstatsoft.org/v40/i03/.

R Core Team. 2016. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.

Xie, Yihui. 2016. Knitr: A General-Purpose Package for Dynamic Report Generation in R. https://CRAN.R-project.org/package=knitr.