Skip to content
Snippets Groups Projects
Commit ee5a243d authored by Ben Anderson's avatar Ben Anderson
Browse files

updated readme re PV installs

parent ed17a8a5
No related branches found
No related tags found
No related merge requests found
......@@ -20,21 +20,21 @@ See license file for details.
Notes (mostly to self)
----------------------
* gas kwh are weather corrected within the 10 DNO distribution zones before delivery to DECC
* The End User License file (EULF) dataset is a sample of just over 4 million households
* EULF is a semi-random sample of the 8m records which have an Energy Performance Certificate.
* It includes only those with valid values on key variables (Property Age, Property Type, Floor Area Band and Energy Efficiency Band) and (especially) valid observations for electricity in 2012.
* Records were selected based on the frequency of household type in the dataset relative to the total dwelling stock so that uncommon property types (e.g. older detached properties) are over-represented and common types (e.g. flats where turnover is high) are under-represented. The supplied weight corrects for this for descriptive analysis.
* The End User License file (EULF) dataset is a sample of just over 4 million households
* EULF is a semi-random sample of the 8m records which have an Energy Performance Certificate.
* It includes only those with valid values on key variables (Property Age, Property Type, Floor Area Band and Energy Efficiency Band) and (especially) valid observations for electricity in 2012.
* Records were selected based on the frequency of household type in the dataset relative to the total dwelling stock so that uncommon property types (e.g. older detached properties) are over-represented and common types (e.g. flats where turnover is high) are under-represented. The supplied weight corrects for this for descriptive analysis.
* Implications for sample bias unclear - there may be other systematic biases not captured by the weight?
* UPRN = unique property reference = linkage mechanism across EPCs, gas/electricity data and EST data on energy efficiency installations (uses AddressBase)
* hoping to add PV etc installations soon
* PV installs added for 2015 report - see https://www.gov.uk/government/statistics/national-energy-efficiency-data-framework-need-report-summary-of-analysis-2015
* Bias caused by linkage failure is unknown although the DECC NEED Data Framework report from 2013 suggest match rates of 94%-100% (https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/209264/Annex_B_-_Quality_Assurance.pdf)
* Both gas and electricity consumption are rounded and the rounding range ('to nearest n') increases through the distributions (see https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/315189/need_dataset_look_ups.xlsx). The reasons for this are explained in the consultation response at https://www.gov.uk/government/consultations/national-energy-efficiency-data-framework-making-data-available
* Both gas and electricity consumption are rounded and the rounding range ('to nearest n') increases through the distributions (see https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/315189/need_dataset_look_ups.xlsx). The reasons for this are explained in the consultation response at https://www.gov.uk/government/consultations/national-energy-efficiency-data-framework-making-data-available
* the Gcons*valid variable codes:
* G = Gas consumption invalid, greater than 50,000
* L = Gas consumption invalid, less than 100
* M = Gas consumption data is missing in source data
* 0 = Property does not have a gas connection
* V = Valid gas consumption (between 100 and 50,000 inclusive)
* V = Valid gas consumption (between 100 and 50,000 inclusive)
* NB - there are valid gas readings of '0' which presumably were > 100 but < 249 (first gas 'heap' = 'nearest 500')
* the Econs*valid variable codes:
* G Electricity consumption invalid, greater than 25,000 (DECC lookup table says 50,000)
......@@ -48,4 +48,3 @@ Notes to DECC (!)
* can the consumption rounding be constant through the distributions?
* check coding of Gcons ref 0 values for 'valid' cases?
* distinguish between electric & 'other' heating in 'main heating fuel'?
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment