Skip to content
Snippets Groups Projects
Commit 0f4889dc authored by Ben Anderson's avatar Ben Anderson
Browse files

updated readme following DECC NEED user event

parent 613fccc9
Branches
No related tags found
No related merge requests found
......@@ -3,10 +3,11 @@ DECC-git NEED
Extract & analyse data from the anonymised & released versions of DECC's NEED dataset.
Original 'End User License' version of the data available from: UK DATA ARCHIVE: Study Number 7518 - National Energy Efficiency Data-Framework, 2014
Original 'End User License' version of the data:
* available from: UK DATA ARCHIVE: Study Number 7518 - National Energy Efficiency Data-Framework, 2014
http://discover.ukdataservice.ac.uk/catalogue/?sn=7518
For full detailed documentation see https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/332169/need_anonymised_dataset_accompanying_documentation.pdf
* Detailed documentation: https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/332169/need_anonymised_dataset_accompanying_documentation.pdf
* Full coding details of variables at: https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/315189/need_dataset_look_ups.xlsx
Notes (mostly to self):
* gas kwh are weather corrected within the 10 DNO distribution zones before delivery to DECC
......@@ -15,15 +16,21 @@ Notes (mostly to self):
* It includes only those with valid values on key variables (Property Age, Property Type, Floor Area Band and Energy Efficiency Band) and (especially) valid observations for electricity in 2012.
* Records were selected based on the frequency of household type in the dataset relative to the total dwelling stock so that uncommon property types (e.g. older detached properties) are over-represented and common types (e.g. flats where turnover is high) are under-represented. The supplied weight corrects for this for descriptive analysis.
* Implications for sample bias unclear - there may be other systematic biases not captured by the weight?
* UPRN = unique property reference = linkage mechanism (uses AddressBase)
* Bias caused by linkage failure is unknown although the DECC NEED Data Framework report from 2011 suggest match rates of 94%-100% (https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/209264/Annex_B_-_Quality_Assurance.pdf)
* UPRN = unique property reference = linkage mechanism across EPCs, gas/electricity data and EST data on energy efficiency installations (uses AddressBase)
* hoping to add PV etc installations soon
* Bias caused by linkage failure is unknown although the DECC NEED Data Framework report from 2013 suggest match rates of 94%-100% (https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/209264/Annex_B_-_Quality_Assurance.pdf)
* Both gas and electricity consumption are rounded and the rounding range ('to nearest n') increases through the distributions (see https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/315189/need_dataset_look_ups.xlsx)
* the E/Gcons*valid variable codes:
* 0 = off gas/elec
* V = valid reading (gas range 100 - 50,000; electricity range = 100 - 25,000)
* L = Gas consumption invalid, less than 100
* M = Gas consumption data is missing in source data
* G = Gas consumption invalid, greater than 50,000
* NB - there are valid gas readings of '0' which presumably were > 100 by < 249 (first gas 'heap' = 'nearest 500')
Issues:
* the E/Gcons*valid variable has some undefined labels (L,M,G):
* 0 = off gas/elec (documented)
* V = valid reading (documented: gas range 0 - 50,000; electricity range = 100 - 25,000)
* L = large? (> 50k or 25k depending?)
* M = missing?
* G = ?
* ideally DECC should set missing to -99 to aid re-coding and avoid unpleasant surprises in naive analysis!
Notes to DECC (!)
* ideally could set missing to -99 to aid re-coding and avoid unpleasant surprises in naive analysis?
* can the consumption rounding be constant through the distributions?
* check coding of Gcons ref 0 values for 'valid' cases?
YMMV
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment