From ee5a243da7babcc8e3fc184193718cc90ab8d62a Mon Sep 17 00:00:00 2001 From: Ben Anderson <b.anderson@soton.ac.uk> Date: Mon, 29 Jun 2015 12:45:52 +0100 Subject: [PATCH] updated readme re PV installs --- NEED/README.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/NEED/README.md b/NEED/README.md index 0ea5bbf..0f1e2fd 100644 --- a/NEED/README.md +++ b/NEED/README.md @@ -20,21 +20,21 @@ See license file for details. Notes (mostly to self) ---------------------- * gas kwh are weather corrected within the 10 DNO distribution zones before delivery to DECC -* The End User License file (EULF) dataset is a sample of just over 4 million households -* EULF is a semi-random sample of the 8m records which have an Energy Performance Certificate. - * It includes only those with valid values on key variables (Property Age, Property Type, Floor Area Band and Energy Efficiency Band) and (especially) valid observations for electricity in 2012. - * Records were selected based on the frequency of household type in the dataset relative to the total dwelling stock so that uncommon property types (e.g. older detached properties) are over-represented and common types (e.g. flats where turnover is high) are under-represented. The supplied weight corrects for this for descriptive analysis. +* The End User License file (EULF) dataset is a sample of just over 4 million households +* EULF is a semi-random sample of the 8m records which have an Energy Performance Certificate. + * It includes only those with valid values on key variables (Property Age, Property Type, Floor Area Band and Energy Efficiency Band) and (especially) valid observations for electricity in 2012. + * Records were selected based on the frequency of household type in the dataset relative to the total dwelling stock so that uncommon property types (e.g. older detached properties) are over-represented and common types (e.g. flats where turnover is high) are under-represented. The supplied weight corrects for this for descriptive analysis. * Implications for sample bias unclear - there may be other systematic biases not captured by the weight? * UPRN = unique property reference = linkage mechanism across EPCs, gas/electricity data and EST data on energy efficiency installations (uses AddressBase) - * hoping to add PV etc installations soon + * PV installs added for 2015 report - see https://www.gov.uk/government/statistics/national-energy-efficiency-data-framework-need-report-summary-of-analysis-2015 * Bias caused by linkage failure is unknown although the DECC NEED Data Framework report from 2013 suggest match rates of 94%-100% (https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/209264/Annex_B_-_Quality_Assurance.pdf) -* Both gas and electricity consumption are rounded and the rounding range ('to nearest n') increases through the distributions (see https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/315189/need_dataset_look_ups.xlsx). The reasons for this are explained in the consultation response at https://www.gov.uk/government/consultations/national-energy-efficiency-data-framework-making-data-available +* Both gas and electricity consumption are rounded and the rounding range ('to nearest n') increases through the distributions (see https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/315189/need_dataset_look_ups.xlsx). The reasons for this are explained in the consultation response at https://www.gov.uk/government/consultations/national-energy-efficiency-data-framework-making-data-available * the Gcons*valid variable codes: * G = Gas consumption invalid, greater than 50,000 * L = Gas consumption invalid, less than 100 * M = Gas consumption data is missing in source data * 0 = Property does not have a gas connection - * V = Valid gas consumption (between 100 and 50,000 inclusive) + * V = Valid gas consumption (between 100 and 50,000 inclusive) * NB - there are valid gas readings of '0' which presumably were > 100 but < 249 (first gas 'heap' = 'nearest 500') * the Econs*valid variable codes: * G Electricity consumption invalid, greater than 25,000 (DECC lookup table says 50,000) @@ -48,4 +48,3 @@ Notes to DECC (!) * can the consumption rounding be constant through the distributions? * check coding of Gcons ref 0 values for 'valid' cases? * distinguish between electric & 'other' heating in 'main heating fuel'? - -- GitLab