diff --git a/NEED/README.md b/NEED/README.md new file mode 100644 index 0000000000000000000000000000000000000000..1072ebe5f9fb713d303504eb1985110604dd4fca --- /dev/null +++ b/NEED/README.md @@ -0,0 +1,14 @@ +DECC-git NEED +============ + +Extract & analyse data from the public versions of DECC's NEED dataset + +Original data available from: UK DATA ARCHIVE: Study Number 7518 - National Energy Efficiency Data-Framework, 2014 +http://discover.ukdataservice.ac.uk/catalogue/?sn=7518 + +* Notes: +* This dataset is a sample of just over 4 million households which have had an Energy Performance Certificate from the full NEED 'all dwellings' dataset +* Is this all those who have had an EPC or a random sample of all those who've had an EPC? +* Sample bias is unkown - which kinds of dwellings have an EPC? +* Gcons<year>valid variable has undefined labels: G, L, M = ? Presumably 0 = off gas & V = valid? +* ideally DECC should set missing to -99 to aid re-coding and avoid unpleasant surprises in naive analysis!