Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
DECC-data
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Deploy
Releases
Model registry
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Ben Anderson
DECC-data
Commits
36611cf4
Commit
36611cf4
authored
9 years ago
by
Ben Anderson
Browse files
Options
Downloads
Patches
Plain Diff
added valid data checks
parent
ee5a243d
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
NEED/analyse-NEED-EULF-2014-electricity-consumption.do
+87
-33
87 additions, 33 deletions
NEED/analyse-NEED-EULF-2014-electricity-consumption.do
with
87 additions
and
33 deletions
NEED/analyse-NEED-EULF-2014-electricity-consumption.do
+
87
−
33
View file @
36611cf4
*******************************************
*******************************************
* Script to:
* Script to:
* - analyse DECC's EULF 2014 NEED data to examine distributions etc
* - analyse DECC's EULF 2014 NEED data to examine distributions etc
* Original data available from: UK DATA ARCHIVE: Study Number 7518 - National Energy Efficiency Data-Framework, 2014
* Original data available from: UK DATA ARCHIVE: Study Number 7518 - National Energy Efficiency Data-Framework, 2014
* http://discover.ukdataservice.ac.uk/catalogue/?sn=7518
* http://discover.ukdataservice.ac.uk/catalogue/?sn=7518
...
@@ -9,16 +9,16 @@
...
@@ -9,16 +9,16 @@
* The script requires the following to have been run first:
* The script requires the following to have been run first:
* https://github.com/dataknut/DECC-data/blob/master/NEED/process-NEED-EULF-2014.do
* https://github.com/dataknut/DECC-data/blob/master/NEED/process-NEED-EULF-2014.do
/*
/*
Copyright (C) 2014 University of Southampton
Copyright (C) 2014 University of Southampton
Author: Ben Anderson (b.anderson@soton.ac.uk, @dataknut, https://github.com/dataknut)
Author: Ben Anderson (b.anderson@soton.ac.uk, @dataknut, https://github.com/dataknut)
[Energy & Climate Change, Faculty of Engineering & Environment, University of Southampton]
[Energy & Climate Change, Faculty of Engineering & Environment, University of Southampton]
This program is free software; you can redistribute it and/or modify
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License
the Free Software Foundation; either version 2 of the License
(http://choosealicense.com/licenses/gpl-2.0/), or (at your option) any later version.
(http://choosealicense.com/licenses/gpl-2.0/), or (at your option) any later version.
This program is distributed in the hope that it will be useful,
This program is distributed in the hope that it will be useful,
...
@@ -39,8 +39,8 @@ set more off
...
@@ -39,8 +39,8 @@ set more off
* written for Mac OSX - remember to change filesystem delimiter for other platforms
* written for Mac OSX - remember to change filesystem delimiter for other platforms
global
home
"/Users/ben/Documents"
global
home
"/Users/ben/Documents"
lo
c
al
dpath
"$home/Work/Data/Social Science Datatsets/DECC/NEED/End User Licence File 2014/processed"
g
lo
b
al
dpath
"$home/Work/Data/Social Science Datatsets/DECC/NEED/End User Licence File 2014/processed"
lo
c
al
rpath
"$home/Work/Papers and Conferences/RSS-2015/results"
g
lo
b
al
rpath
"$home/Work/Papers and Conferences/RSS-2015/results"
local
version
"v1"
local
version
"v1"
* set sample
* set sample
...
@@ -59,28 +59,80 @@ lab def GconsValidr 1 "(V)alid" 2 "(O)ff-gas" 3 "(L)Gas < 100" 4 "(G) Gas > 50,0
...
@@ -59,28 +59,80 @@ lab def GconsValidr 1 "(V)alid" 2 "(O)ff-gas" 3 "(L)Gas < 100" 4 "(G) Gas > 50,0
* NB DECC look up table says max elec = 50,000
* NB DECC look up table says max elec = 50,000
lab
def
EconsValidr
1
"(V)alid"
2
"not set"
3
"(L)Elec < 100"
4
"(G) Elec > 25,000"
5
"M(issing in source)"
lab
def
EconsValidr
1
"(V)alid"
2
"not set"
3
"(L)Elec < 100"
4
"(G) Elec > 25,000"
5
"M(issing in source)"
* also be aware that the consumption is rounded in buckets:
* also be aware that the consumption is rounded in buckets:
/*
/*
GconsYEAR . Missing, off gas or invalid consumption 100 7,999 Gas consumption kWh rounded to nearest 500 kWh 8,000- 15,999 Gas consumption kWh rounded to nearest 100 kWh 16,000 24,999 Gas consumption kWh rounded to nearest 500 kWh 25,000 34,999 Gas consumption kWh rounded to nearest 1,000 kWh 35,000 50,000 Gas consumption kWh rounded to nearest 5,000 kWh
EconsYEAR . Missing or invalid consumption 100 - 9,999 Electricity consumption kWh rounded to nearest 50 kWh 10,000 - 11,999 Electricity consumption kWh rounded to nearest 100 kWh 12,000 - 14,999 Electricity consumption kWh rounded to nearest 500 kWh 15,000 - 19,999 Electricity consumption kWh rounded to nearest 1,000 kWh 20,000 - 25,000 Electricity consumption kWh rounded to nearest 5,000 kWh
GconsYEAR . Missing, off gas or invalid consumption 100
�
7,999 Gas consumption kWh rounded to nearest 500 kWh 8,000- 15,999 Gas consumption kWh rounded to nearest 100 kWh 16,000
�
24,999 Gas consumption kWh rounded to nearest 500 kWh 25,000
�
34,999 Gas consumption kWh rounded to nearest 1,000 kWh 35,000
�
50,000 Gas consumption kWh rounded to nearest 5,000 kWh
EconsYEAR . Missing or invalid consumption 100 - 9,999 Electricity consumption kWh rounded to nearest 50 kWh 10,000 - 11,999 Electricity consumption kWh rounded to nearest 100 kWh 12,000 - 14,999 Electricity consumption kWh rounded to nearest 500 kWh 15,000 - 19,999 Electricity consumption kWh rounded to nearest 1,000 kWh 20,000 - 25,000 Electricity consumption kWh rounded to nearest 5,000 kWh
set more off
set more off
*/
*/
log
using
"
`
rpath
'
/analyse-NEED-EULF-2014-electricity-consumption-`version'.smcl"
,
replace
log
using
"
$
rpath/analyse-NEED-EULF-2014-electricity-consumption-`version'.smcl"
,
replace
if
`do_desc'
{
if
`do_desc'
{
di
"************************"
di
"************************"
di
"* Using `sample'% sample"
di
"* Using `sample'% sample"
use
"`dpath'/need_eul_may2014_consumptionfile_long_`sample'pc.dta"
,
clear
* load the yearly consumption data
use
"$dpath/need_eul_may2014_consumptionfile_long_`sample'pc.dta"
,
clear
* merge in the xwave file (fixed data - we assume!)
merge
m
:
1
HH_ID
using
"$dpath/need_eul_may2014_xwavefile_100pc.dta"
* set as panel in case it wasn't
* set as panel in case it wasn't
xtset
HH_ID
year
* fix format of year so xtset doesn't break
format
year
%ty
xtset
HH_ID
year
,
delta
(
1
year
)
* examine panel status
* examine panel status
xtdescribe
xtdescribe
* distributions for valid obs
* set up
* Gcons
local
vars
"Econs Gcons"
local
vars
"Econs Gcons"
foreach
v
of
local
vars
{
di
"***************"
di
"* Testing `v' for `sample'% sample"
di
"* check the panel transitions for each valid"
gen
`v'
Validr
=
1
if
`v'
Valid
==
"V"
replace
`v'
Validr
=
2
if
`v'
Valid
==
"O"
// off gas (from EPC) only relevant for gas
replace
`v'
Validr
=
3
if
`v'
Valid
==
"L"
replace
`v'
Validr
=
4
if
`v'
Valid
==
"G"
replace
`v'
Validr
=
5
if
`v'
Valid
==
"M"
lab
var
`v'
Validr
"Recoded `v'Valid"
lab
val
`v'
Validr
`v'
Validr
* set up consumption deciles
levelsof
(
year
),
local
(
levels
)
foreach
l
of
local
levels
{
di
"* Calculating consumption deciles for `v' for `l'"
* creates missing for other years have to do this as egen does not allow by
egen
`v'
_dec_
`l'
=
cut
(
`v'
)
if
year
==
`l'
,
group
(
10
)
}
* now combine them - set missing option otherwise it counts a row where all are missing as 0
egen
`v'
_dec
=
rowtotal
(
`v'
_dec_
*
),
missing
* remove temporary ones
drop
`v'
_dec_
*
* check
tab
`v'
_dec
year
}
stop
* flag dwellings which are off gas for electricity
* NB - in this dataset we don't know if they use electricity as main heat (could be oil)
gen
ba_off_gas
=
0
replace
ba_off_gas
=
1
if
GconsValidr
==
2
lab
def
ba_off_gas
0
"On gas (GconsValid!=O)"
1
"Off gas (GconsValid=O, from EPC)"
lab
val
ba_off_gas
ba_off_gas
* check
tabstat
Gcons
Econs
,
by
(
ba_off_gas
)
di
"* MAIN_HEAT_FUEL - Description of main heating fuel (gas or other). EPC - but NB could be 'other' but still be 'on gas'"
tab
ba_off_gas
MAIN_HEAT_FUEL
,
mi
// suggests EPC says 'off gas' (via GconsValid) but main heat fuel still says 'gas'?
table
year
MAIN_HEAT_FUEL
,
by
(
ba_off_gas
)
* roughly constant rate throughout years
table
year
MAIN_HEAT_FUEL
,
by
(
ba_off_gas
)
c
(
mean
Gcons
n
Gcons
)
* but off gas have no gas readings as you'd expect (DECC filter)
foreach
v
of
local
vars
{
foreach
v
of
local
vars
{
di
"***************"
di
"***************"
di
"* Testing `v' for `sample'% sample"
di
"* Testing `v' for `sample'% sample"
...
@@ -91,30 +143,32 @@ if `do_desc' {
...
@@ -91,30 +143,32 @@ if `do_desc' {
* 100 < gcons < 250 so included but rounded to nearest 500 = 0
* 100 < gcons < 250 so included but rounded to nearest 500 = 0
* elec always rounded to nearest 50 so min should always be 100
* elec always rounded to nearest 50 so min should always be 100
tabstat
`v'
,
by
(
`v'
Valid
)
s
(
n
mean
semean
min
max
)
tabstat
`v'
,
by
(
`v'
Valid
)
s
(
n
mean
semean
min
max
)
* by year
* by year
di
"* check `v' for 0s (`s'% sample)"
di
"* check `v' for 0s (`s'% sample)"
table
`v'
year
if
`v'
<
1000
table
`v'
year
if
`v'
<
1000
table
`v'
Valid
year
,
c
(
count
`v'
min
`v'
mean
`v'
max
`v'
)
table
`v'
Valid
year
,
c
(
count
`v'
min
`v'
mean
`v'
max
`v'
)
if
`do_graphs'
{
if
`do_graphs'
{
histogram
`v'
if
`v'
Valid
==
"V"
,
by
(
year
)
name
(
histo_
`s'
pc_
`v'
)
di
"* Running graphs - do not keep in memory, just save out"
graph
export
"`rpath'/NEED-EULF-2014-`s'pc-histo_`v'_by_year_valid.png"
,
replace
di
"* Running graphs: histo"
graph
box
`v'
if
`v'
Valid
==
"V"
,
over
(
year
)
name
(
box_
`s'
pc_
`v'
)
histogram
`v'
if
`v'
Valid
==
"V"
,
by
(
year
)
graph
export
"`rpath'/NEED-EULF-2014-`s'pc-box_`v'_over_year_valid.png"
,
replace
graph
export
"$rpath/graphs/NEED-EULF-2014-`s'pc-histo_`v'_by_year_valid.png"
,
replace
di
"* Running graphs: boxes"
graph
box
`v'
if
`v'
Valid
==
"V"
,
over
(
year
)
graph
export
"$rpath/graphs/NEED-EULF-2014-`s'pc-box_`v'_over_year_valid.png"
,
replace
graph
box
`v'
if
`v'
Valid
==
"V"
,
over
(
year
)
by
(
FLOOR_AREA_BAND
)
graph
export
"$rpath/graphs/NEED-EULF-2014-`s'pc-box_`v'_yr_floor_valid.png"
,
replace
graph
box
`v'
if
`v'
Valid
==
"V"
,
over
(
year
)
by
(
EE_BAND
)
graph
export
"$rpath/graphs/NEED-EULF-2014-`s'pc-box_`v'_yr_ee_valid.png"
,
replace
}
}
di
"* check the panel transitions for each valid"
di
"* Check transitions (`v'Validr)"
gen
`v'
Validr
=
1
if
`v'
Valid
==
"V"
replace
`v'
Validr
=
2
if
`v'
Valid
==
"O"
replace
`v'
Validr
=
3
if
`v'
Valid
==
"L"
replace
`v'
Validr
=
4
if
`v'
Valid
==
"G"
replace
`v'
Validr
=
5
if
`v'
Valid
==
"M"
lab
var
`v'
Validr
"Recoded `v'Valid"
lab
val
`v'
Validr
`v'
Validr
* di "Check transitions (`v'Validr)"
xttrans
`v'
Validr
,
freq
xttrans
`v'
Validr
,
freq
}
}
}
}
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment