Skip to content
Snippets Groups Projects
Commit 7baa595d authored by Ben Anderson's avatar Ben Anderson
Browse files

updated notes

parent caf99c71
No related branches found
No related tags found
No related merge requests found
********************************************************************* /*********************************************************************
* Extracting variables from EFS 2001/2 onwards to produce a 'harmonised' microdata time series * Extracting variables from EFS 2001/2 onwards to produce a 'harmonised' microdata time series
* mainly for use as input to a number of spatial microsimulation models to produce * mainly for use as input to a number of spatial microsimulation models to produce
* small area estimates of consumption. The variables names: * small area estimates of consumption. The variables names:
...@@ -6,32 +6,13 @@ ...@@ -6,32 +6,13 @@
* *
* UK Expenditure Food Survey (as was) now Living Costs & Food Survey available from: * UK Expenditure Food Survey (as was) now Living Costs & Food Survey available from:
* - http://discover.ukdataservice.ac.uk/series/?sn=2000028 * - http://discover.ukdataservice.ac.uk/series/?sn=2000028
*********************************************************************
/************************
Copyright (c) 2002-2012 University of Essex, (c) 2012-2015 University of Southampton
Author: Ben Anderson (b.anderson@soton.ac.uk, @dataknut, https://github.com/dataknut)
[Energy & Climate Change, Faculty of Engineering & Environment, University of Southampton]
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License
(http://choosealicense.com/licenses/gpl-2.0/), or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
#YMMV - http://en.wiktionary.org/wiki/YMMV
************************/
***********************
* Processes the EFS/LCFS into a set of identical files and then merges them * Processes the EFS/LCFS into a set of identical files and then merges them
* NB it treats the Living Costs and Food Survey 2008 onwards as simply another EFS (which it is) * Treats the Living Costs and Food Survey 2008 onwards as simply another EFS (which it is)
* From 2006 the EFS is collected on a calendar year basis.
* This means that 2005-6_Q4 and 2006_Q1 are exactly the same cases
* This script removes 2006_Q1 to avoid duplication
* History * History
* 15/8/2012 moved creation of ba_quarter & income equivalisation, quartiles/deciles to per-year processing * 15/8/2012 moved creation of ba_quarter & income equivalisation, quartiles/deciles to per-year processing
...@@ -42,24 +23,41 @@ GNU General Public License for more details. ...@@ -42,24 +23,41 @@ GNU General Public License for more details.
* 29/4/2015 - changed to only produce basic file on the assumption that this will be linked back to source expenditure data for specific analyses * 29/4/2015 - changed to only produce basic file on the assumption that this will be linked back to source expenditure data for specific analyses
* 2/5/2015 - this does not really work that well as (for example) variable names went to capital letters in 2010 (why why why) * 2/5/2015 - this does not really work that well as (for example) variable names went to capital letters in 2010 (why why why)
* so a mechanism has been included to allow the extraction of bespoke variables sets once this script has set all variable names to lower case * so a mechanism has been included to allow the extraction of bespoke variables sets once this script has set all variable names to lower case
*
* Copyright (c) 2002-2012 University of Essex, (c) 2012-2015 University of Southampton
*
* Author: Ben Anderson (b.anderson@soton.ac.uk, @dataknut, https://github.com/dataknut)
* [Energy & Climate Change, Faculty of Engineering & Environment, University of Southampton]
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License
* (http://choosealicense.com/licenses/gpl-2.0/), or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* #YMMV - http://en.wiktionary.org/wiki/YMMV
**********************************************************************/
* NB - the script assumes a certain folder structure for the source EFS data like so: * run without waiting for user
* `efsd'/<year>/stata/<datafile>.dta
* and also that this folder exists:
* `efsd'/processed/
***********************
set more off set more off
* set these to what works for you
local place = "/Users/ben/Documents/Work"local efsd = "`place'/Data/Social Science Datatsets/Expenditure and Food Survey" local place = "/Users/ben/Documents/Work"local efsd = "`place'/Data/Social Science Datatsets/Expenditure and Food Survey"
local logd = "`efsd'/log_files"local outd = "`efsd'/processed" local logd = "`efsd'/log_files"local outd = "`efsd'/processed"
************************
* NB - the script assumes a certain folder structure for the source EFS/LCFS data like so:
* `efsd'/<year>/stata/<datafile>.dta
* You may have to remane some of the downloaded & unzipped UKDA data folders to make this work
* The script also assumes that this folder exists for the final results:
* `efsd'/processed/
***********************
* NB: from 2006 the EFS is collected on a calendar year basis. local extract_years "2001-2012" // just a name for the FINAL extracted file
* This means that 2005-6_Q4 and 2006_Q1 are exactly the same cases
* This script removes 2006_Q1 later on to avoid duplication
local years "2001-2012" // just a label
* To save time you can leave out years you have already processed * To save time you can leave out years you have already processed
* just paste the ones you want into the do_years local variable below * just paste the ones you want into the do_years local variable below
...@@ -68,9 +66,9 @@ local do_years = "2001-2002 2002-2003 2003-2004 2004-2005 2005-2006 2006 2007 20 ...@@ -68,9 +66,9 @@ local do_years = "2001-2002 2002-2003 2003-2004 2004-2005 2005-2006 2006 2007 20
* set to 1 to refresh each yearly extract you listed in do_years & append the files * set to 1 to refresh each yearly extract you listed in do_years & append the files
* set to 0 to just append previously extracted files * set to 0 to just append previously extracted files
local do_extracts 0 local do_extracts 1
************ ************************
* Set the variables to be extracted here * Set the variables to be extracted here
* there is error checking below to make sure that they exist in each year or to skip if not * there is error checking below to make sure that they exist in each year or to skip if not
* put case in each keep var if otherwise empty * put case in each keep var if otherwise empty
...@@ -92,10 +90,9 @@ local rawhh_keepvars = "flydes*" // flights - NB see 2012 fix below in rawhh sec ...@@ -92,10 +90,9 @@ local rawhh_keepvars = "flydes*" // flights - NB see 2012 fix below in rawhh sec
* -> rawper file * -> rawper file
local rawper_keepvars = "case" local rawper_keepvars = "case"
************************
************ ************************
************
* set logging * set logging
capture log close capture log close
log using "`logd'/ONS-UK-EFS-time-series-extract-$S_DATE.smcl", replace log using "`logd'/ONS-UK-EFS-time-series-extract-$S_DATE.smcl", replace
...@@ -406,7 +403,7 @@ if `do_extracts' { ...@@ -406,7 +403,7 @@ if `do_extracts' {
gen ba_adults = g018 if g018 > 0 gen ba_adults = g018 if g018 > 0
* ignore hhs with no adults (how many are there?) * ignore hhs with no adults (how many are there?)
/* 1st adult = .67 spouse = .33 other 2nd adult = .33 3rd adult = .33 subsequent adults = .33 children aged < 14 = .2 children aged 14+ = .33 */ * catch hh with no children replace ba_under14 = 0 if ba_under14 == . replace ba_over14 = 0 if ba_over14 == . gen oecd_equivbhcwt = 0.67 if ba_adults >= 1 replace oecd_equivbhcwt = oecd_equivbhcwt + ((ba_adults-1) * 0.33) + (ba_under14 * 0.2) + (ba_over14 * 0.33) di "Calculating equiv income (OECD) and quartiles/deciles" /* 1st adult = .67 spouse = .33 other 2nd adult = .33 3rd adult = .33 subsequent adults = .33 children aged < 14 = .2 children aged 14+ = .33 */ * catch hh with no children replace ba_under14 = 0 if ba_under14 == . replace ba_over14 = 0 if ba_over14 == . gen oecd_equivbhcwt = 0.67 if ba_adults >= 1 replace oecd_equivbhcwt = oecd_equivbhcwt + ((ba_adults-1) * 0.33) + (ba_under14 * 0.2) + (ba_over14 * 0.33) di "*-> Calculating equiv income (OECD) and quartiles/deciles"
* p344, p389 & p396 changed to *p after 2006 and top coded (!) * p344, p389 & p396 changed to *p after 2006 and top coded (!)
if `y' > 2005 { if `y' > 2005 {
...@@ -529,7 +526,7 @@ aorder ...@@ -529,7 +526,7 @@ aorder
order case* survey* order case* survey*
save "`outd'/EFS-`years'-extract-reduced-BA.dta", replace save "`outd'/EFS-`extract_years'-extract-BA.dta", replace
di "************************************************************************" di "************************************************************************"
if `do_extracts' { if `do_extracts' {
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment