diff --git a/MTUS-W6-adult-survey-data-processing.Rmd b/MTUS-W6-adult-survey-data-processing.Rmd index 3aef45bc2abfd7633108f586f419d3e4cf582291..d3e1749f3e5f19c31023bbda52390d056fc3731b 100644 --- a/MTUS-W6-adult-survey-data-processing.Rmd +++ b/MTUS-W6-adult-survey-data-processing.Rmd @@ -1,7 +1,7 @@ --- title: "MTUS World 6 Survey Data Processing" -author: "Ben Anderson" -date: "(b.anderson@soton.ac.uk, @dataknut)" +author: "Ben Anderson (b.anderson@soton.ac.uk/@dataknut)" +date: 'Last run at: `r Sys.time()`' output: pdf_document: toc: yes @@ -15,6 +15,7 @@ output: word_document: toc: yes toc_depth: '3' +bibliography: ~/bibliography.bib --- ```{r setupKnitr, include=FALSE} @@ -26,20 +27,29 @@ knitr::opts_chunk$set(fig_caption = TRUE) knitr::opts_chunk$set(tidy = TRUE) ``` -# About this document +# Introduction +Purpose: + +* To process the MTUS World 6 Survey data + +A processed & gzipped .csv file containing data for just the UK is saved. + +Data: -Document last refreshed: `r Sys.time()` +* MTUS [World 6](http://www.timeuse.org/mtus) -This document was created using [knitr](https://cran.r-project.org/package=knitr) in [RStudio](http://www.rstudio.com). Knitr supports the embedding of R statistical analysis code within markdown text documents allowing them to be updated and re-run. Things to note: +Things that are NOT fixed here: -* Depending on the options set Knitr will display warnings (but not errors) from R. The warnings may or may not be important to the interpretation of the results; -* Knitr is very clever but it does not always support pretty tables. + * diary day of the week which is not correct in 1984 - this is fixed in the episodes data processing This work was funded by RCUK through the End User Energy Demand Centres Programme via the "DEMAND: Dynamics of Energy, Mobility and Demand" Centre: * http://www.demand.ac.uk * http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K011723/1 +Code: + * https://github.com/dataknut/MTUS + `License:` `The R code embedded in this document is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License (http://choosealicense.com/licenses/gpl-2.0/), or (at your option) any later version.` @@ -48,9 +58,18 @@ This work was funded by RCUK through the End User Energy Demand Centres Programm >YMMV - http://en.wiktionary.org/wiki/YMMV -*** +# Set up R environment -````{r houseKeeping} +Key packages used: + + * base R - for the basics [@baseR] + * foreign - for loading SPSS data [@foreign] + * data.table - for fast (big) data handling [@data.table] + * knitr - to create this document [@knitr] + * dplyr & dtplyr - for data manipulation [@dplyr][@dtplyr] + * car - regression diagnostics [@fox_car] + +```{r houseKeeping} # Clear out all old objects etc ---- # to avoid confusion rm(list = ls()) @@ -59,15 +78,13 @@ rm(list = ls()) starttime <- Sys.time() # Load required packages ---- -packs <- c("ggplot2", # slick & easy graphs - "foreign", # loading SPSS/STATA +packs <- c("foreign", # loading SPSS/STATA + "dplyr", # data manipulation "data.table", # fast data manipulation - "gmodels", # for table proportions - "knitr", # for kable + "dtplyr", # data table & dplyr code "car", # regression diagnostics - "dplyr" # data transformation + "knitr" # for kable ) -# xtable - not useful: outputs latex only ---- # do this to install them if needed # install.packages(x) @@ -84,22 +101,8 @@ mtusProcPath <- "~/Data/MTUS/World_6/processed/" # where to put the processed .c surveyFile <- "MTUS-adult-aggregate.sav" sfile <- paste0(mtusPath, surveyFile) -```` - -# Introduction -Purpose: - -* To process the MTUS World 6 Survey data (`r surveyFile`) - -A processed & gzipped .csv file containing data for just the UK is saved. - -Data: - -* MTUS [World 6](http://www.timeuse.org/mtus) - -Things that are NOT fixed here: +``` - * diary day of the week which is not correct in 1984 - this is fixed in the episodes data processing # Load original survey data Loading `r sfile`. @@ -129,11 +132,11 @@ We now delete the non-UK data leaving us with `r format(nrow(MTUSW6UKsurvey_DT), ```{r processSurveyData} print("-> Create uniq id for diaries (for matching) and persons") -# Create unique ids ---- +# Create unique ids ---- # diarypid -MTUSW6UKsurvey_DT$ba_diarypid <- - group_indices(MTUSW6UKsurvey_DT, survey, +MTUSW6UKsurvey_DT$ba_diarypid <- group_indices(MTUSW6UKsurvey_DT, + survey, swave, msamp, hldid, @@ -142,8 +145,7 @@ MTUSW6UKsurvey_DT$ba_diarypid <- ) # pid -MTUSW6UKsurvey_DT$ba_pid <- - group_indices(MTUSW6UKsurvey_DT, survey, +MTUSW6UKsurvey_DT$ba_pid <- group_indices(MTUSW6UKsurvey_DT, survey, swave, msamp, hldid, @@ -500,5 +502,4 @@ On the basis of these results we seem justified in assuming that we can pool 198 __Meta:__ Analysis completed in: `r round(Sys.time() - starttime, 3)` seconds using [knitr](https://cran.r-project.org/package=knitr) in [RStudio](http://www.rstudio.com). -*** -__Footnotes:__ +# References diff --git a/MTUS-W6-adult-survey-data-processing.pdf b/MTUS-W6-adult-survey-data-processing.pdf index bc47c5568d36d8940f01099ed9d9502e7b3c8ad3..d5385dc47d6642641a312e5e650a8a3d9ffbbd1d 100644 Binary files a/MTUS-W6-adult-survey-data-processing.pdf and b/MTUS-W6-adult-survey-data-processing.pdf differ