Skip to content
Snippets Groups Projects
Commit ae94182b authored by Ben Anderson's avatar Ben Anderson
Browse files

added code to compare models

parent bd8a4743
No related branches found
No related tags found
No related merge requests found
Showing
with 8169 additions and 0 deletions
---
title: "Testing SAVE SDRC v2.1 Data"
subtitle: ""
author: "Ben Anderson (University of Otago)"
date: 'Last run at: `r Sys.time()`'
output:
bookdown::html_document2:
code_folding: hide
fig_caption: yes
number_sections: yes
self_contained: no
toc: yes
toc_depth: 3
toc_float: yes
bibliography: '`r path.expand("~/bibliography.bib")`'
---
```{r setup, include=FALSE}
# change default
knitr::opts_chunk$set(echo = TRUE)
# Log compile time:
startTime <- proc.time()
library(spatialec)
myPackages <- c(
"data.table", # data munching
"foreign", # loading STATA
"here", # easy path management - https://speakerdeck.com/jennybc/zen-and-the-art-of-workflow-maintenance?slide=49
"lubridate" # date time functions
)
spatialec::loadLibraries(myPackages)
projRoot <- here::here()
iPath <- "/Users/ben/University of Southampton/SAVE - Documents/WP2-Customer-Model/SDRC-v2.1/sim/data/input/"
oPath <- "/Users/ben/University of Southampton/SAVE - Documents/WP2-Customer-Model/SDRC-v2.1/sim/data/output/"
tuPath <- "/Users/ben/Data/UK_TU_2000/processed/"
```
# Report Purpose
Test SAVE SDRC v2.1 data (input & output)
# Input
## ONS TU 2000 data files
Check diary:
```{r checkTUdiary}
onsTU2000f <- paste0(tuPath,"diary_data_8_long_v1.0.dta")
df <- foreign::read.dta(onsTU2000f)
dt <- data.table::as.data.table(df)
message("N individuals: ", uniqueN(dt$serial))
message("Months covered: ")
dt[, month := lubridate::month(s_datetime)]
dt[, .(nRespondents = uniqueN(serial)), keyby = .(month)]
```
Check constraints file:
```{r checkTUcons}
onsTU2000f <- paste0(tuPath,"ONS-2000-TU-simfile_hh_constraints_v3.2.dta")
df <- foreign::read.dta(onsTU2000f)
dt <- data.table::as.data.table(df)
print(table(dt$c_nearners, dt$region, useNA = "always"))
message("N households: ", uniqueN(dt$hhidn))
```
## Sim input files
```{r simInput}
ifiles <- c("ONS-TU-wtdh_ug-2000-v3m1-siminput.txt",
"ONS_TU_wtdh_ug_2000_weekdays_winter_v3m1_siminput.txt",
"ONS_TU_wtdh_ug_2000_weekends_winter_v3m1_siminput.txt",
"ONS_TU_wtdh_ug_2000_weekdays_summer_v3m1_siminput.txt",
"ONS_TU_wtdh_ug_2000_weekends_summer_v3m1_siminput.txt")
for(f in ifiles){
message("Testing: ", f)
dt <- data.table::fread(paste0(iPath, f))
names(dt)
uniqueN(dt$hhid_numeric)
uniqueN(dt$region)
message("Regional totals: ", f)
t <- addmargins(table(dt$wmm_const_gender,dt$region))
print(t)
}
```
# Output
Just survey data as output:
```{r simOutput}
ofiles <- c("ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_simoutput.txt")
for(f in ofiles){
message("Testing: ", f)
dt <- data.table::fread(paste0(oPath, f))
message("N zones: ", uniqueN(dt$V1)) # <- actually zonecode
message("Mean n hhs: ", mean(dt$Num_hh))
message("Min n hhs: ", min(dt$Num_hh))
message("Max n hhs: ", max(dt$Num_hh))
}
```
With TU data results:
```{r simOutputTU}
# with TU data & weights
ofiles <- c("ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_weekdays_winter_simoutput-simoutput-v3m1.csv.gz")
for(f in ofiles){
message("Testing: ", f)
dt <- data.table::fread(paste0(oPath, f))
message("N zones: ", uniqueN(dt$zonecode)) # <- actually zonecode
message("N regions:")
print(table(dt$region))
message("N households: ", uniqueN(dt$hh_id))
message("Mean weight: ", mean(dt$weight))
message("Min weight: ", min(dt$weight))
message("Max weight: ", max(dt$weight))
}
```
Weights files:
```{r simOutputWts}
# weights
wfiles <- c("ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_weekdays_winter_simoutput_weights.dta",
"ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_weekdays_summer_simoutput_weights.dta")
for(f in wfiles){
message("Testing: ", f)
wf <- paste0(oPath,f)
df <- foreign::read.dta(wf)
dt <- data.table::as.data.table(df)
message("N households: ", uniqueN(dt$hh_id))
message("Mean weight: ", mean(dt$weight))
message("Min weight: ", min(dt$weight))
message("Max weight: ", max(dt$weight))
}
```
# References
\ No newline at end of file
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />
<meta name="author" content="Ben Anderson (University of Otago)" />
<title>Testing SAVE SDRC v2.1 Data</title>
<script src="testSAVEData_files/jquery-1.11.3/jquery.min.js"></script>
<meta name="viewport" content="width=device-width, initial-scale=1" />
<link href="testSAVEData_files/bootstrap-3.3.5/css/bootstrap.min.css" rel="stylesheet" />
<script src="testSAVEData_files/bootstrap-3.3.5/js/bootstrap.min.js"></script>
<script src="testSAVEData_files/bootstrap-3.3.5/shim/html5shiv.min.js"></script>
<script src="testSAVEData_files/bootstrap-3.3.5/shim/respond.min.js"></script>
<script src="testSAVEData_files/jqueryui-1.11.4/jquery-ui.min.js"></script>
<link href="testSAVEData_files/tocify-1.9.1/jquery.tocify.css" rel="stylesheet" />
<script src="testSAVEData_files/tocify-1.9.1/jquery.tocify.js"></script>
<script src="testSAVEData_files/navigation-1.1/tabsets.js"></script>
<script src="testSAVEData_files/navigation-1.1/codefolding.js"></script>
<link href="testSAVEData_files/highlightjs-9.12.0/default.css" rel="stylesheet" />
<script src="testSAVEData_files/highlightjs-9.12.0/highlight.js"></script>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
pre:not([class]) {
background-color: white;
}
</style>
<script type="text/javascript">
if (window.hljs) {
hljs.configure({languages: []});
hljs.initHighlightingOnLoad();
if (document.readyState && document.readyState === "complete") {
window.setTimeout(function() { hljs.initHighlighting(); }, 0);
}
}
</script>
<style type="text/css">
h1 {
font-size: 34px;
}
h1.title {
font-size: 38px;
}
h2 {
font-size: 30px;
}
h3 {
font-size: 24px;
}
h4 {
font-size: 18px;
}
h5 {
font-size: 16px;
}
h6 {
font-size: 12px;
}
.table th:not([align]) {
text-align: left;
}
</style>
<style type = "text/css">
.main-container {
max-width: 940px;
margin-left: auto;
margin-right: auto;
}
code {
color: inherit;
background-color: rgba(0, 0, 0, 0.04);
}
img {
max-width:100%;
height: auto;
}
.tabbed-pane {
padding-top: 12px;
}
.html-widget {
margin-bottom: 20px;
}
button.code-folding-btn:focus {
outline: none;
}
summary {
display: list-item;
}
</style>
<!-- tabsets -->
<style type="text/css">
.tabset-dropdown > .nav-tabs {
display: inline-table;
max-height: 500px;
min-height: 44px;
overflow-y: auto;
background: white;
border: 1px solid #ddd;
border-radius: 4px;
}
.tabset-dropdown > .nav-tabs > li.active:before {
content: "";
font-family: 'Glyphicons Halflings';
display: inline-block;
padding: 10px;
border-right: 1px solid #ddd;
}
.tabset-dropdown > .nav-tabs.nav-tabs-open > li.active:before {
content: "&#xe258;";
border: none;
}
.tabset-dropdown > .nav-tabs.nav-tabs-open:before {
content: "";
font-family: 'Glyphicons Halflings';
display: inline-block;
padding: 10px;
border-right: 1px solid #ddd;
}
.tabset-dropdown > .nav-tabs > li.active {
display: block;
}
.tabset-dropdown > .nav-tabs > li > a,
.tabset-dropdown > .nav-tabs > li > a:focus,
.tabset-dropdown > .nav-tabs > li > a:hover {
border: none;
display: inline-block;
border-radius: 4px;
}
.tabset-dropdown > .nav-tabs.nav-tabs-open > li {
display: block;
float: none;
}
.tabset-dropdown > .nav-tabs > li {
display: none;
}
</style>
<script>
$(document).ready(function () {
window.buildTabsets("TOC");
});
$(document).ready(function () {
$('.tabset-dropdown > .nav-tabs > li').click(function () {
$(this).parent().toggleClass('nav-tabs-open')
});
});
</script>
<!-- code folding -->
<style type="text/css">
.code-folding-btn { margin-bottom: 4px; }
</style>
<script>
$(document).ready(function () {
window.initializeCodeFolding("hide" === "show");
});
</script>
<script>
$(document).ready(function () {
// move toc-ignore selectors from section div to header
$('div.section.toc-ignore')
.removeClass('toc-ignore')
.children('h1,h2,h3,h4,h5').addClass('toc-ignore');
// establish options
var options = {
selectors: "h1,h2,h3",
theme: "bootstrap3",
context: '.toc-content',
hashGenerator: function (text) {
return text.replace(/[.\\/?&!#<>]/g, '').replace(/\s/g, '_').toLowerCase();
},
ignoreSelector: ".toc-ignore",
scrollTo: 0
};
options.showAndHide = true;
options.smoothScroll = true;
// tocify
var toc = $("#TOC").tocify(options).data("toc-tocify");
});
</script>
<style type="text/css">
#TOC {
margin: 25px 0px 20px 0px;
}
@media (max-width: 768px) {
#TOC {
position: relative;
width: 100%;
}
}
.toc-content {
padding-left: 30px;
padding-right: 40px;
}
div.main-container {
max-width: 1200px;
}
div.tocify {
width: 20%;
max-width: 260px;
max-height: 85%;
}
@media (min-width: 768px) and (max-width: 991px) {
div.tocify {
width: 25%;
}
}
@media (max-width: 767px) {
div.tocify {
width: 100%;
max-width: none;
}
}
.tocify ul, .tocify li {
line-height: 20px;
}
.tocify-subheader .tocify-item {
font-size: 0.90em;
}
.tocify .list-group-item {
border-radius: 0px;
}
</style>
</head>
<body>
<div class="container-fluid main-container">
<!-- setup 3col/9col grid for toc_float and main content -->
<div class="row-fluid">
<div class="col-xs-12 col-sm-4 col-md-3">
<div id="TOC" class="tocify">
</div>
</div>
<div class="toc-content col-xs-12 col-sm-8 col-md-9">
<div class="fluid-row" id="header">
<div class="btn-group pull-right">
<button type="button" class="btn btn-default btn-xs dropdown-toggle" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"><span>Code</span> <span class="caret"></span></button>
<ul class="dropdown-menu" style="min-width: 50px;">
<li><a id="rmd-show-all-code" href="#">Show All Code</a></li>
<li><a id="rmd-hide-all-code" href="#">Hide All Code</a></li>
</ul>
</div>
<h1 class="title toc-ignore">Testing SAVE SDRC v2.1 Data</h1>
<h4 class="author">Ben Anderson (University of Otago)</h4>
<h4 class="date">Last run at: 2019-06-18 08:52:43</h4>
</div>
<div id="report-purpose" class="section level1">
<h1><span class="header-section-number">1</span> Report Purpose</h1>
<p>Test SAVE SDRC v2.1 data (input &amp; output)</p>
</div>
<div id="input" class="section level1">
<h1><span class="header-section-number">2</span> Input</h1>
<div id="ons-tu-2000-data-files" class="section level2">
<h2><span class="header-section-number">2.1</span> ONS TU 2000 data files</h2>
<p>Check diary:</p>
<pre class="r"><code>onsTU2000f &lt;- paste0(tuPath,&quot;diary_data_8_long_v1.0.dta&quot;)
df &lt;- foreign::read.dta(onsTU2000f)
dt &lt;- data.table::as.data.table(df)
message(&quot;N individuals: &quot;, uniqueN(dt$serial))</code></pre>
<pre><code>## N individuals: 20981</code></pre>
<pre class="r"><code>message(&quot;Months covered: &quot;)</code></pre>
<pre><code>## Months covered:</code></pre>
<pre class="r"><code>dt[, month := lubridate::month(s_datetime)]
dt[, .(nRespondents = uniqueN(serial)), keyby = .(month)]</code></pre>
<pre><code>## month nRespondents
## 1: 1 1482
## 2: 2 1847
## 3: 3 1351
## 4: 4 1572
## 5: 5 2143
## 6: 6 1855
## 7: 7 2516
## 8: 8 2507
## 9: 9 2143
## 10: 10 1483
## 11: 11 1628
## 12: 12 1033</code></pre>
<p>Check constraints file:</p>
<pre class="r"><code>onsTU2000f &lt;- paste0(tuPath,&quot;ONS-2000-TU-simfile_hh_constraints_v3.2.dta&quot;)
df &lt;- foreign::read.dta(onsTU2000f)
dt &lt;- data.table::as.data.table(df)
print(table(dt$c_nearners, dt$region, useNA = &quot;always&quot;))</code></pre>
<pre><code>##
## NE NW Y&amp;H EM WM EoE L SE SW Wa Sc NI &lt;NA&gt;
## 0 110 221 185 158 169 174 141 216 186 126 226 38 3
## 1 88 187 171 176 133 167 197 212 174 95 155 30 6
## 2 54 161 123 150 109 169 94 220 165 65 150 38 2
## 3 7 40 49 27 27 37 23 53 34 15 34 9 3
## &lt;NA&gt; 38 105 71 78 58 44 115 91 63 58 53 31 0</code></pre>
<pre class="r"><code>message(&quot;N households: &quot;, uniqueN(dt$hhidn))</code></pre>
<pre><code>## N households: 6407</code></pre>
</div>
<div id="sim-input-files" class="section level2">
<h2><span class="header-section-number">2.2</span> Sim input files</h2>
<pre class="r"><code>ifiles &lt;- c(&quot;ONS-TU-wtdh_ug-2000-v3m1-siminput.txt&quot;,
&quot;ONS_TU_wtdh_ug_2000_weekdays_winter_v3m1_siminput.txt&quot;,
&quot;ONS_TU_wtdh_ug_2000_weekends_winter_v3m1_siminput.txt&quot;,
&quot;ONS_TU_wtdh_ug_2000_weekdays_summer_v3m1_siminput.txt&quot;,
&quot;ONS_TU_wtdh_ug_2000_weekends_summer_v3m1_siminput.txt&quot;)
for(f in ifiles){
message(&quot;Testing: &quot;, f)
dt &lt;- data.table::fread(paste0(iPath, f))
names(dt)
uniqueN(dt$hhid_numeric)
uniqueN(dt$region)
message(&quot;Regional totals: &quot;, f)
t &lt;- addmargins(table(dt$wmm_const_gender,dt$region))
print(t)
}</code></pre>
<pre><code>## Testing: ONS-TU-wtdh_ug-2000-v3m1-siminput.txt</code></pre>
<pre><code>## Regional totals: ONS-TU-wtdh_ug-2000-v3m1-siminput.txt</code></pre>
<pre><code>##
## 1 2 3 4 5 6 7 8 9 10 11 12 Sum
## 0 34 66 66 65 56 45 59 106 103 41 81 31 753
## 1 24 43 52 36 36 26 46 57 61 14 64 13 472
## Sum 58 109 118 101 92 71 105 163 164 55 145 44 1225</code></pre>
<pre><code>## Testing: ONS_TU_wtdh_ug_2000_weekdays_winter_v3m1_siminput.txt</code></pre>
<pre><code>## Regional totals: ONS_TU_wtdh_ug_2000_weekdays_winter_v3m1_siminput.txt</code></pre>
<pre><code>##
## 1 2 3 4 5 6 7 8 9 10 11 12 Sum
## 0 34 66 66 65 56 45 59 106 103 41 81 31 753
## 1 24 43 52 36 36 26 46 57 61 14 64 13 472
## Sum 58 109 118 101 92 71 105 163 164 55 145 44 1225</code></pre>
<pre><code>## Testing: ONS_TU_wtdh_ug_2000_weekends_winter_v3m1_siminput.txt</code></pre>
<pre><code>## Regional totals: ONS_TU_wtdh_ug_2000_weekends_winter_v3m1_siminput.txt</code></pre>
<pre><code>##
## 1 2 3 4 5 6 7 8 9 10 11 12 Sum
## 0 34 66 64 66 56 44 55 103 103 42 82 32 747
## 1 24 42 51 37 35 25 44 60 61 14 65 13 471
## Sum 58 108 115 103 91 69 99 163 164 56 147 45 1218</code></pre>
<pre><code>## Testing: ONS_TU_wtdh_ug_2000_weekdays_summer_v3m1_siminput.txt</code></pre>
<pre><code>## Regional totals: ONS_TU_wtdh_ug_2000_weekdays_summer_v3m1_siminput.txt</code></pre>
<pre><code>##
## 1 2 3 4 5 6 7 8 9 10 11 12 Sum
## 0 27 122 86 112 48 117 62 140 78 38 114 4 948
## 1 42 59 45 51 30 51 50 60 42 29 55 2 516
## Sum 69 181 131 163 78 168 112 200 120 67 169 6 1464</code></pre>
<pre><code>## Testing: ONS_TU_wtdh_ug_2000_weekends_summer_v3m1_siminput.txt</code></pre>
<pre><code>## Regional totals: ONS_TU_wtdh_ug_2000_weekends_summer_v3m1_siminput.txt</code></pre>
<pre><code>##
## 1 2 3 4 5 6 7 8 9 10 11 12 Sum
## 0 27 118 87 112 50 118 60 141 82 37 113 4 949
## 1 43 57 43 53 33 51 50 57 42 30 57 2 518
## Sum 70 175 130 165 83 169 110 198 124 67 170 6 1467</code></pre>
</div>
</div>
<div id="output" class="section level1">
<h1><span class="header-section-number">3</span> Output</h1>
<p>Just survey data as output:</p>
<pre class="r"><code>ofiles &lt;- c(&quot;ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_simoutput.txt&quot;)
for(f in ofiles){
message(&quot;Testing: &quot;, f)
dt &lt;- data.table::fread(paste0(oPath, f))
message(&quot;N zones: &quot;, uniqueN(dt$V1)) # &lt;- actually zonecode
message(&quot;Mean n hhs: &quot;, mean(dt$Num_hh))
message(&quot;Min n hhs: &quot;, min(dt$Num_hh))
message(&quot;Max n hhs: &quot;, max(dt$Num_hh))
}</code></pre>
<pre><code>## Testing: ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_simoutput.txt</code></pre>
<pre><code>## Warning in data.table::fread(paste0(oPath, f)): Detected 54 column names
## but the data has 55 columns (i.e. invalid file). Added 1 extra default
## column name for the first column which is guessed to be row names or an
## index. Use setnames() afterwards if this guess is not correct, or fix the
## file write command that created the file to create a valid file.</code></pre>
<pre><code>## N zones: 146</code></pre>
<pre><code>## Mean n hhs: 712.616095890411</code></pre>
<pre><code>## Min n hhs: 389.88</code></pre>
<pre><code>## Max n hhs: 1044.51</code></pre>
<p>With TU data results:</p>
<pre class="r"><code># with TU data &amp; weights
ofiles &lt;- c(&quot;ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_weekdays_winter_simoutput-simoutput-v3m1.csv.gz&quot;)
for(f in ofiles){
message(&quot;Testing: &quot;, f)
dt &lt;- data.table::fread(paste0(oPath, f))
message(&quot;N zones: &quot;, uniqueN(dt$zonecode)) # &lt;- actually zonecode
message(&quot;N regions:&quot;)
print(table(dt$region))
message(&quot;N households: &quot;, uniqueN(dt$hh_id))
message(&quot;Mean weight: &quot;, mean(dt$weight))
message(&quot;Min weight: &quot;, min(dt$weight))
message(&quot;Max weight: &quot;, max(dt$weight))
}</code></pre>
<pre><code>## Testing: ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_weekdays_winter_simoutput-simoutput-v3m1.csv.gz</code></pre>
<pre><code>## N zones: 146</code></pre>
<pre><code>## N regions:</code></pre>
<pre><code>##
## SE
## 788208</code></pre>
<pre><code>## N households: 162</code></pre>
<pre><code>## Mean weight: 5.35157420376347</code></pre>
<pre><code>## Min weight: 0.01</code></pre>
<pre><code>## Max weight: 196.9</code></pre>
<p>Weights files:</p>
<pre class="r"><code># weights
wfiles &lt;- c(&quot;ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_weekdays_winter_simoutput_weights.dta&quot;,
&quot;ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_weekdays_summer_simoutput_weights.dta&quot;)
for(f in wfiles){
message(&quot;Testing: &quot;, f)
wf &lt;- paste0(oPath,f)
df &lt;- foreign::read.dta(wf)
dt &lt;- data.table::as.data.table(df)
message(&quot;N households: &quot;, uniqueN(dt$hh_id))
message(&quot;Mean weight: &quot;, mean(dt$weight))
message(&quot;Min weight: &quot;, min(dt$weight))
message(&quot;Max weight: &quot;, max(dt$weight))
}</code></pre>
<pre><code>## Testing: ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_weekdays_winter_simoutput_weights.dta</code></pre>
<pre><code>## N households: 162</code></pre>
<pre><code>## Mean weight: 5.35157420795866</code></pre>
<pre><code>## Min weight: 0.00999999977648258</code></pre>
<pre><code>## Max weight: 196.899993896484</code></pre>
<pre><code>## Testing: ONS-TU-2000-Census-2001-wtdh_ug-Southampton-v3m1_weekdays_summer_simoutput_weights.dta</code></pre>
<pre><code>## N households: 197</code></pre>
<pre><code>## Mean weight: 4.56054300435325</code></pre>
<pre><code>## Min weight: 0.00999999977648258</code></pre>
<pre><code>## Max weight: 180.869995117188</code></pre>
</div>
<div id="references" class="section level1">
<h1><span class="header-section-number">4</span> References</h1>
</div>
</div>
</div>
</div>
<script>
// add bootstrap table styles to pandoc tables
function bootstrapStylePandocTables() {
$('tr.header').parent('thead').parent('table').addClass('table table-condensed');
}
$(document).ready(function () {
bootstrapStylePandocTables();
});
</script>
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
script.src = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>
This diff is collapsed.
Source diff could not be displayed: it is too large. Options to address this: view the blob.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
File added
File added
File added
File added
File added
File added
File added
File added
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment