Commit bce9ea01 authored by Ben Anderson's avatar Ben Anderson
Browse files

rejigged season order; amended final t tests for large sample for comparability; run to html

parent 301c0016
No preview for this file type
......@@ -9,10 +9,6 @@ author: '`r paste0(params$author)` (Contact: b.anderson@soton.ac.uk, `@dataknut`
date: 'Last run at: `r Sys.time()`'
always_allow_html: yes
output:
bookdown::word_document2:
fig_caption: yes
toc: yes
toc_depth: 2
bookdown::html_document2:
code_folding: hide
fig_caption: yes
......@@ -21,6 +17,10 @@ output:
toc: yes
toc_depth: 2
toc_float: yes
bookdown::word_document2:
fig_caption: yes
toc: yes
toc_depth: 2
bookdown::pdf_document2:
fig_caption: yes
keep_tex: yes
......@@ -394,9 +394,10 @@ This may be far too wide an error margin for our purposes so we may instead have
```{r smallNTable}
# mimic an intervention using seasons
testDT <- testDT[season == "Winter", group := "Control"]
testDT <- testDT[season == "Summer", group := "Intervention 1"]
testDT <- testDT[season == "Spring", group := "Intervention 2"]
testDT <- testDT[season == "Spring", group := "Intervention 1"]
testDT <- testDT[season == "Summer", group := "Intervention 2"]
testDT <- testDT[season == "Autumn", group := "Intervention 3"]
t <- testDT[, .("mean W" = mean(meanW),
......@@ -445,10 +446,10 @@ tTest <- t.test(testDT[group == "Intervention 1"]$meanW, testDT[group == "Contro
tTestTidy <- broom::tidy(tTest)
tTestTidy$`Control mean` <- tTestTidy$estimate2
tTestTidy$`Group 1 mean` <- tTestTidy$estimate1
tTestTidy$`Intervention 1 mean` <- tTestTidy$estimate1
tTestTidy$`Mean difference` <- tTestTidy$estimate
knitr::kable(tTestTidy[c("Control mean", "Group 1 mean", "Mean difference",
knitr::kable(tTestTidy[c("Control mean", "Intervention 1 mean", "Mean difference",
"statistic", "p.value", "conf.low", "conf.high")], caption = "T test results (Group 1 vs Control)")
controlW <- tTest$estimate[[2]]
......@@ -474,10 +475,10 @@ tTest <- t.test(testDT[group == "Intervention 2"]$meanW, testDT[group == "Contro
tTestTidy <- broom::tidy(tTest)
tTestTidy$`Control mean` <- tTestTidy$estimate2
tTestTidy$`Group 2 mean` <- tTestTidy$estimate1
tTestTidy$`Intervention 2 mean` <- tTestTidy$estimate1
tTestTidy$`Mean difference` <- tTestTidy$estimate
knitr::kable(tTestTidy[c("Control mean", "Group 2 mean", "Mean difference",
knitr::kable(tTestTidy[c("Control mean", "Intervention 2 mean", "Mean difference",
"statistic", "p.value", "conf.low", "conf.high")], caption = "T test results (Group 2 vs Control)")
controlW <- tTest$estimate[[2]]
......@@ -537,23 +538,23 @@ myCaption <- paste0("Hypothetical large n sample",
makeMeanCIPlot(plotDT)
```
re-run T tests Group 1
re-run T tests Control vs Group 1
```{r largeNtTest1}
```{r largeNtTestControl-1}
# now compare winter & spring for a smaller effect
tTest <- t.test(largeTestDT[group == "Intervention 2"]$meanW, largeTestDT[group == "Control"]$meanW)
tTest <- t.test(largeTestDT[group == "Control"]$meanW, largeTestDT[group == "Intervention 1"]$meanW)
tTestTidy <- broom::tidy(tTest)
tTestTidy$`Control mean` <- tTestTidy$estimate2
tTestTidy$`Group 1 mean` <- tTestTidy$estimate1
tTestTidy$`Control mean` <- tTestTidy$estimate1
tTestTidy$`Intervention 1 mean` <- tTestTidy$estimate2
tTestTidy$`Mean difference` <- tTestTidy$estimate
knitr::kable(tTestTidy[c("Control mean", "Group 1 mean", "Mean difference",
"statistic", "p.value", "conf.low", "conf.high")], caption = "T test results (Group 1 vs Control)")
knitr::kable(tTestTidy[c("Control mean", "Intervention 1 mean", "Mean difference",
"statistic", "p.value", "conf.low", "conf.high")], caption = "T test results (Intervention 2 vs Control)")
controlW <- tTest$estimate[[2]]
intW <- tTest$estimate[[1]]
controlW <- tTest$estimate[[1]]
intW <- tTest$estimate[[2]]
cil <- tTest$conf.int[[1]]
ciu <- tTest$conf.int[[2]]
```
......@@ -564,6 +565,34 @@ In this case:
* 95% confidence interval for the test = `r round(cil,2)` to `r round(ciu,2)` representing _much less_ uncertainty/variation;
* p value of `r round(tTest$p.value,4)` representing a _very low_ risk of a false positive result as it passes all conventional thresholds.
re-run T tests Control vs Group 2
```{r largeNtTestControl-2}
# now compare winter & spring for a smaller effect
tTest <- t.test(largeTestDT[group == "Control"]$meanW, largeTestDT[group == "Intervention 2"]$meanW)
tTestTidy <- broom::tidy(tTest)
tTestTidy$`Control mean` <- tTestTidy$estimate1
tTestTidy$`Intervention 2 mean` <- tTestTidy$estimate2
tTestTidy$`Mean difference` <- tTestTidy$estimate
knitr::kable(tTestTidy[c("Control mean", "Intervention 2 mean", "Mean difference",
"statistic", "p.value", "conf.low", "conf.high")], caption = "T test results (Intervention 2 vs Control)")
controlW <- tTest$estimate[[1]]
intW <- tTest$estimate[[2]]
cil <- tTest$conf.int[[1]]
ciu <- tTest$conf.int[[2]]
```
In this case:
* effect size = `r controlW - intW`W or `r round(100 * (1-(intW/controlW)),2)`% representing a still _reasonable bang for buck_ for whatever caused the difference;
* 95% confidence interval for the test = `r round(cil,2)` to `r round(ciu,2)` representing _much less_ uncertainty/variation;
* p value of `r round(tTest$p.value,4)` representing a _very low_ risk of a false positive result as it passes all conventional thresholds.
# Summary and recommendations
......
......@@ -239,7 +239,7 @@ div.tocify {
<h1 class="title toc-ignore">Statistical Power, Statistical Significance, Study Design and Decision Making: A Worked Example</h1>
<h3 class="subtitle"><em>Sizing Demand Response Trials in New Zealand</em></h3>
<h4 class="author"><em>Ben Anderson and Tom Rushby (Contact: <a href="mailto:b.anderson@soton.ac.uk">b.anderson@soton.ac.uk</a>, <code>@dataknut</code>)</em></h4>
<h4 class="date"><em>Last run at: 2018-11-13 09:57:46</em></h4>
<h4 class="date"><em>Last run at: 2018-11-13 13:49:39</em></h4>
</div>
......@@ -626,16 +626,16 @@ Figure 4.2: Power analysis results (power = 0.8)
</tr>
<tr class="even">
<td align="left">Intervention 1</td>
<td align="right">35.13947</td>
<td align="right">83.90258</td>
<td align="right">22</td>
</tr>
<tr class="odd">
<td align="left">Intervention 2</td>
<td align="right">58.80597</td>
<td align="right">113.53102</td>
<td align="right">26</td>
</tr>
<tr class="odd">
<td align="left">Intervention 2</td>
<td align="right">35.13947</td>
<td align="right">83.90258</td>
<td align="right">22</td>
</tr>
<tr class="even">
<td align="left">Intervention 3</td>
<td align="right">68.37439</td>
......@@ -651,7 +651,7 @@ Figure 4.2: Power analysis results (power = 0.8)
<thead>
<tr class="header">
<th align="right">Control mean</th>
<th align="right">Group 1 mean</th>
<th align="right">Intervention 1 mean</th>
<th align="right">Mean difference</th>
<th align="right">statistic</th>
<th align="right">p.value</th>
......@@ -662,20 +662,20 @@ Figure 4.2: Power analysis results (power = 0.8)
<tbody>
<tr class="odd">
<td align="right">162.6691</td>
<td align="right">35.13947</td>
<td align="right">-127.5297</td>
<td align="right">-1.990661</td>
<td align="right">0.0552626</td>
<td align="right">-258.11</td>
<td align="right">3.050644</td>
<td align="right">58.80597</td>
<td align="right">-103.8632</td>
<td align="right">-1.587604</td>
<td align="right">0.1216582</td>
<td align="right">-236.8285</td>
<td align="right">29.10212</td>
</tr>
</tbody>
</table>
<p>The results show that the mean power demand for the control group was 162.67W and for Intervention 1 was 35.14W. This is a (very) large difference in the mean of 127.53. The results of the t test are:</p>
<p>The results show that the mean power demand for the control group was 162.67W and for Intervention 1 was 58.81W. This is a (very) large difference in the mean of 103.86. The results of the t test are:</p>
<ul>
<li>effect size = 128W or 78% representing a <em>substantial bang for buck</em> for whatever caused the difference;</li>
<li>95% confidence interval for the test = -258.11 to 3.05 representing <em>considerable</em> uncertainty/variation;</li>
<li>p value of 0.055 representing a <em>relatively low</em> risk of a false positive result but which (just) fails the conventional p &lt; 0.05 threshold.</li>
<li>effect size = 104W or 64% representing a <em>substantial bang for buck</em> for whatever caused the difference;</li>
<li>95% confidence interval for the test = -236.83 to 29.1 representing <em>considerable</em> uncertainty/variation;</li>
<li>p value of 0.122 representing a <em>relatively low</em> risk of a false positive result but which (just) fails the conventional p &lt; 0.05 threshold.</li>
</ul>
<p>T test Group 2</p>
<table>
......@@ -683,7 +683,7 @@ Figure 4.2: Power analysis results (power = 0.8)
<thead>
<tr class="header">
<th align="right">Control mean</th>
<th align="right">Group 2 mean</th>
<th align="right">Intervention 2 mean</th>
<th align="right">Mean difference</th>
<th align="right">statistic</th>
<th align="right">p.value</th>
......@@ -694,22 +694,22 @@ Figure 4.2: Power analysis results (power = 0.8)
<tbody>
<tr class="odd">
<td align="right">162.6691</td>
<td align="right">58.80597</td>
<td align="right">-103.8632</td>
<td align="right">-1.587604</td>
<td align="right">0.1216582</td>
<td align="right">-236.8285</td>
<td align="right">29.10212</td>
<td align="right">35.13947</td>
<td align="right">-127.5297</td>
<td align="right">-1.990661</td>
<td align="right">0.0552626</td>
<td align="right">-258.11</td>
<td align="right">3.050644</td>
</tr>
</tbody>
</table>
<p>Now:</p>
<ul>
<li>effect size = 104W or 63.85% representing a still <em>reasonable bang for buck</em> for whatever caused the difference;</li>
<li>95% confidence interval for the test = -236.83 to 29.1 representing <em>even greater</em> uncertainty/variation;</li>
<li>p value of 0.122 representing a <em>higher</em> risk of a false positive result which fails the conventional p &lt; 0.05 threshold and also the less conservative p &lt; 0.1.</li>
<li>effect size = 128W or 78.4% representing a still <em>reasonable bang for buck</em> for whatever caused the difference;</li>
<li>95% confidence interval for the test = -258.11 to 3.05 representing <em>even greater</em> uncertainty/variation;</li>
<li>p value of 0.055 representing a <em>higher</em> risk of a false positive result which fails the conventional p &lt; 0.05 threshold and also the less conservative p &lt; 0.1.</li>
</ul>
<p>To detect Intervention Group 2’s effect size of 63.85% would have required control and trial group sizes of 47 respectively.</p>
<p>To detect Intervention Group 2’s effect size of 78.4% would have required control and trial group sizes of 31 respectively.</p>
</div>
<div id="getting-it-right" class="section level2">
<h2><span class="header-section-number">5.2</span> Getting it ‘right’</h2>
......@@ -726,27 +726,27 @@ Figure 4.2: Power analysis results (power = 0.8)
<tbody>
<tr class="odd">
<td align="left">Control</td>
<td align="right">169.60064</td>
<td align="right">328.56355</td>
<td align="right">1140</td>
<td align="right">160.44582</td>
<td align="right">317.03541</td>
<td align="right">1128</td>
</tr>
<tr class="even">
<td align="left">Intervention 1</td>
<td align="right">34.50149</td>
<td align="right">82.94015</td>
<td align="right">907</td>
<td align="right">58.51839</td>
<td align="right">109.58111</td>
<td align="right">984</td>
</tr>
<tr class="odd">
<td align="left">Intervention 2</td>
<td align="right">59.84020</td>
<td align="right">112.74650</td>
<td align="right">1008</td>
<td align="right">36.35177</td>
<td align="right">83.36952</td>
<td align="right">903</td>
</tr>
<tr class="even">
<td align="left">Intervention 3</td>
<td align="right">73.24102</td>
<td align="right">148.25869</td>
<td align="right">1145</td>
<td align="right">70.69426</td>
<td align="right">147.43129</td>
<td align="right">1185</td>
</tr>
</tbody>
</table>
......@@ -756,13 +756,45 @@ Figure 4.2: Power analysis results (power = 0.8)
Figure 5.1: Mean W demand per group for large sample (Error bars = 95% confidence intervals for the sample mean)
</p>
</div>
<p>re-run T tests Group 1</p>
<p>re-run T tests Control vs Group 1</p>
<table>
<caption><span id="tab:largeNtTestControl-1">Table 5.5: </span>T test results (Intervention 2 vs Control)</caption>
<thead>
<tr class="header">
<th align="right">Control mean</th>
<th align="right">Intervention 1 mean</th>
<th align="right">Mean difference</th>
<th align="right">statistic</th>
<th align="right">p.value</th>
<th align="right">conf.low</th>
<th align="right">conf.high</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="right">160.4458</td>
<td align="right">58.51839</td>
<td align="right">101.9274</td>
<td align="right">10.12667</td>
<td align="right">0</td>
<td align="right">82.18316</td>
<td align="right">121.6717</td>
</tr>
</tbody>
</table>
<p>In this case:</p>
<ul>
<li>effect size = 101.9274343W or 63.53% representing a still <em>reasonable bang for buck</em> for whatever caused the difference;</li>
<li>95% confidence interval for the test = 82.18 to 121.67 representing <em>much less</em> uncertainty/variation;</li>
<li>p value of 0 representing a <em>very low</em> risk of a false positive result as it passes all conventional thresholds.</li>
</ul>
<p>re-run T tests Control vs Group 2</p>
<table>
<caption><span id="tab:largeNtTest1">Table 5.5: </span>T test results (Group 1 vs Control)</caption>
<caption><span id="tab:largeNtTestControl-2">Table 5.6: </span>T test results (Intervention 2 vs Control)</caption>
<thead>
<tr class="header">
<th align="right">Control mean</th>
<th align="right">Group 1 mean</th>
<th align="right">Intervention 2 mean</th>
<th align="right">Mean difference</th>
<th align="right">statistic</th>
<th align="right">p.value</th>
......@@ -772,20 +804,20 @@ Figure 5.1: Mean W demand per group for large sample (Error bars = 95% confidenc
</thead>
<tbody>
<tr class="odd">
<td align="right">169.6006</td>
<td align="right">59.8402</td>
<td align="right">-109.7604</td>
<td align="right">-10.59573</td>
<td align="right">160.4458</td>
<td align="right">36.35177</td>
<td align="right">124.0941</td>
<td align="right">12.61266</td>
<td align="right">0</td>
<td align="right">-130.0807</td>
<td align="right">-89.44015</td>
<td align="right">104.7925</td>
<td align="right">143.3956</td>
</tr>
</tbody>
</table>
<p>In this case:</p>
<ul>
<li>effect size = 109.7604326W or 64.72% representing a still <em>reasonable bang for buck</em> for whatever caused the difference;</li>
<li>95% confidence interval for the test = -130.08 to -89.44 representing <em>much less</em> uncertainty/variation;</li>
<li>effect size = 124.0940533W or 77.34% representing a still <em>reasonable bang for buck</em> for whatever caused the difference;</li>
<li>95% confidence interval for the test = 104.79 to 143.4 representing <em>much less</em> uncertainty/variation;</li>
<li>p value of 0 representing a <em>very low</em> risk of a false positive result as it passes all conventional thresholds.</li>
</ul>
</div>
......@@ -807,7 +839,7 @@ Figure 5.1: Mean W demand per group for large sample (Error bars = 95% confidenc
</div>
<div id="runtime" class="section level1">
<h1><span class="header-section-number">8</span> Runtime</h1>
<p>Analysis completed in 51.37 seconds ( 0.86 minutes) using <a href="https://cran.r-project.org/package=knitr">knitr</a> in <a href="http://www.rstudio.com">RStudio</a> with R version 3.5.1 (2018-07-02) running on x86_64-apple-darwin15.6.0.</p>
<p>Analysis completed in 46.02 seconds ( 0.77 minutes) using <a href="https://cran.r-project.org/package=knitr">knitr</a> in <a href="http://www.rstudio.com">RStudio</a> with R version 3.5.1 (2018-07-02) running on x86_64-apple-darwin15.6.0.</p>
</div>
<div id="r-environment" class="section level1">
<h1><span class="header-section-number">9</span> R environment</h1>
......
This source diff could not be displayed because it is too large. You can view the blob instead.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment