Commit 40131726 authored by B.Anderson's avatar B.Anderson
Browse files

latest html run

parent cd3ccc9d
......@@ -181,7 +181,7 @@ summary {
<h1 class="title toc-ignore">Testing electricity substation/feeder data</h1>
<h3 class="subtitle">Outliers and missing data...</h3>
<h4 class="author">Ben Anderson &amp; Ellis Ridett</h4>
<h4 class="date">Last run at: 2020-07-09 00:56:06</h4>
<h4 class="date">Last run at: 2020-07-09 09:48:01</h4>
 
</div>
 
......@@ -214,7 +214,7 @@ dataCleaning::loadLibraries(rmdLibs)</code></pre>
<h1>Intro</h1>
<p>We have some electricity substation feeder data that has been cleaned to give mean kW per 15 minutes.</p>
<p>There seem to be some NA kW values and a lot of missing time stamps. We want to select the 'best' (i.e most complete) days within a day-of-the-week/season/year sampling frame. If we can't do that we may have to resort to seasonal mean kW profiles by hour &amp; day of the week...</p>
<p>Code used to generate this report: <a href="https://git.soton.ac.uk/ba1e12/spatialec/-/blob/master/isleOfWight/cleaningFeederData.Rmd" class="uri">https://git.soton.ac.uk/ba1e12/spatialec/-/blob/master/isleOfWight/cleaningFeederData.Rmd</a></p>
<p>The code used to generate this report is in: <a href="https://git.soton.ac.uk/ba1e12/dataCleaning/Rmd/" class="uri">https://git.soton.ac.uk/ba1e12/dataCleaning/Rmd/</a></p>
</div>
<div id="data-prep" class="section level1">
<h1>Data prep</h1>
......@@ -226,11 +226,11 @@ dataCleaning::loadLibraries(rmdLibs)</code></pre>
uniqDataDT &lt;- drake::readd(uniqData) # readd the drake object
 
kableExtra::kable(head(origDataDT), digits = 2,
caption = &quot;Counts per feeder (long table)&quot;) %&gt;%
caption = &quot;First 6 rows of data&quot;) %&gt;%
kable_styling()</code></pre>
<table class="table" style="margin-left: auto; margin-right: auto;">
<caption>
Counts per feeder (long table)
First 6 rows of data
</caption>
<thead>
<tr>
......@@ -487,14 +487,16 @@ Winter
 
message(&quot;Unique data nrows: &quot;, tidyNum(nrow(uniqDataDT)))
 
message(&quot;So we have &quot;, tidyNum(nrow(origDataDT) - nrow(uniqDataDT)), &quot; duplicates...&quot;)
nDups &lt;- tidyNum(nrow(origDataDT) - nrow(uniqDataDT))
message(&quot;So we have &quot;, tidyNum(nDups), &quot; duplicates...&quot;)
 
pc &lt;- 100*((nrow(origDataDT) - nrow(uniqDataDT))/nrow(origDataDT))
message(&quot;That's &quot;, round(pc,2), &quot;%&quot;)
 
feederDT &lt;- uniqDataDT[!is.na(rDateTime)] # use dt with no duplicates
origDataDT &lt;- NULL # save memory</code></pre>
<p>There were duplicates - that's 0.38 % of the observations loaded.</p>
<p>There were 83,606 duplicates - that's ~ 0.38 % of the observations loaded.</p>
<p>So we remove the duplicates...</p>
</div>
</div>
......@@ -1500,7 +1502,7 @@ Fri
</div>
<div id="runtime" class="section level1">
<h1>Runtime</h1>
<p>Analysis completed in 196.02 seconds ( 3.27 minutes) using <a href="https://cran.r-project.org/package=knitr">knitr</a> in <a href="http://www.rstudio.com">RStudio</a> with R version 3.6.0 (2019-04-26) running on x86_64-redhat-linux-gnu.</p>
<p>Analysis completed in 218.48 seconds ( 3.64 minutes) using <a href="https://cran.r-project.org/package=knitr">knitr</a> in <a href="http://www.rstudio.com">RStudio</a> with R version 3.6.0 (2019-04-26) running on x86_64-redhat-linux-gnu.</p>
</div>
<div id="r-environment" class="section level1">
<h1>R environment</h1>
......@@ -1539,8 +1541,8 @@ Fri
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] kableExtra_1.1.0 skimr_2.1.2 ggplot2_3.3.2 hms_0.5.3
## [5] lubridate_1.7.9 here_0.1 drake_7.12.4 data.table_1.12.0
## [1] kableExtra_1.1.0 drake_7.12.4 skimr_2.1.2 ggplot2_3.3.2
## [5] hms_0.5.3 lubridate_1.7.9 here_0.1 data.table_1.12.0
## [9] dataCleaning_0.1.0
##
## loaded via a namespace (and not attached):
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment