Skip to content
Snippets Groups Projects

merge a few edits

Merged Ben Anderson requested to merge ba1e12/datacleaning:master into master
1 file
+ 8
5
Compare changes
  • Side-by-side
  • Inline
  • e8f5e9bc
    pdf/latex fix: · e8f5e9bc
    B.Anderson authored
    > install.packages(c('tinytex', 'rmarkdown'))
    > tinytex::install_tinytex()
    
    #rtfm
@@ -314,14 +314,19 @@ So, there are `r n` days with 100% data...
If we plot the mean then we will see which days get closest to having a full dataset.
```{r bestDaysMean, fig.width=8}
ggplot2::ggplot(aggDT, aes(x = rDate, colour = season, y = meanOK)) + geom_point()
ggplot2::ggplot(aggDT, aes(x = rDate, colour = season, y = meanOK)) +
geom_point()
```
Re-plot by the % of expected if we assume we _should_ have 25 feeders * 24 hours * 4 per hour (will be the same shape):
Re-plot by the % of expected if we assume we _should_ have n feeders * 24 hours * 4 per hour (will be the same shape). This also tells us that there is some reason why we get fluctations in the number of data points per hour after 2003.
For fun we then print 4 tables of the 'best' days per season.
```{r bestDaysProp, fig.width=8}
ggplot2::ggplot(aggDT, aes(x = rDate, colour = season, y = 100*propExpected)) + geom_point() +
ggplot2::ggplot(aggDT, aes(x = rDate, colour = season,
y = 100*propExpected)) +
geom_point() +
labs(y = "%")
aggDT[, rDoW := lubridate::wday(rDate, lab = TRUE)]
@@ -346,8 +351,6 @@ kableExtra::kable(h, caption = "Best Winter days overall",
kable_styling()
```
This also tells us that there is some reason why we get fluctations in the number of data points per hour after 2003.
# Summary
So there are no days with 100% data. We need a different approach.
Loading