diff --git a/Rmd/cleaningFeederData.Rmd b/Rmd/cleaningFeederData.Rmd index 032a46503dfc3cffc86b096f41b1198d70b97694..4c352200e16a6432fb58fa7ce2fd86fca2b395ea 100644 --- a/Rmd/cleaningFeederData.Rmd +++ b/Rmd/cleaningFeederData.Rmd @@ -314,14 +314,19 @@ So, there are `r n` days with 100% data... If we plot the mean then we will see which days get closest to having a full dataset. ```{r bestDaysMean, fig.width=8} -ggplot2::ggplot(aggDT, aes(x = rDate, colour = season, y = meanOK)) + geom_point() +ggplot2::ggplot(aggDT, aes(x = rDate, colour = season, y = meanOK)) + + geom_point() ``` -Re-plot by the % of expected if we assume we _should_ have 25 feeders * 24 hours * 4 per hour (will be the same shape): +Re-plot by the % of expected if we assume we _should_ have n feeders * 24 hours * 4 per hour (will be the same shape). This also tells us that there is some reason why we get fluctations in the number of data points per hour after 2003. + +For fun we then print 4 tables of the 'best' days per season. ```{r bestDaysProp, fig.width=8} -ggplot2::ggplot(aggDT, aes(x = rDate, colour = season, y = 100*propExpected)) + geom_point() + +ggplot2::ggplot(aggDT, aes(x = rDate, colour = season, + y = 100*propExpected)) + + geom_point() + labs(y = "%") aggDT[, rDoW := lubridate::wday(rDate, lab = TRUE)] @@ -346,8 +351,6 @@ kableExtra::kable(h, caption = "Best Winter days overall", kable_styling() ``` -This also tells us that there is some reason why we get fluctations in the number of data points per hour after 2003. - # Summary So there are no days with 100% data. We need a different approach.