Skip to content
Snippets Groups Projects
Commit 076902b2 authored by B.Anderson's avatar B.Anderson
Browse files

added drake functions; added basicReport.Rmd; run to test. works

parent 3bc2196f
No related branches found
No related tags found
No related merge requests found
*
!/.gitignore
---
params:
subtitle: ""
title: ""
authors: ""
title: '`r params$title`'
subtitle: '`r params$subtitle`'
author: '`r params$authors`'
date: 'Last run at: `r Sys.time()`'
output:
bookdown::html_document2:
fig_caption: yes
code_folding: hide
number_sections: yes
toc: yes
toc_depth: 2
toc_float: TRUE
bookdown::pdf_document2:
fig_caption: yes
number_sections: yes
bookdown::word_document2:
fig_caption: yes
number_sections: yes
toc: yes
toc_depth: 2
fig_width: 5
always_allow_html: yes
bibliography: '`r path.expand("~/bibliography.bib")`'
---
```{r knitrSetup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE) # by default turn off code echo
```
# A basic .Rmd template to illustrate our workflow
## Data
The following uses `skimr::skim()` to describe the data. Remember we created the skim output object in the R script. We just report it here.
```{r reportSkim}
# print out the skimr object by getting drake to bring it back from wherever
# it is hidden
drake::readd(skimTable)
```
There's quite a lot of data...
## Plot
\@ref(fig:allDataPlot) plots every data point in the data (!). Remember we created the plot output object in the R script. We just print it here.
```{r allDataPlot, fig.cap="Half-hourly generation (GW)"}
# render the plot by getting drake to bring it back from wherever
# it is hidden
drake::readd(gWPlot)
```
# # Runtime
```{r check runtime, include=FALSE}
t <- proc.time() - startTime
elapsed <- t[[3]]
```
Report generated in `r round(elapsed,2)` seconds ( `r round(elapsed/60,2)` minutes) using [knitr](https://cran.r-project.org/package=knitr) in [RStudio](http://www.rstudio.com) with `r R.version.string` running on `r R.version$platform`.
# R environment
## R packages used
* base R [@baseR]
* bookdown [@bookdown]
* data.table [@data.table]
* drake [@drake]
* ggplot2 [@ggplot2]
* here [@here]
* knitr [@knitr]
* lubridate [@lubridate]
* rmarkdown [@rmarkdown,@rmarkdownBook]
## Session info
```{r sessionInfo, echo=FALSE}
sessionInfo()
```
# References
# basic drake makefile
library(woRkflow) # remember to build it first :-)
woRkflow::setup() # set stuff iup
\ No newline at end of file
# basic drake makefile
# Libraries ----
library(woRkflow) # remember to build it first :-)
reqLibs <- c("data.table", # data munching
"drake", # what's done stays done
"here", # here
"lubridate", # dates and times
"ggplot2", # plots
"skimr" # for skim
)
# load them
woRkflow::loadLibraries(reqLibs)
# Parameters ----
# Some data to play with:
# https://data.nationalgrideso.com/carbon-intensity1/historic-generation-mix/r/historic_gb_generation_mix
urlToGet <- "http://data.nationalgrideso.com/backend/dataset/88313ae5-94e4-4ddc-a790-593554d8c6b9/resource/7b41ea4d-cada-491e-8ad6-7b62f6a63193/download/df_fuel_ckan.csv"
update <- "now" # edit this in any way (at all) to get drake to re-load the data from the url
rmdFile <- "basicReport" # <- name of the .Rmd file to run at the end
title <- "UK Electricity Generation"
subtitle <- "UK ESO grid data"
authors <- "Ben Anderson"
# Functions ----
# for use in drake
getData <- function(f,update){
# gets the data
dt <- data.table::fread(f)
return(dt)
}
makeGWPlot <- function(dt){
# expects the eso data as a data.table
# draws a plot
dt[, rDateTime := lubridate::ymd_hms(DATETIME)]
dt[, weekDay := lubridate::wday(rDateTime, label = TRUE)]
# draw a megaplot for illustrative purposes
p <- ggplot2::ggplot(dt, aes(x = rDateTime,
y = GENERATION/1000,
colour = weekDay)) +
geom_point() +
theme(legend.position = "bottom") +
labs(x = "Time",
y = "Generation (GW - mean per halfhour?)",
caption = "Source: UK Grid ESO (http://data.nationalgrideso.com)")
return(p)
}
makeReport <- function(f){
# default = html
rmarkdown::render(input = paste0(here::here("analysis", f),".Rmd"), # we love here:here()
params = list(title = title,
subtitle = subtitle,
authors = authors),
output_file = paste0(here::here("docs", f),".html")
)
}
# Set up ----
woRkflow::setup() # set stuff up
startTime <- proc.time()
# Set the drake plan ----
# Clearly this will fail if you do not have internet access...
plan <- drake::drake_plan(
esoData = getData(urlToGet, update), # returns data as data.table. If you edit update in any way it will reload - drake is watching you!
skimTable = skimr::skim(esoData), # create a data description table
gWPlot = makeGWPlot(esoData) # make a plot
)
# Run drake plan ----
plan # test the plan
make(plan) # run the plan, re-loading data if needed
# Run the report ----
# run the report - don't do this inside the drake plan as
# drake can't track the .rmd file if it is not explicitly named
makeReport(rmdFile)
# Finish off ----
t <- proc.time() - startTime # how long did it take?
elapsed <- t[[3]]
print("Done")
print(paste0("Completed in ", round(elapsed/60,3), " minutes using ",
R.version.string, " running on ", R.version$platform))
\ No newline at end of file
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment