diff --git a/template.md b/template.md index 46c699248473339bfb88a310e5c5099856d0dac0..be18a8422e41e8944b75e2f6a8ea9a83b2003343 100644 --- a/template.md +++ b/template.md @@ -9,15 +9,17 @@ Things you should touch: | Item | Description | | --- | --- | | **.gitignore** | A place to tell git what _not_ to synchronise e.g. `.csv` or [weird OS files](https://gist.github.com/adamgit/3786883)| -| **env.R** | Where we store all the parameters that might be re-used across our repo. Such as colour defaults, data paths etc. We avoid using a project/repo level .Rprofile because it can lead to [a **lot** of confusion](https://support.rstudio.com/hc/en-us/articles/360047157094-Managing-R-with-Rprofile-Renviron-Rprofile-site-Renviron-site-rsession-conf-and-repos-conf). | -| **DESCRIPTION** | But only if you use this as a tmeplate for your own repo - it is a special file for packages | -| **LICENSE** | Edit to suit your needs | -| **R/** | Where we store functions that get built | | **analysis/** | Where we store .Rmd files and the .R scripts that call them (usually using a `drake` plan) | +| **DESCRIPTION** | But only if you use this as a tmeplate for your own repo - it is a special file for packages | | **docs/** | Where we put output generated by the .R/.Rmd code. This is helpful if you are using [github/lab pages](https://guides.github.com/features/pages/). Unfortunately the University of Southampton gitlab service does not currently support this. | +| **env.R** | Where we store all the parameters that might be re-used across our repo. Such as colour defaults, data paths etc. We avoid using a project/repo level .Rprofile because it can lead to [a **lot** of confusion](https://support.rstudio.com/hc/en-us/articles/360047157094-Managing-R-with-Rprofile-Renviron-Rprofile-site-Renviron-site-rsession-conf-and-repos-conf). | +| **LICENSE** | Edit to suit your needs | | **notData/** | Where we do not store data. R packages expect certain kinds of data in their 'data/' folders. Do not put your data in it. | +| **R/** | Where we store functions that get built | + +> We recommend **not** putting your data in your repo at all. -In fact we recommend **not** putting your data in your repo at all. Yes, this breaks true reproducability but there are reasons: +Yes, this breaks true reproducability but there are reasons: * we often use data that is commercial or sensitive or personal (under GDPR) - so we cannot risk that leaking out * we often use _very large_ datasets which most git/hub/lab services sensibly reject * we often pull real time data on the fly from elsewhere so storage makes no sense