| **.gitignore** | A place to tell git what _not_ to synchronise e.g. `.csv` or [weird OS files](https://gist.github.com/adamgit/3786883)|
| **env.R** | Where we store all the parameters that might be re-used across our repo. Such as colour defaults, data paths etc. We avoid using a project/repo level .Rprofile because it can lead to [a **lot** of confusion](https://support.rstudio.com/hc/en-us/articles/360047157094-Managing-R-with-Rprofile-Renviron-Rprofile-site-Renviron-site-rsession-conf-and-repos-conf). |
| **DESCRIPTION** | But only if you use this as a tmeplate for your own repo - it is a special file for packages |
| **LICENSE** | Edit to suit your needs |
| **R/** | Where we store functions that get built |
| **analysis/** | Where we store .Rmd files and the .R scripts that call them (usually using a `drake` plan) |
| **DESCRIPTION** | But only if you use this as a tmeplate for your own repo - it is a special file for packages |
| **docs/** | Where we put output generated by the .R/.Rmd code. This is helpful if you are using [github/lab pages](https://guides.github.com/features/pages/). Unfortunately the University of Southampton gitlab service does not currently support this. |
| **env.R** | Where we store all the parameters that might be re-used across our repo. Such as colour defaults, data paths etc. We avoid using a project/repo level .Rprofile because it can lead to [a **lot** of confusion](https://support.rstudio.com/hc/en-us/articles/360047157094-Managing-R-with-Rprofile-Renviron-Rprofile-site-Renviron-site-rsession-conf-and-repos-conf). |
| **LICENSE** | Edit to suit your needs |
| **notData/** | Where we do not store data. R packages expect certain kinds of data in their 'data/' folders. Do not put your data in it. |
| **R/** | Where we store functions that get built |
> We recommend **not** putting your data in your repo at all.
In fact we recommend **not** putting your data in your repo at all. Yes, this breaks true reproducability but there are reasons:
Yes, this breaks true reproducability but there are reasons:
* we often use data that is commercial or sensitive or personal (under GDPR) - so we cannot risk that leaking out
* we often use _very large_ datasets which most git/hub/lab services sensibly reject
* we often pull real time data on the fly from elsewhere so storage makes no sense