Commit f60b0116 authored by Ben Anderson's avatar Ben Anderson
Browse files

Update README.md

parent e3c3fd8a
# dataCleaning
A place to store hints, tips and examples for data cleaning. We use a lot of very dirty data.
A place to store hints, tips and examples for data cleaning. We use a lot of very dirty data which often has outliers and missing observations. Since most of this data is large scale 'sensor' data with time stamps we make a lot of use of t hese R packages:
* [data.table](https://rdatatable.gitlab.io/data.table/)
* [lubridate](https://lubridate.tidyverse.org/)
* [hms](https://hms.tidyverse.org/)
* [ggplot2](https://ggplot2.tidyverse.org/)'s [geom_tile()](https://ggplot2.tidyverse.org/reference/geom_tile.html) with time of day on the date on the x axis, time on the y and 'fill' set to the sensor value that _should_ be there. This shows up non-random (and random) data holes like [these](https://git.soton.ac.uk/SERG/datacleaning/-/blob/master/rmd/cleaningFeederData_files/figure-latex/missingVis-1.pdf) very nicely.
This repo is an R package. This means:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment