Skip to content
Snippets Groups Projects
Commit f60b0116 authored by Ben Anderson's avatar Ben Anderson
Browse files

Update README.md

parent e3c3fd8a
No related branches found
No related tags found
No related merge requests found
# dataCleaning # dataCleaning
A place to store hints, tips and examples for data cleaning. We use a lot of very dirty data. A place to store hints, tips and examples for data cleaning. We use a lot of very dirty data which often has outliers and missing observations. Since most of this data is large scale 'sensor' data with time stamps we make a lot of use of t hese R packages:
* [data.table](https://rdatatable.gitlab.io/data.table/)
* [lubridate](https://lubridate.tidyverse.org/)
* [hms](https://hms.tidyverse.org/)
* [ggplot2](https://ggplot2.tidyverse.org/)'s [geom_tile()](https://ggplot2.tidyverse.org/reference/geom_tile.html) with time of day on the date on the x axis, time on the y and 'fill' set to the sensor value that _should_ be there. This shows up non-random (and random) data holes like [these](https://git.soton.ac.uk/SERG/datacleaning/-/blob/master/rmd/cleaningFeederData_files/figure-latex/missingVis-1.pdf) very nicely.
This repo is an R package. This means: This repo is an R package. This means:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment