Commit 6f217823 authored by Ben Anderson's avatar Ben Anderson
Browse files

added further data cleaning - slightly adjusts initial plots and results;...

added further data cleaning - slightly adjusts initial plots and results; results based on random re-sample will not now match the paper due to random processes
parent 03d64ee0
Ben Anderson,ben,ou029107.otago.ac.nz,15.11.2018 14:52,file:///Users/ben/Library/Application%20Support/LibreOffice/4;
\ No newline at end of file
Ben Anderson,ben,octomac.local,28.11.2018 17:27,file:///Users/ben/Library/Application%20Support/LibreOffice/4;
\ No newline at end of file
......@@ -160,8 +160,21 @@ if(file.exists(heatPumpData)){
stop()
}
```
```{r rawTable}
gst <- summary(gsDT)
knitr::kable(caption = "Summary of loaded grid spy data", gst)
```
Notice that there are negawatts! Remove rf_46 and all negative values as per https://cfsotago.github.io/GREENGridData/gridSpy1mOutliersReport_v1.0.html
```{r cleanGSdata}
# remove rf_46 and all negative values as per https://cfsotago.github.io/GREENGridData/gridSpy1mOutliersReport_v1.0.html
gsDT <- gsDT[linkID != "rf_46" & powerW >= 0]
gsDT <- gsDT[, month := lubridate::month(r_dateTime)]
gsDT <- gsDT[, year := lubridate::year(r_dateTime)]
......@@ -174,6 +187,14 @@ gsDT <- gsDT[tmpM >= 9 & tmpM <= 11, season := "Spring"]
# re-order to make sense
gsDT <- gsDT[, season := factor(season, levels = c("Spring", "Summer", "Autumn", "Winter"))]
knitr::kable(caption = "Summary of cleaned grid spy data", summary(gsDT))
nHH <- uniqueN(gsDT$linkID)
```
Number of households in cleaned heatpump data: `r nHH`
```{r load household data}
# Load GREEN Grid household data
if(file.exists(ggHHData)){
message("Loading: ", ggHHData )
......@@ -183,20 +204,6 @@ if(file.exists(ggHHData)){
stop()
}
knitr::kable(caption = "Summary of grid spy data", gst)
# there are negawatts!
gsDT <- gsDT[, negW := "PosW"]
gsDT <- gsDT[ powerW < 0, negW := "NegaW"]
t <- table(gsDT$linkID,gsDT$negW)
t
prop.table(t)
```
```{r dataPrep}
hhDT <- hhDT[Q57 == 1, nPeople := "1"]
hhDT <- hhDT[Q57 == 2, nPeople := "2"]
hhDT <- hhDT[Q57 == 3, nPeople := "3"]
......@@ -209,7 +216,7 @@ testDT <- gsDT[lubridate::hour(r_dateTime) > 15 & # 16:00 ->
lubridate::hour(r_dateTime) < 20 & # <- 20:00
lubridate::wday(r_dateTime) != 6 & # not Saturday
lubridate::wday(r_dateTime) != 7 & # not Sunday
year == 2015 & negW == "PosW", # remove the negawatts (see https://github.com/CfSOtago/GREENGridData/issues/6)
year == 2015,
.(meanW = mean(powerW, na.rm = TRUE),
sdW = sd(powerW, na.rm = TRUE)), keyby = .(season, linkID)]
setkey(testDT, linkID)
......
This diff is collapsed.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment