re-arranged plots so clear which model is being used; saves out plots without...

re-arranged plots so clear which model is being used; saves out plots without model labels for papers etc

re-arranged plots so clear which model is being used; saves out plots without...
19362e34 · Ben Anderson · 026da5f0 · 19362e34
Commit 19362e34 authored 4 years ago by Ben Anderson
--- a/impetusModel/computeHistoricalMonthly.Rmd
+++ b/impetusModel/computeHistoricalMonthly.Rmd
-This section takes the two (or more) different baseline household consumption datasets and uses them to construct a synthetic household level dataset of estimated micro-component values for each month and year from `r min(metDataDT$year)` to `r max(metDataDT$year)` based on the known metoffice data imported earlier.
+This section takes the two (or more) different baseline household consumption datasets and uses them to construct a synthetic household level dataset of estimated micro-component values for each month and year from `r min(metDataDT$year)` to `r max(metDataDT$year)` based on the known metoffice climate data imported earlier.

 The approach used is common to both models v1 and v2 and requires input data with the following variables:

@@ -466,13 +466,37 @@ myPlot <- ggplot(dt, aes(x = factor(currMon), group = metered)
       caption = myCaption)

 myPlot
+
+# refined plot just for model v2
+plotDT <- dt[model == "v2_0"]
+myPlot <- ggplot(plotDT, aes(x = factor(currMon), linetype = metered, 
+                         group = metered)
+                 ) +
+  #geom_boxplot()  +
+  geom_line(aes(y = Basin, colour = "Basin")) +
+  geom_line(aes(y = Bath, colour = "Bath")) +
+  geom_line(aes(y = KitchenSink, colour = "Kitchen Sink")) +
+  geom_line(aes(y = Dishwasher, colour = "Dishwasher")) +
+  geom_line(aes(y = External, colour = "External")) +
+  geom_line(aes(y = Shower, colour = "Shower")) +
+  geom_line(aes(y = WC, colour = "WC")) +
+  geom_line(aes(y = WashingMachine, colour = "Washing Machine")) +
+  theme(legend.title = element_blank()) +
+  theme(legend.position = "bottom") +
+  labs(y = "Mean litres/day",
+       x = "Month",
+       title = myTitle,
+       caption = myCaption)
+
+myPlot
+
 ```

 As we can see most consumption patterns follow the expected cycles with the exception of Dishwashers in metered households which are shown as constant. This is because none of Parker's monthly, sunshine or mean temperature regression coefficients for Dishwashers were statistically significant.

-### Yearly patterns
+### Yearly patterns

-As a final test we calculate the litres/day/household by year and comapre this with the per capita consumption. As we can see the baseline estimation model has produced plausible estimates of both daily household and daily per capita consumption over time.
+As a final test we calculate the litres/day/household by year and compare this with the per capita consumption. As we can see the baseline estimation model has produced plausible estimates of both daily household and daily per capita consumption over time.

 ```{r testYearlyLitresPerDay}
 t <- hhFinalDataComboExpandedDT[,
@@ -556,375 +580,14 @@ myPlot
 myTitle <- "All uses: mean daily total"
 myPlot <- ba_IMPETUSmakeYearlyPlot(hhFinalDataComboExpandedDT, "sumDaily.baseline.madj")
 myPlot
-```
-
-> Do these look mostly sensible? How could we validate them?
-
-## Add baseline WE uptake
-
-This has to be done after the expansion to months as it is a monthly uptake model. how this works:
-
- * Add baseline WE uptake rates of dual flush WC & low flow shower head uptake and adjust l/day/hh for WCs and showers accordingly
-
-### Backwards estimation of water efficiency uptake
-The first step is to allocate households to dual-flush/no dual-flush status and low-flow/no low-flow shower heads for all years. The logic applied is as follows:
-
-  * dual-flush/no dual-flush WC:
-    * linearly interpolate rate for a given year and month using the 2% yearly adoption rate
-    * apply these rates randomly to households. We acknowledge that those with meters may be more likely to also have low-flush WCs but we do not model this association at this time.
-  * low-flow/normal flow shower head:
-    * as for WC but stratified by top & bottom 50% shower litres/day within household size
-
-> somehow this process needs to take account of the 'existing' uptake levels in 2011 in model v2 (based on SPRG survey data)
-
-```{r Backcast WE uptake rates}
-# start with a collapsed year/month spine
-
-hhFinalDataComboExpandedDT <- hhFinalDataComboExpandedDT[, obsDate := ymd(paste0(currYear, "-", currMon, "-15"))]  # make it the middle
-
-yearMonthSpineDT <- hhFinalDataComboExpandedDT[, .(nObs = .N), by = .(currYear, currMon, obsDate)]
-
-
-minYear <- min(yearMonthSpineDT$currYear)
-maxYear <- max(yearMonthSpineDT$currYear)
-
-# reduce occupancy levels to 1->5, 6+ for this purpose
-hhFinalDataComboExpandedDT <- hhFinalDataComboExpandedDT[, occRed := ifelse(occupancy >= 6,
-                                                                                  6,
-                                                                                  occupancy)
-                                                               ]
-
-hhFinalDataComboExpandedDT$loWC <- 0
-hhFinalDataComboExpandedDT$loSh <- 0
-
-minOcc <- min(hhFinalDataComboExpandedDT$occRed)
-maxOcc <- max(hhFinalDataComboExpandedDT$occRed)
-
-wcCollectorDT <- NULL
-shCollectorDT <- NULL
-
-
-models <- c("v1_3","v2_0") # by model (as usage distributions vary)
-for(mod in models){
-  countYear = 0 # reset for the start of each model
-  for(y in maxYear:minYear){
-    print(paste0("Updating water efficiency rates for model: ",mod," in ",y))
-    for(m in 12:1){ # run backwards so we can work backwards from the 2011 rates
-      yearMonthSpineDT <- yearMonthSpineDT[currYear == y & currMon == m, 
-                                           dfWCRate := estdfRate - (countYear * (dfWCAdopt/12))]
-      yearMonthSpineDT <- yearMonthSpineDT[currYear == y & currMon == m, 
-                                           lfShowerRate := estlfRate - (countYear * (lfShowerAdopt/12))]
-      countYear <- countYear + 1
-      
-      monthlyWCRate <- yearMonthSpineDT[currYear == y & currMon == m, dfWCRate]
-      monthlyShRate <- yearMonthSpineDT[currYear == y & currMon == m, lfShowerRate]
-      
-      tempDT <- hhFinalDataComboExpandedDT[currYear == y & currMon == m & model == mod,]
-      if(nrow(tempDT) > 0){ # check we have household data for this combination of month, year & model
-        #print(paste0("Model: ", mod , " Year: ", y , " Month: ", m , " N rows: ", nrow(tempDT)))
-        # for WCs
-        #print(paste0("monthlyWCRate: ", monthlyWCRate))
-        # sample households within model/year/month according to the monthly rate
-        # use sample_n as it randomly selects n, use floor to force rounding so n = an integer
-        dualFlushDT <- as.data.table(sample_n(tempDT[, 
-                                                     .(currYear,currMon,obsDate, hhid, model)], 
-                                              floor(nrow(tempDT)*monthlyWCRate)
-        )
-        )
-        #print(paste0("WC: Selected ", nrow(dualFlushDT), " using monthlyWCRate = ", round(monthlyWCRate,3)))
-        dualFlushDT <- dualFlushDT[, dualFlushWC := "Dual flush"]
-        
-        wcCollectorDT <- rbind(wcCollectorDT, dualFlushDT) # add to the data collector
-        
-        # same for showers
-        showerCut <- tempDT[, median(Shower.baseline.madj, na.rm = TRUE)]
-        # sample households within model/year/month according to the monthly rate
-        # use sample_n as it randomly selects n, use floor to force rounding so n = an integer
-        lowFlowShDT <- as.data.table(sample_n(tempDT[, 
-                                                     .(currYear,currMon,obsDate, hhid, model)], 
-                                              floor(nrow(tempDT)*monthlyShRate)
-        )
-        )
-        #print(paste0("Shower: Selected ", nrow(lowFlowShDT), " using monthlyShRate = ", round(monthlyShRate,3)))
-        lowFlowShDT <- lowFlowShDT[, loFlowShower := "Low flow"]
-        shCollectorDT <- rbind(shCollectorDT, lowFlowShDT) # add to the data collector
-      }
-    }
-  }
-}
-
-wcCollectorDT <- as.data.table(wcCollectorDT)
-setkey(wcCollectorDT, currYear,currMon,hhid,model,obsDate)
-setkey(hhFinalDataComboExpandedDT, currYear,currMon,hhid,model,obsDate)
-# merge the dualFlushWC indicator back on to main data file
-hhFinalDataComboExpandedDT <- merge(hhFinalDataComboExpandedDT, wcCollectorDT, all.x = TRUE) # keep all records
-# set NA to 0 (not low flush/flow)
-hhFinalDataComboExpandedDT <- hhFinalDataComboExpandedDT[,dualFlushWC := ifelse(is.na(dualFlushWC), "Single flush", dualFlushWC)]
-          
-# repeat for showers
-shCollectorDT <- as.data.table(shCollectorDT)
-setkey(shCollectorDT, currYear,currMon,hhid,model,obsDate)
-setkey(hhFinalDataComboExpandedDT, currYear,currMon,hhid,model,obsDate)
-# merge the low flow shower indicator back on to main data file
-hhFinalDataComboExpandedDT <- merge(hhFinalDataComboExpandedDT, shCollectorDT, all.x = TRUE) # keep all records
-hhFinalDataComboExpandedDT <- hhFinalDataComboExpandedDT[, 
-                                                                 loFlowShower := ifelse(is.na(loFlowShower), "Normal flow", loFlowShower)]
-
-#summary(yearMonthSpineDT)
-```
-
-```{r check uptake rates}
-# check imputed update rates
-ggplot(yearMonthSpineDT, aes(x=obsDate)) +
-  geom_line(aes(y = 100*dfWCRate, colour = "Dual flush WC")) +
-  geom_line(aes(y = 100*lfShowerRate, colour = "Low flow shower")) +
-  theme(legend.title = element_blank()) +
-  labs(title = "Modelled uptake rates",
-       y = "%",
-       x = "Date") 
-
-# check imputed uptake
-dt <- hhFinalDataComboExpandedDT[, .(nHHs = .N,
-                                         pcHH = 100*(.N/1800)
-                                         ),
-                                     by=.(obsDate, model, dualFlushWC)]
-ggplot(dt, aes(x=obsDate, colour = factor(dualFlushWC))) +
-  geom_line(aes(y = pcHH)) +
-  theme(legend.title = element_blank()) +
-  facet_grid(model ~ .) +
-  labs(title = "Dual Flush WCs: Modelled household uptake",
-       y = "%",
-       x = "Date") 
-
-# check imputed uptake
-dt <- hhFinalDataComboExpandedDT[, .(nHHs = .N,
-                                         pcHH = 100*(.N/1800)
-                                         ), 
-                                     by=.(obsDate, model, loFlowShower)]
-
-ggplot(dt, aes(x=obsDate, colour = factor(loFlowShower))) +
-  geom_line(aes(y = pcHH)) +
-  theme(legend.title = element_blank()) +
-  facet_grid(model ~ .) +
-  labs(title = "Low flow showers: Modelled household uptake",
-       y = "%",
-       x = "Date") 
-```

-Next we use the water use reduction values given at the start of the section to update the l/day/hh for those who have the dual flush WCs or low flow showers.
-
-```{r update baseline water consumption}
-# wc ----
-hhFinalDataComboExpandedDT <- hhFinalDataComboExpandedDT[, WC.baseline.madj.we := ifelse(dualFlushWC == "Dual flush", WC.baseline.madj * dfWCReduction, WC.baseline.madj)
-                                                               ]
-
-# shower ----
-hhFinalDataComboExpandedDT <- hhFinalDataComboExpandedDT[, Shower.baseline.madj.we := ifelse(loFlowShower == "Low flow", Shower.baseline.madj * lfShowerReduction, Shower.baseline.madj)
-                                                               ]
-
-# update total ----
-hhFinalDataComboExpandedDT <- hhFinalDataComboExpandedDT[, sumDaily.baseline.madj.we := Basin.baseline.madj + Bath.baseline.madj + Dishwasher.baseline.madj + External.baseline.madj + KitchenSink.baseline.madj + Shower.baseline.madj.we + WC.baseline.madj.we + WashingMachine.baseline.madj]
+# refine the last one for just Model 2
+plotDT <- hhFinalDataComboExpandedDT[model == "v2_0"]
+m2Plot <- ba_IMPETUSmakeYearlyPlot(plotDT, "sumDaily.baseline.madj")
+m2Plot
 ```

-Now redraw the charts of WC & shower use over time following these adjustments.
-
-```{r plot we adjusted shower and WC use}
-plotDT <- hhFinalDataComboExpandedDT[, .(meanSh = mean(Shower.baseline.madj, na.rm = TRUE),
-                                            meanShWE = mean(Shower.baseline.madj.we, na.rm = TRUE),
-                                            meanWC = mean(WC.baseline.madj, na.rm = TRUE),
-                                            meanWCWE = mean(WC.baseline.madj.we, na.rm = TRUE)
-                                            ),
-                                        by = .(obsDate, model)]
-
-ggplot(plotDT[model == "v1_3"], aes(x = obsDate,)) +
-  geom_line(aes(y = meanSh, colour = "Shower")) + 
-  geom_line(aes(y = meanShWE, colour = "Shower with modeled low flow shower uptake")) +
-  facet_grid(model ~ .) +
-  theme(legend.title = element_blank()) +
-  theme(legend.position = "bottom") +
-  labs(title = "Low flow showers: impact of modeled uptake (model 1.3)",
-       y = "Mean l/hh/day",
-       x = "Date")
-
-ggplot(plotDT[model == "v2_0"], aes(x = obsDate,)) +
-  geom_line(aes(y = meanSh, colour = "Shower")) + 
-  geom_line(aes(y = meanShWE, colour = "Shower with modeled low flow shower uptake")) +
-  facet_grid(model ~ .) +
-  theme(legend.title = element_blank()) +
-  theme(legend.position = "bottom") +
-  labs(title = "Low flow showers: impact of modeled uptake (model 2.0)",
-       y = "Mean l/hh/day",
-       x = "Date")
-
-ggplot(plotDT[model == "v1_3"], aes(x = obsDate,)) +
-  geom_line(aes(y = meanWC, colour = "WC")) + 
-  geom_line(aes(y = meanWCWE, colour = "WC with modeled dual flush uptake")) +
-  facet_grid(model ~ .) +
-  theme(legend.title = element_blank()) +
-  theme(legend.position = "bottom") +
-  labs(title = "Dual flush WCs: impact of modeled uptake (model 1.3)",
-       y = "Mean l/hh/day",
-       x = "Date")
-
-ggplot(plotDT[model == "v2_0"], aes(x = obsDate,)) +
-  geom_line(aes(y = meanWC, colour = "WC")) + 
-  geom_line(aes(y = meanWCWE, colour = "WC with modeled dual flush uptake")) +
-  facet_grid(model ~ .) +
-  theme(legend.title = element_blank()) +
-  theme(legend.position = "bottom") +
-  labs(title = "Dual flush WCs: impact of modeled uptake (model 2.0)",
-       y = "Mean l/hh/day",
-       x = "Date")
-```
-
-Well that seems to have an effect!
-
-Now re-draw final charts for papers - one chart per model.
-
-```{r final model v1_3 2012 by month}
-myTitle <- "All uses (2012 only)"
-myCaption <- "IMPETUS model: synthetic households (n = 1800 per month)\nModel v1\nBaseline water efficiency uptake"
-
-dt <- hhFinalDataComboExpandedDT[currYear == 2012 & model == "v1_3", .(Basin = mean(Basin.baseline.madj),
-                                 Bath = mean(Bath.baseline.madj),
-                                 KitchenSink = mean(KitchenSink.baseline.madj),
-                                 Dishwasher = mean(Dishwasher.baseline.madj),
-                                 External = mean(External.baseline.madj),
-                                 Shower = mean(Shower.baseline.madj.we),
-                                 WC = mean(WC.baseline.madj.we),
-                                 WashingMachine = mean(WashingMachine.baseline.madj)
-                                 ), by = .(metered, currMon)]
-# recast dt to make plotting easier
-plotDT <- melt(dt, id.vars = c("currMon", "metered"))
-plotDT <- plotDT[, Usage := variable]
-
-myPlot <- ggplot(plotDT, aes(x = factor(currMon), y = value, group = Usage)
-                 ) +
-  geom_line(aes(colour = Usage)) +
-  facet_grid(. ~ metered, scales = "free") +
-  theme(legend.title = element_blank()) +
-  theme(legend.position = "bottom") +
-  labs(y = "Mean litres/day",
-       x = "Month",
-       title = myTitle,
-       caption = myCaption)
-
-myPlot
-
-# version with linetype for use in bw fig
-myPlot <- ggplot(plotDT, aes(x = factor(currMon), y = value, group = Usage)
-                 ) +
-  geom_line(aes(linetype = Usage)) +
-  facet_grid(. ~ metered, scales = "free") +
-  theme(legend.title = element_blank()) +
-  theme(legend.position = "bottom") +
-  labs(y = "Mean litres/day",
-       x = "Month",
-       caption = myCaption)
-
-myPlot
-
-# Grey scale version if required
-myPlot <- myPlot + theme_bw()
-
-# Figure for IWA Bath final paper (http://ws.iwaponline.com/content/early/2018/02/13/ws.2018.035)
-ggsave("Fig3_Final_model_v1_3_2012_by_month.pdf", plot = myPlot, dpi = 400)
-
-```
-
-```{r final model v1_3 2012 all dates}
-myTitle <- "All uses (all years)"
-myCaption <- "IMPETUS model: synthetic households (n = 1800 per month)\nModel v1\nBaseline water efficiency uptake"
-
-dt <- hhFinalDataComboExpandedDT[model == "v1_3", .(Basin = mean(Basin.baseline.madj),
-                                 Bath = mean(Bath.baseline.madj),
-                                 KitchenSink = mean(KitchenSink.baseline.madj),
-                                 Dishwasher = mean(Dishwasher.baseline.madj),
-                                 External = mean(External.baseline.madj),
-                                 Shower = mean(Shower.baseline.madj.we),
-                                 WC = mean(WC.baseline.madj.we),
-                                 WashingMachine = mean(WashingMachine.baseline.madj)
-                                 ), by = .(metered, obsDate)]
-
-myPlot <- ggplot(dt, aes(x = obsDate, group = metered)
-                 ) +
-  #geom_boxplot()  +
-  geom_line(aes(y = Basin, colour = "Basin")) +
-  geom_line(aes(y = Bath, colour = "Bath")) +
-  geom_line(aes(y = KitchenSink, colour = "Kitchen Sink")) +
-  geom_line(aes(y = Dishwasher, colour = "Dishwasher")) +
-  geom_line(aes(y = External, colour = "External")) +
-  geom_line(aes(y = Shower, colour = "Shower")) +
-  geom_line(aes(y = WC, colour = "WC")) +
-  geom_line(aes(y = WashingMachine, colour = "Washing Machine")) +
-  facet_grid(. ~ metered, scales = "free") +
-  theme(legend.title = element_blank()) +
-  theme(legend.position = "bottom") +
-  labs(y = "Mean litres/day",
-       x = "Month",
-       title = myTitle,
-       caption = myCaption)
-
-myPlot
-```
-
-```{r test model v1 total}
-myTitle <- "Total: mean daily total"
-myCaption <- "Baseline with water efficiency"
-myPlot <- ba_IMPETUSmakeYearMonthPlot(hhFinalDataComboExpandedDT[model == "v1_3"], "sumDaily.baseline.madj.we")
-myPlot
-```
-
-```{r test model v1 by component}
-myTitle <- "External: mean daily total"
-myCaption <- "Baseline with water efficiency"
-myPlot <- ba_IMPETUSmakeYearMonthPlot(hhFinalDataComboExpandedDT[model == "v1_3"], "External.baseline.madj")
-myPlot
-
-myTitle <- "Shower: mean daily total"
-myCaption <- "Baseline with water efficiency"
-myPlot <- ba_IMPETUSmakeYearMonthPlot(hhFinalDataComboExpandedDT[model == "v1_3"], "Shower.baseline.madj.we")
-myPlot
-
-myTitle <- "WC: mean daily total"
-myCaption <- "Baseline with water efficiency"
-myPlot <- ba_IMPETUSmakeYearMonthPlot(hhFinalDataComboExpandedDT[model == "v1_3"], "WC.baseline.madj.we")
-myPlot
-
-myTitle <- "Dishwasher: mean daily total"
-myCaption <- "Baseline with water efficiency"
-myPlot <- ba_IMPETUSmakeYearMonthPlot(hhFinalDataComboExpandedDT[model == "v1_3"], "Dishwasher.baseline.madj")
-myPlot
-
-myTitle <- "Bath: mean daily total"
-myCaption <- "Baseline with water efficiency"
-myPlot <- ba_IMPETUSmakeYearMonthPlot(hhFinalDataComboExpandedDT[model == "v1_3"], "Bath.baseline.madj")
-myPlot
-
-myTitle <- "Washing machine: mean daily total"
-myCaption <- "Baseline with water efficiency"
-myPlot <- ba_IMPETUSmakeYearMonthPlot(hhFinalDataComboExpandedDT[model == "v1_3"], "WashingMachine.baseline.madj")
-myPlot
-
-myTitle <- "Kitchen sink: mean daily total"
-myCaption <- "Baseline with water efficiency"
-myPlot <- ba_IMPETUSmakeYearMonthPlot(hhFinalDataComboExpandedDT[model == "v1_3"], "KitchenSink.baseline.madj")
-myPlot
-
-```
-
-
-```{r final model v2_0}
-```
-
-## Extract baseline hot water volumes (for BECC 2017 paper)
-
-
-Use: hhFinalDataComboExpandedDT
-
-```{r summary of final baseline data}
-summary(hhFinalDataComboExpandedDT)
-```
+> Do these look mostly sensible? How could we validate them?


 ```{r endComputeHistoricalMonthly}