diff --git a/MTUS-W6-adult-survey-data-processing.Rmd b/MTUS-W6-adult-survey-data-processing.Rmd index 5baeac804ef89e48cbfc0d30412fcc598f0dbbd8..4b4ae191d53c5c692b13186e1212f94e0360bf35 100644 --- a/MTUS-W6-adult-survey-data-processing.Rmd +++ b/MTUS-W6-adult-survey-data-processing.Rmd @@ -289,6 +289,8 @@ print("-> Creating new ba_survey variable to pool 1983/7") # Save out processed file +This includes duplicate records where the respondent completed more than one diary-day. As an indicator, there are `r uniqueN(MTUSW6UKsurvey_DT$ba_pid)` unique respondents but `r uniqueN(MTUSW6UKsurvey_DT$ba_diarypid)` records. Most importantly it means that `propwt` is not necessarily constant within ba_pid as it is a per-diary-day individual level weight. + ```{r saveSurveyFile} # Keep the survey vars we need ---- print("-> Keeping core survey variables") @@ -346,7 +348,7 @@ kable(caption = "Number of diary days by survey", As we can see 1974-1987 were full week diaries. 2001 was a two day diary and 1995/2005 were one-day dairies. -From this point on in this section we use only unique individual records. Note that results do not necessarily match the number of cases recorded in the [MTUS user guide](http://www.timeuse.org/MTUS-User-Guide.html) as the user guide includes all cases (i.e. both adults and children). +From this point on in this section we use only unique individual records to avoid duplicates where more than 1 diary day was completed. Note that results do not necessarily match the number of cases recorded in the [MTUS user guide](http://www.timeuse.org/MTUS-User-Guide.html) as the user guide includes all cases (i.e. both adults and children). ```{r createUniqueAdults} setkey(MTUSW6UKsurveyCore_DT, ba_pid)