Administrator approval is now required for registering new accounts. If you are registering a new account, and are external to the University, please ask the repository owner to contact ServiceLine to request your account be approved. Repository owners must include the newly registered email address, and specific repository in the request for approval.

Commit 60387128 authored by Ben Anderson's avatar Ben Anderson
Browse files

updated package names & general tidying; introduced proportions stub

parent 25ca79ab
File added
^.*\.Rproj$
^\.Rproj\.user$
Package: packageTemplate
Title: What the Package Does (One Line, Title Case)
Version: 0.0.0.9000
Authors@R:
person(given = "First",
family = "Last",
role = c("aut", "cre"),
email = "first.last@example.com")
Description: What the package does (one paragraph).
License: What license it uses
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.1.0
# Generated by roxygen2: do not edit by hand
export(estimateEffectSizes)
import(data.table)
import(pwr)
import(reshape2)
#--- Sample power related functions ---#
#' Estimate detectable effect sizes using statistical power analysis
#'
#' \code{estimateEffectSizes} calculates required sample sizes for a given set of p values and samples.
#'
#' Returns a data.table of effect sizes for a given sample size. Calculates these for p = 0.01, 0.05, 0.1 & 0.2. Pick out the ones you want.
#'
#' @param mean the estimated mean value to use
#' @param sd the estimated stadnard deviation to use
#' @param samples a list of sample sizes to iterate over
#' @param power power value to use
#'
#' @author Ben Anderson, \email{b.anderson@@soton.ac.uk}
#' @export
#' @import data.table
#' @import reshape2
#' @family Power functions
estimateEffectSizes <- function(mean,sd,samples,power){
# obtain effect sizes using supplied mean & sd
sigs <- c(0.01,0.05,0.1,0.2) # force these, can always remove later
nSigs <- length(sigs)
nSamps <- length(samples)
# initialise power results array
resultsArray <- array(numeric(nSamps*nSigs),
dim=c(nSamps,nSigs)
)
# loop over significance values
for (p in 1:nSigs){
for (s in 1:nSamps){ # loop over the sample sizes
result <- power.t.test( # pwr.t.test?
n = samples[s],
delta = NULL,
sd = sd,
sig.level = sigs[p],
power = power,
alternative = c("one.sided")
)
resultsArray[s,p] <- result$delta/mean # report effect size against sample size
}
}
dt <- data.table::as.data.table(resultsArray) # convert to dt for tidying
dt <- dt[,
.(
sampleN = testSamples,
"p = 0.01" = 100*V1, # "Detectable % effect (p = 0.01)"
"p = 0.05" = 100*V2, # "Detectable % effect (p = 0.05)"
"p = 0.1" = 100*V3, # "Detectable % effect (p = 0.1)"
"p = 0.2" = 100*V4 # "Detectable % effect (p = 0.2)"
)
]
longDT <- data.table::as.data.table(reshape2::melt(dt, id=c("sampleN")))
longDT <- data.table::setnames(longDT, "value", "effectSize")
longDT <- data.table::setnames(longDT, "variable", "pValue")
return(longDT) # returned the tidied & long form dt
}
#' Estimate proportion margins of error using statistical power analysis
#'
#' \code{estimateProportions} calculates required sample sizes for a given set of p values and samples.
#'
#' Returns a data.table of effect sizes for a given sample size. Calculates these for p = 0.01, 0.05, 0.1 & 0.2. Pick out the ones you want.
#'
#' @param mean the estimated mean value to use
#' @param sd the estimated stadnard deviation to use
#' @param samples a list of sample sizes to iterate over
#' @param power power value to use
#'
#' @author Ben Anderson, \email{b.anderson@@soton.ac.uk}
#' @export
#' @import data.table
#' @import reshape2
#' @import pwr
#' @family Power functions
estimateEffectSizes <- function(mean,sd,samples,power){
# obtain effect sizes using supplied mean & sd
sigs <- c(0.01,0.05,0.1,0.2) # force these, can always remove later
nSigs <- length(sigs)
nSamps <- length(samples)
# initialise power results array
resultsArray <- array(numeric(nSamps*nSigs),
dim=c(nSamps,nSigs)
)
# loop over significance values
for (p in 1:nSigs){
for (s in 1:nSamps){ # loop over the sample sizes
# pwr.t.test?
result <- power.t.test(
n = samples[s],
delta = NULL,
sd = sd,
sig.level = sigs[p],
power = power,
alternative = c("one.sided")
)
resultsArray[s,p] <- result$delta/mean # report effect size against sample size
}
}
dt <- data.table::as.data.table(resultsArray) # convert to dt for tidying
dt <- dt[,
.(
sampleN = testSamples,
"p = 0.01" = 100*V1, # "Detectable % effect (p = 0.01)"
"p = 0.05" = 100*V2, # "Detectable % effect (p = 0.05)"
"p = 0.1" = 100*V3, # "Detectable % effect (p = 0.1)"
"p = 0.2" = 100*V4 # "Detectable % effect (p = 0.2)"
)
]
longDT <- data.table::as.data.table(reshape2::melt(dt, id=c("sampleN")))
longDT <- data.table::setnames(longDT, "value", "effectSize")
longDT <- data.table::setnames(longDT, "variable", "pValue")
return(longDT) # returned the tidied & long form dt
}
# Analysis for a short paper on statistical power & statistical significance
And the various confusions that arise...
\ No newline at end of file
And the various confusions that arise...
Structured as an [R package](http://r-pkgs.had.co.nz/) for re-usability.
\ No newline at end of file
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/power.R
\name{estimateEffectSizes}
\alias{estimateEffectSizes}
\title{Estimate detectable effect sizes using statistical power analysis}
\usage{
estimateEffectSizes(mean, sd, samples, power)
estimateEffectSizes(mean, sd, samples, power)
}
\arguments{
\item{mean}{the estimated mean value to use}
\item{sd}{the estimated stadnard deviation to use}
\item{samples}{a list of sample sizes to iterate over}
\item{power}{power value to use}
\item{mean}{the estimated mean value to use}
\item{sd}{the estimated stadnard deviation to use}
\item{samples}{a list of sample sizes to iterate over}
\item{power}{power value to use}
}
\description{
\code{estimateEffectSizes} calculates required sample sizes for a given set of p values and samples.
\code{estimateProportions} calculates required sample sizes for a given set of p values and samples.
}
\details{
Returns a data.table of effect sizes for a given sample size. Calculates these for p = 0.01, 0.05, 0.1 & 0.2. Pick out the ones you want.
Returns a data.table of effect sizes for a given sample size. Calculates these for p = 0.01, 0.05, 0.1 & 0.2. Pick out the ones you want.
}
\author{
Ben Anderson, \email{b.anderson@soton.ac.uk}
Ben Anderson, \email{b.anderson@soton.ac.uk}
}
\concept{Power functions}
......@@ -47,9 +47,8 @@ rmdLibs <- c("data.table", # data munching
"ggplot2", # for fancy graphs
"readr", # writing to files
"lubridate", # for today
"SAVEr", # power stats functions
"GREENGridData",
"broom", # tidy test results
"dkUtils", # local utilities
"knitr" # for kable
)
# load them
......@@ -78,7 +77,7 @@ heatPumpData <- paste0(myParams$dPath, "Heat Pump_2015-04-01_2016-03-31_observat
myParams$GGDataDOI <- "https://dx.doi.org/10.5255/UKDA-SN-853334"
plotCaption <- paste0("Source: ", myParams$GGDataDOI)
myParams$ccBY <- "includes/licenseCCBY.Rmd"
myParams$license <- "includes/licenseCCBY.Rmd"
myParams$support <- "includes/supportGeneric.Rmd"
myParams$pubLoc <- "Southampton: University of Southampton"
......@@ -101,7 +100,7 @@ cbbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2"
## License
```{r ccby license, child=myParams$ccBY}
```{r license, child=myParams$license}
```
## Citation
......@@ -114,9 +113,9 @@ This work is (c) `r lubridate::year(today())` the authors.
## History
Code history is generally tracked via the paper [repo](https://github.com/dataknut/powerSignificanceDesignAndDecisionMaking):
Code & report history:
* [Paper history](https://github.com/dataknut/powerSignificanceDesignAndDecisionMaking/commits/master)
* [Paper history](https://github.com/dataknut/weGotThePower/commits/master)
## Data:
......@@ -137,6 +136,7 @@ This report contains the analysis for a paper of the same name. The text is stor
# Sample design: statistical power
## Means
```{r loadGgData, include =FALSE}
dt <- data.table::as.data.table(readr::read_csv(heatPumpData, progress = FALSE))
......@@ -144,7 +144,7 @@ dt <- data.table::as.data.table(readr::read_csv(heatPumpData, progress = FALSE))
dt <- dt[, month := lubridate::month(r_dateTime)]
dt <- dt[, year := lubridate::year(r_dateTime)]
dt <- GREENGridData::addNZSeason(dt, r_dateTime)
dt <- dkUtils::addNZSeason(dt, r_dateTime)
```
......@@ -161,7 +161,8 @@ testSD <- mean(testDT[season == "Winter"]$meanW)
testSamples <- seq(50,3000,50)
testPower <- 0.8
powerRes80DT <- SAVEr::estimateEffectSizes(testMean,testSD,testSamples,testPower) # auto-produces range of p values
# use pac
powerRes80DT <- estimateEffectSizes(testMean,testSD,testSamples,testPower) # auto-produces range of p values
```
Figure \@ref(fig:ggHPSampleSizeFig80) shows the initial p = 0.05 plot.
......@@ -286,6 +287,13 @@ knitr::kable(caption = "Full results table (part)",
```
## Proportions
Does not require a sample.
# Testing for differences: effect sizes, confidence intervals and p values
## Getting it 'wrong'
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment