========================================================
Hi everyone, I’m trying out R knitr for my blog posts now; let me know what you think!
Recently, I was looking for sample size calculations in R and found that R really doesn’t have good built in funtions for sample size and power. For example, for finding the sample size necessary to find a difference in two proportions, one can use the power.prop.test() function and get the following output:
power.prop.test(n=NULL, .10, .25, .05, .90, alternative="two.sided")
##
## Two-sample comparison of proportions power calculation
##
## n = 132.7557
## p1 = 0.1
## p2 = 0.25
## sig.level = 0.05
## power = 0.9
## alternative = two.sided
##
## NOTE: n is number in *each* group
But there are very few options in this function; for example, there is no option for a continuity correction for the normal approximation to binary data, you can’t compare a one sample of a null vs alternative value instead of two sample proportion, and you can’t change the ratio of the size of the two samples. Even more importantly, there’s no way to find effective sample size when data is clustered. There are other functions in R but I found them complicated and not user friendly. The sampsi and sampclus functions in Stata are intuitive and useful, and have all of those important options. I decided to build my own R functions that had the same capabilities, and I’ve posted them here for others to use and comment on. Eventually I would like to make an R package that includes all of these functions for ease of use.
The first function, sampsi.prop(), calculates sample size for one and two sample tests of proportions. The required n for both are as follows: * Two-sample test of equality of proportions, with continuity correction:
\[n_1 = \frac{n'}{4}\left(1+\left\{1+\frac{2(r+1)}{n'r|p_1-p_2|}\right\}^{1/2}\right)^2\]
and \(n_2 = rn_1\)
\[n' = \frac{\left(z_{1-\alpha/2}\left\{(r+1)\bar{p}\bar{q}\right\}^{1/2}+z_{1-\beta}(rp_1q_1+p_2q_2)^{1/2}\right)^2}{r(p_1-p_2)^2}\]
and \(\bar{p} = (p_1+rp_2)/(r+1)\) and \(\bar{q}=1-\bar{p}\)
so that \(n_1=n'\) and \(n_2=rn_1\)
- One-sample test of proportions where null is \(p=p_0\) and alternative is \(p=p_A\):
\[n = \left(\frac{z_{1-\alpha/2}\left\{p_0(1-p_0)\right\}^{1/2} + z_{1-\beta}\left\{p_A(1-p_A)\right\}^{1/2}}{p_A-p_0}\right)^2\]
The function sampsi.prop() takes arguments p1 and p2, which are either the null and alternative proportions, respectively, for the one-sample test, or two proportions for the two-sample test. These arguments are required for the function to run. The rest of the arguments have preset default values so do not need to be indicated, and include the ratio of smaller group to larger group (default=1), power (default of \(\beta=.80\)), significance level (default of \(\alpha=.05\)), a logical argument for whether to include the continuity correction (default is true), whether to perform a two-sided test or one-sided (default is two), and whether to perform the two-sample or one-sample test (defaut is two-sample test). The output is a list object.
sampsi.prop<-function(p1, p2, ratio=1, power=.90, alpha=.05, cont.corr=TRUE, two.sided=TRUE, one.sample=FALSE){
effect.size<-abs(p2-p1)
avg.p<-(p1+ratio*p2)/(ratio+1)
sd=ifelse(one.sample==FALSE, sqrt(ratio*p1*(1-p1)+p2*(1-p2)), sqrt(p2*(1-p2)))
z.pow<-qt(1-power, df=Inf, lower.tail=FALSE)
z.alph<-ifelse(two.sided==TRUE, qt(alpha/2, df=Inf, lower.tail=FALSE), qt(alpha, df=Inf, lower.tail=FALSE))
ct<-(z.pow+z.alph)
n1<-(z.alph*sqrt((ratio+1)*avg.p*(1-avg.p))+z.pow*sd)^2/(effect.size^2*ratio)
n1.cont<-ifelse(cont.corr==FALSE, n1, (n1/4)*(1+sqrt(1+(2*(ratio+1))/(n1*ratio*effect.size)))^2)
n<-(((z.alph*sqrt(p1*(1-p1)))+z.pow*sd)/effect.size)^2
if(one.sample==FALSE){
col1<-c("alpha", "power", "p1", "p2", "effect size", "n2/n1", "n1", "n2")
col2<-c(alpha, power, p1, p2, effect.size, ratio, ceiling(n1.cont), ceiling(n1.cont*ratio))
}
else{
col1<-c("alpha", "power", "p", "alternative", "n")
col2<-c(alpha, power, p1, p2, ceiling(n))
}
ret<-as.data.frame(cbind(col1, col2))
ret$col2<-as.numeric(as.character(ret$col2))
colnames(ret)<-c("Assumptions", "Value")
description<-paste(ifelse(one.sample==FALSE, "Two-sample", "One-sample"), ifelse(two.sided==TRUE, "two-sided", "one-sided"), "test of proportions", ifelse(cont.corr==FALSE, "without", "with"), "continuity correction")
retlist<-list(description, ret)
return(retlist)
}
Now we can do the following sample size calculations that match up perfectly to the corresponding commands in Stata:
sampsi.prop(.10, .25)
## [[1]]
## [1] "Two-sample two-sided test of proportions with continuity correction"
##
## [[2]]
## Assumptions Value
## 1 alpha 0.05
## 2 power 0.90
## 3 p1 0.10
## 4 p2 0.25
## 5 effect size 0.15
## 6 n2/n1 1.00
## 7 n1 146.00
## 8 n2 146.00
Notice that the results from the calculation above are corrected for continuity, so do not match up with the power.prop.test output from above. To get those results, specify cont.corr=FALSE like so:
sampsi.prop(.10, .25, cont.corr=FALSE)
## [[1]]
## [1] "Two-sample two-sided test of proportions without continuity correction"
##
## [[2]]
## Assumptions Value
## 1 alpha 0.05
## 2 power 0.90
## 3 p1 0.10
## 4 p2 0.25
## 5 effect size 0.15
## 6 n2/n1 1.00
## 7 n1 133.00
## 8 n2 133.00
Here are a couple more examples with some of the other defaults changed:
sampsi.prop(.5, .55, power=.8, one.sample=TRUE)
## [[1]]
## [1] "One-sample two-sided test of proportions with continuity correction"
##
## [[2]]
## Assumptions Value
## 1 alpha 0.05
## 2 power 0.80
## 3 p 0.50
## 4 alternative 0.55
## 5 n 783.00
sampsi.prop(.5, .55, ratio=2, two.sided=FALSE)
## [[1]]
## [1] "Two-sample one-sided test of proportions with continuity correction"
##
## [[2]]
## Assumptions Value
## 1 alpha 0.05
## 2 power 0.90
## 3 p1 0.50
## 4 p2 0.55
## 5 effect size 0.05
## 6 n2/n1 2.00
## 7 n1 1310.00
## 8 n2 2619.00
Next we can calculate sample size for a comparison of means. The required n for the one sample and two sample tests are as follows:
One sample test of mean where null hypothesis is \(\mu=\mu_0\) and the alternative is \(\mu=\mu_A\): \[n = \left\{\frac{(z_{1-\alpha/2} + z_{1-\beta})\sigma}{\mu_A-\mu_0}\right\}^2\]
Two-sample test of equality of means:
\[n_1 = \frac{(\sigma_1^2+\sigma_2^2/r)(z_{1-\alpha/2}+z_{1-\beta})^2}{(\mu_1-\mu_2)^2}\] \[n_2=rn_1\]
The sampsi.means() function again works the same way as in Stata where you must input as arguments either a null mean and alternative mean for a one-sample test or two means for a two-sample test. At least one standard deviation of the mean is also required. It is possible to have different standard deviations for the two-sample test, but if only the first standard deviation value is assigned and a two-sample test is designated, then the function assumes both standard deviations to be the same. The other arguments and their defaults are similar to the sampsi.prop() function from above.
sampsi.means<-function(m1, m2, sd1, sd2=NA, ratio=1, power=.90, alpha=.05, two.sided=TRUE, one.sample=FALSE){
effect.size<-abs(m2-m1)
sd2<-ifelse(!is.na(sd2), sd2, sd1)
z.pow<-qt(1-power, df=Inf, lower.tail=FALSE)
z.alph<-ifelse(two.sided==TRUE, qt(alpha/2, df=Inf, lower.tail=FALSE), qt(alpha, df=Inf, lower.tail=FALSE))
ct<-(z.pow+z.alph)
n1<-(sd1^2+(sd2^2)/ratio)*(ct)^2/(effect.size^2)
n<-(ct*sd1/effect.size)^2
if(one.sample==FALSE){
col1<-c("alpha", "power", "m1", "m2", "sd1", "sd2", "effect size", "n2/n1", "n1", "n2")
col2<-c(alpha, power, m1, m2, sd1, sd2, effect.size, ratio, ceiling(n1), ceiling(n1*ratio))
}
else{
col1<-c("alpha", "power", "null", "alternative", "n")
col2<-c(alpha, power, m1, m2, ceiling(n))
}
ret<-as.data.frame(cbind(col1, col2))
ret$col2<-as.numeric(as.character(ret$col2))
colnames(ret)<-c("Assumptions", "Value")
description<-paste(ifelse(one.sample==FALSE, "Two-sample", "One-sample"), ifelse(two.sided==TRUE, "two-sided", "one-sided"), "test of means")
retlist<-list(description, ret)
return(retlist)
}
And here are the examples:
sampsi.means(0, 10, sd1=15, power=.8)
## [[1]]
## [1] "Two-sample two-sided test of means"
##
## [[2]]
## Assumptions Value
## 1 alpha 0.05
## 2 power 0.80
## 3 m1 0.00
## 4 m2 10.00
## 5 sd1 15.00
## 6 sd2 15.00
## 7 effect size 10.00
## 8 n2/n1 1.00
## 9 n1 36.00
## 10 n2 36.00
sampsi.means(10, 30, sd1=15, sd2=20, alpha=.1, ratio=1)
## [[1]]
## [1] "Two-sample two-sided test of means"
##
## [[2]]
## Assumptions Value
## 1 alpha 0.1
## 2 power 0.9
## 3 m1 10.0
## 4 m2 30.0
## 5 sd1 15.0
## 6 sd2 20.0
## 7 effect size 20.0
## 8 n2/n1 1.0
## 9 n1 14.0
## 10 n2 14.0
Finally, we often work with clustered data and would like to calculate sample sizes for clustered observations. Here I created a samp.clus() function that works the same way as the sampclus function in Stata; that is, it takes a sampsi.prop() or sampsi.means() object, along with an interclass correlation coefficient (\(\rho\)) and either the number of observations per cluster or the number of clusters, and calculates the corresponding effective sample size.
samp.clus<-function(sampsi.object, rho, num.clus=NA, obs.clus=NA){
if(is.na(num.clus)&is.na(obs.clus)) print("Either num.clus or obs.clus must be identified")
else{
so<-sampsi.object[[2]]
n1<-as.numeric(so[so$Assumptions=="n1",2])
n2<-as.numeric(so[so$Assumptions=="n2",2])
if(!is.na(obs.clus)){
deff<-1+(obs.clus-1)*rho
n1.clus<-n1*deff
n2.clus<-n2*deff
num.clus<-ceiling((n1.clus+n2.clus)/obs.clus)
}
else if(!is.na(num.clus)){
tot<-(n1*(1-rho)+n2*(1-rho))/(1-(n1*rho/num.clus) - (n2*rho/num.clus))
if(tot<=0) stop("Number of clusters is too small")
else{
obs.clus<-ceiling(tot/num.clus)
deff<-1+(obs.clus-1)*rho
n1.clus<-n1*deff
n2.clus<-n2*deff
}
}
col1<-c("n1 uncorrected", "n2 uncorrected", "ICC", "Avg obs/cluster", "Min num clusters", "n1 corrected", "n2 corrected")
col2<-c(n1, n2, rho, obs.clus, num.clus, ceiling(n1.clus), ceiling(n2.clus))
ret<-as.data.frame(cbind(col1, col2))
colnames(ret)<-c("Assumptions", "Value")
return(ret)
}
}
Here are a couple of examples of how it works (the same function samp.clus() works for both a sampsi.prop object or a sampsi.means object:
ss<-sampsi.prop(.10, .25, power=.8, two.sided=FALSE)
samp.clus(ss, rho=.05, obs.clus=15)
## Assumptions Value
## 1 n1 uncorrected 92
## 2 n2 uncorrected 92
## 3 ICC 0.05
## 4 Avg obs/cluster 15
## 5 Min num clusters 21
## 6 n1 corrected 157
## 7 n2 corrected 157
samp.clus(ss, rho=.05, num.clus=150)
## Assumptions Value
## 1 n1 uncorrected 92
## 2 n2 uncorrected 92
## 3 ICC 0.05
## 4 Avg obs/cluster 2
## 5 Min num clusters 150
## 6 n1 corrected 97
## 7 n2 corrected 97
ss2<-sampsi.means(10, 15, sd1=15, power=.8)
samp.clus(ss2, rho=.05, obs.clus=15)
## Assumptions Value
## 1 n1 uncorrected 142
## 2 n2 uncorrected 142
## 3 ICC 0.05
## 4 Avg obs/cluster 15
## 5 Min num clusters 33
## 6 n1 corrected 242
## 7 n2 corrected 242
samp.clus(ss2, rho=.05, num.clus=5)
## Error in samp.clus(ss2, rho = 0.05, num.clus = 5): Number of clusters is too small
##Note: the above won't work because not enough clusters given as input; this error message will occur to warn you. You must increase the number of clusters to avoid the error.
Finally, I add one functionality that Stata doesn’t have with their basic sample size calculations, which is graphing sample size as a function of power. Here, I created two functions for these purposes, for either proprotions or means. The input is the starting point and ending point for the power, as well as the same inputs and defaults from the corresponding sampsi.prop or sampsi.means functions:
graph.power.prop<-function(from.power, to.power,p1, p2, ratio=1, power=.90, alpha=.05, cont.corr=TRUE, two.sided=TRUE, one.sample=FALSE){
seq.p<-seq(from.power, to.power, by=.01)
n<-rep(NA, length(seq.p))
for(i in 1:length(seq.p)){
ob<-sampsi.prop(p1=p1, p2=p2, power=seq.p[i], alpha=alpha, ratio=ratio, cont.corr=cont.corr, two.sided=two.sided, one.sample=one.sample)[[2]]
n[i]<-as.numeric(ob[7,2])
}
plot(n, seq.p, ylab="Power", xlab="n (in smaller arm)", type="l", main=paste("Power graph for p1=", p1, "and p2=", p2))
}
graph.power.means<-function(from.power, to.power, m1, m2, sd1, sd2=NA, ratio=1, alpha=.05, cont.corr=TRUE, two.sided=TRUE, one.sample=FALSE){
seq.p<-seq(from.power, to.power, by=.01)
n<-rep(NA, length(seq.p))
for(i in 1:length(seq.p)){
ob<-sampsi.means(m1=m1, m2=m2, sd1=sd1, sd2=sd2, power=seq.p[i], alpha=alpha, ratio=ratio,, one.sample=one.sample, two.sided=two.sided)[[2]]
n[i]<-as.numeric(ob[9,2])
}
plot(n, seq.p, ylab="Power", xlab="n (in smaller arm)", type="l", main=paste("Power graph for m1=", m1, "and m2=", m2))
}
And this is what it looks like. This example will graph power as a function of sample size. We restrict the graph from \(\beta=0.6\) to \(\beta=1\).
graph.power.prop(.6, 1, p1=.20, p2=.35)
Slawa, Super nice article! Thanks for sharing your work. I noticed that this article is not appearing properly in r-bloggers.com (at least in Chrome). This may be a result of knitr?
ReplyDeleteThanks! Yeah I saw that too and was disappointed. Not sure how to fix this. I'll have to play around with it.
DeleteLooks great, but Blogger cut off your x-axis label. I was curious, so I stepped through and pasted together a barebones .Rmd file of my own -- basically with all your function definitions and minus all the LaTeX -- so I could replicate the final call to graph.power.prop().
ReplyDeleteSo, how did you make knitr work with Blogger anyway? Did you just paste the source of the output of Knit HTML into the post editor?
Thanks. That's strange. It looks ok for me. The label is just "n (in smaller arm)" - this is not showing up when you look at it?
DeleteRight now, I save the rmd file as an html into a public dropbox folder and then use an iframe script in blogger to reference that file. Somebody else was doing this on another blog, but if you have a better idea, I'm all for it. It did not show up great on the blogger website this way.
Nevermind, you're right. Must be my 12in screen. I only see the tick marks on the x axis, none of the numbers or the axis label. But on R-bloggers in Feedly the picture looks as intended.
DeleteI have no better ideas about using knitr for blogging; I only ever use it for pdf output. But I like it a lot and I'd blog with it if it were easy. There used to be an RWordPress package, but the .tar.gz source file is no longer available and there never was a binary, it seems.
This comment has been removed by the author.
ReplyDeleteStata 13 just added a great deal under this heading. The entire documentation is accessible to all via http://www.stata.com/bookstore/power-sample-size-reference-manual/
ReplyDeleteThanks for your comment- yes I'm excited to try out the new stuff in Stata 13. Lots of new capabilities, especially for graphing sample size and power.
DeleteHave you put these into an R package? I think that would be great.
ReplyDeleteThanks! I haven't yet. I meant to but time got away from me.
DeleteSlawa, great work! Hurry to create an R package with all your functions, you deserve all the credits for this. Greetings from Brazil!
ReplyDelete