R for Public Health: Sample size calculations equivalent to Stata functions

========================================================

Hi everyone, I’m trying out R knitr for my blog posts now; let me know what you think!

Recently, I was looking for sample size calculations in R and found that R really doesn’t have good built in funtions for sample size and power. For example, for finding the sample size necessary to find a difference in two proportions, one can use the power.prop.test() function and get the following output:

power.prop.test(n=NULL, .10, .25, .05, .90, alternative="two.sided")

## 
##      Two-sample comparison of proportions power calculation 
## 
##               n = 132.7557
##              p1 = 0.1
##              p2 = 0.25
##       sig.level = 0.05
##           power = 0.9
##     alternative = two.sided
## 
## NOTE: n is number in *each* group

But there are very few options in this function; for example, there is no option for a continuity correction for the normal approximation to binary data, you can’t compare a one sample of a null vs alternative value instead of two sample proportion, and you can’t change the ratio of the size of the two samples. Even more importantly, there’s no way to find effective sample size when data is clustered. There are other functions in R but I found them complicated and not user friendly. The sampsi and sampclus functions in Stata are intuitive and useful, and have all of those important options. I decided to build my own R functions that had the same capabilities, and I’ve posted them here for others to use and comment on. Eventually I would like to make an R package that includes all of these functions for ease of use.

The first function, sampsi.prop(), calculates sample size for one and two sample tests of proportions. The required n for both are as follows: * Two-sample test of equality of proportions, with continuity correction:

\[n_1 = \frac{n'}{4}\left(1+\left\{1+\frac{2(r+1)}{n'r|p_1-p_2|}\right\}^{1/2}\right)^2\]

and \(n_2 = rn_1\)

\[n' = \frac{\left(z_{1-\alpha/2}\left\{(r+1)\bar{p}\bar{q}\right\}^{1/2}+z_{1-\beta}(rp_1q_1+p_2q_2)^{1/2}\right)^2}{r(p_1-p_2)^2}\]

and \(\bar{p} = (p_1+rp_2)/(r+1)\) and \(\bar{q}=1-\bar{p}\)

so that \(n_1=n'\) and \(n_2=rn_1\)

One-sample test of proportions where null is \(p=p_0\) and alternative is \(p=p_A\):

\[n = \left(\frac{z_{1-\alpha/2}\left\{p_0(1-p_0)\right\}^{1/2} + z_{1-\beta}\left\{p_A(1-p_A)\right\}^{1/2}}{p_A-p_0}\right)^2\]

The function sampsi.prop() takes arguments p1 and p2, which are either the null and alternative proportions, respectively, for the one-sample test, or two proportions for the two-sample test. These arguments are required for the function to run. The rest of the arguments have preset default values so do not need to be indicated, and include the ratio of smaller group to larger group (default=1), power (default of \(\beta=.80\)), significance level (default of \(\alpha=.05\)), a logical argument for whether to include the continuity correction (default is true), whether to perform a two-sided test or one-sided (default is two), and whether to perform the two-sample or one-sample test (defaut is two-sample test). The output is a list object.

sampsi.prop<-function(p1, p2, ratio=1, power=.90, alpha=.05, cont.corr=TRUE, two.sided=TRUE, one.sample=FALSE){
  
  effect.size<-abs(p2-p1)
  avg.p<-(p1+ratio*p2)/(ratio+1)
  sd=ifelse(one.sample==FALSE, sqrt(ratio*p1*(1-p1)+p2*(1-p2)), sqrt(p2*(1-p2)))
  
  z.pow<-qt(1-power, df=Inf, lower.tail=FALSE)
  z.alph<-ifelse(two.sided==TRUE, qt(alpha/2, df=Inf, lower.tail=FALSE), qt(alpha, df=Inf, lower.tail=FALSE))
  ct<-(z.pow+z.alph)
  
  n1<-(z.alph*sqrt((ratio+1)*avg.p*(1-avg.p))+z.pow*sd)^2/(effect.size^2*ratio)
  n1.cont<-ifelse(cont.corr==FALSE, n1, (n1/4)*(1+sqrt(1+(2*(ratio+1))/(n1*ratio*effect.size)))^2)

  n<-(((z.alph*sqrt(p1*(1-p1)))+z.pow*sd)/effect.size)^2
  
  if(one.sample==FALSE){
    col1<-c("alpha", "power", "p1", "p2", "effect size", "n2/n1", "n1", "n2")
    col2<-c(alpha,  power, p1, p2, effect.size, ratio, ceiling(n1.cont), ceiling(n1.cont*ratio))
  }
  else{
    col1<-c("alpha", "power", "p", "alternative", "n")
    col2<-c(alpha, power, p1, p2, ceiling(n))
  }
  ret<-as.data.frame(cbind(col1, col2))
  ret$col2<-as.numeric(as.character(ret$col2))
  colnames(ret)<-c("Assumptions", "Value")
  
  description<-paste(ifelse(one.sample==FALSE, "Two-sample", "One-sample"), ifelse(two.sided==TRUE, "two-sided", "one-sided"), "test of proportions", ifelse(cont.corr==FALSE, "without", "with"), "continuity correction")
  
  retlist<-list(description, ret)
  
  return(retlist)
}

Now we can do the following sample size calculations that match up perfectly to the corresponding commands in Stata:

sampsi.prop(.10, .25)

## [[1]]
## [1] "Two-sample two-sided test of proportions with continuity correction"
## 
## [[2]]
##   Assumptions  Value
## 1       alpha   0.05
## 2       power   0.90
## 3          p1   0.10
## 4          p2   0.25
## 5 effect size   0.15
## 6       n2/n1   1.00
## 7          n1 146.00
## 8          n2 146.00

Notice that the results from the calculation above are corrected for continuity, so do not match up with the power.prop.test output from above. To get those results, specify cont.corr=FALSE like so:

sampsi.prop(.10, .25,  cont.corr=FALSE)

## [[1]]
## [1] "Two-sample two-sided test of proportions without continuity correction"
## 
## [[2]]
##   Assumptions  Value
## 1       alpha   0.05
## 2       power   0.90
## 3          p1   0.10
## 4          p2   0.25
## 5 effect size   0.15
## 6       n2/n1   1.00
## 7          n1 133.00
## 8          n2 133.00

Here are a couple more examples with some of the other defaults changed:

sampsi.prop(.5, .55, power=.8, one.sample=TRUE)

## [[1]]
## [1] "One-sample two-sided test of proportions with continuity correction"
## 
## [[2]]
##   Assumptions  Value
## 1       alpha   0.05
## 2       power   0.80
## 3           p   0.50
## 4 alternative   0.55
## 5           n 783.00

sampsi.prop(.5, .55, ratio=2, two.sided=FALSE)

## [[1]]
## [1] "Two-sample one-sided test of proportions with continuity correction"
## 
## [[2]]
##   Assumptions   Value
## 1       alpha    0.05
## 2       power    0.90
## 3          p1    0.50
## 4          p2    0.55
## 5 effect size    0.05
## 6       n2/n1    2.00
## 7          n1 1310.00
## 8          n2 2619.00

Next we can calculate sample size for a comparison of means. The required n for the one sample and two sample tests are as follows:

One sample test of mean where null hypothesis is \(\mu=\mu_0\) and the alternative is \(\mu=\mu_A\): \[n = \left\{\frac{(z_{1-\alpha/2} + z_{1-\beta})\sigma}{\mu_A-\mu_0}\right\}^2\]
Two-sample test of equality of means:

\[n_1 = \frac{(\sigma_1^2+\sigma_2^2/r)(z_{1-\alpha/2}+z_{1-\beta})^2}{(\mu_1-\mu_2)^2}\] \[n_2=rn_1\]

The sampsi.means() function again works the same way as in Stata where you must input as arguments either a null mean and alternative mean for a one-sample test or two means for a two-sample test. At least one standard deviation of the mean is also required. It is possible to have different standard deviations for the two-sample test, but if only the first standard deviation value is assigned and a two-sample test is designated, then the function assumes both standard deviations to be the same. The other arguments and their defaults are similar to the sampsi.prop() function from above.

sampsi.means<-function(m1, m2, sd1, sd2=NA, ratio=1, power=.90, alpha=.05, two.sided=TRUE, one.sample=FALSE){
  
  effect.size<-abs(m2-m1)
  sd2<-ifelse(!is.na(sd2), sd2, sd1)
  
  z.pow<-qt(1-power, df=Inf, lower.tail=FALSE)
  z.alph<-ifelse(two.sided==TRUE, qt(alpha/2, df=Inf, lower.tail=FALSE), qt(alpha, df=Inf, lower.tail=FALSE))
  ct<-(z.pow+z.alph)
  
  n1<-(sd1^2+(sd2^2)/ratio)*(ct)^2/(effect.size^2)
  n<-(ct*sd1/effect.size)^2
  
  if(one.sample==FALSE){
    col1<-c("alpha", "power", "m1", "m2", "sd1", "sd2", "effect size", "n2/n1", "n1", "n2")
    col2<-c(alpha,  power, m1, m2, sd1, sd2, effect.size, ratio, ceiling(n1), ceiling(n1*ratio))
  }
  else{
    col1<-c("alpha", "power", "null", "alternative", "n")
    col2<-c(alpha, power, m1, m2, ceiling(n))
  }
  ret<-as.data.frame(cbind(col1, col2))
  ret$col2<-as.numeric(as.character(ret$col2))
  colnames(ret)<-c("Assumptions", "Value")
  
  description<-paste(ifelse(one.sample==FALSE, "Two-sample", "One-sample"), ifelse(two.sided==TRUE, "two-sided", "one-sided"), "test of means")
  
  retlist<-list(description, ret)
  
  return(retlist)
}

And here are the examples:

sampsi.means(0, 10, sd1=15, power=.8)

## [[1]]
## [1] "Two-sample two-sided test of means"
## 
## [[2]]
##    Assumptions Value
## 1        alpha  0.05
## 2        power  0.80
## 3           m1  0.00
## 4           m2 10.00
## 5          sd1 15.00
## 6          sd2 15.00
## 7  effect size 10.00
## 8        n2/n1  1.00
## 9           n1 36.00
## 10          n2 36.00

sampsi.means(10, 30, sd1=15, sd2=20, alpha=.1, ratio=1)

## [[1]]
## [1] "Two-sample two-sided test of means"
## 
## [[2]]
##    Assumptions Value
## 1        alpha   0.1
## 2        power   0.9
## 3           m1  10.0
## 4           m2  30.0
## 5          sd1  15.0
## 6          sd2  20.0
## 7  effect size  20.0
## 8        n2/n1   1.0
## 9           n1  14.0
## 10          n2  14.0

Finally, we often work with clustered data and would like to calculate sample sizes for clustered observations. Here I created a samp.clus() function that works the same way as the sampclus function in Stata; that is, it takes a sampsi.prop() or sampsi.means() object, along with an interclass correlation coefficient (\(\rho\)) and either the number of observations per cluster or the number of clusters, and calculates the corresponding effective sample size.

samp.clus<-function(sampsi.object, rho, num.clus=NA, obs.clus=NA){
  
  if(is.na(num.clus)&is.na(obs.clus)) print("Either num.clus or obs.clus must be identified")
  else{
    
    so<-sampsi.object[[2]]
    n1<-as.numeric(so[so$Assumptions=="n1",2])
    n2<-as.numeric(so[so$Assumptions=="n2",2])
    
    if(!is.na(obs.clus)){
      deff<-1+(obs.clus-1)*rho
      n1.clus<-n1*deff
      n2.clus<-n2*deff
      num.clus<-ceiling((n1.clus+n2.clus)/obs.clus)
    }
    else if(!is.na(num.clus)){
      
      tot<-(n1*(1-rho)+n2*(1-rho))/(1-(n1*rho/num.clus) - (n2*rho/num.clus))
      if(tot<=0) stop("Number of clusters is too small")
      else{
        obs.clus<-ceiling(tot/num.clus)
        deff<-1+(obs.clus-1)*rho
        n1.clus<-n1*deff
        n2.clus<-n2*deff
      }
    }
    
    col1<-c("n1 uncorrected", "n2 uncorrected", "ICC", "Avg obs/cluster", "Min num clusters", "n1 corrected", "n2 corrected")
    col2<-c(n1, n2, rho, obs.clus, num.clus, ceiling(n1.clus), ceiling(n2.clus))
    ret<-as.data.frame(cbind(col1, col2))
    colnames(ret)<-c("Assumptions", "Value")
    return(ret)
  }
}

Here are a couple of examples of how it works (the same function samp.clus() works for both a sampsi.prop object or a sampsi.means object:

ss<-sampsi.prop(.10, .25, power=.8, two.sided=FALSE)
samp.clus(ss, rho=.05, obs.clus=15)

##        Assumptions Value
## 1   n1 uncorrected    92
## 2   n2 uncorrected    92
## 3              ICC  0.05
## 4  Avg obs/cluster    15
## 5 Min num clusters    21
## 6     n1 corrected   157
## 7     n2 corrected   157

samp.clus(ss, rho=.05, num.clus=150)

##        Assumptions Value
## 1   n1 uncorrected    92
## 2   n2 uncorrected    92
## 3              ICC  0.05
## 4  Avg obs/cluster     2
## 5 Min num clusters   150
## 6     n1 corrected    97
## 7     n2 corrected    97

ss2<-sampsi.means(10, 15, sd1=15, power=.8)
samp.clus(ss2, rho=.05, obs.clus=15)

##        Assumptions Value
## 1   n1 uncorrected   142
## 2   n2 uncorrected   142
## 3              ICC  0.05
## 4  Avg obs/cluster    15
## 5 Min num clusters    33
## 6     n1 corrected   242
## 7     n2 corrected   242

samp.clus(ss2, rho=.05, num.clus=5)

## Error in samp.clus(ss2, rho = 0.05, num.clus = 5): Number of clusters is too small

##Note: the above won't work because not enough clusters given as input; this error message will occur to warn you. You must increase the number of clusters to avoid the error.

Finally, I add one functionality that Stata doesn’t have with their basic sample size calculations, which is graphing sample size as a function of power. Here, I created two functions for these purposes, for either proprotions or means. The input is the starting point and ending point for the power, as well as the same inputs and defaults from the corresponding sampsi.prop or sampsi.means functions:

graph.power.prop<-function(from.power, to.power,p1, p2, ratio=1, power=.90, alpha=.05, cont.corr=TRUE, two.sided=TRUE, one.sample=FALSE){
  
  seq.p<-seq(from.power, to.power, by=.01)
  n<-rep(NA, length(seq.p))

  for(i in 1:length(seq.p)){
    ob<-sampsi.prop(p1=p1, p2=p2, power=seq.p[i], alpha=alpha, ratio=ratio, cont.corr=cont.corr, two.sided=two.sided, one.sample=one.sample)[[2]]
    n[i]<-as.numeric(ob[7,2])
  }
  plot(n, seq.p, ylab="Power", xlab="n (in smaller arm)", type="l",  main=paste("Power graph for p1=", p1, "and p2=", p2))
}


graph.power.means<-function(from.power, to.power, m1, m2, sd1, sd2=NA, ratio=1, alpha=.05, cont.corr=TRUE, two.sided=TRUE, one.sample=FALSE){
  
  seq.p<-seq(from.power, to.power, by=.01)
  n<-rep(NA, length(seq.p))
  for(i in 1:length(seq.p)){
    ob<-sampsi.means(m1=m1, m2=m2, sd1=sd1, sd2=sd2, power=seq.p[i], alpha=alpha, ratio=ratio,, one.sample=one.sample, two.sided=two.sided)[[2]]
    n[i]<-as.numeric(ob[9,2])
  }
  plot(n, seq.p, ylab="Power", xlab="n (in smaller arm)", type="l",  main=paste("Power graph for m1=", m1, "and m2=", m2))
}

And this is what it looks like. This example will graph power as a function of sample size. We restrict the graph from \(\beta=0.6\) to \(\beta=1\).

graph.power.prop(.6, 1, p1=.20, p2=.35)

11 comments:

CraigJune 26, 2013 at 11:15 AM
Slawa, Super nice article! Thanks for sharing your work. I noticed that this article is not appearing properly in r-bloggers.com (at least in Chrome). This may be a result of knitr?
Gabi HuiberJune 26, 2013 at 12:10 PM
Looks great, but Blogger cut off your x-axis label. I was curious, so I stepped through and pasted together a barebones .Rmd file of my own -- basically with all your function definitions and minus all the LaTeX -- so I could replicate the final call to graph.power.prop().

So, how did you make knitr work with Blogger anyway? Did you just paste the source of the output of Knit HTML into the post editor?
Nick CoxJuly 6, 2013 at 4:37 AM
This comment has been removed by the author.
Nick CoxJuly 6, 2013 at 4:58 AM
Stata 13 just added a great deal under this heading. The entire documentation is accessible to all via http://www.stata.com/bookstore/power-sample-size-reference-manual/
UnknownFebruary 11, 2016 at 12:09 PM
Have you put these into an R package? I think that would be great.
UnknownNovember 13, 2016 at 5:56 PM
Slawa, great work! Hurry to create an R package with all your functions, you deserve all the credits for this. Greetings from Brazil!

Note: Only a member of this blog may post a comment.

Tuesday, June 25, 2013

Sample size calculations equivalent to Stata functions

11 comments: