Supplementary Exercise 8.103 of IPS7e ------------------------------------- Sample size calculation to obtain a margin of error of 0.10. Assume X ~ B(n,p1) and Y ~ B(n,p2). The classical, approximate 95% CI for the difference p1-p2 is Dp +- 1.96*SE_Dp where Dp=p1_hat - p2_hat, and SE_Dp = sqrt(p1_hat(1-p1_hat)/n + p2_hat*(1-p2_hat)/n). The SE_Dp (and therefore also the margin of error) is largest if both p1_hat and p2_hat are equal to 0.5. We'll therefore get a conservative estimate of the sample size if we work with both p_hat=0.5. To get a margin of error of at most 0.10, we need to solve 0.1 >= 1.96*sqrt(0.5*0.5/n+0.5*0.5/n) ~= 2/sqrt(2n) or 2n >= (2/0.1)^2 = 400 or n>= 200 (or 193 if working with 1.96 rather than the rounded-off value of 2) We need at least 200 subjects in each group to get a margin of error less than 0.1. This estimate is conservative (too large) if the proportions in the two populations are far from 0.5. From this calculation, we deduce that the general formula for a conservative sample size calculation, at a desired margin of error of m and a critical value zstar, for comparison of two independent proportions, is n >= (zstar/m)^2 /2 This formula is not in the VHM 801 textbooks. The expression for n is seen to be twice of the expression for a single sample. One valid approach to this (conservative) sample size calculation is to double the value for n from a one-sample calculation (which can be done in Minitab). It's a conservative approach because in practice one would rarely expect both probabilities to be very close to 0.5.