Extra exercise 13
-----------------

Dogs trained to identify smells from breast cancer patients completed
125 trials. In each trial, the dog had to choose between 5 breath samples, 
of which one originated from a breast cancer patients and the four others
were control samples. The dogs correctly identified the breast cancer
sample in X=110 trials out of the total n=125. We assume a binomial setting
and hence that X follows B(125,p) where p is the probability of correct
identification. The observed proportion is p_hat = X/n = 110/125 = 0.88.

For a 95% confidence interval we have the choice between our 3 approaches. As 
n*(1-p_hat) = 15 (the number of negatives), we are exactly on the cut-off for 
meeting the condition for use of the normal approximation. The best CI is 
obtained by the plus four method. We compute all 3 intervals for
comparison purposes:

  sample proportion: p_hat = X/n = 0.88, 
  standard error of p_hat: sqrt(p_hat*(1-p_hat)/n) = 0.0291
  classical approximate 95% CI for p: 0.88 +- 1.96*0.0291 = (0.823,0.937)  

  plus four adjusted sample proportion: p_tilde = (X+2)/(n+4) = 112/129 = 0.868, 
  standard error of p_tilde: sqrt(p_tilde*(1-p_tilde)/(n+4)) = 0.0298
  classical approximate 95% CI for p: 0.868 +- 1.96*0.0298 = (0.810,0.927)  

  "exact" binomial 95% CI (software): (0.810,0.931)

It is seen that the plus four CI and the exact CI are very similar (the
latter is a bit wider), whereas the normal approximation CI gives a quite different 
range that is symmetric around 0.88. As noted above, the normal approximation 
CI is not very good in this case.


Questions on testing specific values:
-------------------------------------
If the dogs were purely "guessing" (randomly deciding on one of the 5
samples), we would have p=1/5=0.2. It is therefore of interest to test
this hypothesis with a one-sided alternative:
  H0: p=0.2 and Ha: p>0.2
A one-sided alternative is chosen because the interest is whether the dogs
do better than guessing (it is hard to imagine they could do worse; this is
similar to the duo-trio testing example in Lecture 6).

An exact test of the hypothesis has a P-value P<0.0005 or P<0.0000005 (Minitab/Stata).
An approximate test using the normal approximation (conditions ok: 125*0.2=25>10) 
is based on
  z=(0.88-0.2)/sqrt(0.2*0.8/125)=19.01,
and has a very small P-value as well. There is no doubt we must reject
H0 and conclude that the dogs do better than guessing. This conclusion
was probably pretty obvious from the observed proportion being so much
larger than 0.2 anyway.

The paper by McCulloch et al (2006) report that dogs achieved a detection rate
of 99% for breath samples from lung cancer patients. We are interested in 
assessing whether the dogs do equally well with breast cancer and lung
cancer samples. This is really a two-sample situation, but in absence of
actual for lung cancer samples (and thus assuming that the rate of 0.99
was obtained with very small standard error) we will compare the breast
cancer data to a rate of 0.99:
  H0: p=0.99 and Ha: p<>0.99
A two-sided alternative is chosen because we are interested in any
differences between the rates for the two types of samples. Recall that
it is not allowed to observe that the breast cancer rate is lower than
0.99 and chose the alternative as Ha: p<0.99 for this reason. The
hypotheses must always be chosen independently of the data.

In this case, the z-test based on the normal approximation is not a
valid option, because n*(1-p0) = 125*(1-0.99) = 1.25 < 10. Therefore we
have to use an exact test based on the binomial distribution. So far all
our exact tests in the binomial distribution have been with a one-sided
alternative or for the hypothesis H0: p=0.5 where the reference distribution, 
i.e. the distribution under the null hypothesis - B(n,0.5), was symmetrical. 
It is less clear how to compute P-values for a two-sided alternative when the 
reference distribution is asymmetrical, and statistical software use
different formulae. 

Let us denote our observed count of 110 by Xobs: Xobs=110. The simplest
formula for computing the P-value against a two-sided alternative is to
double the one-sided P-value, here (with X ~ B(125,0.99))
  P = 2*(min(P(X<=Xobs),P(X>=Xobs)) = 2*P(X<=110) < 2*0.00000005 = 0.0000001

As the reference is asymmetrical, it is however better to compute the
probabilities in the two tails directly. Stata and Minitab does this in
slightly different ways but in this particular example there is no contribution
from the upper tail by either approach. The method in Minitab which is
easiest to understand and the slightly worse of the two, computes the
expected number of positives under H0, here 125*0.99=123.75, and
determines the point from where to include probabilities in the upper
tail by symmetry of the observed count: 123.75 + (123.75-Xobs) = 137.5.
As 137.5>125 there is no contribution from the upper tail. Stata does
not determine the point in the upper tail by symmetry but by when the
individual probabilities are lower than P(X=Xobs). Also here there is
no contribution from the upper tail. Hence 
  P = P(X<=Xobs) < 0.00000005.
We conclude that there is strong evidence that the dogs are better at
sniffing out lung cancer patients than breast cancer patients. As noted
above, the proper statistical analysis would be from a two-sample test 
if we had the data for the lung cancer samples.
---

Minitab commands and listing:

MTB > POne 125 110;
SUBC>   Confidence 95.0;
SUBC>   Alternative 0.
Test and CI for One Proportion 

Sample    X    N  Sample p         95% CI
1       110  125  0.880000  (0.809811, 0.931257)

MTB > POne 125 110;
SUBC>   Confidence 95.0;
SUBC>   Alternative 0;
SUBC>   UseZ.
Test and CI for One Proportion 

Sample    X    N  Sample p         95% CI
1       110  125  0.880000  (0.823033, 0.936967)

Using the normal approximation.

MTB > POne 129 112;
SUBC>   Confidence 95.0;
SUBC>   Alternative 0;
SUBC>   UseZ.
Test and CI for One Proportion 

Sample    X    N  Sample p         95% CI
1       112  129  0.868217  (0.809846, 0.926588)

Using the normal approximation.

---

Minitab commands for extra test questions:

MTB > POne 125 110;
SUBC>   Test .2;
SUBC>   Confidence 95.0;
SUBC>   Alternative 1.
Test and CI for One Proportion 

Test of p = 0.2 vs p > 0.2
                            95% Lower    Exact
Sample    X    N  Sample p      Bound  P-Value
1       110  125  0.880000   0.821249    0.000

MTB > POne 125 110;
SUBC>   Test .2;
SUBC>   Confidence 95.0;
SUBC>   Alternative 1;
SUBC>   UseZ.
Test and CI for One Proportion 

Test of p = 0.2 vs p > 0.2
                            95% Lower
Sample    X    N  Sample p      Bound  Z-Value  P-Value
1       110  125  0.880000   0.832192    19.01    0.000

Using the normal approximation.

MTB > CDF -19.01;
SUBC>   Normal 0.0 1.0.
 
Cumulative Distribution Function 

Normal with mean = 0 and standard deviation = 1

     x  P( X <= x )
-19.01    0.0000000

MTB > CDF 110;
SUBC>   Binomial 125 .99.
Cumulative Distribution Function 

Binomial with n = 125 and p = 0.99

  x  P( X <= x )
110    0.0000000

MTB > POne 125 110;
SUBC>   Test .99;
SUBC>   Confidence 95.0;
SUBC>   Alternative 0.
Test and CI for One Proportion 

Test of p = 0.99 vs p not = 0.99
                                                    Exact
Sample    X    N  Sample p         95% CI         P-Value
1       110  125  0.880000  (0.809811, 0.931257)    0.000