Assignment III for Biostats Course VHM 801 at AVC - Winter semester 2018
The assignment is worth 10% of the final course mark. Please be aware that by handing
in the home assignment you implicitly acknowledge to have read and accepted
the instructions for home assignments as described
on the VHM 801 homepage.
The data for the assignment are from a study utilizing a registry of twins born
in a certain geographical region in a certain time period. The registry
contained 1030 pairs of female twins that were eligible by the inclusion criteria
of the study. First, each twin pair had been classified as either
monozygotic or dizygotic from background information such as photographs, data on physical
similarity and frequency of confusion as children, and in some instances also blood samples.
Next, interviews were conducted (separately) with each of the 2060 women,
and they were all classified (no/yes) as meeting the criteria for alcoholism according to
three different definitions: narrow (alcoholism with dependence), intermediate
(alcoholism without dependence) and broad (problem drinking). The alcoholism classifications
are only available in summarized form, as the number of twin pairs with 0, 1 and 2 twins
classified as positive to alcoholism according to each of the three criteria/definitions. The table below
gives these counts, for the monozygotic and dizygotic twin pairs.
| Type of twins | Monozygotic | Dizygotic
|
|---|
| Alcoholism criteria | Narrow | Intermed | Broad | Narrow | Intermed | Broad
|
|---|
| neither positive (0) | 537 | 510 | 443
| 377 | 361 | 301
|
|---|
| one positive (1) | 45 | 65 | 102
| 59 | 68 | 113
|
|---|
| both positive (2) | 8 | 15 | 45
| 4 | 11 | 26
|
|---|
| Total | 590 | 440
|
|---|
For example, there were 537 monozygotic twin pairs with neither of the
twins classified as alcoholic according to the narrow criteria, out of
the 590 monozygotic twin pairs; similarly, there were 65 monozygotic twin pairs with
one of the two twins classified as alcoholic according to the
intermediate criteria.
The counts in the table are available as a data set in Minitab format
and as a comma-separated file, for import into Stata and other statistical software.
The home assignment has four questions (a)-(d) which should all be answered. In general,
the assumptions of every
statistical procedure used should be stated (formally or informally) and checked
(where possible), and every statistical analysis should
be summarized in a conclusion.
-
Using the narrow criteria of alcoholism, estimate the probability that in a
randomly chosen monozygotic twin pair with at least one alcoholic twin,
actually both twins in the pair are alcoholics. Supplement the estimate with a
95% confidence interval. Repeat for the intermediate and broad
criteria (still for monozygotic twins); what happens to these probabilities when the criteria are changed?
-
Assume that you have a monozygotic twin who has been classified as an alcoholic
according to the narrow criteria. Estimate the probability that you would also be
classified as an alcoholic using the narrow criteria. In other words, our interest is
in the probability that a twin has the condition if her (his) co-twin has the condition.
Explain your calculation, and why this probability is not the
same as in (a). Repeat also here for the other criteria for alcoholism (for monozygotic twins),
and compare the results. As our focus is on the estimates, there is no
need to supplement with confidence intervals.
-
Under each criteria of alcoholism, compare the results for monozygotic and dizygotic
twin pairs to statistically assess whether monozygotic or dizygotic twin pairs are equally likely to
contain 0, 1 and 2 alcoholics. What do your results tell you about a possible
genetic factor behind alcoholism? Do the results and conclusions depend
on the chosen criteria for alcoholism?
-
In addition to comparing monozygotic and dizygotic twin pairs, it is
also of interest to assess directly whether the data show
evidence of a dependence between alcoholism in the two twins in a
pair. You are required to compute a chi-square test for the null
hypothesis of independence by the following steps. Use for this part only the data for
dizygotic twins and the broad criteria.
- Estimate the probability that a randomly chosen person among all the twin
pairs is an alcoholic.
- Using the estimated probability and the assumption of independence between twins,
estimate the probabilities that in a randomly chosen twin pair 0, 1,
and 2 of the twins are alcoholics, respectively.
- From these three estimated probabilities, compute the expected
number of twin pairs with 0, 1, and 2 alcoholics (under the null
hypothesis of independence).
- Compute the chi-square statistic by adding up terms of the familiar form
"(observed-expected) squared divided by expected" for the three categories of twin
pairs.
- Assess the significance of the test statistic computed this way in a chi-square
distribution with one degree of freedom.
- Draw conclusions about the null hypothesis.
Henrik Stryhn
(hstryhn@upei.ca) 2018-03-10