Assignment II for Biostats Course VHM 801 at AVC - Fall semester 2024
The assignment is worth either 10% or 15% of the final course mark. Questions 1-4 constitute an
assignment for 10%, whereas Questions 1-6 constitute an assignment for
15%. Home assignment III will be for either 15% or 10% depending on
whether you chose the 10% or 15% version for this assignment,
respectively. You need to indicate
clearly which version of the assignment you answer.
Please be aware that by handing
in the home assignment you implicitly acknowledge to have read and accepted
the instructions for home assignments as described
on the VHM 801 homepage.
The assignment is a continuation of the first home
assignment on blood cholesterol measurements for a subset of participants
in the Framingham heart study. For the first two questions, we will
continue to work with the data provided for the first home assignment (Minitab and .csv formats), whereas the last
four questions will use a different version of the data (described below).
-
Question 3 of the first home assignment included
informal comparisons between the cholesterol values in the Framingham dataset for
both women and men in the age group 35-44 years with the respective mean
values reported in a national survey for a comparable time period. The task now is to carry
out statistical analyses (separately, for both men and women) to investigate
whether the Framingham data in this age group seem to differ in their
means from the corresponding values in the national survey. State your statistical
models/assumptions explicitly and discuss critically to which extent
these seem to be met. Based on this discussion, assess whether the
statistical results should be considered as exact or approximate. Give
95% confidence intervals for the means (for men and women) of the
Framingham population and interpret these carefully. Additionally,
carry out statistical tests of relevant hypotheses to investigate the question of interest. Draw
conclusions from the statistical analysis, and
indicate how you confident you are in your conclusions (e.g., weak or strong
confidence).
-
Continuing with the data of the previous question, carry out a statistical analysis
to compare the mean cholesterol levels for men and women in the 35-44 years age group
in the Framingham data. Include also here the statistical
model/assumptions with the specific discussions outlined in the previous
question, a 95% confidence interval and a statistical test relevant for the question
studied, and draw conclusions.
The next four questions will use a version of the data extracted from the
Framingham heart study that includes cholesterol values measured for
the same subjects bi-yearly (i.e., every two years) over a 10-year
period (Minitab and .csv formats).
The values included in the first dataset were those obtained at
the first measurement. In the expanded dataset, these values are included
in the variable chol0;
the values at subsequent years (2,4,6,8,10) are denoted
chol2,...,chol10. Only subjects with a complete series of
measurements are included, reducing the size of the dataset to 133
subjects. The other variables are unchanged from the first dataset.
-
Discuss in general terms the implications (factual and/or potential) of restricting an analysis to the subset of
the original data consisting of all complete series over 10 years. You
could for example address the reference population for the analysis and potential
biases or confounding/lurking variables. For this question you are not
expected to carry out any major analyses, although calculations supporting your
arguments are allowed.
-
Using data across all ages but separately for women and men, carry out statistical
analysies to compare the cholesterol values for year 10
to baseline (i.e., the values at year 0) in order to investigate
whether the cholesterol values show change in their mean over these 10 years (as the study persons grew
older). Include also here in each analysis the statistical model/assumptions and discussions along the lines
of the previous questions as well as a relevant confidence interval and test. Draw
conclusions and, if you're not
continuing to Questions 5-6, summarize your findings from Questions 1-4 to a brief statement about
the findings of your analyses and what they tell us about the cholesterol levels among individuals
in the Framingham population.
The final two questions are for the 15% version of the home assignment only.
Unless you have indicated otherwise, if you submit
answers to any of these questions your assignment will be evaluated towards 15%.
-
Continuing from Question 4, still using data across all ages and both
gender groups (separately), carry out statistical
analyses to compare the cholesterol values
for years 2,...,10 to baseline (i.e., the values at year 0) in order to investigate
whether the cholesterol values show change in their mean over these years.
You are expected to analyse each of the years 2,...,10 separately (i.e., not include them all in a
combined analysis); you don't need to repeat the analyses for year 10 from Question 4 here.
Determine, if possible, when such a change can first be
established. Include also here the statistical model/assumptions and discussions along the lines
of the previous questions; however, confine the discussions to the most important points.
Summarise your findings across the different analyses to answer the main
questions posed for this point.
-
Irrespective of your results in the previous question, carry out
additional statistical analyses to compare the changes at years 2,...,10 relative to baseline between women and men.
Hence, the objective of these analyses is to investigate whether the
cholesterol levels change relative to baseline in the same way for women
and men, or whether for example stronger changes can be determined in one of the
gender groups. Analyse also here the years 2,...,10 separately, and
include the statistical model/assumptions and discussions along the lines
of the previous questions, confining yourself to the most important points.
Summarize your findings from Questions 1-6 to a brief statement about
the findings of your analyses and what they tell us about the cholesterol levels among individuals
in the Framingham population.
Henrik Stryhn
(hstryhn@upei.ca) 2024-10-09