Supplementary Exercises 7.102, 7.103 and 7.104 of IPS7e ------------------------------------------------------- Data: 2 samples of changes (improvements, differences after-before) in spatial-temporal reading test scores for 34 children attending six months of piano lessons and 44 children in a control group. Note that we already analyzed the piano group in Exercises 7.58 and 7.59. Model: the 2 samples are independent and each a simple random sample (i.i.d. sample) from a distribution with unknown mean and standard devation (mu1 and sigma1 for the piano lesson group, mu2 and sigma2 for the control group). (a) Minitab commands: MTB > WOpen "H:\VHM\VHM801\Datasets\Minitab\Chapter 7\ex07_102.mtw". Retrieving worksheet from file: ‘H:\VHM\VHM801\Datasets\Minitab\Chapter 7\ex07_102.mtw’ Worksheet was saved on 02/11/2014 MTB > name c4 'change' MTB > Stem-and-Leaf 'change'; SUBC> By 'g'. Stem-and-Leaf Display: change Stem-and-leaf of change g = 0 N = 34 Leaf Unit = 0.10 1 -3 0 3 -2 00 4 -1 0 5 -0 0 6 0 0 7 1 0 10 2 000 15 3 00000 (7) 4 0000000 12 5 00 10 6 000 7 7 00000 2 8 2 9 00 Stem-and-leaf of change g = 1 N = 44 Leaf Unit = 0.10 1 -6 0 1 -5 2 -4 0 5 -3 000 7 -2 00 14 -1 0000000 19 -0 00000 (6) 0 000000 19 1 000000 13 2 0000000 6 3 0 5 4 000 2 5 0 1 6 1 7 0 MTB > Describe 'change'; SUBC> By 'group'; SUBC> Mean; SUBC> SEMean; SUBC> StDeviation; SUBC> QOne; SUBC> Median; SUBC> QThree; SUBC> Minimum; SUBC> Maximum; SUBC> Skewness; SUBC> Kurtosis; SUBC> N. Descriptive Statistics: change Variable group N Mean SE Mean StDev Minimum Q1 Median Q3 Maximum change control 44 0.386 0.365 2.423 -6.000 -1.000 0.000 2.000 7.000 piano 34 3.618 0.524 3.055 -3.000 2.000 4.000 6.000 9.000 Variable group Skewness Kurtosis change control 0.12 1.04 piano -0.36 -0.28 MTB > Dotplot ( 'change' ) * 'group'. Dotplot of change vs group MTB > PPlot 'change'; SUBC> Normal; SUBC> Symbol; SUBC> FitD; SUBC> Grid 2; SUBC> Grid 1; SUBC> MGrid 1; SUBC> Panel 'group'. Probability Plot of change The P-value of the Anderson-Darling test of normality is 0.066 (group=control) The P-value of the Anderson-Darling test of normality is 0.227 (group=piano) MTB > GSummary 'change'; SUBC> By 'group'. Results for group = control Summary Report for change (group = control) Results for group = piano Summary Report for change (group = piano) Comments for 7.102 (a) and (b) ------------------------------ The distributions are displayed by stemplots and dotplots. Note that the stemplot in Minitab artificially divides the observations with a value of zero into 2 groups (this is clearly not desirable). The table of descriptive statistics contains the mean, standard deviation and standard error of the mean, as requested. Both distributions look reasonably symmetric and bell-shaped. The normal plots and normality tests show no reason to reject a normal distribution for the piano group. The distribution for the control group is somewhat too peaked for a normal distribution (kurtosis=1.04), and the P-value for the normality is as low as 0.066. Among the different normality tests, the A-D test is the only showing something near significance for the control group. It seems reasonable to maintain the normal distribution assumption even in view of a possible mild violation. Comments for 7.102 (c) and 7.103 ------------------------------------ The interest is in comparing the changes in score between the piano and control groups. Even if the primary interest is in an improvement of the scores piano group over the control group, there seems to be no apriori reason to focus only on an improvement in the score. Therefore, our hypotheses are H0: mu1=mu2 Ha: mu1<>mu2 Since both distributions look reasonably normal, we may assume normal distributions and obtain exact inference (confidence interval and test). MTB > TwoT 'change' 'group'; SUBC> Confidence 95.0; SUBC> Test 0.0; SUBC> Alternative . Two-Sample T-Test and CI: change, group Two-sample T for change group N Mean StDev SE Mean control 44 0.39 2.42 0.37 piano 34 3.62 3.06 0.52 Difference = mu (control) - mu (piano) Estimate for difference: -3.231 95% CI for difference: (-4.508, -1.954) T-Test of difference = 0 (vs not =): T-Value = -5.06 P-Value = 0.000 DF = 61 Comments: --------- The t-test (without assuming same standard deviations in the two groups) gives a value of 5.06 with approximate DF=61 which is highly significant. Similar results are obtained with other variants of the test: conservative DF, or assuming same variances (not too far off). There is clear evidence of a difference in scores in the two groups: the piano lesson group scored higher than the control group. We also note that the evidence against H0 is so strong that any deviations from the normal distribution are without practical importance. The 95% confidence interval gives the range of the improvement as about 2 to 4.5 units (of test scores). Minitab technical note: The difference between the two group means is computed as control minus piano, and therefore shows as negative. If we were interested in the difference piano minus control, we can use all the above results and simply switch the signs. Alternatively, we can make Minitab do the difference in the preferred way by changing the labels for the groups so that the piano group becomes the first one (alphabetically, it's the second one). The variable g in the Minitab worksheet would do this (g=0 for piano, g=1 for control). We could also unstack the two columns and then enter the columns in the desired order. Comments for Exercise 7.104 --------------------------- The advantage of including a control group is that any improvement in the scores by aging (or perhaps other types of confounding) is taken into account. The control group data show that such an improvement is at most minor. The primary advantage of carrying out a significance test over using a confidence interval is that it gives a P-value, which is a more informative measure of the evidence against the null hypothesis than mere significance at 5% level. On the other hand, the confidence interval is useful, really indispensable, to quantify the likely magnitude of the effect; recall that statistical significance is not the same as biological significance.