Solution file for additional exercise 10.4 ------------------------------------------ (see exercise 10.1 for discussion of model, design and notation) MTB > WOpen "h:\vhm\vhm802\data_csv\hs10_4.csv"; SUBC> FType; SUBC> CSV; SUBC> DecSep; SUBC> Period; SUBC> Field; SUBC> Comma; SUBC> TDelimiter; SUBC> DoubleQuote. Retrieving worksheet from file: 'h:\vhm\vhm802\data_csv\hs10_4.csv' Worksheet was saved on 03/04/2011 MTB > Name c5 "SRES1" c6 "TRES1" MTB > GLM 'pH' = strain litter( strain); SUBC> Random 'litter'; SUBC> Brief 2 ; SUBC> EMS; SUBC> Means strain; SUBC> SResiduals 'SRES1'; SUBC> TResiduals 'TRES1'; SUBC> GFourpack; SUBC> RType 2 . General Linear Model: pH versus strain, litter Factor Type Levels Values strain fixed 2 pHH, pHL litter(strain) random 14 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7 Analysis of Variance for pH, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P strain 1 0.006645 0.006645 0.006645 1.26 0.283 litter(strain) 12 0.063093 0.063093 0.005258 2.22 0.028 Error 42 0.099375 0.099375 0.002366 Total 55 0.169113 S = 0.0486423 R-Sq = 41.24% R-Sq(adj) = 23.05% Unusual Observations for pH Obs pH Fit SE Fit Residual St Resid 2 7.39000 7.47500 0.02432 -0.08500 -2.02 R 31 7.63000 7.53250 0.02432 0.09750 2.31 R 46 7.55000 7.44250 0.02432 0.10750 2.55 R R denotes an observation with a large standardized residual. Expected Mean Squares, using Adjusted SS Expected Mean Square Source for Each Term 1 strain (3) + 4.0000 (2) + Q[1] 2 litter(strain) (3) + 4.0000 (2) 3 Error (3) Variance Components, using Adjusted SS Estimated Source Value litter(strain) 0.00072 Error 0.00237 Least Squares Means for pH strain Mean pHH 7.477 pHL 7.455 Residual Plots for pH MTB > NormTest 'SRES1'. Probability Plot of SRES1 The P-value of the Anderson-Darling test of normality is 0.429. Comments and answers to questions: ---------------------------------- The residuals of the error terms look okay, and the most extreme residual has a corresponding deletion residual of 2.74, which is no cause of concern in a dataset of this size. We discuss the residuals of litter random effects below. The mean strain levels are given above as least square means, and the standard errors could be computed by the usual formulae except for using MS(Litt) instead of MSE: SE = sqrt(MS(Litt)/28) = 0.014 The estimated variance component are listed above as well: sigma^2 (error): 0.00237 sigma^2_B (litters): 0.00072 There appears to be much more variation among mice than among litters. The ANOVA table shows weak effects of both litters and strains. In particular, the F-statistic for testing no strain effect is only 1.26, with a P-value of 0.28. Therefore, the breeding does not really seem to have been successful (at least, the pHH strain has a higher level than the pHL strain, but the message is that this could very well be caused by random fluctuations). Next we give commands to compute residuals at the litter level, or estimated random litter effects. Essentially, we aggregate the data within litters, and analyze the litter means. MTB > Name c7 "ByVar1" c8 "ByVar2" c9 "Mean1" MTB > Statistics 'pH'; SUBC> By 'strain' 'litter'; SUBC> GValues 'ByVar1'-'ByVar2'; SUBC> Mean 'Mean1'. MTB > Name c10 "SRES2" MTB > GLM 'Mean1' = ByVar1; SUBC> Brief 2 ; SUBC> Means ByVar1; SUBC> SResiduals 'SRES2'; SUBC> GFourpack; SUBC> RType 2 . General Linear Model: Mean1 versus ByVar1 Factor Type Levels Values ByVar1 fixed 2 pHH, pHL Analysis of Variance for Mean1, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P ByVar1 1 0.001661 0.001661 0.001661 1.26 0.283 Error 12 0.015773 0.015773 0.001314 Total 13 0.017434 S = 0.0362551 R-Sq = 9.53% R-Sq(adj) = 1.99% Least Squares Means for Mean1 ByVar1 Mean SE Mean pHH 7.477 0.01370 pHL 7.455 0.01370 Residual Plots for Mean1 MTB > NormTest 'SRES2'. Probability Plot of SRES2 The P-value of the Anderson-Darling test of normality is 0.485. Comments: --------- The ANOVA table gives the same F-test for strains as the full analysis above (and this should be so). Also, the standard errors of the strain least squares means are exactly the ones calculated above. The residuals in this analysis show no particular problems - neither different variation for the two strains, nor departures from normality.