Solution file for additional exercise 9.1 ------------------------------------------ Data: measurements of peak expiratory flow (PEF) of 13 children 8 hours after treatment with two different inhalation medicines. Notation: y_i = PEF for treatment episode i, i=1,...,26, or y_ijk = PEF for child j after treatment i in period j, i=F,S; j=1,...,13; k=1,2. (a) The design is a AB/BA cross-over design with 7 children for treatment sequence FS and 6 children for sequence SF. If no carry-over effect is assumed, the statistical model should include only treatment, period and subject effects: y_i = mu + alpha_tx(i) + beta_period(i) + gamma_child(i) + eps_i, or y_ijk = mu + alpha_i + beta_k + gamma_j + eps_ijk, depending on the chosen notation. The errors are assumed independent and distributed as a N(0,sigma^2) distribution. MTB > WOpen "h:\vhm\vhm802\data_csv\hs09_1.csv"; SUBC> FType; SUBC> CSV; SUBC> DecSep; SUBC> Period; SUBC> Field; SUBC> Comma; SUBC> TDelimiter; SUBC> DoubleQuote. Retrieving worksheet from file: 'h:\vhm\vhm802\data_csv\hs09_1.csv' Worksheet was saved on 03/03/2011 MTB > Name c6 "SRES1" c7 "TRES1" MTB > GLM 'pef' = id period tx; SUBC> Brief 2 ; SUBC> Means period tx; SUBC> SResiduals 'SRES1'; SUBC> TResiduals 'TRES1'; SUBC> GFourpack; SUBC> RType 2 . General Linear Model: pef versus id, period, tx Factor Type Levels Values id fixed 13 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14 period fixed 2 1, 2 tx fixed 2 F, S Analysis of Variance for pef, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P id 12 115213.5 115213.5 9601.1 12.79 0.000 period 1 984.6 1632.1 1632.1 2.17 0.168 tx 1 14035.9 14035.9 14035.9 18.70 0.001 Error 11 8254.5 8254.5 750.4 Total 25 138488.5 S = 27.3935 R-Sq = 94.04% R-Sq(adj) = 86.45% Least Squares Means for pef period Mean SE Mean 1 310.5 7.609 2 326.4 7.609 tx F 341.8 7.609 S 295.2 7.609 Residual Plots for pef Comments: --------- The normal plot looks fine, but the residual plot shows some tendency of ``inverse cone''-shape (decreasing variance with increasing values). This is essentially due to one pair of large residuals (note that all residuals come in pairs with positive and negative signs but the same value) at the lower end of the predicted values. No residuals are suspiciously large. A Box-Cox analysis in Stata gives an optimal power of 1.82 and clear evidence in favour of a transformation; in Minitab we get a 95% CI of (1.195,*) where the * means that the interval has no finite upper bound. However, the tests of homoscedasticity (in Stata) are non-significant. As power transformations with powers >1 are somewhat unusual and its justification in the present case is not intuitively obvious, we decide to work with the untransformed data. The ANOVA table shows a clear subject effect, no period effect and a clearly significant treatment effect (P<0.0005). The (least squares) means show that the new drug, formoterol, on the average increases the pef by 46 units. The Minitab commands to analyse the period differences 1-2 are given below. MTB > Unstack ('id' 'order' 'pef'); SUBC> Subscripts 'period'; SUBC> NewWS "period"; SUBC> VarNames. MTB > Worksheet "period" MTB > Name C7 'diffper' MTB > Let 'diffper' = 'pef_1'-'pef_2' MTB > TwoT 'diffper' 'order_1'; SUBC> Pooled. Two-Sample T-Test and CI: diffper, order_1 Two-sample T for diffper SE order_1 N Mean StDev Mean FS 7 30.7 33.0 12 SF 6 -62.5 44.7 18 Difference = mu (FS) - mu (SF) Estimate for difference: 93.2143 95% CI for difference: (45.7762, 140.6524) T-Test of difference = 0 (vs not =): T-Value = 4.32 P-Value = 0.001 DF = 11 Both use Pooled StDev = 38.7403 Comments: --------- The two-sample t-test with equal variances does indeed reproduce the strong significance from above; note that t^2=4.32^2=18.7=F. (b) The model with subject random effects can be run in Minitab as long as the interaction period*tx is not included. MTB > GLM 'pef' = id period tx; SUBC> Random 'id'; SUBC> Brief 2 ; SUBC> EMS. General Linear Model: pef versus id, period, tx Factor Type Levels Values id random 13 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14 period fixed 2 1, 2 tx fixed 2 F, S Analysis of Variance for pef, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P id 12 115213.5 115213.5 9601.1 12.79 0.000 period 1 984.6 1632.1 1632.1 2.17 0.168 tx 1 14035.9 14035.9 14035.9 18.70 0.001 Error 11 8254.5 8254.5 750.4 Total 25 138488.5 S = 27.3935 R-Sq = 94.04% R-Sq(adj) = 86.45% Expected Mean Squares, using Adjusted SS Expected Mean Square for Each Source Term 1 id (4) + 2.0000 (1) 2 period (4) + Q[2] 3 tx (4) + Q[3] 4 Error (4) Error Terms for Tests, using Adjusted SS Synthesis of Error Source Error DF Error MS MS 1 id 11.00 750.4 (4) 2 period 11.00 750.4 (4) 3 tx 11.00 750.4 (4) Variance Components, using Adjusted SS Estimated Source Value id 4425.4 Error 750.4 Comments: --------- The ANOVA table is unchanged from the fixed effects model. The estimated variance component for subjects is very large: 4425 compared to 750 for the within-subject variation. This shows that the cross-over design is much more efficient than an ordinary completely randomised design would be. Stata commands and output for part (b): . mixed pef i.Tx i.period || id:, reml Performing EM optimization: Performing gradient-based optimization: Iteration 0: log restricted-likelihood = -127.56466 Iteration 1: log restricted-likelihood = -127.56466 Computing standard errors: Mixed-effects REML regression Number of obs = 26 Group variable: id Number of groups = 13 Obs per group: min = 2 avg = 2.0 max = 2 Wald chi2(2) = 20.02 Log restricted-likelihood = -127.56466 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ pef | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Tx | S | -46.60714 10.77656 -4.32 0.000 -67.72881 -25.48548 2.period | 15.89286 10.77656 1.47 0.140 -5.228807 37.01452 _cons | 333.8187 20.56391 16.23 0.000 293.5142 374.1232 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ id: Identity | var(_cons) | 4425.36 1966.341 1852.38 10572.24 -----------------------------+------------------------------------------------ var(Residual) | 750.4055 319.9739 325.3438 1730.81 ------------------------------------------------------------------------------ LR test vs. linear regression: chibar2(01) = 14.67 Prob >= chibar2 = 0.0001 . * added analysis with tx by period interaction . mixed pef i.Tx##i.period || id:, reml Performing EM optimization: Performing gradient-based optimization: Iteration 0: log restricted-likelihood = -122.26323 Iteration 1: log restricted-likelihood = -122.26323 Computing standard errors: Mixed-effects REML regression Number of obs = 26 Group variable: id Number of groups = 13 Obs per group: min = 2 avg = 2.0 max = 2 Wald chi2(3) = 20.05 Log restricted-likelihood = -122.26323 Prob > chi2 = 0.0002 ------------------------------------------------------------------------------ pef | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Tx | S | -53.80952 41.62196 -1.29 0.196 -135.3871 27.76802 2.period | 8.690476 41.62196 0.21 0.835 -72.88707 90.26802 | Tx#period | S#2 | 14.40476 80.40531 0.18 0.858 -143.1867 171.9963 | _cons | 337.1429 28.27655 11.92 0.000 281.7218 392.5639 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ id: Identity | var(_cons) | 4846.539 2232.299 1965.039 11953.42 -----------------------------+------------------------------------------------ var(Residual) | 750.4055 319.9739 325.3438 1730.81 ------------------------------------------------------------------------------ LR test vs. linear regression: chibar2(01) = 15.24 Prob >= chibar2 = 0.0000 Comments: --------- The random effects model with no interaction gives similar but not identical results to the ANOVA-based analysis. The variance components are identical (as is true for all "nice" designs, when the likelihood-based analysis uses REML estimation). The tests for the fixed effects give slightly more significant P-values, due to the use of a standard normal reference distribution instead of the (exact) t-distribution. The treatment by period interaction shows absolutely no significance. Note however the changed results for the two main effects; these are no longer useful (the model should be refitted without the interaction before assessing the main effects). A treatment by period interaction would mean that the difference between treatments depends on the period. Such an effect would most likely be a carry-over effect (although there could also be genuinely different treatment effects in periods). For example, treatment A differs in periods 1 and 2 by being either the first treatment applied (AB) and by being the follow-up treatment after treatment B was applied (BA). In the simple AB/BA-design we cannot from the data alone distinguish between carry-over effects and a genuine treatment by period interaction. With a totally non-significant interaction, we conclude that no carry-over effects seem to be present in these data.