Solution file for Exercise 13.5 (GO) ------------------------------------ Data: measurements of responses of 16 disk drives produced with 4 different substrates (A-D), on 4 different days, by 4 different machines and 4 different operators. Notation: y_i = response (in microvolts times 10^-2) for i'th disk drive, i=1,...,16, or y_ijkl = response (in microvolts times 10^-2) for drive produced with substrate i, by operator j, and on machine k on day l, i=A,B,C,D; j=1,2,3,4; k=1,2,3,4; l=1,2,3,4. The design is a 4x4 Graeco-Latin square because the symbols for both substrates and days occur once in every row and column, and in addition every (substrate,day) occurs exactly once. The machine, operator and day may be considered as blocking factors, thus the design allows to account for three blocking factors simultaneously. Note that the sequential and partial (adjusted) sum of squares in the ANOVA table below are identical; this is a result of the orthogonality of all factors (allowing them to be assessed independently). The statistical model is y_i = mu + alpha_substrate(i) + beta_operator(i) + gamma_macine(i) + delta_day(i) + eps_i, or y_ijkl = mu + alpha_i + beta_j + gamma_k + + delta_l + epsilon_ijkl, depending on the chosen notation. MTB > WOpen "C:\DATA.AVC\teaching\vhm802\08P\ch13ex5.csv"; SUBC> FType; SUBC> CSV; SUBC> DecSep; SUBC> Period; SUBC> Field; SUBC> Comma; SUBC> TDelimiter; SUBC> DoubleQuote. Retrieving worksheet from file: 'C:\DATA.AVC\teaching\vhm802\08P\ch13ex5.csv' Worksheet was saved on 04/02/2012 MTB > GLM 'y' = operator machine 'day_txt' 'tx_txt'; SUBC> Brief 2 ; SUBC> Means operator machine 'day_txt' 'tx_txt'; SUBC> GFourpack; SUBC> RType 2 . General Linear Model: y versus operator, machine, day_txt, tx_txt Factor Type Levels Values operator fixed 4 1, 2, 3, 4 machine fixed 4 1, 2, 3, 4 day_txt fixed 4 alpha, beta, delta, gamma tx_txt fixed 4 A, B, C, D Analysis of Variance for y, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P operator 3 14.000 14.000 4.667 0.65 0.633 machine 3 21.500 21.500 7.167 1.00 0.500 day_txt 3 3.500 3.500 1.167 0.16 0.915 tx_txt 3 61.500 61.500 20.500 2.86 0.206 Error 3 21.500 21.500 7.167 Total 15 122.000 S = 2.67706 R-Sq = 82.38% R-Sq(adj) = 11.89% Least Squares Means for y operator Mean SE Mean 1 5.500 1.339 2 7.500 1.339 3 5.000 1.339 4 6.000 1.339 machine 1 7.250 1.339 2 4.500 1.339 3 7.000 1.339 4 5.250 1.339 day_txt alpha 6.000 1.339 beta 6.250 1.339 delta 5.250 1.339 gamma 6.500 1.339 tx_txt A 5.750 1.339 B 5.750 1.339 C 9.000 1.339 D 3.500 1.339 Residual Plots for y Comments: --------- The analysis shows no significant effect of treatments, and nor any significant effects of any of the blocking factors. With only 3 degrees of freedom for error, the residuals are not of any use for model checking (only four different values are taken). Note that the power of the F-tests is also rather low with so few degrees of freedom for error, and we should perhaps not consider the treatment effects as totally non-interesting. The means show the biggest difference between C an D, the two types of glass substrates. It is not clear whether that would be an expected or unexpected result. One question is whether to refit the model without the non-significant factors (excluding the treatment, of course, which is the factor of primary interest). Usually this is not worthwhile because the partial and sequential sum of squares coincide (so that results would be unchanged). However, with only 3 degrees of freedom for error, one potential and important advantage could be to increase the degrees of freedom for error (pooling). Because the non-significant effects here are both quite small, pooling will also decrease the estimated error variance, which also leads to stronger effects of the remaining factors. We explore the effects of pooling all blocking effects into error, thereby effectively ignoring the statistical design completely. MTB > GLM 'y' = 'tx_txt'; SUBC> Brief 2 ; SUBC> Means 'tx_txt'; SUBC> GFourpack; SUBC> RType 2 ; SUBC> Pairwise 'tx_txt'; SUBC> Bonferroni; SUBC> NoCI. General Linear Model: y versus tx_txt Factor Type Levels Values tx_txt fixed 4 A, B, C, D Analysis of Variance for y, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P tx_txt 3 61.500 61.500 20.500 4.07 0.033 Error 12 60.500 60.500 5.042 Total 15 122.000 S = 2.24537 R-Sq = 50.41% R-Sq(adj) = 38.01% Least Squares Means for y tx_txt Mean SE Mean A 5.750 1.123 B 5.750 1.123 C 9.000 1.123 D 3.500 1.123 Grouping Information Using Bonferroni Method and 95.0% Confidence tx_txt N Mean Grouping C 4 9.0 A B 4 5.8 A B A 4 5.8 A B D 4 3.5 B Means that do not share a letter are significantly different. Bonferroni Simultaneous Tests Response Variable y All Pairwise Comparisons among Levels of tx_txt tx_txt = A subtracted from: Difference SE of Adjusted tx_txt of Means Difference T-Value P-Value B -0.000 1.588 -0.000 1.0000 C 3.250 1.588 2.047 0.3793 D -2.250 1.588 -1.417 1.0000 tx_txt = B subtracted from: Difference SE of Adjusted tx_txt of Means Difference T-Value P-Value C 3.250 1.588 2.047 0.3793 D -2.250 1.588 -1.417 1.0000 tx_txt = C subtracted from: Difference SE of Adjusted tx_txt of Means Difference T-Value P-Value D -5.500 1.588 -3.464 0.0281 Residual Plots for y Comments: --------- The change in results is remarkable, with the treatment factor now becoming significant. The only significant pairwise comparison between treatments is between C and D (the two glass substrates). The residual plots look good. It is essentially impossible to determine from the data whether the analysis without the blocking factors is valid. The best conclusion is perhaps that the study shows an indication of treatment differences and suggests further investigation of treatment effects, preferably in an experiment with greater power than the 4x4 Graeco-Latin design.