Solution file for additional exercise 10.10
-------------------------------------------
Data on antibiotic blood serum levels which during a pilot trial were measured for 
5 subjects at 1, 2, 3 and 6 hours after medication. Each subject went through two
measurement periods, each with a different drug, and with a wash-out period in-between.

- notation:
y_ijk = antibiotic level time for subject i with drug j and measured at time k,
   i = 1,2,3,4,5, (subjects),
   j = 1,2 (drug: A, B),
   k = 1,2,3,4 (hours after medication: 1, 2, 3, 6,
- repeated measures data with 2 series of 4 measurements on each subject,
- the treatment factor (drug) varies within subjects, therefore the
design does not have split-plot character (no whole-plot factor),
- may at first sight be viewed as a block design with
  * drugs & time = treatment factors,
  * subjects = blocks,
however this leaves out one important effect in the model: the subject*drug
interaction, corresponding to measurement periods for each of the
subjects; in fact, the repeated measures are taken over drug*subject
units,
- model: 
y_ijk = mu + A_i + beta_j + AB_ij + gamma_k + (beta gamma)_jk + eps_ijk, 
      where A_i's are assumed i.i.d. N(0,sigma_A^2),
      where AB_ij's are assumed i.i.d. N(0,sigma_AB^2),
      where eps_ijk's are assumed i.i.d. N(0,sigma^2),
we take here subject effects as random because there could be some
interest in a variation between subjects.

Answers to questions:
- experimental design: repeated measures with treatments within
subjects, may be viewed as a block design with subject*drugs as blocks,
(which does however ignore the ordering of measurements over time),
- effects of interest: drug, drug*time, drug*subject,
- experimental unit for drug treatment: single measurement or period 
(NOT subject because drugs are compared within subjects),
- single measurement over time: 2-way layout with treatments and blocks
(2*5 design).

MTB > WOpen "H:\VHM\VHM802\Data_csv\hs10_10.csv";
SUBC>   FType;
SUBC>     CSV;
SUBC>   DecSep;
SUBC>     Period;
SUBC>   Field;
SUBC>     Comma;
SUBC>   TDelimiter;
SUBC>     DoubleQuote.
Retrieving worksheet from file: ‘H:\VHM\VHM802\Data_csv\hs10_10.csv’
Worksheet was saved on 19/03/2011

MTB > Plot 'y'*'time';
SUBC>   Symbol 'subject';
SUBC>   Connect 'subject';
SUBC>   Panel 'drug'.
Scatterplot of y vs time 

MTB > GLM;
SUBC>   Response 'y';
SUBC>   Nodefault;
SUBC>   Categorical 'subject' 'drug' 'time';
SUBC>   Random subject;
SUBC>   Terms subject drug time subject*drug drug*time;
SUBC>   TExpand;
SUBC>   TMethod;
SUBC>   TAnova;
SUBC>   TSummary;
SUBC>   TCoefficients;
SUBC>   TEquation;
SUBC>   TFactor;
SUBC>   TEMS;
SUBC>   TVariance;
SUBC>   TDiagnostics 0;
SUBC>   Rtype 2;
SUBC>  GFOURPACK.
General Linear Model: y versus subject, drug, time 

Method
Factor coding  (-1, 0, +1)

Factor Information
Factor   Type    Levels  Values
subject  Random       5  1, 2, 3, 4, 5
drug     Fixed        2  A, B
time     Fixed        4  1, 2, 3, 6

Analysis of Variance
Source          DF   Seq SS  Contribution   Adj SS   Adj MS  F-Value  P-Value
  subject        4   4.4351        34.91%  4.43512  1.10878     1.82    0.288
  drug           1   0.0497         0.39%  0.04970  0.04970     0.08    0.789
  time           3   3.2716        25.75%  3.27159  1.09053    10.85    0.000
  subject*drug   4   2.4365        19.18%  2.43654  0.60913     6.06    0.002
  drug*time      3   0.0988         0.78%  0.09885  0.03295     0.33    0.805
Error           24   2.4125        18.99%  2.41254  0.10052
Total           39  12.7043       100.00%

Model Summary
       S    R-sq  R-sq(adj)   PRESS  R-sq(pred)
0.317053  81.01%     69.14%  6.7015      47.25%

Coefficients
Term             Coef  SE Coef        95% CI        T-Value  P-Value   VIF
Constant       1.1762   0.0501  ( 1.0728,  1.2797)    23.46    0.000
subject
  1             0.610    0.100  (  0.403,   0.817)     6.08    0.000     *
  2            -0.161    0.100  ( -0.368,   0.046)    -1.61    0.121     *
  3             0.052    0.100  ( -0.154,   0.259)     0.52    0.605     *
  4            -0.133    0.100  ( -0.339,   0.074)    -1.32    0.199     *
drug
  A           -0.0352   0.0501  (-0.1387,  0.0682)    -0.70    0.489  1.00
time
  1           -0.2572   0.0868  (-0.4365, -0.0780)    -2.96    0.007  1.50
  2            0.4257   0.0868  ( 0.2465,  0.6050)     4.90    0.000  1.50
  3            0.0967   0.0868  (-0.0825,  0.2760)     1.11    0.276  1.50
subject*drug
  1 A          -0.316    0.100  ( -0.523,  -0.109)    -3.15    0.004     *
  2 A           0.385    0.100  (  0.178,   0.592)     3.84    0.001     *
  3 A           0.154    0.100  ( -0.053,   0.361)     1.54    0.138     *
  4 A          -0.174    0.100  ( -0.380,   0.033)    -1.73    0.096     *
drug*time
  A 1          0.0443   0.0868  (-0.1350,  0.2235)     0.51    0.615  1.50
  A 2          0.0532   0.0868  (-0.1260,  0.2325)     0.61    0.545  1.50
  A 3         -0.0617   0.0868  (-0.2410,  0.1175)    -0.71    0.484  1.50

Regression Equation
y = 1.1762 + 0.610 subject_1 - 0.161 subject_2 + 0.052 subject_3 - 0.133 subject_4 - 0.369 subject_5
    - 0.0352 drug_A + 0.0352 drug_B - 0.2572 time_1 + 0.4257 time_2 + 0.0967 time_3 - 0.2652 time_6
    - 0.316 subject*drug_1 A + 0.316 subject*drug_1 B + 0.385 subject*drug_2 A
    - 0.385 subject*drug_2 B + 0.154 subject*drug_3 A - 0.154 subject*drug_3 B
    - 0.174 subject*drug_4 A + 0.174 subject*drug_4 B - 0.050 subject*drug_5 A
    + 0.050 subject*drug_5 B + 0.0443 drug*time_A 1 + 0.0532 drug*time_A 2 - 0.0617 drug*time_A 3
    - 0.0357 drug*time_A 6 - 0.0443 drug*time_B 1 - 0.0532 drug*time_B 2 + 0.0617 drug*time_B 3
    + 0.0357 drug*time_B 6
Equation treats random terms as though they are fixed.

Fits and Diagnostics for Unusual Observations
Obs      y    Fit  SE Fit      95% CI       Resid  Std Resid  Del Resid   HI  Cook’s D     DFITS
 29  0.320  0.951   0.201  (0.537, 1.365)  -0.631      -2.57      -2.95  0.4      0.28  -2.41204  R
 30  2.120  1.625   0.201  (1.211, 2.039)   0.495       2.02       2.16  0.4      0.17   1.76759  R
 37  1.480  0.591   0.201  (0.177, 1.005)   0.889       3.62       5.26  0.4      0.55   4.29408  R
R  Large residual

Expected Mean Squares, using Adjusted SS
   Source        Expected Mean Square for Each Term
1  subject       (6) + 4.0000 (4) + 8.0000 (1)
2  drug          (6) + 4.0000 (4) + Q[2, 5]
3  time          (6) + Q[3, 5]
4  subject*drug  (6) + 4.0000 (4)
5  drug*time     (6) + Q[5]
6  Error         (6)

Variance Components, using Adjusted SS
Source         Variance  % of Total     StDev  % of Total
subject       0.0624559      21.53%  0.249912      46.40%
subject*drug   0.127153      43.83%  0.356585      66.20%
Error          0.100523      34.65%  0.317053      58.86%
Total          0.290131              0.538638
 
Residual Plots for y  

Comments:
---------
The residual plots show a very strong outlier: observation 37, which is
the first measurement for subject 5 with drug B. It is the highest in
that series, contrasting all other series which peak after 1 hour. Also,
the value is higher than all values for subject 5 with drug A. The
P-value computed from the deletion residual of 5.26 is about 0.001. 
Before deciding about this observation, we consider a possible
transformation of the outcome; the residual plots showed patterns
that could indicate other problems than just a single outlying
observation. A Box-Cox analysis for the fixed effects model suggested an
optimal power for transformation of 0.5, and as shown below observation
37 remains an extreme outlier also after transformation (in fact, the
deletion residual increases to 5.66).

MTB > GLM;
SUBC>   Response 'y';
SUBC>   Nodefault;
SUBC>   Categorical 'subject' 'drug' 'time';
SUBC>   Terms subject drug time subject*drug drug*time;
SUBC>   Boxcox;
SUBC>   TExpand;
SUBC>   TMethod;
SUBC>   TAnova;
SUBC>   TSummary;
SUBC>   TCoefficients;
SUBC>   TEquation;
SUBC>   TFactor;
SUBC>   TDiagnostics 0;
SUBC>   Rtype 2;
SUBC>  GFOURPACK.
General Linear Model: y versus subject, drug, time 

Box-Cox transformation
Rounded lambda               0.5
Estimated lambda             0.609761
95% CI for lambda            (0.156261, 1.06226)

...

Fits and Diagnostics for Unusual Observations

Original Response
Obs       y     Fit       95% CI
 29  0.3200  0.8636  (0.5381, 1.2656)
 37  1.4800  0.6121  (0.3442, 0.9566)

Transformed Response
Obs      y'     Fit  SE Fit       95% CI         Resid  Std Resid  Del Resid   HI  Cook’s D
 29  0.5657  0.9293  0.0948  (0.7336, 1.1250)  -0.3636      -3.13      -3.98  0.4      0.41
 37  1.2166  0.7824  0.0948  (0.5867, 0.9781)   0.4342       3.74       5.66  0.4      0.58
Obs     DFITS
 29  -3.25363  R
 37   4.62399  R
y' = transformed response
R  Large residual

Comments:
---------
We therefore decide to remove that observation and rerun the analysis.
Before continuing, we note that the first ANOVA table shows no effects
whatsoever of drug or drug*time. There is some effect of drug*subject,
indicating the importance of the drug*subject variation in the data.

MTB > Copy 'y' c5;
SUBC>   Varnames.
MTB > let c5(37)='*'

MTB > Name C6 "SRES".
MTB > GLM;
SUBC>   Response 'y_1';
SUBC>   Nodefault;
SUBC>   Categorical 'subject' 'drug' 'time';
SUBC>   Random subject;
SUBC>   Terms subject drug time subject*drug drug*time;
SUBC>   TExpand;
SUBC>   TMethod;
SUBC>   TAnova;
SUBC>   TSummary;
SUBC>   TCoefficients;
SUBC>   TEquation;
SUBC>   TFactor;
SUBC>   TEMS;
SUBC>   TVariance;
SUBC>   TDiagnostics 0;
SUBC>   Rtype 2;
SUBC>  GFOURPACK;
SUBC>  SResiduals 'SRES_1'.
General Linear Model: y_1 versus subject, drug, time 

Method
Factor coding  (-1, 0, +1)
Rows unused    1

Factor Information
Factor   Type    Levels  Values
subject  Random       5  1, 2, 3, 4, 5
drug     Fixed        2  A, B
time     Fixed        4  1, 2, 3, 6

Analysis of Variance
Source          DF   Seq SS  Contribution   Adj SS   Adj MS  F-Value  P-Value
  subject        4   4.8574        38.52%  5.36567  1.34142     2.15    0.238
  drug           1   0.0106         0.08%  0.00012  0.00012     0.00    0.989  x
  time           3   3.8263        30.34%  3.99752  1.33251    27.98    0.000
  subject*drug   4   2.4613        19.52%  2.49481  0.62370    13.10    0.000
  drug*time      3   0.3589         2.85%  0.35886  0.11962     2.51    0.084
Error           23   1.0953         8.69%  1.09534  0.04762
Total           38  12.6097       100.00%
x Not an exact F-test.

       S    R-sq  R-sq(adj)    PRESS  R-sq(pred)
0.218228  91.31%     85.65%  3.22384      74.43%

Coefficients
Term             Coef  SE Coef        95% CI        T-Value  P-Value   VIF
Constant       1.1392   0.0352  ( 1.0664,  1.2121)    32.35    0.000
subject
  1            0.6470   0.0694  ( 0.5035,  0.7905)     9.33    0.000     *
  2           -0.1242   0.0694  (-0.2677,  0.0193)    -1.79    0.087     *
  3            0.0895   0.0694  (-0.0540,  0.2330)     1.29    0.210     *
  4           -0.0955   0.0694  (-0.2390,  0.0480)    -1.38    0.182     *
drug
  A            0.0018   0.0352  (-0.0711,  0.0746)     0.05    0.960  1.01
time
  1           -0.3684   0.0634  (-0.4995, -0.2372)    -5.81    0.000  1.60
  2            0.4628   0.0602  ( 0.3383,  0.5873)     7.69    0.000  1.52
  3            0.1338   0.0602  ( 0.0093,  0.2583)     2.22    0.036  1.52
subject*drug
  1 A         -0.3530   0.0694  (-0.4965, -0.2095)    -5.09    0.000     *
  2 A          0.3482   0.0694  ( 0.2047,  0.4917)     5.02    0.000     *
  3 A          0.1170   0.0694  (-0.0265,  0.2605)     1.69    0.105     *
  4 A         -0.2105   0.0694  (-0.3540, -0.0670)    -3.04    0.006     *
drug*time
  A 1          0.1554   0.0634  ( 0.0242,  0.2865)     2.45    0.022  1.60
  A 2          0.0162   0.0602  (-0.1083,  0.1407)     0.27    0.790  1.52
  A 3         -0.0988   0.0602  (-0.2233,  0.0257)    -1.64    0.114  1.52

Regression Equation
y_1 = 1.1392 + 0.6470 subject_1 - 0.1242 subject_2 + 0.0895 subject_3 - 0.0955 subject_4
      - 0.5169 subject_5 + 0.0018 drug_A - 0.0018 drug_B - 0.3684 time_1 + 0.4628 time_2
      + 0.1338 time_3 - 0.2282 time_6 - 0.3530 subject*drug_1 A + 0.3530 subject*drug_1 B
      + 0.3482 subject*drug_2 A - 0.3482 subject*drug_2 B + 0.1170 subject*drug_3 A
      - 0.1170 subject*drug_3 B - 0.2105 subject*drug_4 A + 0.2105 subject*drug_4 B
      + 0.0984 subject*drug_5 A - 0.0984 subject*drug_5 B + 0.1554 drug*time_A 1
      + 0.0162 drug*time_A 2 - 0.0988 drug*time_A 3 - 0.0728 drug*time_A 6 - 0.1554 drug*time_B 1
      - 0.0162 drug*time_B 2 + 0.0988 drug*time_B 3 + 0.0728 drug*time_B 6
Equation treats random terms as though they are fixed.

Fits and Diagnostics for Unusual Observations
Obs    y_1    Fit  SE Fit       95% CI       Resid  Std Resid  Del Resid        HI  Cook’s D
 13  0.620  0.141   0.144  (-0.157, 0.440)   0.479       2.93       3.61  0.437500      0.42
 29  0.320  0.729   0.144  ( 0.430, 1.027)  -0.409      -2.50      -2.86  0.437500      0.30
 30  2.120  1.699   0.139  ( 1.412, 1.986)   0.421       2.50       2.86  0.404167      0.26
Obs     DFITS
 13   3.18371  R
 29  -2.52318  R
 30   2.35811  R
R  Large residual

Expected Mean Squares, using Adjusted SS
   Source        Expected Mean Square for Each Term
1  subject       (6) + 3.8571 (4) + 7.7143 (1)
2  drug          (6) + 3.8400 (4) + Q[2, 5]
3  time          (6) + Q[3, 5]
4  subject*drug  (6) + 3.8571 (4)
5  drug*time     (6) + Q[5]
6  Error         (6)

Variance Components, using Adjusted SS
Source         Variance  % of Total     StDev  % of Total
subject       0.0930372      32.08%  0.305020      56.64%
subject*drug   0.149354      51.50%  0.386463      71.76%
Error         0.0476234      16.42%  0.218228      40.52%
Total          0.290014              0.538530
 
Residual Plots for y_1 

MTB > NormTest 'SRES'.
Probability Plot of SRES 
The P-value for the Anderson-Darling test is 0.032.

Comments:
---------
The model without observation 37 has no longer any strong residuals, but
the residual plots looks strange, possibly indicating a right-skewed
distribution. Quite amazingly, the drug*time effect is now close to
significance, at a P-value of 0.084. The effect lies in the comparison
between drugs at 1 hour, where now - after the removal of obs. 37 - drug
B lies lower than drug A. Obviously, such a conclusion would have to be 
taken with some reservation, by the strong dependence on the removal of obs. 37.

For this model, one may try alternative correlation structures for the
repeated measures on the subjects (see SAS and Stata analyses). The results 
show that assuming equal correlation among all time points is actually a
quite good assumption for these data.

As the residuals were not yet fully satisfactory, we reconsider the option
of a transformation. Also without obs. 37 the optimal Box-Cox power is 
around 0.5, but as shown below the analysis on square-root transformed scale
shows another extreme observation, namely no. 29. 

MTB > GLM;
SUBC>   Response 'y_1';
SUBC>   Nodefault;
SUBC>   Categorical 'subject' 'drug' 'time';
SUBC>   Terms subject drug time subject*drug drug*time;
SUBC>   Boxcox;
SUBC>   TExpand;
SUBC>   TMethod;
SUBC>   TAnova;
SUBC>   TSummary;
SUBC>   TCoefficients;
SUBC>   TEquation;
SUBC>   TFactor;
SUBC>   TDiagnostics 0;
SUBC>   Rtype 2;
SUBC>  GFOURPACK.
General Linear Model: y_1 versus subject, drug, time 

Method
Factor coding           (-1, 0, +1)
Rows unused             1

Box-Cox transformation
Rounded lambda               0.5
Estimated lambda             0.566831
95% CI for lambda            (0.230331, 0.900331)

...

Fits and Diagnostics for Unusual Observations

Original Response
Obs     y_1     Fit       95% CI
 13  0.6200  0.3146  (0.1810, 0.4849)
 29  0.3200  0.6736  (0.4697, 0.9143)
 30  2.1200  1.6342  (1.3183, 1.9839)

Transformed Response
Obs    y_1'     Fit  SE Fit       95% CI         Resid  Std Resid  Del Resid        HI  Cook’s D
 13  0.7874  0.5609  0.0655  (0.4255, 0.6963)   0.2265       3.05       3.87  0.437500      0.45
 29  0.5657  0.8207  0.0655  (0.6853, 0.9562)  -0.2551      -3.44      -4.82  0.437500      0.57
 30  1.4560  1.2784  0.0629  (1.1482, 1.4085)   0.1777       2.33       2.60  0.404167      0.23
Obs     DFITS
 13   3.41106  R
 29  -4.24802  R
 30   2.14180  R
y_1' = transformed response
R  Large residual

Comments:
---------
Observation no. 29 is the first observation (1 hour) for subject 4 with drug B. 
Compared to the other observations in that series, it appears quite low
but not necessarily too unusual. It is only after transformation that
this becomes unusually low. At this point we have several choices. We
could go back to the first analysis of the full dataset on original
scale, acknowledging that the model assumptions are not fully met. We
could also remove both observations (29 and 37) and reconsider the
analysis on either original or transformed scale. This has the advantage
of "balancing" the removal of observations because we would remove one
low and one high value for drug B and 1 hour, thereby presumably not
affecting the drug*time interaction too strongly. Without these
observations the Box-Cox analysis gives an optimal power of 0.16, and 
no evidence against a log-transformation. To complete the analysis, we 
therefore also try a log-transformation of the outcome, without
both of these extreme outliers.

MTB > Copy 'y_1' c7;
SUBC>   Varnames.
MTB > let c7(29)='*'

MTB > GLM;
SUBC>   Response 'y_1_1';
SUBC>   Nodefault;
SUBC>   Categorical 'subject' 'drug' 'time';
SUBC>   Terms subject drug time subject*drug drug*time;
SUBC>   Boxcox;
SUBC>   TExpand;
SUBC>   TMethod;
SUBC>   TAnova;
SUBC>   TSummary;
SUBC>   TCoefficients;
SUBC>   TEquation;
SUBC>   TFactor;
SUBC>   TDiagnostics 0;
SUBC>   Rtype 2;
SUBC>  GFOURPACK.
General Linear Model: y_1_1 versus subject, drug, time 

Method
Factor coding           (-1, 0, +1)
Rows unused             2

Box-Cox transformation
Rounded lambda               0
Estimated lambda             0.157766
95% CI for lambda            (-0.138734, 0.452266)

MTB > Name C8 'lny'
MTB > Let 'lny' = ln('y_1_1')


Factor Information

Factor   Type   Levels  Values
subject  Fixed       5  1, 2, 3, 4, 5
drug     Fixed       2  A, B
time     Fixed       4  1, 2, 3, 6


Analysis of Variance for Transformed Response

Source          DF   Seq SS  Contribution   Adj SS    Adj MS  F-Value  P-Value
  subject        4  3.39529        41.19%  3.62168  0.905420    53.95    0.000
  drug           1  0.00000         0.00%  0.00446  0.004461     0.27    0.611
  time           3  2.51007        30.45%  2.38912  0.796373    47.45    0.000
  subject*drug   4  1.91904        23.28%  1.87426  0.468564    27.92    0.000
  drug*time      3  0.04857         0.59%  0.04857  0.016191     0.96    0.427
Error           22  0.36923         4.48%  0.36923  0.016783
Total           37  8.24220       100.00%


Model Summary for Transformed Response

       S    R-sq  R-sq(adj)    PRESS  R-sq(pred)
0.129550  95.52%     92.47%  1.18400      85.63%


Coefficients for Transformed Response

Term             Coef  SE Coef        95% CI        T-Value  P-Value   VIF
Constant       0.0447   0.0216  (-0.0000,  0.0895)     2.07    0.050
subject
  1            0.4894   0.0415  ( 0.4033,  0.5756)    11.78    0.000  1.54
  2           -0.1272   0.0415  (-0.2134, -0.0411)    -3.06    0.006  1.54
  3            0.1131   0.0415  ( 0.0269,  0.1992)     2.72    0.012  1.54
  4            0.0016   0.0442  (-0.0901,  0.0934)     0.04    0.971  1.63
drug
  A            0.0111   0.0216  (-0.0336,  0.0559)     0.52    0.611  1.05
time
  1           -0.2395   0.0410  (-0.3244, -0.1545)    -5.85    0.000  1.79
  2            0.3578   0.0361  ( 0.2829,  0.4328)     9.90    0.000  1.56
  3            0.1220   0.0361  ( 0.0470,  0.1969)     3.38    0.003  1.56
subject*drug
  1 A         -0.2118   0.0415  (-0.2979, -0.1256)    -5.10    0.000  1.54
  2 A          0.3378   0.0415  ( 0.2516,  0.4239)     8.13    0.000  1.54
  3 A          0.0931   0.0415  ( 0.0070,  0.1792)     2.24    0.035  1.54
  4 A         -0.2776   0.0442  (-0.3693, -0.1858)    -6.27    0.000  1.63
drug*time
  A 1          0.0518   0.0410  (-0.0332,  0.1368)     1.26    0.219  1.79
  A 2          0.0208   0.0361  (-0.0541,  0.0957)     0.58    0.571  1.56
  A 3         -0.0363   0.0361  (-0.1112,  0.0386)    -1.01    0.326  1.56


Regression Equation

ln(y_1_1) = 0.0447 + 0.4894 subject_1 - 0.1272 subject_2 + 0.1131 subject_3 + 0.0016 subject_4
            - 0.4769 subject_5 + 0.0111 drug_A - 0.0111 drug_B - 0.2395 time_1 + 0.3578 time_2
            + 0.1220 time_3 - 0.2403 time_6 - 0.2118 subject*drug_1 A + 0.2118 subject*drug_1 B
            + 0.3378 subject*drug_2 A - 0.3378 subject*drug_2 B + 0.0931 subject*drug_3 A
            - 0.0931 subject*drug_3 B - 0.2776 subject*drug_4 A + 0.2776 subject*drug_4 B
            + 0.0585 subject*drug_5 A - 0.0585 subject*drug_5 B + 0.0518 drug*time_A 1
            + 0.0208 drug*time_A 2 - 0.0363 drug*time_A 3 - 0.0363 drug*time_A 6
            - 0.0518 drug*time_B 1 - 0.0208 drug*time_B 2 + 0.0363 drug*time_B 3
            + 0.0363 drug*time_B 6


Fits and Diagnostics for Unusual Observations

Original Response

Obs   y_1_1     Fit       95% CI
 13  0.6200  0.4855  (0.4015, 0.5870)
 21  0.6500  0.7884  (0.6520, 0.9534)


Transformed Response

Obs   y_1_1'      Fit  SE Fit        95% CI          Resid  Std Resid  Del Resid   HI  Cook’s D
 13  -0.4780  -0.7226  0.0916  (-0.9126, -0.5327)   0.2446       2.67       3.17  0.5      0.45
 21  -0.4308  -0.2377  0.0916  (-0.4277, -0.0477)  -0.1931      -2.11      -2.31  0.5      0.28

Obs     DFITS
 13   3.17322  R
 21  -2.30515  R

y_1_1' = transformed response
R  Large residual
 
Residual Plots for y_1_1 

MTB > Name C9 'lny'
MTB > Let 'lny' = ln('y_1_1')

MTB > GLM;
SUBC>   Response 'lny';
SUBC>   Nodefault;
SUBC>   Categorical 'subject' 'drug' 'time';
SUBC>   Random subject;
SUBC>   Terms subject drug time subject*drug drug*time;
SUBC>   Means drug time;
SUBC>   TExpand;
SUBC>   TMethod;
SUBC>   TAnova;
SUBC>   TSummary;
SUBC>   TCoefficients;
SUBC>   TEquation;
SUBC>   TFactor;
SUBC>   TEMS;
SUBC>   TVariance;
SUBC>   TMeans;
SUBC>   TDiagnostics 0;
SUBC>   Rtype 2;
SUBC>  GFOURPACK.
General Linear Model: lny versus subject, drug, time 

Method
Factor coding  (-1, 0, +1)
Rows unused    2

Factor Information
Factor   Type    Levels  Values
subject  Random       5  1, 2, 3, 4, 5
drug     Fixed        2  A, B
time     Fixed        4  1, 2, 3, 6

Analysis of Variance
Source          DF   Seq SS  Contribution   Adj SS    Adj MS  F-Value  P-Value
  subject        4  3.39529        41.19%  3.62168  0.905420     1.93    0.270
  drug           1  0.00000         0.00%  0.00446  0.004461     0.01    0.926  x
  time           3  2.51007        30.45%  2.38912  0.796373    47.45    0.000
  subject*drug   4  1.91904        23.28%  1.87426  0.468564    27.92    0.000
  drug*time      3  0.04857         0.59%  0.04857  0.016191     0.96    0.427
Error           22  0.36923         4.48%  0.36923  0.016783
Total           37  8.24220       100.00%
x Not an exact F-test.

       S    R-sq  R-sq(adj)    PRESS  R-sq(pred)
0.129550  95.52%     92.47%  1.18400      85.63%

Coefficients
Term             Coef  SE Coef        95% CI        T-Value  P-Value   VIF
Constant       0.0447   0.0216  (-0.0000,  0.0895)     2.07    0.050
subject
  1            0.4894   0.0415  ( 0.4033,  0.5756)    11.78    0.000     *
  2           -0.1272   0.0415  (-0.2134, -0.0411)    -3.06    0.006     *
  3            0.1131   0.0415  ( 0.0269,  0.1992)     2.72    0.012     *
  4            0.0016   0.0442  (-0.0901,  0.0934)     0.04    0.971     *
drug
  A            0.0111   0.0216  (-0.0336,  0.0559)     0.52    0.611  1.05
time
  1           -0.2395   0.0410  (-0.3244, -0.1545)    -5.85    0.000  1.79
  2            0.3578   0.0361  ( 0.2829,  0.4328)     9.90    0.000  1.56
  3            0.1220   0.0361  ( 0.0470,  0.1969)     3.38    0.003  1.56
subject*drug
  1 A         -0.2118   0.0415  (-0.2979, -0.1256)    -5.10    0.000     *
  2 A          0.3378   0.0415  ( 0.2516,  0.4239)     8.13    0.000     *
  3 A          0.0931   0.0415  ( 0.0070,  0.1792)     2.24    0.035     *
  4 A         -0.2776   0.0442  (-0.3693, -0.1858)    -6.27    0.000     *
drug*time
  A 1          0.0518   0.0410  (-0.0332,  0.1368)     1.26    0.219  1.79
  A 2          0.0208   0.0361  (-0.0541,  0.0957)     0.58    0.571  1.56
  A 3         -0.0363   0.0361  (-0.1112,  0.0386)    -1.01    0.326  1.56

Regression Equation
lny = 0.0447 + 0.4894 subject_1 - 0.1272 subject_2 + 0.1131 subject_3 + 0.0016 subject_4
      - 0.4769 subject_5 + 0.0111 drug_A - 0.0111 drug_B - 0.2395 time_1 + 0.3578 time_2
      + 0.1220 time_3 - 0.2403 time_6 - 0.2118 subject*drug_1 A + 0.2118 subject*drug_1 B
      + 0.3378 subject*drug_2 A - 0.3378 subject*drug_2 B + 0.0931 subject*drug_3 A
      - 0.0931 subject*drug_3 B - 0.2776 subject*drug_4 A + 0.2776 subject*drug_4 B
      + 0.0585 subject*drug_5 A - 0.0585 subject*drug_5 B + 0.0518 drug*time_A 1
      + 0.0208 drug*time_A 2 - 0.0363 drug*time_A 3 - 0.0363 drug*time_A 6 - 0.0518 drug*time_B 1
      - 0.0208 drug*time_B 2 + 0.0363 drug*time_B 3 + 0.0363 drug*time_B 6
Equation treats random terms as though they are fixed.

Fits and Diagnostics for Unusual Observations
Obs      lny      Fit  SE Fit        95% CI          Resid  Std Resid  Del Resid   HI  Cook’s D
 13  -0.4780  -0.7226  0.0916  (-0.9126, -0.5327)   0.2446       2.67       3.17  0.5      0.45
 21  -0.4308  -0.2377  0.0916  (-0.4277, -0.0477)  -0.1931      -2.11      -2.31  0.5      0.28
Obs     DFITS
 13   3.17322  R
 21  -2.30515  R
R  Large residual

Expected Mean Squares, using Adjusted SS
   Source        Expected Mean Square for Each Term
1  subject       (6) + 3.7143 (4) + 7.4286 (1)
2  drug          (6) + 3.6000 (4) + Q[2, 5]
3  time          (6) + Q[3, 5]
4  subject*drug  (6) + 3.7143 (4)
5  drug*time     (6) + Q[5]
6  Error         (6)

Means
Term  Fitted Mean
drug
  A      0.055873
  B      0.033610
time
  1     -0.194718
  2      0.402571
  3      0.166715
  6     -0.195602

Variance Components, using Adjusted SS
Source         Variance  % of Total     StDev  % of Total
subject       0.0588075      29.82%  0.242503      54.61%
subject*drug   0.121633      61.67%  0.348760      78.53%
Error         0.0167833       8.51%  0.129550      29.17%
Total          0.197224              0.444099
 
Residual Plots for lny 

Comments:
---------
After removal of two outlying observations the residuals look very nice. 
Similar to the original model, there is indication of significance for the 
drug*time interaction (contrary to the models without obs. 37). Also the drug
main effect is non-significant, so we conclude that there is no sign of 
difference between the drugs.

With the non-significant drug effects we do not need to worry about the possible 
effects of violation of the assumptions in the model from repeated measures over 
time, because those would tend to increase our P-values even more (if the data
do not show sphericity). Analyses in SAS and Stata confirm that our conclusions 
do not change when taking the repeated measures into account.

Analysis at single time points does (not surprisingly) show absolutely
no difference between drugs at any time.

Addendum:
---------
The above analysis did not include checks of the distribution of subject
random effects. Averaging over subject and drug, and analyzing these
means in a two-way ANOVA shows that there are no problems with the
subject random effects, neither for the untransformed nor for the
transformed data (results not shown).