Supplementary exercise 9.50 of IPS7e ------------------------------------ Data: performance assessments (partial, usual, continuous; see text for further explanation) of employees in two age groups (under and above 40 years). Minitab commands and output: MTB > WOpen "H:\VHM\VHM801\Datasets\Minitab\Chapter 9\ex09_050.mtw". Retrieving worksheet from file: 'H:\VHM\VHM801\Datasets\Minitab\Chapter 9\ex09_050.mtw' Worksheet was saved on 24/10/2014 MTB > XTabs 'perf' 'over40'; SUBC> Layout 1 1; SUBC> Frequencies 'count'; SUBC> Counts; SUBC> ColPercents; SUBC> ChiSquare; SUBC> Expected; SUBC> XResiduals; SUBC> DMissing 'perf' 'over40'. Tabulated Statistics: perf, over40 Using frequencies in count Rows: perf Columns: over40 n y All contin 63 32 95 12.55 4.20 7.52 37.8 57.2 16.872 11.130 part 82 237 319 16.33 31.14 25.26 126.8 192.2 15.824 10.438 usual 357 492 849 71.12 64.65 67.22 337.4 511.6 1.133 0.747 All 502 761 1263 100.00 100.00 100.00 Cell Contents: Count % of Column Expected count Contribution to Chi-square Pearson Chi-Square = 56.144, DF = 2, P-Value = 0.000 (Likelihood Ratio Chi-Square = 56.969, DF = 2, P-Value = 0.000) Minitab technical note: in order to avoid reordering af rows and columns by their labels, one can set the value ordering of variables in the "Editor-Column properties-Value order" menu (available only when the worksheet is the active window, or by right-clicking a highlighted column of interest). Comments: -------- Only little information is given about how the data were collected. It seems most natural that two samples of employees were taken, one for each age group. Then age group is not a response variable (it is instead an explanatory variable). Performance is a response variable. Thus, the appropriate model is two independent multinomial distributions, one for each column in the table (age group). We assume multinomial settings within each age group. Parameter estimates are the sample proportions within each group, as listed in the above table. The values are quite close for the middle outcome ("usually") but differ considerably for the other outcomes. The above 40 age group shows a higher proportion of performance assessments as "partial", and conversely the under 40 age group shows a higher proportion of "continually" performances. Null hypothesis H0: same proportions in the two groups, two-sided alternative hypothesis Ha: different proportions in the two groups. Test statistic: X^2 = 56.1, df = (3-1)*(2-1) = 2, P-value < 0.0005. All cells have large expected values, so the chi-square distribution may be considered a valid reference distribution. Conclusion: the P-value is strongly significant and gives clear evidence of different performances in the groups of employees below and above 40 years of age. The under 40 group received more assessments indicating continual exceedance of expectations than the above 40 group, and the over 40 group received more assessments indicating (only) expectations to be met than the younger group.