Supplementary exercise 4.9 of IPS7e ----------------------------------- Simulation of draws of Internet users' age group; specifically whether a randomly selected Internet user is of age 18-29 years. The (true) probability of this event is assumed to be 0.3. We use the Probability applet to carry out the simulation, as described in the exercise. The PSLS applet does not allow a sample size of 20, so we use 15 (the default) instead for part (a). It is recommended to type the results (number of heads ~ Internet users) into a column in a Minitab worksheet, in order to facilitate processing of the results. Minitab commands and output for the two distributions (obtained in one session with the Probability applet, resetting between the two series): MTB > name c1 "count15" MTB > name c2 "count200" * type the counts into these columns * MTB > Name C3 'prop15' MTB > Let 'prop15' = 'count15'/15 MTB > Name C4 'prop200' MTB > Let 'prop200' = 'count200'/200 MTB > Describe 'prop15' 'prop200'. Descriptive Statistics: prop15, prop200 Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum prop15 25 0 0.2747 0.0201 0.1006 0.0667 0.2000 0.2667 0.3333 0.4667 prop200 25 0 0.30440 0.00711 0.03554 0.24000 0.28250 0.30500 0.32500 0.39500 MTB > Dotplot 'prop15' 'prop200'; SUBC> Overlay. Dotplot of prop15, prop200 MTB > Boxplot 'prop15' 'prop200'; SUBC> Overlay; SUBC> IQRBox; SUBC> Outlier. Boxplot of prop15, prop200 MTB > Stem-and-Leaf 'prop15' 'prop200'; SUBC> Increment .05. Stem-and-Leaf Display: prop15, prop200 Stem-and-leaf of prop15 N = 25 Leaf Unit = 0.010 1 0 6 2 1 3 2 1 9 2 0000000 (8) 2 66666666 8 3 3333 4 3 4 4 0 3 4 666 Stem-and-leaf of prop200 N = 25 Leaf Unit = 0.010 2 2 44 11 2 568889999 (12) 3 000112222444 2 3 59 We may manually align the two stemplots to faciliate the comparison: prop15 prop200 1 0 6 2 1 3 2 1 9 2 0000000 2 2 44 (8) 2 66666666 11 2 568889999 8 3 3333 (12) 3 000112222444 4 3 2 3 59 4 4 0 3 4 666 Comments: --------- As also indicated in the exercise, the distribution for 15 tosses is more variable and not centered as closely to the true probability (0.3) as the distribution for 200 tosses. See Exercise 4.10 for a numerical comparison of the variability based on a larger number of repetitions.