Extra exercise 5
----------------

The results of this run of the PSLS "Probability"-applet were typed
into Minitab, and displayed together with the observed proportion and 
the deviations from the expected numbers for both proportions and counts:

MTB > Name C1 'trials'
MTB > Name C2 'count'
MTB > Name C3 'proportion'
MTB > Formula 'proportion' = 'count'/'trials'
MTB > Name C4 'diffprop'
MTB > Formula 'diffprop' = 'proportion'-0.5
MTB > Name C5 'diffcount'
MTB > Formula 'diffcount' = 'count'-0.5*'trials'
MTB > Print 'trials'-'diffcount'.
Data Display 

Row  trials  count  proportion    diffprop  diffcount
  1      50     23    0.460000  -0.0400000         -2
  2     100     50    0.500000   0.0000000          0
  3     200     93    0.465000  -0.0350000         -7
  4     400    183    0.457500  -0.0425000        -17
  5    1000    473    0.473000  -0.0270000        -27
  6    2000    979    0.489500  -0.0105000        -21
  7    4000   1963    0.490750  -0.0092500        -37
  8    6000   2975    0.495833  -0.0041667        -25
  9    8000   3958    0.494750  -0.0052500        -42
 10   10000   4967    0.496700  -0.0033000        -33

From the discussion in the lecture and the text(s) we would expect the
observed proportions to get closer to 0.5 as the number of trials
increase. This does indeed appear to be the case, but perhaps not as
rapidly as one would expect. The deviation from 0.5 after 10,000 trials
above is well within the statistical uncertainty (we will later in the
course see how to assess this).

It is less clear from the results above whether the deviations between
the observed and expected counts will stabilize as well. It can be
shown mathematically that a stabilization will not happen, and in fact
the difference between the observed and expected counts will not remain 
within any bounds around zero, as n increases. For example, the probability 
that these two numbers differ by at most 100 will tend to zero as n increases 
to very large numbers. It is a mathematical result and in a sense of less 
practical use when one does not know how large n needs to be for this to happen. 

From a more practical perspective we can just note that the difference 
between observed and expected counts is "unpredictable" in the sense that 
it can be large in either direction. The limitation to how large it can be, 
follows from when dividing the difference by n, the value will get 
successively smaller (towards zero) as n increases.