Extra Exercise 10
-----------------
The numbers throughout this solution correspond to one run with the
applets. Other runs may give different results, this is the nature of
simulation.

Law of Large Numbers: Dice Rolling Example applet:
--------------------------------------------------
(a)
Assuming all outcomes to be equally likely (as is done in the applet),
we would expect the proportion for each outcome to be somewhere close to
1/6=0.167. With the default seed for the random numbers (123), the proportion 
for 2 spots is quite a bit higher (0.22) than expected, and the proportions 
for 1 and 4 spots (both 0.14) are a bit lower.

From the graph it appears that the first four averages are: 4, 4.5, 5
and 4. That would correspond to getting the outcomes 4,5,6,1 in the
first four rolls.

(b)
The graph shows that with more rolls (up to 50 for a start), the
average moves closer to 3.5. That is what we would expect from the Law
of large numbers (LLN), because the mean in the distribution is 3.5. 
If X denotes the outcome of a single die, we have
  EX = 1*(1/6) + 2*(1/6) + 3*(1/6) + 4*(1/6) + 5*(1/6) + 6*(1/6) = 21/6 = 3.5

After 10,000 rolls, the sample proportions for the 6 outcomes are all
very close to 1/6, and the average is close to 3.5. No value is given so
it's difficult to say how close it is exactly. The graph shows only
major fluctuations around 3.5 for the first few hundred rolls,
thereafter the average gets ever closer to 3.5. That is the expected
behaviour from the LLN.

Sampling distribution applet:
-----------------------------
(i)
The top graph shows a black rectangle representing the density curve of 
the Uniform distribution on the interval (0,50). The density curve is 
flat, and the mean and median of the distribution both equal 25. 
For a sample size of 2 and "1 time", the middle graph shows the 
two samples drawn, at their values. The mean and the median are the
same. The bottom graph shows the single average obtained, at the same
value as the listed mean in the middle display. For a single value, the
mean and median are also the same.

After resetting and applying "5 times", the top display is unchanged,
the middle display shows the two values obtained in the last run, and
the bottom display shows the 5 averages from the 5 trials in a layout
similar to a histogram.

(ii)
When repeating the experiment a large number of times (well beyond
1000), the histogram at the bottom approaches the triangular
distribution shown on slide 5L-5 for n=2.

(iii)
With a sample size of 12, say, and many repetitions the histogram seems
to be bell-shaped, corresponding to a normal distribution. It would
be helpful to overlay the normal distribution curve, but the applet does
not seem to allow that.

(iv)
For a bell-shaped (i.e., normal) distribution, the histogram at the
bottom should be normal regardless of the sample size. An average of
normally distributed (i.i.d) variables is always exactly normal. 

For a right-skewed distribution, the average of two observations is
still very right-skewed, and it takes a pretty large sample size to
eliminate the skewness. For n=20 the histogram looks quite good, but
mean remains a bit larger than the median, corresponding to a slight
right-skewness.

With a binary distribution, the number of bins in the histogram will be
the sample size +1 (for low sample sizes). This limits how closely the
distribution can be approximated by a normal distribution. Visually, the
approximation apeears quite good for
- for n=10 for the binary distribution with p=0.5
- for n=15 for the binary distribution with p=0.7
- for n=30 for the binary distribution with p=0.9

The findings above show that a skewness in the distribution of each component 
of the sum (average) strongly affects with quality of the approximation.
For the binary distributions with p=0.9, it takes a large number of
observations to smooth out the contributions from the large peak at 1.