4.8. Central Limit Theorem¶

We know that many random variables naturally fit the normal distribution model. It turns out that random variables from other distributions can be mapped to a normal distribution by the central limit theorem.

From a population that has mean $\mu$ and variance $\sigma^2$ , draw $m$ sampling sets, each of size $n$ . The central limit theorem says that when $n$ is large, the distribution of the sample means and sample sums is approximately normal regardless of the underlying population distribution. For each sampling of random variable $X$ , $X_i$ , let $\bar{X_i}$ be the sample mean, and let $Y_i$ be the sample sum.

$\begin{array}{ll} \bar{X_i} &= \frac{1}{n} \sum_{j = 1}^n x_{i,j} \\ \\ Y_i &= \sum_{j = 1}^n x_{i,j} \end{array}$

We also define variable $Z$ as follows.

$Z = \frac{\bar{X} - \mu}{\frac{\sigma}{\sqrt{n}}} = \frac{Y - n\mu}{\frac{\sigma}{\sqrt{n}}}$

Then we can define normal distributions from $\bar{X}$ , $Y$ , and $Z$ .

$\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right), \mbox{ and } Z \sim N(0, 1)$

Let’s see this in action. We will start with, $X$ representing the six sides of a die, so we use a discrete uniform distribution. We will put the data in a $100{\times}100$ matrix so that each column will be a sampling. Then we can find the mean and sum of each column to get new random variables with normal distributions.

>> n = 100;
>> X = randi(6, n);  % 100 x 100
>> X_bar = mean(X);  % 1 x 100
>> mu = mean(X(:))
mu =
    3.4960           % 3.5 expected
>> sigma = std(X(:))
sigma =
    1.7025           % 35/12 = 2.92 expected

% Make Z ~ N(0, 1)
>> Z = (X_bar - mu)/(sigma/sqrt(n));
>> mean(Z)
ans =
   -1.9895e-15
>> var(Z)           % Z ~ N(0,1)
ans =
    0.9791

../_images/CLTplot.png — Fig. 4.14 Histograms of $X \sim U(1, 6)$ and $Z \sim N(0, 1)$ .¶

Fig. 4.14 Histograms of $X \sim U(1, 6)$ and $Z \sim N(0, 1)$ .¶

4.8. Central Limit Theorem¶

Previous topic

Next topic

This Page