7 Population Models

7.5 The Normal Distribution

Carl Friedrich Gauss invented the normal distribution, perhaps the most important distribution in statistics. Here is a technical definition: a continuous random variable X has a normal distribution with mean μ and standard deviation σ if the p.d.f. is

f(x)=12πσe-(x-μ)22σ2,-<x<.

Figure 7.10 gives a plot of the distribution.

Figure 7.10: Normal Distribution Plot

Things to know:

  1. 1.

    Our shorthand for “X is normally distributed with mean μ and standard deviation σ” will be XN(μ,σ). So, for example, XN(3,8) means X is normally distributed with mean μ=3 and standard deviation σ=8.

  2. 2.

    The distribution is symmetric and bell-shaped, and is centered at the mean μ. Thus, the mean locates the center of the distribution. The standard deviation σ controls the spread of the distribution. The location and spread are illustrated in Figure 7.11.

    Figure 7.11: Different Normal Distributions
  3. 3.

    The graph of y=f(x) has exactly two inflection points. They are located horizontally at μ-σ and μ+σ, respectively.

  4. 4.

    If XN(μ,σ), then P(X>μ)=P(X<μ)=0.5.

Other then a few very special cases, it is not possible to compute exact areas under a normal distribution, making approximation techniques necessary. For our purposes, Excel has two built-in commands for working with a normal distribution that we will use frequently. They are covered in the next to sections.

7.5.1 Excel’s NORM.DIST

The Excel command NORM.DIST approximates areas for a normal distribution. Given XN(μ,σ) and any real number x, then

P(X<x)NORM.DIST(x,μ,σ,TRUE)

It’s vital to recognize that the command gives the area to the left of x, as illustrated below.

Figure 7.12: P(X<x)NORM.DIST(x,μ,σ,TRUE)

This means that if XN(μ,σ) and if you want to approximate P(X>x) using NORM.DIST, you must subtract the result of the command from 1, that is,

P(X>x)1 - NORM.DIST(x,μ,σ,TRUE).

Similarly, if a and b are real numbers with a<b, then

P(a<X<b)=P(X<b)-P(X<a),

and so

P(a<X<b)NORM.DIST(b,μ,σ,TRUE)-NORM.DIST(a,μ,σ,TRUE).
Figure 7.13: P(a<X<b)

These computations are illustrated in the next example.

Example 7.5.1.

Suppose that XN(3,2), so that X is normally distributed with mean μ=3 and standard deviation σ=2. A graph of the distribution is given below.

Figure 7.14: Plot of XN(3,2)

Some probability calculations in Excel:

  1. 1.

    P(X<1)NORM.DIST(𝟷,𝟹,𝟸,TRUE)0.158655254

    Figure 7.15: Area to the Left of 1
  2. 2.

    P(X>1)𝟷-NORM.DIST(𝟷,𝟹,𝟸,TRUE)0.841344746

    Figure 7.16: Area to the Right of 1
  3. 3.
    P(1<X<4)=P(X<4)-P(X<1)NORM.DIST(𝟺,𝟹,𝟸,TRUE)-NORM.DIST(𝟷,𝟹,𝟸,TRUE)0.532807207
    Figure 7.17: Area between 1 and 4

7.5.2 Excel’s NORM.INV

Excel’s NORM.INV reverses the process of NORM.DIST. That is, where NORM.DIST computes an area to the left of a given value of X, the command NORM.INV computes a value for X for given area to the left. Suppose that XN(μ,σ) and suppose that you want to find the value x such that P(X<x)=A, for some desired area A. Then

xNORM.INV(A,μ,σ).

In other words, NORM.INV computes percentile scores of normal distributions. Following are some example computations.

Example 7.5.2.

Suppose again that XN(3,2).

  1. 1.

    The value x where P(X<x)=0.8 is

    xNORM.INV(0.8,𝟹,𝟸)4.683242467.

    Thus, the 80th percentile of X is approximately 4.683242467.

    Figure 7.18: 80th Percentile of XN(3,2)
  2. 2.

    If we apply NORM.DIST to the prior result, we’ll get 0.8, demonstrating the relationship between NORM.DIST and NORM.INV.

    Figure 7.19: NORM.DIST and NORM.INV are Inverse Functions

7.5.3 The Standard Normal Distribution

The normal distribution with mean 0 and standard deviation 1 is called the standard normal distribution. For the standard normal distribution, it is customary to use the letter Z for the distribution, so ZN(0,1) is the shorthand notation. The standard normal distribution is displayed in Figure 7.11.

Suppose that XN(μ,σ). It is a theorem of calculus that the random variable

X-μσ

has a standard normal distribution,3030Recall that if a population X has mean μ and standard deviation σ, then x-μσ gives the number of standard deviations x is from μ. i.e.,

Z=X-μσ.

Moreover, for any two real numbers a and b, with a<b, from calculus it can be shown that

P(a<X<b)=P(a-μσ<Z<b-μσ). (7.5)

Equation (7.5) allows for computing areas for any normal distribution by computing areas using only the standard normal, which is of computational significance if one doesn’t have access to command like NORM.DIST or NORM.INV. The following example illustrates.

Example 7.5.3.

Suppose again that XN(3,2). Note that the value x=4 is 0.5 standard deviations above the mean. This means that the area to the left of 4 in N(3,2) is the same as the areas to the left of 0.5 in N(0,1). See Figure 7.20.

Figure 7.20: Equal Areas Under Different Normal Distributions

7.5.4 The 68 - 95 - 99.7 “Rule”

Suppose that X is a random variable with mean μ and standard deviation σ. Recall that the phrase “x is within k standard deviations of the mean” means that x satisfies the inequality

μ-kσxμ+kσ,

or that x is in the interval [μ-kσ,μ+kσ]. Knowing

P(μ-kσXμ+kσ)

for various values of k provides information on the behavior of the random variable X.

Suppose that XN(μ,σ) and let k be a positive integer. Equation (7.5) gives that the probability a randomly generated X is within k standard deviations of the mean is the same as the probability that a randomly generate number from a standard normal distribution is within the interval (-k,k). That is,

P(μ-kσ<X<μ+kσ)=P(-k<Z<k).

So, for a normally distributed population XN(μ,σ), the proportion of that falls within 1 standard deviation of the mean is

P(μ-σXμ+σ)=P(-1<Z<1)NORM.DIST(𝟷,𝟶,𝟷,TRUE)-NORM.DIST(-𝟷,𝟶,𝟷,TRUE)0.682689492

In words, for a normally distributed population, about 68.3% of all values fall within 1 standard deviation of the mean.

Similarly, again for XN(μ,σ), the proportion that fall within 2 standard deviations of the mean is

P(μ-2σXμ+2σ)=P(-2<Z<2)NORM.DIST(𝟸,𝟶,𝟷,TRUE)-NORM.DIST(-𝟸,𝟶,𝟷,TRUE)0.954499736

Thus, for a normally distributed population, about 95% of all values fall within 2 standard deviations of the mean.

The reader should check that for a normally distributed population, the percent that fall within 3 standard deviations of the mean is about 99.7%. Combining these results is typically called the “The 68 - 95 - 99.7 Rule.” The choice of word “rule” is unfortunate, as this isn’t a rule, but instead is a theorem. Theorems are not arbitrary diktats, but instead are true statements backed by rigorous proofs. This text isn’t concerned with rigorous mathematical proofs, but know that they exist.

A word of caution on the 68 - 95 - 99.7 rule: this holds for normally distributed populations. If you are working with a population that has a different distribution, then these percents a likely different.

7.5.5 Central Limit Theorem

We now discuss one of the most important theorems in statistics, which ultimately in the punchline of the prior chapter. Recall that whether looking at sample means or sample proportions, their distributions were roughly bell-shaped and symmetric, centered at the mean of the population from which the samples were taken. Moreover, as the sample size increased, the spread of the sample means and sample proportions shrunk. This behavior can be proved, and is known as the Central Limit Theorem (CLT).


Theorem (The Central Limit Theorem). Suppose that a large population X has mean μ and finite standard deviation σ. For random samples of size n, if n is sufficiently large, then the distribution for X¯ will be approximately normal, with the mean approximately μ, and standard deviation approximately σ/n. In symbols, X¯N(μ,σ/n). One can also write μX¯μ and σX¯σ/n.


Remarks:

  • If the population XN(μ,σ), then for any sample size n, X¯N(μ,σ/n), i.e., if X is normal, X¯ is always normal with mean μ and standard deviation σ/n.

  • If X is “pathological,” large sample sizes may be necessary to see bell-shaped distributions for X¯. For our purposes, we will use n30 as a rule-of-thumb for sufficiently large.

  • Note that the standard deviation of X¯ gets smaller as n gets larger.

Example 7.5.4.

Suppose population X has mean μ=4 and standard deviation σ=7. Suppose a random sample size n=30 is drawn from the population. Estimate P(X¯>5).3131In words, if a random sample of size 30 is drawn from a population with mean 4 and standard deviation 7, what is the probability that the sample mean is greater than 5.


Solution: Since n=30, we can use CLT. Now μX¯4 and σX¯7/30. Since the distribution of X¯ is approximately normal, using Excel we have

P(X¯>5)𝟷-NORM.DIST(𝟻,𝟺,𝟽/𝚂𝚀𝚁𝚃(𝟹𝟶),TRUE)0.2170.

When working with proportions, the CLT still applies, but the notation is changed a bit.


Corollary Suppose that a large population contains a proportion p of things of Type A. For random samples of size n, if n is sufficiently large, then the distribution of sample proportions P^ of things of Type A will be approximately normal, with the mean approximately p, and standard deviation approximately p(1-p)/n. In symbols, P^N(p,p(1-p)/n. One can also write μP^p and σP^p(1-p)/n.


Remark:

  • If population is “pathological,” large sample sizes may be necessary to see bell-shaped distributions for P^. For our purposes, we will use np10 and n(1-p)10 as a rule-of-thumb for sufficiently large.

Example 7.5.5.

Suppose a population contains p=0.3 things of Type A. Suppose a random sample size n=100 is drawn from the population. Estimate P(P^<0.25).3232In words, if a random sample of size 100 is drawn from a population with p=0.3, what is the probability that the sample proportion is smaller than 0.25.


Solution: Since np=1000.3=3010 and n(1-p)=70, we can use CLT. Now P^N(0.3,0.30.7/100), using Excel we have

P(P^<0.25)NORM.DIST(0.25,0.3,𝚂𝚀𝚁𝚃(0.3*0.7/𝟷𝟶𝟶),TRUE)0.1376.

7.5.6 CLT Simulation - Optional

In the prior chapter, we saw several demonstrations that sample means and sample proportions have approximately bell-shaped distributions centered at the population mean or proportion, respectively. Let’s do that again, while checking on the mean and standard deviation of sample statistics.

The simulation will be sharpest if we make the population normal, so let us use

XN(5,2).

The sample size doesn’t matter, so we’ll use n=10 here. Thus, in this simulation, we will have

X¯N(5,2/10). (7.6)

Use Excel to simulate taking 1,000 samples of size 10 from XN(5,2).

Figure 7.21: CLT Simulation

We’ll compare the observed behavior of the sample means to (7.6). For each sample, compute x¯, and then create a frequency histogram for the 1,000 sample means. You’ll get a histogram that looks similar to Figure 7.22.

Figure 7.22: Histogram

Note the bell shape, and that the mean of the x¯’s appears to be about 5. Compute the sample mean and sample standard deviation of the sample means, as in Figure 7.23.

Figure 7.23: Statistics of the x¯’s

Your sample mean of the sample means will be close to 5. Moreover, your sample standard deviation of the sample means will be near to 2/100.6325.

Lastly, compute the percent of x¯’s that fall within 1, 2 and 3 standard deviations of the mean. You’ll get results that are very close to the 68-95-99.7 rule for normal populations. First compute the bounds of the three intervals about the mean of the sample means, as in Figure 7.24.

Figure 7.24: Intervals about the Sample Mean of the x¯’s

Then count the number of x¯’s that fall to the left of the lower and upper bounds of the intervals, as in Figure 7.25.

Figure 7.25: Counts

Now the percents can be computed directly, as in Figure 7.26. The results will give further evidence that the sample means have a normal distribution.

Figure 7.26: Percents in Intervals

7.5.7 Exercises

  1. 1.

    Suppose that XN(3,5). Find the mean μ and standard deviation σ of X, compute P(X<4), P(X>4), and P(2<X<4), and sketch each probability.

  2. 2.

    Compute P(Z<1.5), P(Z>-1.5), and P(-1.5<Z<1.5). Sketch each probability.

  3. 3.

    Suppose that XN(3,5). Compute the first and third quartiles of X.

  4. 4.

    For a certain standardized exam, scores X are normally distributed with a mean of 100 and standard deviation of 10. What exam score is needed to be above the 90th percentile?

  5. 5.

    Compute z>0 such that P(-z<Z<z)=0.95. Sketch a picture that illustrates.

  6. 6.

    Suppose population X has mean 6 and standard deviation 10. A random sample of size 100 is drawn from X. Estimate P(5<X¯<7).

  7. 7.

    Suppose population X has mean 8 and standard deviation 4. A random sample of size 40 is drawn from X. Estimate P(X¯>9).

  8. 8.

    A random sample of size 8 is drawn from XN(-4,3). Compute P(X¯<-3).

  9. 9.

    At a large school, 65% of students participate in athletics. A random sample of 200 students is drawn. What is the probability that the observed sample percentage of athletes is greater than 70%?

  10. 10.

    A fair 6-sided die is rolled 300 times, and the proportion of 1’s observed is recorded. Approximate the probability that the proportion of observed 1’s exceeds 0.2.

  11. 11.

    Suppose that XN(10,5), and suppose that a random sample of size 50 is drawn from X. Compare the values of P(X>11) and X¯>11).

  12. 12.

    (Optional) Let XU(0,12). We know that μ=6, and it turns out that σ=12. Use a simulation to study the behavior of X¯ for samples of size 50. Compare the results to the CLT.