10 Testing a Single Population Mean

10.10 Optional: Robustness of the T-Test

Suppose X is a population of numbers with mean μ. Because of the Central Limit Theorem, the T-Test is reliable regardless of the distribution of X, as long as the samples are simple random and sufficiently large. We have the skills to simulate this with ease.

For example, suppose that X has a uniform distribution from 0 to 20, so X has a distribution quite different from normal. We know that X has mean μ=10. Let’s run 1000 T-Tests on H0:μ=10 with a significance level α=0.05. This is an abuse of the T-Test. as X is not normal, but if the sample size is large enough, we should see about 5% Type I errors. Since we’ve been using n30 as a rule-of-thumb for “large enough,” let’s use samples of size 30 in the simulation.

In Excel cell 𝙰𝟸 execute =𝚁𝙰𝙽𝙳()*𝟸𝟶. This will generate a random number from the population. (Recall that 𝚁𝙰𝙽𝙳 generates random numbers uniformly between 0 and 1.) Drag the command across to cell 𝙰𝙳𝟸, thus generating a random sample of size 30 from X in row 𝟸, as shown in Figure 10.48.

Figure 10.48: First Sample

Now drag that row of commands down to row 𝟷𝟶𝟶𝟷 to create 999 more samples of size 30, as in Figure 10.49. (If you want to create more random samples than 1000, go for it.)

Figure 10.49: 1000 Samples of Size 30

Compute the sample average for the first sample, as in Figure 10.50.

Figure 10.50: x¯ for Sample 1

Similarly, compute the sample standard deviation for the first sample, as shown in Figure 10.51.

Figure 10.51: s for Sample 1

Since X isn’t normally distributed, computing

x¯-μ0s/n=x¯-10s/30 (10.9)

won’t be a proper t-score, so label the column as you see fit, and then calculate Equation (10.9) for the first sample, as in Figure 10.52.

Figure 10.52: t”-score for Sample 1

Since Equation (10.9) isn’t a true t-score, then using 𝚃.𝙳𝙸𝚂𝚃 isn’t a proper p-value calculation, so label that column as you see fit. Some “t”-scores will be positive and some negative, so we can calculate the “p”-value using the absolute value command as follows:

=𝟸*𝚃.𝙳𝙸𝚂𝚃(-𝙰𝙱𝚂(𝙰𝙶𝟸),𝟸𝟿,𝚃𝚁𝚄𝙴).

The calculation is shown in Figure 10.53.

Figure 10.53: p”-value for Sample 1

It the p-value is less than or equal to α, then a Type I error will have occurred (we know H0 is true). We can use the 𝙸𝙵 command to encode the result, with 0 denoting a correct decision, and 1 otherwise, as in Figure 10.54.

Figure 10.54: Type 1 Error?

Now drag the four commands down to repeat the computations for the other 999 samples, as in Figure 10.55.

Figure 10.55: Compute For All

At last, compute the percent of Type I errors committed. This can be done by adding up the 0’s and 1’s, and then computing the percent, as in Figure 10.56.

Figure 10.56: Percent of Simulated Type 1 Errors

Because the 𝚁𝙰𝙽𝙳 command is re-executed with each change to the spreadsheet, you can experiment with seeing how the percent of Type I errors changes with repeated 1000 simulations. Note that the percent of errors stays near 5%, implying the T-Test can be used reliably on a uniform population if the sample size is at least 30.

10.10.1 Exercises

  1. 1.

    Try the simulation with the sample population, but use a smaller sample size, say 10. Note what happens to the percent of Type I errors.

  2. 2.

    Do a simulation on a population that could be described as pathological. For example, try the discrete distribution below:

    XP(X=x)00.210.120.7

    How large do the samples need to be until you see consistent Type I error rates of your chosen α?