10 Testing a Single Population Mean

10.5 T-Test on a Single Population Mean

Suppose that X is a population of numbers with (unknown) mean μ on which a hypothesis test will be done. Using the usual notation, let σ denote the population standard deviation, and n the sample size. If it is known that X is normally distributed, then the population of sample means X¯ will be normally distributed with mean μ and standard deviation σ/n. Thus, as we know already, a Z-Test can be used in this circumstance, as long as the sample can be assumed to be a simple random sample. But, what if σ is unknown?

If X has a normal distribution, then regardless of sample, the random variable

T=X¯-μs/n (10.5)

will have a t-distribution with n-1 degrees of freedom. If we can reasonably assume that the population X is normally distributed, we can use Equation (10.5) as the test statistic for a hypothesis test on the unknown mean μ. This test is called the Student’s T-Test, or simply T-Test.

More precisely, suppose that H0 is one of the three claims μμ0, μμ0, or μ=μ0. Suppose that a simple random sample of size n is selected, giving a sample mean of x¯ and sample standard deviation s. In computing the corresponding p-value, H0 is assumed true and μ=μ0 is used in Equation (10.5), thus giving the following as the test statistic:

t0=x¯-μ0s/n. (10.6)

From here the p-value is computed, just as in the Z-Test, except one uses a t-distribution with df=n-1, instead of using N(0,1).

Remarks: The T-Test is fairly robust, if the sample size is not too small. Even with a small sample, if the population X is nearly normal and not too skewed, the test will yield reliable results. If the sample size is large, the T-Test will yield reliable results, as long as the population distribution is not too pathological. If the sample size is not large, and if it is not reasonable to assume that the sample comes from a nearly normal distribution, then a T-Test should not be used. In this case, a nonparametric test should be considered. See Chapter ???.

Example 10.5.1.

A company has committed to purchase an industrial glue if there is strong evidence to support that the mean sealing strength, at a room temperature of 100 F, is greater than 20 lb/in2. Following are the sealing strengths, measured in lb/in2, of a sample of 15 tested at 100 F:

19.62419.920.121.120.720.220.319.121.421.419.920.219.820.8

At a level of 5%, test whether the company should purchase the glue.

Solution: With a sample of size 15, a T-test is appropriate if we can assume that the population of all glue strengths is approximately normally distributed. Unless the sample size is too small, a low-road, yet reasonable approach is to create a frequency histogram of the sample, and qualitatively assess whether it looks like the sample came from a normally distributed population. Figure 10.24 gives such a graph for the sample, using a small number of bins since the sample is small.

Figure 10.24: Frequency Histogram

And, indeed, it doesn’t appear unreasonable to assume the sample came from a normally distributed population, i.e., we will use a T-test.4545Section 10.9 will provide another qualitative approach on whether the normality assumption is reasonable.

If μ denotes the mean for the population of sealing strengths of the glue, then the hypotheses are as follows:

H0:μ20H1:μ>20

Note that the direction of extreme is to the right. From the sample we have n=15, x¯20.57, and s1.15. Thus, the test statistic is

t0(14)=20.57-201.15/151.90.

The p-value is the chance of observing t0(14)=1.90, or anything larger, assuming H0 is true, i.e., the p-value is T(14)

P(T1.90)=area to the right of 1.90 under the t-distribution with df=14.

The p-value is shown if Figure 10.25.

Figure 10.25: p-value

Using the Excel command 𝚃.𝙳𝙸𝚂𝚃 we can compute the corresponding p-value, as shown in Figure 10.26.

Figure 10.26: p-value calculation in Excel

The p-value 0.0390, and since α=0.05, we reject H0. There is significant evidence (t(14)=1.90, p=0.0390) that the true mean glue strength, at a room temperature of 100 F, is greater than 20 lb/in2.

Example 10.5.2.

Mathematical skills assessment tests were given to a sample first grade children at the end of the school year. Their scores were as follows.

63515955614471615358626440

At the 10% level of significance, is there a significant evidence that in the true mean of scores on the mathematical skills assessment exam is less than 60?

Solution: With a sample of size 13, a T-test is appropriate if we can assume that the population of all exam scores is approximately normally distributed. Figure 10.27 gives frequency histogram for the sample, it does appear reasonable to assume the sample came from a normally distributed population.

Figure 10.27: Frequency Histogram

If μ denotes the mean for the population of exam scores, then the hypotheses are as follows:

H0:μ60H1:μ<60

Note that the direction of extreme is to the left. From the sample we have n=13, x¯57.1, and s8.5. Thus, the test statistic is

t0(9)=57.1-608.5/13-1.25.

The p-value is the chance of observing t0(12)=-1.25, or anything smaller, assuming H0 is true, i.e., the p-value is

P(T-1.25)=area to the left of -1.25 under the t-distribution with df=12.

The p-value is shown if Figure 10.28.

Figure 10.28: p-value

Figure 10.29 illustrates the computation of the p-value in Excel.

Figure 10.29: p-value calculation in Excel

The p-value 0.1181 and α=0.10, so we fail to reject H0. There is insufficient evidence (t(12)=-1.25, p=0.1181) that the mean exam score is less than 60.

Example 10.5.3.

Using the data from Example 10.5.2 on page 10.5.2, but instead use a two-sided direction of extreme, i.e., suppose the hypotheses are:

H0:μ=60H1:μ60

Compute the corresponding p-value.

Solution: The calculation of the test statistic is the same, i.e.,

t0(9)=57.1-608.5/13-1.25.

The p-value is the chance of observing -1.25 or smaller, together with the chance of observing 1.25 or larger, assuming H0 is true. The p-value is illustrated in Figure 10.30.

Figure 10.30: Two-sided p-value

Figure 10.31 shows a method for computing the p-value using Excel. Note that the command computes the area of one tale (the area to the left of -1.25), then using the symmetry of the t-distribution, the result is multiplied by 2.

Figure 10.31: Excel Calculation of the p-value

Bottom-line for when a T-test on a single population mean is appropriate: The population X of numbers is known to be normally distributed, or the sample size is sufficiently large (30) and the population distribution not too pathological (look at a histogram of the sample to assess).

Exercises

  1. 1.

    Since 1982, the United States penny is claimed to have average weight 2.5 grams. A sample of 15 pennies were selected, yielding the following wights (in grams):

    2.482.472.522.482.402.512.502.422.482.542.462.562.512.512.54

    At the 10% level, does the sample provide strong evidence that the mean weight of pennies is not 2.5 grams?

    1. (a)

      What is the population of interest?

    2. (b)

      State the competing hypotheses.

    3. (c)

      What is the direction of extreme?

    4. (d)

      Why is it reasonable to use a T-Test?

    5. (e)

      Compute the test statistic and corresponding p-value.

    6. (f)

      Sketch the p-value.

    7. (g)

      State your conclusion, i.e., do you reject H0, or fail to reject H0?

    8. (h)

      State the conclusion in a manner appropriate for a scientific journal.

    9. (i)

      What type of error could have been made?

  2. 2.

    Nationwide, the average weight of newborns is approximately 7.5 lbs, but a city health official is concerned that the average birth weight in her city is less. Using recent local hospital records, she obtained the following sample of newborn weights (in lbs):

    6.35.56.98.27.45.97.66.97.96.97.79.6

    At the 5% level, does the sample provide strong evidence that the mean weight of newborns in the city is less than 7.5 lbs?

    1. (a)

      What is the population of interest?

    2. (b)

      State the competing hypotheses.

    3. (c)

      What is the direction of extreme?

    4. (d)

      Why is it reasonable to use a T-Test?

    5. (e)

      Compute the test statistic and corresponding p-value.

    6. (f)

      Sketch the p-value.

    7. (g)

      State your conclusion, i.e., do you reject H0, or fail to reject H0?

    8. (h)

      State the conclusion in a manner appropriate for a scientific journal.

    9. (i)

      What type of error could have been made?

  3. 3.

    A common estimate for the average length of a song played on the radio is 3 minutes and 30 seconds. Rita believes that the average length of songs on her IPOD is longer, but she doesn’t have the time to do a census of all 15896 songs. Instead she takes a sample, with the results given below (in seconds):

    279290245152176222200219255234245291211219185229254296205245140

    At the 10% level, does the sample provide strong evidence that the mean length of songs on Rita’s IPOD is greater than 3 minutes and 30 seconds?

    1. (a)

      What is the population of interest?

    2. (b)

      State the competing hypotheses.

    3. (c)

      What is the direction of extreme?

    4. (d)

      Why is it reasonable to use a T-Test?

    5. (e)

      Compute the test statistic and corresponding p-value.

    6. (f)

      Sketch the p-value.

    7. (g)

      State your conclusion, i.e., do you reject H0, or fail to reject H0?

    8. (h)

      State the conclusion in a manner appropriate for a scientific journal.

    9. (i)

      What type of error could have been made?