13 Testing Two Population Proportions

13.3 Testing a Difference of Two Population Proportions

Goals:
Learn how to do a hypothesis test on two population proportions;
Learn how to build a confidence interval on the difference
  of two population proportions.

Suppose that Population 1 and Population 2 have proportions p1 and p2, respectively, of things of Type A. Suppose further that we want to make a decision on p1-p2. In this situation, it may be appropriate to perform a Two-Independent Samples Z-Test on Two Proportions, which we introduce in this section.

Let n1 and n2 denote the respective sample sizes of the samples taken from Populations 1 and 2, respectively. Similarly, let x1 and x2 denote the counts of things of Type A in each sample. These counts produce sample proportions p^1=x1/n1 and p2^=x2/n2, which are point estimates for the unknown p1 and p2, and hence, p^1-p^2 is a point estimate for p1-p2, i.e.,

p^1-p^2p1-p2.

In order to use the hypothesis test, the following conditions must be met:

  1. 1.

    The two samples are independent.

  2. 2.

    The samples are simple random samples.

  3. 3.

    The samples are sufficiently large. This is met if

    n1p^110,n1(1-p^1)10,n2p^210,and,n2(1-p^2)10.

    Algebraically, this is equivalent to

    x110,n1-x110,x210,and,n2-x210. (13.1)

    In words, “sufficiently large” means that each sample contains at least 10 things of Type A and at least 10 things not of Type A.

  4. 4.

    Each sample is not too large relative to the size of the population, i.e., each sample size is not more than 5% the size of the corresponding population.

If these conditions are met, then the Z-score for the observed data is

z=p1^-p2^(p^)(1-p^)(1n1+1n2) (13.2)

where

p^=x1+x2n1+n2.

The value p^ is called the pooled sample proportion. The following examples show application of the test.

Example 13.3.1.

A researcher wanted to test the hypothesis that the proportion of women in Wisconsin who plan to vote in the upcoming election is greater than the proportion of men in Wisconsin who plan to vote in the upcoming election. She sampled 600 women, 324 of whom reported that they plan to vote. In a sample of 516 men, 258 reported that they plan to vote. At the 5% level of significance, is there significant evidence that a greater proportion of women intend to vote?

Solution: Let p1 and p2 be the population proportions of women and men who intend to vote, respectively. Then the competing hypotheses can be written as follows:

H0:p1-p20H1:p1-p2>0

Note that the direction of extreme is to the right.

The sample data are as follows:

n1=600x1=324n2=516x2=258

Note that the conditions for the “samples are sufficiently large” in (13.1) are satisfied, so we can use the test. We have p^1=324/600=0.54 and p^2=258/516=0.5, and the pooled sample proportion is

p^=x1+x2n1+n2=324+258600+516=58211160.5215.

Thus, the test statistic is

z=p1^-p2^(p^)(1-p^)(1n1+1n2)=.54-.55821116(1-5821116)(1600+1516)1.334.

Now the p-value is

P(Z>1.334)𝟷-NORM.DIST(1.334,𝟶,𝟷,TRUE)0.0911.

Now α=0.05, so the p-value is larger than α and we fail to reject H0. Thus, there is insufficient evidence that supports the claim that a greater proportion of women intend to vote.


Two Remarks:

  1. 1.

    Here are the calculations in Excel. The highlighted cell shows the calculation of Equation (13.2):

    Figure 13.1: Excel Calculations
  2. 2.

    In failing to reject H0, recall that a Type II error may have been made. But, the probability that the error was made is not β, instead it is 0 or 1. It is 0 if the error was not made, and 1 if the error was made.

Example 13.3.2.

A study was conducted to assess whether Saint Agatho College student satisfaction had improved compared to 10 years ago. From 10 years ago, a survey was conducted in which 100 students were asked if they were satisfied with their experience at Saint Agatho. Of the 100, 75 answered “yes.” The same survey was conducted this year, and of the 100 surveyed, 83 replied “yes.” At a 10% significance level, determine if there is significant evidence that student satisfaction has improved. Assume that student enrollment at Saint Agatho has exceeded 2,000 during the last 10 years.

Solution: Let p1 be the population proportion of students 10 years ago who would have answered “yes,” and let p2 be the population proportion of current students who would answer “yes.” Then the competing hypotheses can be written as follows:

H0:p1-p20H1:p1-p2<0

Note that the direction of extreme is to the left.

The sample data are as follows:

n1=100x1=75n2=100x2=83

Note that the conditions to use the Z-test on two proportions are met. We have p^1=75/100=0.75 and p^2=83/100=0.83, and the pooled sample proportion is

p^=x1+x2n1+n2=75+83100+116=158200=0.79.

Thus, the test statistic is

z=p1^-p2^(p^)(1-p^)(1n1+1n2)=.79-.830.79(1-0.79)(1100+1100)-1.389.

Now the p-value is

P(Z<-1.389)NORM.DIST(-1.389,𝟶,𝟷,TRUE)0.0824.

Now α=0.1, so the p-value is smaller than α and we reject H0. Thus, there is significant evidence that student sanctification has improved compared to 10 years ago.


Two Remarks:

  1. 1.

    Here are the calculations in Excel. The highlighted cell shows the calculation of Equation (13.2):

    Figure 13.2: Excel Calculations
  2. 2.

    In rejecting H0, recall that a Type I error may have been made. However, the probability that the error was made is not α, instead it is 0 or 1. It is 0 if the error was not made, and 1 if the error was made.

13.3.1 Exercises

  1. 1.

    Mark the following statements as True or False:

    1. (a)

      x1 can be negative.

    2. (b)

      n1 can be negative.

    3. (c)

      p^2 can be larger than p^1.

    4. (d)

      In Equation (13.2), the value of z can be negative.

    5. (e)

      p^1 and p^2 can be negative.

    6. (f)

      The p-value can be negative.

    7. (g)

      The p-value has to be less than alpha in order to reject H0.

  2. 2.

    Compute p^1 and p^2 for the following scenarios:

    1. (a)

      x1=11; n1=19; x2=23; n2=34

    2. (b)

      x1=216; n1=368; x2=197; n2=251

    3. (c)

      x1=1032; n1=1229; x2=1237; n2=1465

    4. (d)

      x1=88; n1=95; x2=102; n2=119

  3. 3.

    Compute p^ for each of the situations in problem 1.

  4. 4.

    Compute Z for each of the situations in problem 1.

  5. 5.

    A college provost wished to test whether freshmen and sophomores take more than 16 credits more often than juniors and seniors take more than 16 credits. In the fall semester, she took a random sample of size 200 from freshmen and sophomores, and a random sample of size 200 from juniors and seniors. Of the former, 34 took more than 16 credits that semester, and of the latter, 25 were taking more than 16 credits. At the 10% level, test whether there is sufficient evidence that freshmen and sophomores take more than 16 credits more often than juniors and seniors take more than 16 credits. State the competing hypotheses, the direction of extreme, compute the test statistic and corresponding p-value, and interpret. Address, as well, why using this test was appropriate. Assume that both populations have at least 2,000 students.

  6. 6.

    A basketball player wishes to test whether she is more accurate when shooting three pointers from the top of the key versus shooting from the corner. To randomize the order in which she took shots, she flipped a coin before each shot, or until she reached a total of 50 shots from each position. If the coin flip was heads, she shot from the top of the key, and tails meant she shot from the corner. Of the 50 from the top of the key, she made 40 shots, and from the corner, she made 34 of the 50 shots. At the 10% level, test whether there is sufficient evidence that she is more accurate when shooting three pointers from the top of the key versus shooting from the corner. State the competing hypotheses, the direction of extreme, compute the test statistic and corresponding p-value, and interpret. Address, as well, why using this test was appropriate.