| Goals: |
|
Suppose that and are populations of numbers with means and respectively, and suppose that the difference makes sense. Suppose further that we wish to make a decision on the difference but that in doing so, the samples drawn from and are independent, i.e., selections from have no impact on selections from In this situation, it may be appropriate to perform a Two-Independent Samples -Test, which we introduce in this section.
Suppose that is a simple random sample drawn from and is a simple random sample independently drawn from If the populations and are normally distributed, or if the sample sizes and are both sufficiently large, then it is reasonable to use a -Test with test statistic given by
| (12.3) |
If it is reasonable to assume that the populations and have equal variances, then the test statistic reduces to
| (12.4) |
where denotes a pooled standard deviation and is given by
| (12.5) |
If you are unsure whether it proper to assume that and have the same variance, then use Equation (12.3). But, if you can assume the variances are the same, then Equation (12.4) is preferred.4949Assuming the variances are equal is additional information, meaning a likely stronger result. The pooled standard deviation probably won’t change the value of by much, but the larger degrees of freedom will likely yield a smaller -value.
A researcher wished to test whether filling a football with air or helium would change the average distance of a punt. One set of footballs were filled with air and the other were filled with helium. Each football was punted by the researcher, and the distance recorded in yards. The data is given in the table below. At a significance level of 5%, determine whether there is significant evidence that the true average distance of helium-filled punts is different from the true average distance of air-filled punts. Assume that the two populations of distances are normally distributed.
Solution. Recall that if we let and denote the true averages of the researcher punting distances of air-filled and helium-filled footballs, respectively, then the competing hypotheses are:
The direction of extreme is two-sided, and the samples are independent. Since the populations are normally distributed, a two-independent samples -Test is appropriate. One way to assess if it is reasonable to assume the population standard deviations are equal is to examine the summary statistics, as shown in Figure 12.7.
We will add to a repertoire of assessing when two standard deviations are not the same, but and do not strongly suggest different variances, so we will use the pooled test. The computation of using Equation (12.5) is shown in Figure 12.8.5050Make a note that what you get is between and If it isn’t, then something went wrong in the calculation.
Recalling that this is a two-sided test, the -value calculation is shown in 12.10.
Since the -value is greater than 0.05, we fail to reject That is, there is insufficient evidence that the true average punting distances (when kicked by the researcher) are different whether the football is filled with air or helium.
A study was conducted to assess whether student learning in college algebra would improve in an online course versus a traditional classroom. A single instructor with experience in both environments was selected to run two courses, one traditional and one online. A group of 50 college algebra students were randomly assigned to either of the two sections, so that 25 were in each. At the end of the semester, students were given a common final exam. The scores on the final are given below. Using the data, at a 10% significance level, determine if there is significant evidence that student learning is improved in online college algebra courses. Assume that the two populations of final exam scores are normally distributed.
Solution. Let and denote the true average exam score for online and traditional college algebra courses taught by the instructor, respectively. The competing hypotheses are:
The direction of extreme is to the right.
By computing the summary statistics as in Figure 12.11, it is evident that assuming the population variances are the same is not reasonable.
The -value is given in Figure 12.13.
Since the -value is not larger than 0.1, we reject i.e., there is significant evidence that the true average exam score in the online courses is larger than the true average score in the traditional courses.
We can construct confidence intervals for the difference of two means using the -distribution, assuming the proper assumptions hold first. That is, if the two samples come from normal distributions or if the sample sizes are sufficiently large. For the independent case, we must know whether or not the samples come from random variables with same variances. This, again, will determine whether we use Eq. (12.3) or Eq. (12.4).
Let’s first look at the case when the the samples come from random variables with different variances. In this case, the confidence interval would be constructed based on the following formula.
| (12.6) |
The degrees of freedom is then given by the smaller of or .
A study was conducted to assess whether student learning in college algebra would improve in an online course versus a traditional classroom. A single instructor with experience in both environments was selected to run two courses, one traditional and one online. A group of 50 college algebra students were randomly assigned to either of the two sections, so that 25 were in each. At the end of the semester, students were given a common final exam. The scores on the final are given below. Assume that the two populations of final exam scores are normally distributed. Construct at 98% confidence interval for the difference in mean exam scores.
Solution. Extracting all of the pertinent descriptive stats we have the following.
After obtaining the test statistic and the , we plug the respective values into Eq. (12.3).
For the case when we can assume equal variances, we instead calculate the pooled variance, . This used Eq. (12.5). Thus, the confidence interval formula is given as
| (12.7) |
with .
A researcher wished to test whether filling a football with air or helium would change the average distance of a punt. One set of footballs were filled with air and the other were filled with helium. Each football was punted by the researcher, and the distance recorded in yards. Assume that the two populations of distances are normally distributed and that their variances are equal. Construct an 85% confidence interval for the difference in distance kicked.
Solution. Extracting all of the pertinent descriptive stats we have the following.
Calculating the poolled standard deviation, the test statistic, and the mean error, we can plug in the values into Eq. (12.4).
An engineer is interested if a new sheet metal molding machine performs faster than an older version a company has been using. In order to test this claim, the times to complete the same mold are recorded for both the new machine and for the old machine. Below are their times in seconds. The data is assumed to come from normally distributed random variables. Assume that the variances are equal. Answer the following.
| Old | 32.73 | 33.32 | 33.83 | 32.27 | 33.39 | 33.11 | 31.70 | 33.82 | 32.67 | 32.62 |
|---|---|---|---|---|---|---|---|---|---|---|
| New | 30.72 | 30.86 | 31.84 | 31.58 | 31.90 | 33.02 | 31.34 | 30.63 | 32.12 | 30.76 |
What is the population of interest?
State the competing hypotheses.
What is the direction of extreme?
What test will you use, and why is it reasonable to use?
Compute the test statistic and corresponding -value.
Sketch the -value.
Using , state your conclusion, i.e., do you reject , or fail to reject ?
State the conclusion in a appropriate for a scientific journal.
What type of error could have been made?
Calculate a 95% confidence interval for the difference in means.
Differences in post-test/pre-test scores are calculated in an effort to determine if curriculum changes are beneficial. Suppose two groups receive the same instruction except for one extra unit is taught to Group B. Group A does not receive this instruction. Both groups received the same exams regardless if they knew the material or not. Below is a sample taken between the groups. Assume the data comes from normally distributed random variables and that variances are not equal.
| Group A | -1 | -1 | -3 | -3 | 1 | 0 | 2 | 4 | 3 | 2 |
|---|---|---|---|---|---|---|---|---|---|---|
| Group B | -1 | 1 | 0 | 2 | 2 | 1 | 1 | 1 | 0 | 1 |
What is the population of interest?
State the competing hypotheses.
What is the direction of extreme?
What test will you use, and why is it reasonable to use?
Compute the test statistic and corresponding -value.
Sketch the -value.
Using , state your conclusion, i.e., do you reject , or fail to reject ?
State the conclusion in a appropriate for a scientific journal.
What type of error could have been made?
Calculate a 95% confidence interval for the difference in means.