9 Hypothesis Testing - An Introduction

9.5 Simulations

We can use Microsoft Excel to simulate taking random samples from populations such as these and observing the behavior of probabilities like α, β and p-values.

9.5.1 Simulating α and β

For ease of input into Excel, let’s use the populations as given in Figure 9.5 on page 9.5. Create a column for each box (population), inputting a value for each number in the box, as in Figure 9.11.

Figure 9.11: Populations 1 & 2 in Excel

Let’s use the decision rule“reject H0 if the selected number is 4 or greater,” so that α=1/10=0.1 and β=6/10=0.6. We’ll use Excel to randomly pick a box, and then randomly select a number from the chosen box.

The primary command we will use is 𝙸𝙽𝙳𝙴𝚇. From a block of given cells, 𝙸𝙽𝙳𝙴𝚇 allows the user to select the value from a particular cell within the block. The command takes three inputs:

  • The first input is a block of cells from which the cell will be selected.

  • The second is the row from which to pick.

  • The third input is the column from which to pick.

For example, =𝙸𝙽𝙳𝙴𝚇(𝙰𝟸:𝙱𝟷𝟷,𝟿,𝟷) would output the value in cell 𝙰𝟷𝟶, because the ninth row in the block is row 10, and the first column is column A. The output is shown in Figure 9.12.

Figure 9.12: INDEX example

Now using 𝚁𝙰𝙽𝙳𝙱𝙴𝚃𝚆𝙴𝙴𝙽, we can randomly select a column and row, as shown in 9.13.

Figure 9.13: A Single Sample Example

This is a simulated sample, and from it we must make a decision. Since the selected number is not 4 or greater, we would not reject H0, i.e., there is insufficient evidence to reject the hypothesis that the number came from Population 1. Note that since we failed to reject H0, a Type II error might have occurred, though since we didn’t record which box was selected, it is not possible to know if the error was made. This is the case in practice, in that when a decision is made, it is impossible to know if an error occurred. However, this is a simulation, meaning we can control what what is known, i.e., we can keep track of which population was selected, and hence, determine if an error was made.

All we need to do is remove the 𝚁𝙰𝙽𝙳𝙱𝙴𝚃𝚆𝙴𝙴𝙽(𝟷,𝟸) from inside the 𝙸𝙽𝙳𝙴𝚇 command, so that we can see which population is chosen before the sample is generated. Figure 9.14 demonstrates.

Figure 9.14: Another Single Sample Example

The command in cell 𝙳𝟸 selected column 1, which is then referenced in the 𝙴𝟸 command. Having chosen Population 1 makes H0 true. Since a 2 was chosen from the box, we would fail to reject H0, and hence, we would make a good decision.

Note: We’re going to drag commands down, so you’ll want absolute referencing on the row numbers for the cells that refer to Population 1 and Population 2, i.e., note the $’s in the 𝙸𝙽𝙳𝙴𝚇 command in Figure 9.14.

Using Excel’s 𝙸𝙵 command, we can encode the decision rule of “reject H0 is a 4 or more extreme is observed.” For ease of uisng 𝙸𝙵, we’ll denote a decision as follows:

  1. 1 = Fail to Reject H0;

  2. 2 = Reject H0.

The decision rule can be executed with 𝙸𝙵(𝙲𝙴𝙻𝙻𝟺,𝟸,𝟷), as shown in Figure 9.15.

Figure 9.15: Decision Rule as an 𝙸𝙵

Note that the choice of encoding the decisions as above means the following: If the selected population and decision outcome are the same, then an error has not occurred. For example, if Population 1 is selected, and the decision is a 1, then an error did not occur.

We can use 𝙸𝙵 to test whether a Type I error has occurred. Such an error has occurred if Population 1 was selected and H0 was rejected. We can use Excel’s 𝙰𝙽𝙳 command to check both conditions as shown in Figure 9.16.

Figure 9.16: Checking for Type I errors

Similarly, we can check whether a Type II error has occurred, as in Figure 9.17.

Figure 9.17: Checking for Type II errors

We’re now set to do a large number of simulations in Excel. Drag the commands down, as if Figure 9.18, for at least a few hundred rows.

Figure 9.18: Population 1 & 2 Simulation

Now use Excel to check on the following:

  1. 1.

    When H0 is true, a Type II error never occurs. Similarly, when H0 is false, a Type I error never occurs.

  2. 2.

    When H0 is true, the percent of Type I errors made is about 10%, and when H0 is false, the percent of Type II errors made is about 60%. (Recall that for the decision rule used in this example, α=0.1 and β=0.6.

9.5.2 Simulating Samples of Size 2

Without too much pain, we can use the work above to simulate making decisions with samples of size 2.

First, when working with samples of size greater than 1, we will need to use summary statistics as tools. The examples used in this chapter all involve populations that consist of numbers, so computing a sample mean from sample data is a natural choice, and will be the choice in this simulation.

To make a reasonable comparison, let’s use the same setup as in Figure 9.18 on page 9.5, and the same decision rule of “reject H0 if the sample mean is 4 or greater.” Only column 𝙴 needs to modified: After the box is selected, two numbers need to be chosen from the population, and then their average computed. To make the computation much less complicated, we will allow sampling with replacement, i.e., after selecting a number from the chosen box, it is placed back into the population and can be chosen again. With this assumption, column 𝙴 need only be changed as in Figure 9.19.

Figure 9.19: Sample Size of 2

To construct the command in cell 𝙴𝟸, copy and past the 𝙸𝙽𝙳𝙴𝚇 command into the Excel formula bar, putting a plus sign between the two commands:

𝙸𝙽𝙳𝙴𝚇(𝙰$𝟸:𝙱$𝟷𝟷,𝚁𝙰𝙽𝙳𝙱𝙴𝚃𝚆𝙴𝙴𝙽(𝟷,𝟷𝟶),𝙳𝟸)+𝙸𝙽𝙳𝙴𝚇(𝙰$𝟸:𝙱$𝟷𝟷,𝚁𝙰𝙽𝙳𝙱𝙴𝚃𝚆𝙴𝙴𝙽(𝟷,𝟷𝟶),𝙳𝟸)

Then add parentheses on the outside, and divide by 2:

=(𝙸𝙽𝙳𝙴𝚇(𝙰$𝟸:𝙱$𝟷𝟷,𝚁𝙰𝙽𝙳𝙱𝙴𝚃𝚆𝙴𝙴𝙽(𝟷,𝟷𝟶),𝙳𝟸)+𝙸𝙽𝙳𝙴𝚇(𝙰$𝟸:𝙱$𝟷𝟷,𝚁𝙰𝙽𝙳𝙱𝙴𝚃𝚆𝙴𝙴𝙽(𝟷,𝟷𝟶),𝙳𝟸))/𝟸

Now copy the command down the rest of the column, and recompute the following:

  1. 1.

    When H0 is true, what is the percentage of Type I errors committed?

  2. 2.

    When H0 is false, what is the percentage of Type II errors committed?

You should observe percentages of errors that are greatly reduced, i.e., if the sampling process is reliable, then increasing sample size will increase confidence in making a good decision.


9.5.3 p-value Simulation

We can use the work done previously to quickly build a simulation for estimating p-values for samples of size 2. First, recall that a p-value is the chance of observing the test statistic, or anything more extreme, assuming H0 is true. Thus, to simulate a p-value in this scenario, Population 1 is always chosen, and column 𝙳 can be replaced with 1’s, as shown in 9.20.

Figure 9.20: p-value Simulation

Column 𝙴 will automatically update to simulated means for samples of size 2.

As an example computation, suppose the observed sample mean is 3.5. The corresponding p-value is the chance of observing 3.5 (the test statistic) or anything larger (more extreme) assuming the sample came from Population 1 (H0 is assumed true). We can estimate the chance using the simulated sample means. After generating a large number of sample means, compute the proportion of means that are 3.5 or larger, which can be done using 𝙲𝙾𝚄𝙽𝚃𝙸𝙵 and 𝙲𝙾𝚄𝙽𝚃:

=𝙲𝙾𝚄𝙽𝚃𝙸𝙵(𝙴:𝙴,"3.5")/𝙲𝙾𝚄𝙽𝚃(𝙴:𝙴)

Figure 9.21 illustrates.

Figure 9.21: p-value Estimate

Thus, if a sample mean of 3.5 is observed from a sample of size 2, then the approximate p-value is 0.0467. (The estimate shown was generated from a 1307 simulated sample means.)