2 Low-Tech Simulations

2.1 Family Planning Simulation

We’ll use the following question to introduce simulations:

  • Some couples use the following family plan in having children: The couple will have children until having a boy or having three children. What is the average number of girls in families that use this plan?11It’s important to note that if we know enough probability theory, then we can compute this particular average directly (making some assumptions, of course), but for our purposes, this problem works well for illustrating simulations. It also important to recognize that gender identity is more complicated than the given scenario.

The problem illustrates much. Note that there are four possible outcomes for the number of girls: 0, 1, 2, or 3. So there is a population of numbers, where each number represents the number of girls for a particular couple, and if we could see all of those numbers, we could directly compute the average the number of girls, say μ, we seek.22It is common to use the Greek letter μ to denote a population mean. We will use other common notations, such as x¯ for sample mean, p for population proportion, and p^ for sample proportion. All of these are formally defined in the next chapter. But, we cannot, and taking a random sample from the population to estimate the average is unrealistic. Note that

0+1+2+34=1.5

is likely a poor estimate for μ, as about half of the population consists of 0’s.

To execute a simulation that will yield an approximation for μ, equipment is necessary: Get a coin, something to write on, and something to write with. For the three-legged stool that is Computational Science, the mathematical model of the simulation is that chance of having a boy or girl is the same, and that the sex of the next child is independent of the prior (if there is a prior), i.e., we’ll use a coin to simulate having children, and we’ll assume that the coin is fair. The “computer simulation” is to decide on what a flip means (head = boy, for example), and then flip the coin and record results. For example, suppose we do decide that a flip of heads means that a boy is born. For a particular family, there are four possible outcomes:

FlipsKidsOutcomeHB0THGB1TTHGGB2TTTGGG3

A series of flips, such as

HHTTTTTHHHHHTHHHTTH,

would be interpreted as a collection of trials of different families using the family plan. Scanning from left to right, we’d have

TrialsHHTTTTTHHHHHTHHHTTHOutcomes003200001002

This gives a sample {0,0,3,2,0,0,0,0,1,0,0,2} of size n=12, and we can estimate μ using the sample average

x¯=0+0+3+2+0+0+0+0+1+0+0+2120.67.

This simulation leads to a number of key questions that will be investigated in the course:

  • Unlike μ, which is fixed, from sample to sample the value of x¯ will likely change. How do these changes behave? Does the behavior change if the sample size is changed?

  • How confident can we be that sample mean x¯ is close to population mean μ? How does the level of confidence change if the sample size is changed?

A central part of this course is building skills so that you can boss around a computer into doing the simulation and calculations for you. It’s efficient and powerful. Being able to do this requires an ability to articulate instructions that can be followed by a machine. This is algorithmic thinking, a key skill in all disciplines. For example, a nurse diagnosing a patient follows algorithms to identify a solution to a problem. Flow charts, which you are no doubt familiar, are illustrations of algorithmic thinking.

Here is an instruction set on how to carry out the family planning simulation: Designate the sides of a coin as a boy or girl is born. Then repeat the following:

  1. 1.

    Flip the coin until a stopping criterion is met (a boy is born but no more than three kids).

  2. 2.

    Count the number of girls, and then add that as a data point in the sample.

  3. 3.

    Go to Step 1. Stop when you can’t take it anymore.

The data analysis, i.e., computing the sample mean, completes the three-legged stool of computational science. Now this discipline of computational science is deep, sufficiently so that one can earn a Ph.D. in it, but this example does capture a key part of the spirit of the enterprise.


Concepts Check

  1. 1.

    Suppose that a sequence of flips results in TTTTHHHHTTT. Using the procedure outlined in Table 2.1, what are the corresponding outcomes?

  2. 2.

    Add the outcomes in the prior problem to the sample above, and compute the new sample mean x¯. You should get 15/180.83.

  3. 3.

    Of the two sample means, which one would you expect to be closer to the population mean μ? Why?