Math 210
Laboratory 12

Understanding Confidence Intervals

In this lab we will construct confidence intervals based the duration of eruptions at Old Faithful and birth weights of children from North Carolina.

  1. Old Faithful.  The Old Faithful Geyser in Yellowstone National Park erupts every 35 to 120 minutes. The duration of each eruption lasts for 1½ to 5 minutes.  Notice that Old Faithful is not as faithful as one might expect. The time between eruptions and the length of each eruption varies quite a bit. However, one can estimate the time of the next eruption quite accurately given the duration of the previous eruption.  In this lab we will examine data that give the duration of 222 different eruptions of Old Faithful taken over a number of days in August 1978 and August 1979. (From Applied Linear Regression, 2nd Edition, by Sanford Weisberg, pp. 231 and 234.)  The data also contains the length of time between consecutive eruptions.  We will  look at a confidence interval involving the duration. The times given in the Old Faithful data set found here are given in minutes. (View a "live" web cam picture of Old Faithful!)
    1. Find the mean time of the duration of the eruption.
    1. Consider the data to be a simple random sample from the durations of all eruptions of Old Faithful. Find a 95% confidence interval for the population mean duration of eruption.
    1. Make a histogram of the data along with the confidence interval included. 
    1. Explain what a 95% confidence interval means. Do so in such a way that someone with very little knowledge of statistics would understand.
    2. How would the width of your interval be different if you used a 90% confidence level instead of 95%?

     
  1. Birth Weights. The data set North Carolina Births gives information about 200 babies.  The data come from 1995 birth registry at the North Carolina State Center for Health and Environmental Statistics.  We will look at a couple of confidence intervals involving this set of data. We will be using two columns of the data. The column labeled GESTATION, the gestation period for the baby in weeks, and the column labeled WEIGHT, the birth weight of the baby in ounces.

  2.  
    1. Find the mean birth weight for the babies.
    2. Consider the data to be a simple random sample from all the births in North Carolina. Find a 95% confidence interval for the mean birth weight of a child in North Carolina.
    3. Make a histogram of the data along with the confidence interval included.
    4. A few of the gestation periods were quite short. Suppose we wanted to estimate the mean weight of a baby born after a gestation period of at least 32 weeks. Using the data, eliminate any birth weight that had a gestation period of less than 32 weeks. Also, some of the data given do not include either a gestation period or a birth weight. Eliminate any data like this that is not complete. (Remember, if you eliminate a gestation period you must eliminate the corresponding birth weight.) Find a 95% confidence interval for the mean birth weight of a child in North Carolina after a gestation period of at least 32 weeks. Make a histogram of the data along with the confidence interval included.
    5. How does your second confidence interval compare to your original both in the width of the interval and center of the interval? Explain why the differences occurred.  Be specific.

     
  3. We will now look at an applet that shows the affects of sample size and confidence level on the width of a confidence interval.  Go to the Rice Virtual Lab in Statistics.  Read the instructions that go along with this applet and then select "Begin" under "Confidence Intervals" on the left side of the page. Use the applet enough times with enough different sample sizes so that you can answer the following questions.
    1. What percent of the 95% confidence intervals would you expect to contain the population mean of 50?
    2. Which confidence level, 95% or 99%, gives wider intervals?
    3. How does sample size affect the width of the intervals?