Math 210
Laboratory 5

Scatterplots

Scatterplots are used to display a relationship between two quantitative variables.  Often times, a relationship does not become evident until the data are displayed somehow.  In this lab, we will look at the relationship between the time of the year (as well as the day of the week) and the number of children that are born.  To do this, we will be using a data set that contains the number of births in the United States  for each day in 1978.  This data set can be found here.
 

  1. Before plotting the data, think about how birthdays are distributed throughout the year.  Often an assumption in probability problems is that birthdays are distributed evenly throughout the year.  Do you think that is the case or are there times during the year when more births occur than others?  How about days of the week?  Are the some days of the week when  fewer births occur than others?
  2. Let's now take a look at the data.
    1. Make a scatterplot of the data with "date" as the explanatory variable and "no. of births" as the response variable.  Make sure the axes of your scatterplot are labeled correctly.  Give your scatterplot a title.  (To do this click on Graph > Scatterplot. With the Simple scatterplot highlighted in the first pop-up window click OK.  Put the "no. of births" in as the Y-Variable and "date" as the X-Variable and click OK.)
    2. Describe the shape of your scatterplot and anything interesting about it.
    3. Do more births occur during some parts of the year than others?  If so, what time of year has the larger number of births per day.
  3. You should have seen that your plot was sort of split into two parts.  Let's investigate this phenomenon.
    1. Make a scatterplot of the data again, but this time include the categorical variable "day of the week."  (To do this click on Graph > Scatterplot. With the With Groups scatterplot highlighted in the first pop-up window click OK.  Put the "no. of births" in as the Y-Variable, "date" as the X-Variable, and "day of the week" in for the categorical variables for grouping, and click OK.)
    2. For the most part, what two days of the week are the fewest number of births per day?
    3. You should now see all but six of the symbols in the lower group are Saturdays or Sundays.  What are the dates of these six weekdays?  Are they associated with certain holidays?  If so, which ones?
  4. Let's look at another data set of  birthdates.  These dates do not come from a specific year, but were collected from 480,040 life insurance applications made from 1981 to 1994.  This data set can be found here.
    1. With this new data set, make a simple scatterplot with "date_1" as the explanatory variable (X-Variable) and "no._of_births_1" as the response variable (Y-Variable).  Make sure the axes of your scatterplot are labeled correctly.  Give your scatterplot a title.  
    2. Describe any similarities and differences between this scatterplot and the one you made in question 2.  Why do you think the similarities and differences occurred?










http://sci.grinnell.edu/surveys/prot/cure.htm

Username: cureresp

Password: blue2white