Math 210
Laboratory 6
Regression and Correlation
Least-squares regression is used to describe a linear relationship
between
two variables and correlation is used to describe the strength of this
relationship. For this lab, you are going to determine the regression
equation
and correlation coefficients for a couple of data sets and answer some
questions associated with either the regression equation or the
correlation
coefficient.
- Robert
Pershing
Wadlow
was born in Alton, Illinois on February 22, 1918. He weighed a normal
eight
pounds, six ounces. However, due to an overactive pituitary gland he
grew
at an astounding rate. He continued to grow his entire life. When he
died,
at age 22, (in Manistee, Michigan) he reached a height of 8 feet 11.1
inches.
This qualifies him as the tallest person in history, as recorded in the
Guinness Book of Records. Get the Robert
Wadlow
data. The age given in this data set is his age in years and the
height is given in inches.
- Plot the data with age as the explanatory variable (or x-variable) and height
as the
response
variable (or y-variable).
(Graph > Scatterplot) Make
sure the axes of your scatterplot are labeled correctly. Give
your
scatterplot a title. Do your points look fairly linear?
- Find the equation for the least-squares regression line where
his age
is
the explanatory variable (or predictor) and his height is the response
variable. (Stat
> Regression > Regression)
- Describe what the y-intercept of your regression line
means in
terms
of his age and height. Be specific and use the appropriate numbers in
your
explanation. Does this number make sense? Explain.
- Describe what the slope of your regression line means in terms
of his
age
and height. Be specific.
- Find the correlation for Robert Wadlow's age and height. Based
on the
correlation,
is his age a good predictor of his height? (Stat
> Basic Statistics > Correlation)
- Plot the data with age as the explanatory variable and height
as the
response
variable along with the regression line. (Stat
> Regression > Fitted Line Plot)
- The last time that a world's record in the men's mile run was set
was
on
July 7, 1999. It was accomplished by Hicham El Guerrouj of
Morocco. (El Guerrouj more recently won gold in the 1500 m and
5000m runs at the Olympics in Athens in 2004 and has been retired since
2006.)
Get the mile run data. This table lists the
year in which the record was broken, the runner, his nationality, and
his
time in seconds.
- Transform your years so that it gives years since 1900. (Calc
> Calculator. Store results in some open
column.
Your expression should be
year
- 1900.) Label your new column of data "years since 1900."
Plot the data with year since 1900 as the explanatory
variable
(x-variable) and record time
as the response variable (y-variable).
Make sure your
scatterplot
is labeled appropriately. Do these points fall in a fairly linear
pattern?
- Find the equation for the least-squares regression line that
fits these
data. Your explanatory variable (or predictor) should be number of
years since 1900 and your response variable
should
be the time in seconds.
- Describe what the y-intercept and the slope of your
regression
line
mean in terms of the year and the record time. Be specific and use the
appropriate numbers in your explanation.
- Plot the data with year (since 1900) as the explanatory
variable and
record
time as the response variable along with the regression line.
- When the first person ran a mile in less than 4 minutes it made
headlines
around the world. Looking at your data, who was the first person
to run a mile in less than 4 minutes? When did that occur?
- Using your regression equation, when does your equation predict
that
someone
would run a mile in 4 minutes? Is the year the same as your
answer
to the previous question? If not, why is there a difference?