The Olympic gold medal-winning height for the women's high jump, \(\textit{Wgold}\), is often lower than the best height achieved in other international women's high jump competitions in that same year. The table below lists the Olympic year, \(\textit{year}\), the gold medal-winning height, \(\textit{Wgold}\), in metres, and the best height achieved in all international women's high jump competitions in that same year, \(\textit{Wbest}\), in metres, for each Olympic year from 1972 to 2020. A scatterplot of \(\textit{Wbest}\) versus \(\textit{Wgold}\) for this data is also provided. Wgold Wbest When a least squares line is fitted to the scatterplot, the equation is found to be: \(Wbest =0.300+0.860 \times Wgold\) The correlation coefficient is 0.9318 --- 1 WORK AREA LINES (style=lined) --- --- 0 WORK AREA LINES (style=lined) --- --- 1 WORK AREA LINES (style=lined) --- --- 0 WORK AREA LINES (style=lined) --- \begin{array}{|l|l|} --- 3 WORK AREA LINES (style=lined) --- --- 4 WORK AREA LINES (style=lined) --- --- 0 WORK AREA LINES (style=lined) --- --- 3 WORK AREA LINES (style=lined) --- --- 2 WORK AREA LINES (style=lined) ---
year
1972
1976
1980
1984
1988
1992
1996
2000
2004
2008
2012
2016
2020
(m)1.92
1.93
1.97
2.02
2.03
2.02
2.05
2.01
2.06
2.05
2.05
1.97
2.04
(m)1.94
1.96
1.98
2.07
2.07
2.05
2.05
2.02
2.06
2.06
2.05
2.01
2.05
\hline
\rule{0pt}{2.5ex}\text { strength } \rule[-1ex]{0pt}{0pt} & \quad \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\\
\hline
\rule{0pt}{2.5ex}\text { direction } \rule[-1ex]{0pt}{0pt} & \\
\hline
\end{array}
Data Analysis, GEN1 2022 VCAA 7-8 MC
The association between the weight of a seal's spleen, spleen weight, in grams, and its age, in months, for a sample of seals is non-linear.
This association can be linearised by applying a \(\log _{10}\) transformation to the variable spleen weight.
The equation of the least squares line for this scatterplot is
\(\log _{10}\) (spleen weight) = 2.698 + 0.009434 × age
Question 7
The equation of the least squares line predicts that, on average, for each one-month increase in the age of the seals, the increase in the value of \(\log _{10}\) (spleen weight) is
- 0.009434
- 0.01000
- 1.020
- 2.698
- 5.213
Question 8
Using the equation of the least squares line, the predicted spleen weight of a 30-month-old seal, in grams, is
- 3
- 511
- 772
- 957
- 1192
CORE, FUR2 2021 VCAA 3
The time series plot below shows the winning time, in seconds, for the women's 100 m freestyle swim plotted against year, for each year that the Olympic Games were held during the period 1956 to 2016.
A least squares line has been fitted to the plot to model the decreasing trend in the winning time over this period.
The equation of the least squares line is
winning time = 357.1 – 0.1515 × year
The coefficient of determination is 0.8794
- Name the explanatory variable in this time series plot. (1 mark)
- Determine the value of the correlation coefficient (`r`).
- Round your answer to three decimal places. (1 mark)
- Write down the average decrease in winning time, in seconds per year, during the period 1956 to 2016. (1 mark)
- The predicted winning time for the women's 100 m freestyle in 2000 was 54.10 seconds.
- The actual winning time for the women's 100 m freestyle in 2000 was 53.83 seconds.
- Determine the residual value in seconds. (1 mark)
- The following equation can be used to predict the winning time for the women's 100 m freestyle in the future.
- winning time = 357.1 – 0.1515 × year
- i. Show that the predicted winning time for the women's 100 m freestyle in 2032 is 49.252 seconds. (1 mark)
- ii. What assumption is being made when this equation is used to predict the winning time for the women's 100 m freestyle in 2032? (1 mark)
CORE, FUR2 2020 VCAA 5
The scatterplot below shows body density, in kilograms per litre, plotted against waist measurement, in centimetres, for 250 men.
When a least squares line is fitted to the scatterplot, the equation of this line is
body density = 1.195 – 0.001512 × waist measurement
- Draw the graph of this least squares line on the scatterplot above. (1 mark)
(Answer on the scatterplot above.)
- Use the equation of this least squares line to predict the body density of a man whose waist measurement is 65 cm.
Round your answer to two decimal places. (1 mark)
- When using the equation of this least squares line to make the prediction in part b., are you extrapolating or interpolating? (1 mark)
- Interpret the slope of this least squares line in terms of a man’s body density and waist measurement. (1 mark)
- In this study, the body density of the man with a waist measurement of 122 cm was 0.995 kg/litre.
Show that, when this least squares line is fitted to the scatterplot, the residual, rounded to two decimal places, is –0.02 (1 mark)
- The coefficient of determination for this data is 0.6783
Write down the value of the correlation coefficient `r`.
Round your answer to three decimal places. (1 mark)
- The residual plot associated with fitting a least squares line to this data is shown below.
Does this residual plot support the assumption of linearity that was made when fitting this line to this data? Briefly explain your answer. (1 mark)
CORE, FUR2-NHT 2019 VCAA 4
The scatterplot below plots the variable life span, in years, against the variable sleep time, in hours, for a sample of 19 types of mammals.
On the assumption that the association between sleep time and life span is linear, a least squares line is fitted to this data with sleep time as the explanatory variable.
The equation of this least squares line is
life span = 42.1 – 1.90 × sleep time
The coefficient of determination is 0.416
- Draw the graph of the least squares line on the scatterplot above. (1 mark)
- Describe the linear association between life span and sleep time in terms of strength and direction. (2 marks)
- Interpret the slope of the least squares line in terms of life span and sleep time. (2 marks)
- Interpret the coefficient of determination in terms of life span and sleep time. (1 mark)
- The life of the mammal with a sleep time of 12 hours is 39.2 years.
Show that, when the least squares line is used to predict the life span of this mammal, the residual is 19.9 years. (2 marks)
CORE, FUR2 2019 VCAA 5
The scatterplot below shows the atmospheric pressure, in hectopascals (hPa), at 3 pm (pressure 3 pm) plotted against the atmospheric pressure, in hectopascals, at 9 am (pressure 9 am) for 23 days in November 2017 at a particular weather station.
A least squares line has been fitted to the scatterplot as shown.
The equation of this line is
pressure 3 pm = 111.4 + 0.8894 × pressure 9 am
- Interpret the slope of this least squares line in terms of the atmospheric pressure at this weather station at 9 am and at 3 pm. (1 mark)
- Use the equation of the least squares line to predict the atmospheric pressure at 3 pm when the atmospheric pressure at 9 am is 1025 hPa.
Round your answer to the nearest whole number. (1 mark)
- Is the prediction made in part b. an example of extrapolation or interpolation? (1 mark)
- Determine the residual when the atmospheric pressure at 9 am is 1013 hPa.
Round your answer to the nearest whole number. (1 mark)
- The mean and the standard deviation of pressure 9 am and pressure 3 pm for these 23 days are shown in Table 4 below.
-
- Use the equation of the least squares line and the information in Table 4 to show that the correlation coefficient for this data, rounded to three decimal places, is `r` = 0.966 (1 mark)
- What percentage of the variation in pressure 3 pm is explained by the variation in pressure 9 am?
Round your answer to one decimal place. (1 mark)
- The residual plot associated with the least squares line is shown below.
-
- The residual plot above can be used to test one of the assumptions about the nature of the association between the atmospheric pressure at 3 pm and the atmospheric pressure at 9 am.
What is this assumption? (1 mark)
- The residual plot above does not support this assumption.
Explain why. (1 mark)
- The residual plot above can be used to test one of the assumptions about the nature of the association between the atmospheric pressure at 3 pm and the atmospheric pressure at 9 am.
CORE, FUR1 2019 VCAA 11 MC
A study was conducted to investigate the effect of drinking coffee on sleep.
In this study, the amount of sleep, in hours, and the amount of coffee drunk, in cups, on a given day were recorded for a group of adults.
The following summary statistics were generated.
On average, for each additional cup of coffee drunk, the amount of sleep
- decreased by 0.55 hours.
- decreased by 0.77 hours.
- decreased by 1.1 hours.
- increased by 1.1 hours.
- increased by 2.3 hours.
CORE, FUR1 2018 VCAA 14 MC
A least squares line is fitted to a set of bivariate data.
Another least squares line is fitted with response and explanatory variables reversed.
Which one of the following statistics will not change in value?
- the residual values
- the predicted values
- the correlation coefficient `r`
- the slope of the least squares line
- the intercept of the least squares line
CORE, FUR1 2018 VCAA 10 MC
In a study of the association between a person’s height, in centimetres, and body surface area, in square metres, the following least squares line was obtained.
body surface area = –1.1 + 0.019 × height
Which one of the following is a conclusion that can be made from this least squares line?
- An increase of 1 m² in body surface area is associated with an increase of 0.019 cm in height.
- An increase of 1 cm in height is associated with an increase of 0.019 m² in body surface area.
- The correlation coefficient is 0.019
- A person’s body surface area, in square metres, can be determined by adding 1.1 cm to their height.
- A person’s height, in centimetres, can be determined by subtracting 1.1 from their body surface area, in square metres.
CORE, FUR2 2017 VCAA 3
The number of male moths caught in a trap set in a forest and the egg density (eggs per square metre) in the forest are shown in the table below.
- Determine the equation of the least squares line that can be used to predict the egg density in the forest from the number of male moths caught in the trap.
Write the values of the intercept and slope of this least squares line in the appropriate boxes provided below.
Round your answers to one decimal place. (2 marks)
- The number of female moths caught in a trap set in a forest and the egg density (eggs per square metre) in the forest can also be examined.
A scatterplot of the data is shown below.
The equation of the least squares line isegg density = 191 + 31.3 × number of female moths
- Draw the graph of this least squares line on the scatterplot (provided above). (1 mark)
- Interpret the slope of the regression line in terms of the variables egg density and number of female moths caught in the trap. (1 mark)
- The egg density is 1500 when the number of female moths caught is 55.
Determine the residual value if the least squares line is used to predict the egg density for this number of female moths. (1 mark)
- The correlation coefficient is `r = 0.862`
Determine the percentage of the variation in egg density in the forest explained by the variation in the number of female moths caught in the trap.
Round your answer to one decimal place. (1 mark)
CORE, FUR2 2006 VCAA 2
The heights (in cm) and ages (in months) of a random sample of 15 boys have been plotted in the scatterplot below. The least squares regression line has been fitted to the data.
The equation of the least squares regression line is
`text(height = 75.4 + 0.53 × age)`
The correlation coefficient is `r= 0.7541`
- Complete the following sentence.
On average, the height of a boy increases by _______ cm for each one-month increase in age. (1 mark)
--- 0 WORK AREA LINES (style=lined) ---
-
- Evaluate the coefficient of determination.
Write your answer, as a percentage, correct to one decimal place. (1 mark)
--- 1 WORK AREA LINES (style=lined) ---
- Interpret the coefficient of determination in terms of the variables height and age. (1 mark)
--- 2 WORK AREA LINES (style=lined) ---
- Evaluate the coefficient of determination.
CORE, FUR2 2007 VCAA 2
The mean surface temperature (in °C) of Australia for the period 1960 to 2005 is displayed in the time series plot below.
- In what year was the lowest mean surface temperature recorded? (1 mark)
The least squares method is used to fit a trend line to the time series plot.
- The equation of this trend line is found to be
mean surface temperature = – 12.361 + 0.013 × year
- Use the trend line to predict the mean surface temperature (in °C) for 2010. Write your answer correct to two decimal places. (1 mark)
The actual mean surface temperature in the year 2000 was 13.55°C.
- Determine the residual value (in °C) when the trend line is used to predict the mean surface temperature for this year. Write your answer correct to two decimal places. (1 mark)
- By how many degrees does the trend line predict Australia's mean surface temperature will rise each year? Write your answer correct to three decimal places. (1 mark)
CORE, FUR2 2008 VCAA 4
The arm spans (in cm) and heights (in cm) for a group of 13 boys have been measured. The results are displayed in the table below.
The aim is to find a linear equation that allows arm span to be predicted from height.
- What will be the explanatory variable in the equation? (1 mark)
- Assuming a linear association, determine the equation of the least squares regression line that enables arm span to be predicted from height. Write this equation in terms of the variables arm span and height. Give the coefficients correct to two decimal places. (2 marks)
- Using the equation that you have determined in part b., interpret the slope of the least squares regression line in terms of the variables height and arm span. (1 mark)
CORE, FUR2 2010 VCAA 2
In the scatterplot below, average annual female income, in dollars, is plotted against average annual male income, in dollars, for 16 countries. A least squares regression line is fitted to the data.
The equation of the least squares regression line for predicting female income from male income is
female income = 13 000 + 0.35 × male income
- What is the explanatory variable? (1 mark)
--- 1 WORK AREA LINES (style=lined) ---
- Complete the following statement by filling in the missing information.
From the least squares regression line equation it can be concluded that, for these countries, on average, female income increases by `text($________)` for each $1000 increase in male income. (1 mark)
--- 0 WORK AREA LINES (style=lined) ---
-
- Use the least squares regression line equation to predict the average annual female income (in dollars) in a country where the average annual male income is $15 000. (1 mark)
--- 1 WORK AREA LINES (style=lined) ---
- The prediction made in part c.i. is not likely to be reliable.
Explain why. (1 mark)
--- 2 WORK AREA LINES (style=lined) ---
- Use the least squares regression line equation to predict the average annual female income (in dollars) in a country where the average annual male income is $15 000. (1 mark)
CORE, FUR2 2012 VCAA 2
The maximum temperature and the minimum temperature at this weather station on each of the 30 days in November 2011 are displayed in the scatterplot below.
The correlation coefficient for this data set is `r = 0.630`.
The equation of the least squares regression line for this data set is
maximum temperature = `13 + 0.67` × minimum temperature
- Draw this least squares regression line on the scatterplot above. (1 mark)
- Interpret the vertical intercept of the least squares regression line in terms of maximum temperature and minimum temperature. (1 mark)
- Describe the relationship between the maximum temperature and the minimum temperature in terms of strength and direction. (1 mark)
- Interpret the slope of the least squares regression line in terms of maximum temperature and minimum temperature. (1 mark)
- Determine the percentage of variation in the maximum temperature that may be explained by the variation in the minimum temperature.
Write your answer, correct to the nearest percentage. (1 mark)
On the day that the minimum temperature was 11.1 °C, the actual maximum temperature was 12.2 °C.
- Determine the residual value for this day if the least squares regression line is used to predict the maximum temperature.
Write your answer, correct to the nearest degree. (2 marks)
CORE, FUR2 2014 VCAA 2
The scatterplot below shows the population and area (in square kilometres) of a sample of inner suburbs of a large city.
The equation of the least squares regression line for the data in the scatterplot is
population = 5330 + 2680 × area
- Write down the response variable. (1 mark)
- Draw the least squares regression line on the scatterplot above.
(Answer on the scatterplot above.) (1 mark)
- Interpret the slope of this least squares regression line in terms of the variables area and population. (2 marks)
- Wiston is an inner suburb. It has an area of 4 km² and a population of 6690.
The correlation coefficient, `r`, is equal to 0.668
- Calculate the residual when the least squares regression line is used to predict the population of Wiston from its area. (1 mark)
- What percentage of the variation in the population of the suburbs is explained by the variaton in area.
Write your answer, correct to one decimal place. (1 mark)
CORE, FUR2 2015 VCAA 3
The scatterplot below plots male life expectancy (male) against female life expectancy (female) in 1950 for a number of countries. A least squares regression line has been fitted to the scatterplot as shown.
The slope of this least squares regression line is 0.88
- Interpret the slope in terms of the variables male life expectancy and female life expectancy. (1 mark)
--- 3 WORK AREA LINES (style=lined) ---
The equation of this least squares regression line is
male = 3.6 + 0.88 × female
- In a particular country in 1950, female life expectancy was 35 years.
Use the equation to predict male life expectancy for that country. (1 mark)
--- 2 WORK AREA LINES (style=lined) ---
- The coefficient of determination is 0.95
Interpret the coefficient of determination in terms of male life expectancy and female life expectancy. (1 mark)
--- 4 WORK AREA LINES (style=lined) ---
CORE, FUR1 2007 VCAA 7-8 MC
The lengths and diameters (in mm) of a sample of jellyfish selected were recorded and displayed in the scatterplot below. The least squares regression line for this data is shown.
The equation of the least squares regression line is
length = 3.5 + 0.87 × diameter
The correlation coefficient is `r = 0.9034`
Part 1
Written as a percentage, the coefficient of determination is closest to
- `0.816 text(%)`
- `0.903text(%)`
- `81.6text(%)`
- `90.3text(%)`
- `95.0text(%)`
Part 2
From the equation of the least squares regression line, it can be concluded that for these jellyfish, on average
- there is a 3.5 mm increase in diameter for each 1 mm increase in length.
- there is a 3.5 mm increase in length for each 1 mm increase in diameter.
- there is a 0.87 mm increase in diameter for each 1 mm increase in length.
- there is a 0.87 mm increase in length for each 1 mm increase in diameter.
- there is a 4.37 mm increase in diameter for each 1 mm increase in length.