SmarterEd

Aussie Maths & Science Teachers: Save your time with SmarterEd

  • Login
  • Get Help
  • About

Data Analysis, GEN1 2024 NHT 9-10 MC

The scatterplot below shows the average annual income, in dollars, plotted against life expectancy, in years, for 42 countries in 2020.

A least squares line has been fitted to the scatterplot.

The coefficient of determination is 0.306.
 

Question 9

The equation of the least squares line is closest to

  1. income \(=-19\,000+345 \times\) life expectancy
  2. income \(=-19\,250+355 \times\) life expectancy
  3. income \(=-19\,500+365 \times\) life expectancy
  4. income \(=-19\,750+375 \times\) life expectancy
  5. income \(=-20\,000+385 \times\) life expectancy

 
Question 10

Which one of the following statements is true?

  1. The value of the correlation coefficient is 0.306
  2. There are more data points above the least squares line than below.
  3. 30.6% of the variation in annual income is not explained by the variation in life expectancy.
  4. The country with the longest life expectancy has a positive residual associated with it.
  5. Using the least squares line to predict the annual income of a country whose citizens have a life expectancy of 54 years is an example of extrapolation.
Show Answers Only

\(\text{Question 9:} \ D \)

\(\text{Question 10:} \ E \)

Show Worked Solution

\(\text{Question 9}\)

\(\text{Using points (58, 2000) and (74, 8000):}\)

\(m=\dfrac{8000-2000}{74-58}=375\)
 

\(\text{Find equation of line:}\)

\(y-y_1\) \(=m(x-x_1)\)  
\(y-2000\) \(=375(x-58) \)  
\(y\) \(=375x-19\,750\)  

 
\(\Rightarrow D\)
 

\(\text{Question 10}\)

\(\text{A life expectancy of  54 years is outside the dataset range}\ \Rightarrow \ \text{extrapolation}\)

\(\Rightarrow E\)

Filed Under: Correlation and Regression Tagged With: Band 4, smc-265-20-Find LSRL Equation/Gradient, smc-265-60-Extrapolation / Interpolation

Data Analysis, GEN2 2024 VCAA 3

The Olympic gold medal-winning height for the women's high jump, \(\textit{Wgold}\), is often lower than the best height achieved in other international women's high jump competitions in that same year.

The table below lists the Olympic year, \(\textit{year}\), the gold medal-winning height, \(\textit{Wgold}\), in metres, and the best height achieved in all international women's high jump competitions in that same year, \(\textit{Wbest}\), in metres, for each Olympic year from 1972 to 2020.

A scatterplot of \(\textit{Wbest}\) versus \(\textit{Wgold}\) for this data is also provided.

When a least squares line is fitted to the scatterplot, the equation is found to be:

\(Wbest =0.300+0.860 \times Wgold\)

The correlation coefficient is 0.9318

  1. Name the response variable in this equation.   (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

  2. Draw the least squares line on the scatterplot above.  (1 mark)

    --- 0 WORK AREA LINES (style=lined) ---

  3. Determine the value of the coefficient of determination as a percentage.  (1 mark)
  4. Round your answer to one decimal place.

    --- 1 WORK AREA LINES (style=lined) ---

  5. Describe the association between \(\textit{Wbest}\) and \(\textit{Wgold}\) in terms of strength and direction.  (1 mark)

    --- 0 WORK AREA LINES (style=lined) ---

\begin{array}{|l|l|}
\hline
\rule{0pt}{2.5ex}\text { strength } \rule[-1ex]{0pt}{0pt} & \quad \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\\
\hline
\rule{0pt}{2.5ex}\text { direction } \rule[-1ex]{0pt}{0pt} & \\
\hline
\end{array}

  1. Referring to the equation of the least squares line, interpret the value of the slope in terms of the variables \(\textit{Wbest}\) and \(\textit{Wgold}\).  (1 mark)

    --- 3 WORK AREA LINES (style=lined) ---

  2. In 1984, the \(\textit{Wbest}\) value was 2.07 m for a \(\textit{Wgold}\) value of 2.02 m .
  3. Show that when this least squares line is fitted to the scatterplot, the residual value for this point is 0.0328.  (2 marks)

    --- 4 WORK AREA LINES (style=lined) ---

  4. The residual plot obtained when the least squares line was fitted to the data is shown below. The residual value from part f is missing from the residual plot.
     

    1. Complete the residual plot by adding the residual value from part f, drawn as a cross ( X ), to the residual plot above.   (1 mark)

      --- 0 WORK AREA LINES (style=lined) ---

    2. In part b, a least squares line was fitted to the scatterplot. Does the residual plot from part g justify this? Briefly explain your answer.  (1 mark)

      --- 3 WORK AREA LINES (style=lined) ---

  1. In 1964, the gold medal-winning height, \(\textit{Wgold}\), was 1.90m . When the least squares line is used to predict \(\textit{Wbest}\), it is found to be 1.934 m .
  2. Explain why this prediction is not likely to be reliable.  (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

Show Answers Only

a.    \(Wbest\)

b.    

c.    \(86.8\%\)

d.    \(\text{Strong, positive}\)

e.    \(Wbest\ \text{will increase, on average, by 0.86 metres for every metre of increase in}\ Wgold.\)

f.      \(Wbest\) \(=0.300 +0.86\times 2.02\)
    \(=2.0372\)

\(\therefore\ \text{Residual}\ =2.07-2.0372=0.0328\)

g.i.

g.ii.  \(\text{Yes, it is justified as there is no clear pattern, linear or otherwise.}\)

h.    \(\text{This prediction is outside the data range (1972 – 2020 → extrapolation)}\)

\(\text{and therefore cannot be relied upon.}\)

Show Worked Solution

a.    \(Wbest\)

b.    \(\text{Using points:}\ (1.90, 1.934)\ \text{and}\ (2.00, 2.02)\)
 

Mean mark (b) 51%.

c.    \(r=0.9318\ \ \Rightarrow\ \ r^2=0.9318^2=0.8682\dots\)

\(\therefore\ \text{Coefficient of determination} \approx 86.8\%\)
 

d.    \(\text{Strong, positive}\)
 

e.    \(Wbest\ \text{will increase, on average, by 0.86 metres for every metre of increase in}\ Wgold.\)
 

f.      \(Wbest\) \(=0.300 +0.86\times 2.02\)
    \(=2.0372\)

 
\(\therefore\ \text{Residual}\ =2.07-2.0372=0.0328\)

♦ Mean mark (f) 48%.

g.i.

g.ii.  \(\text{Yes, it is justified as there is no clear pattern, linear or otherwise.}\)

♦ Mean mark (g)(i) 47%.
♦ Mean mark (g)(ii) 40%.

h.    \(\text{This prediction is outside the data range (1972–2020 → extrapolation)}\)

\(\text{and therefore cannot be relied upon.}\)

♦ Mean mark (h) 50%.

Filed Under: Correlation and Regression Tagged With: Band 3, Band 4, Band 5, smc-265-10-r / r^2 and Association, smc-265-30-LSRL formula, smc-265-40-Interpret Gradient, smc-265-50-Residuals, smc-265-60-Extrapolation / Interpolation, smc-265-75-Explanatory / Response

Data Analysis, GEN1 2024 VCAA 9-10 MC

The least squares equation for the relationship between the average number of male athletes per competing nation, males, and the number of the Summer Olympic Games, number, is

\(males =67.5-1.27 \times number\)
 

Part 1

The summary statistics for the variables number and males are shown in the table below.
 

The value of Pearson's correlation coefficient, \(r\), rounded to three decimal places, is closest to

  1. \(-0.569\)
  2. \(-0.394\)
  3. \(0.394\)
  4. \(0.569\)

 
Part 2

At which Summer Olympic Games will the predicted average number of males be closest to 25.6 ?

  1. 31st
  2. 32nd
  3. 33rd
  4. 34th
Show Answers Only

Part 1: \(A\)

Part 2: \(C\)

Show Worked Solution

Part 1

\(b\) \(=r \times \dfrac{s_y}{s_x}\)
\(-1.27\) \(=r\times\dfrac{19}{8.51}\)
\(\therefore\ r\) \(=\dfrac{-1.27\times 8.51}{19}\)
  \(=-0.5688\dots\)

 
\(\Rightarrow A\)
 

♦ Mean mark (Part 1) 52%.

Part 2

\(males\) \(=67.5-1.27\times number\)
\(25.6\) \(=67.5-1.27\times number\)
\(number\) \(=\dfrac{67.5-25.6}{1.27}\)
  \(=32.992\dots\)
  \(\approx 33\)

 

\(\Rightarrow C\)

Filed Under: Correlation and Regression Tagged With: Band 4, Band 5, smc-265-10-r / r^2 and Association, smc-265-60-Extrapolation / Interpolation

Data Analysis, GEN2 2023 VCAA 3

The scatterplot below plots the average monthly ice cream consumption, in litres/person, against average monthly temperature, in °C. The data for the graph was recorded in the Northern Hemisphere.
 

When a least squares line is fitted to the scatterplot, the equation is found to be:

consumption = 0.1404 + 0.0024 × temperature

The coefficient of determination is 0.7212

  1. Draw the least squares line on the scatterplot graph above.  (1 mark)

    --- 0 WORK AREA LINES (style=lined) ---

  2. Determine the value of the correlation coefficient \(r\).
  3. Round your answer to three decimal places.  (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

  4. Describe the association between average monthly ice cream consumption and average monthly temperature in terms of strength, direction and form.  (1 mark)

    --- 0 WORK AREA LINES (style=lined) ---

    \begin{array} {|l|c|}
    \hline
    \rule{0pt}{2.5ex} \textbf{strength} \rule[-1ex]{0pt}{0pt} & \quad \quad \quad \quad \quad \quad \quad \quad \\
    \hline
    \rule{0pt}{2.5ex} \textbf{direction} \rule[-1ex]{0pt}{0pt} & \\
    \hline
    \rule{0pt}{2.5ex} \textbf{form} \rule[-1ex]{0pt}{0pt} & \\
    \hline
    \end{array}

  5. Referring to the equation of the least squares line, interpret the value of the intercept in terms of the variables consumption and temperature.  (1 mark)

    --- 3 WORK AREA LINES (style=lined) ---

  6. Use the equation of the least squares line to predict the average monthly ice cream consumption, in litres per person, when the monthly average temperature is –6°C.  (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  7. Write down whether this prediction is an interpolation or an extrapolation.  (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

Show Answers Only

a.    
         

b.    \(r = \sqrt{0.7212} = 0.849\ \text{(3 d.p.)}\)

c.   

\begin{array} {|l|c|}
\hline
\rule{0pt}{2.5ex} \textbf{strength} \rule[-1ex]{0pt}{0pt} & \text{strong} \\
\hline
\rule{0pt}{2.5ex} \textbf{direction} \rule[-1ex]{0pt}{0pt} & \text{positive} \\
\hline
\rule{0pt}{2.5ex} \textbf{form} \rule[-1ex]{0pt}{0pt} & \text{linear} \\
\hline
\end{array}

 
d.
    \(\text{At 0°C, the predicted average consumption is:}\)

\(\textit{consumption} = 0.1404 + 0.002 \times 0 = 0.1404\ \text{L/person}\)

 
e.
    \(\text{Find consumption} (c)\ \text{when temperature} (t) = -6:\)

\(c=0.1404 + 0.002 \times -6 = 0.1284\ \text{L/person} \)

 
f. 
    \(\text{Extrapolation (even though the axes extend to –6°C, the data set} \)

\(\text{range finishes with a lower limit around –4.5°C.)}\)

Show Worked Solution

a.    
         

b.    \(r = \sqrt{0.7212} = 0.849\ \text{(3 d.p.)}\)

c.   

\begin{array} {|l|c|}
\hline
\rule{0pt}{2.5ex} \textbf{strength} \rule[-1ex]{0pt}{0pt} & \text{strong} \\
\hline
\rule{0pt}{2.5ex} \textbf{direction} \rule[-1ex]{0pt}{0pt} & \text{positive} \\
\hline
\rule{0pt}{2.5ex} \textbf{form} \rule[-1ex]{0pt}{0pt} & \text{linear} \\
\hline
\end{array}

 
d.
    \(\text{At 0°C, the predicted average consumption is:}\)

\(\textit{consumption} = 0.1404 + 0.002 \times 0 = 0.1404\ \text{L/person}\)

 
e.
    \(\text{Find consumption} (c)\ \text{when temperature} (t) = -6:\)

\(c=0.1404 + 0.002 \times -6 = 0.1284\ \text{L/person} \)

 
f. 
    \(\text{Extrapolation (even though the axes extend to –6°C, the data set} \)

\(\text{range finishes with a lower limit around –4.5°C.)}\)

Filed Under: Correlation and Regression Tagged With: Band 3, Band 4, Band 5, smc-265-10-r / r^2 and Association, smc-265-30-LSRL formula, smc-265-60-Extrapolation / Interpolation

CORE, FUR2 2021 VCAA 4

The time series plot below shows that the winning time for both men and women in the 100 m freestyle swim in the Olympic Games has been decreasing during the period 1912 to 2016.
 

Least squares lines are used to model the trend for both men and women.

The least squares line for the men's winning time has been drawn on the time series plot above.

The equation of the least squares line for men is

winning time men = 356.9 – 0.1544 × year

The equation of the least squares line for women is

winning time women = 538.9 – 0.2430 × year

  1. Draw the least squares line for winning time women on the time series plot above.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  2. The difference between the women's predicted winning time and the men's predicted winning time can be calculated using the formula.
  3.       difference = winning time women – winning time men
  4. Use the equation of the least squares lines and the formula above to calculate the difference predicted for the 2024 Olympic Games.
  5. Round your answer to one decimal place.   (2 marks)

    --- 3 WORK AREA LINES (style=lined) ---

  6. The Olympic Games are held every four years. The next Olympic Games will be held in 2024, then 2028, 2032 and so on.
  7. In which Olympic year do the two least squares lines predict that the wining time for women will first be faster than the winning time for men in the 100 m freesytle?   (2 marks)

    --- 4 WORK AREA LINES (style=lined) ---

Show Answers Only
  1. `text{See Worked Solutions}`
  2. `2.7 \ text{seconds}`
  3. `2056`
Show Worked Solution

a.   

`text{Find end points for the women’s graph.}`

`1908 \ text{data point} = 538.9-0.2430 xx 1908 = 75.256`

`2020 \ text{data point} = 538.9-0.2430 xx 2020 = 48.04`

`=> \ text{Line passes through} \ (1908, 75.3) \ text{and} \ (2020, 48.0)`
 

b.   `text{difference}` `= (538.9-0.2430 xx 2024)-(356.9-0.1544 xx 2024)`
    `= 2.673 …`
    `= 2.7 \ text{seconds (to 1 d.p.)}`

 
c.    `text{Times will be equal when}`

`538.9-0.2430 xx text{year} = 356.9-0.1544 xx text{year}`

`text{year} = 2054.17 …`

`text{1st Olympic year after} \ 2054.17 = 2056`

Filed Under: Correlation and Regression Tagged With: Band 4, Band 5, smc-265-30-LSRL formula, smc-265-60-Extrapolation / Interpolation

CORE, FUR2 2021 VCAA 3

The time series plot below shows the winning time, in seconds, for the women's 100 m freestyle swim plotted against year, for each year that the Olympic Games were held during the period 1956 to 2016.

A least squares line has been fitted to the plot to model the decreasing trend in the winning time over this period.
 

The equation of the least squares line is

winning time = 357.1 – 0.1515 × year

The coefficient of determination is 0.8794

  1. Name the explanatory variable in this time series plot.   (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

  2. Determine the value of the correlation coefficient (`r`).
  3. Round your answer to three decimal places.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  4. Write down the average decrease in winning time, in seconds per year, during the period 1956 to 2016.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  5. The predicted winning time for the women's 100 m freestyle in 2000 was 54.10 seconds.
  6. The actual winning time for the women's 100 m freestyle in 2000 was 53.83 seconds.
  7. Determine the residual value in seconds.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  8. The following equation can be used to predict the winning time for the women's 100 m freestyle in the future.
  9.      winning time =  357.1 – 0.1515 × year
  10.  i. Show that the predicted winning time for the women's 100 m freestyle in 2032 is 49.252 seconds.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  11. ii. What assumption is being made when this equation is used to predict the winning time for the women's 100 m freestyle in 2032?   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

Show Answers Only
  1. `text{year}`
  2. `- 0.938`
  3. `0.1515 \ text{seconds}`
  4. `-0.27`
  5.  i. `49.252 \ text{seconds}`
  6. ii. `text{The same trend continues when the graph is extended beyond 2016.}`
Show Worked Solution

a.      `text{year}`
 

b. `r^2` `= 0.8794 \ text{(given)}`
  `r` `= ± sqrt{0.8794}`
    `= ± 0.938 \ text{(to 3 d.p.)}`

 
`text{By inspection of graph, correlation is negative}`

`:. \ r = -0.938`
 

c.    `text{Average decrease in winning time = 0.1515 seconds}`

`text{(this is given by the slope of the line.)}`
 

d.    `text{Residual Value}` `= text{actual}-text{predicted}`
    `= 53.83-54.10`
    `= -0.27`

 

e.i.      `text{winning time (2032)}` `= 357.1-0.1515 xx 2032`
      `=49.252 \ text{seconds}`

 

e.ii.  `text{The assumption is that the graph is accurate when it is extended}`

  `text{beyond 2016 (i.e decreasing trend continues).}`

Filed Under: Correlation and Regression Tagged With: Band 3, Band 4, Band 5, smc-265-10-r / r^2 and Association, smc-265-30-LSRL formula, smc-265-40-Interpret Gradient, smc-265-50-Residuals, smc-265-60-Extrapolation / Interpolation, smc-265-75-Explanatory / Response

CORE, FUR2 2020 VCAA 5

The scatterplot below shows body density, in kilograms per litre, plotted against waist measurement, in centimetres, for 250 men.

When a least squares line is fitted to the scatterplot, the equation of this line is

body density = 1.195 – 0.001512 × waist measurement

  1. Draw the graph of this least squares line on the scatterplot above.   (1 mark)

    --- 0 WORK AREA LINES (style=lined) ---

  2. Use the equation of this least squares line to predict the body density of a man whose waist measurement is 65 cm.
  3. Round your answer to two decimal places.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  4. When using the equation of this least squares line to make the prediction in part b., are you extrapolating or interpolating?   (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

  5. Interpret the slope of this least squares line in terms of a man’s body density and waist measurement.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  6. In this study, the body density of the man with a waist measurement of 122 cm was 0.995 kg/litre.
  7. Show that, when this least squares line is fitted to the scatterplot, the residual, rounded to two decimal places, is –0.02   (1 mark)

    --- 3 WORK AREA LINES (style=lined) ---

  8. The coefficient of determination for this data is 0.6783
  9. Write down the value of the correlation coefficient `r`.
  10. Round your answer to three decimal places.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  11. The residual plot associated with fitting a least squares line to this data is shown below.
     
       

     

    Does this residual plot support the assumption of linearity that was made when fitting this line to this data? Briefly explain your answer.   (1 mark)

    --- 3 WORK AREA LINES (style=lined) ---

Show Answers Only
  1. `text(See Worked Solutions)`
  2. `1.10\ text(kg/litre)`
  3. `text(extrapolating)`
  4. `text(See Worked Solutions)`
  5. `-0.02`
  6. `-0.824`
  7. `text(See Worked Solutions)`
Show Worked Solution

a.   `text(LSRL passes through)\ (60, 1.1043) and (130, 0.998)`

♦ Mean mark part a. 41%.

b.   `text(body density)` `= 1.195-0.001512 xx 65`
    `= 1.09672`
    `= 1.10\ text{kg/litre (to 2 d.p.)}`
♦ Mean mark part c. 38%.
c.   `text(A waist of 65 cm is outside the)`
  `text(range of the existing data set.)`

 
`:.\ text(Extrapolating)`

 

♦ Mean mark part d. 44%.
d.   `text(Body density decreases by 0.001512 kg/litre)`
  `text(for each increase in waist size of 1 cm.)`

 

e.   `text{Body density (predicted)}`

`= 1.195-0.001512 xx 122`

`~~ 1.0105\ text(kg/litre)`
 

`text(Residual)` `= text(Actual-predicted)`
  `~~ 0.995-1.0105`
  `~~ -0.0155`
  `~~ -0.02\ text{(to 2 d.p.)}`

 

♦♦ Mean mark part f. 25%.

f.   `r` `= -sqrt(0.6783)`
    `=-0.8235…`
    `= -0.824\ text{(to 3 d.p.)}`

 

g.  `text(The residual plot has no pattern and is centred around zero.)`

`:.\ text(It supports the assumption of linearity of the LSRL.)`

Filed Under: Correlation and Regression Tagged With: Band 4, Band 5, smc-265-10-r / r^2 and Association, smc-265-30-LSRL formula, smc-265-40-Interpret Gradient, smc-265-50-Residuals, smc-265-60-Extrapolation / Interpolation

CORE, FUR2 2019 VCAA 5

The scatterplot below shows the atmospheric pressure, in hectopascals (hPa), at 3 pm (pressure 3 pm) plotted against the atmospheric pressure, in hectopascals, at 9 am (pressure 9 am) for 23 days in November 2017 at a particular weather station.
 

A least squares line has been fitted to the scatterplot as shown.

The equation of this line is

pressure 3 pm = 111.4 + 0.8894 × pressure 9 am

  1. Interpret the slope of this least squares line in terms of the atmospheric pressure at this weather station at 9 am and at 3 pm.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  2. Use the equation of the least squares line to predict the atmospheric pressure at 3 pm when the atmospheric pressure at 9 am is 1025 hPa.
  3. Round your answer to the nearest whole number.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  4. Is the prediction made in part b. an example of extrapolation or interpolation?   (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

  5. Determine the residual when the atmospheric pressure at 9 am is 1013 hPa.
  6. Round your answer to the nearest whole number.   (1 mark)

    --- 3 WORK AREA LINES (style=lined) ---

  7. The mean and the standard deviation of pressure 9 am and pressure 3 pm for these 23 days are shown in Table 4 below.

    1. Use the equation of the least squares line and the information in Table 4 to show that the correlation coefficient for this data, rounded to three decimal places, is  `r` = 0.966   (1 mark)

      --- 3 WORK AREA LINES (style=lined) ---

    2. What percentage of the variation in pressure 3 pm is explained by the variation in pressure 9 am?
    3. Round your answer to one decimal place.   (1 mark)

      --- 2 WORK AREA LINES (style=lined) ---

  1. The residual plot associated with the least squares line is shown below.
     

    1. The residual plot above can be used to test one of the assumptions about the nature of the association between the atmospheric pressure at 3 pm and the atmospheric pressure at 9 am.
    2. What is this assumption?   (1 mark)

      --- 2 WORK AREA LINES (style=lined) ---

    3. The residual plot above does not support this assumption.
    4. Explain why.   (1 mark)

      --- 2 WORK AREA LINES (style=lined) ---

Show Answers Only
  1. `text(An increase in 1hPa of pressure at 9 am is associated)`
    `text(with an increase of 0.8894 hPa of pressure at 3 pm.)`
  2. `1023\ text(hPa)`
  3. `text(Interpolation)`
  4. `3\ text(hPa)`
    1. `0.966`
    2. `93.3%`
    1. `text(The assumption is that a linear relationship)`
      `text(exists between the pressure at 9 am and the)`
      `text(pressure at 3 pm.)`
    2. `text(The residual plot does not appear to be random.)`
Show Worked Solution

a.    `text(An increase in 1hPa of pressure at 9 am is associated)`

`text(with an increase of 0.8894 hPa of pressure at 3 pm.)`

 

b.   `text(pressure 3 pm)` `= 111.4 + 0.8894 xx 1025`
    `= 1023\ text(hPa)`

 

c.  `text{Interpolation (1025 is within the given data range)}`

 

d.   `text(Residual)` `= text(actual) – text(predicted)`
    `= 1015 – (111.4 + 0.8894 xx 1013)`
    `= 1015 – 1012.36`
    `= 2.63…`
    `~~ 3\ text(hPa)`

 

e.i.   `r= b (s_x)/(s_y)`

    `= 0.8894 xx 4.5477/4.1884`
    `= 0.96569…`
    `= 0.966`

 

e.ii.   `r` `= 0.966`
  `r^2` `= 0.9331`
    `= 93.3%`

 

f.i.   `text(The assumption is that a linear relationship)`
 

`text(exists between the pressure at 9 am and the)`

`text(pressure at 3 pm.)`

 

f.ii.  `text(The residual plot does not appear to be random.)`

Filed Under: Correlation and Regression Tagged With: Band 3, Band 4, Band 5, page-break-before-question, smc-265-10-r / r^2 and Association, smc-265-30-LSRL formula, smc-265-40-Interpret Gradient, smc-265-50-Residuals, smc-265-60-Extrapolation / Interpolation

CORE, FUR2 2018 VCAA 3

Table 3 shows the yearly average traffic congestion levels in two cities, Melbourne and Sydney, during the period 2008 to 2016. Also shown is a time series plot of the same data.

The time series plot for Melbourne is incomplete.

  1. Use the data in Table 3 to complete the time series plot above for Melbourne.   (1 mark)

    --- 0 WORK AREA LINES (style=lined) ---

  2. A least squares line is used to model the trend in the time series plot for Sydney. The equation is

       `text(congestion level = −2280 + 1.15 × year)`

  1.   i. Draw this least squares line on the time series plot.   (1 mark)

    --- 0 WORK AREA LINES (style=lined) ---

  2.  ii. Use the equation of the least squares line to determine the average rate of increase in percentage congestion level for the period 2008 to 2016 in Sydney.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

    iii. Use the least squares line to predict when the percentage congestion level in Sydney will be 43%.   (1 mark)

    --- 3 WORK AREA LINES (style=lined) ---

The yearly average traffic congestion level data for Melbourne is repeated in Table 4 below.

  1. When a least squares line is used to model the trend in the data for Melbourne, the intercept of this line is approximately –1514.75556
  2. Round this value to four significant figures.   (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

  3. Use the data in Table 4 to determine the equation of the least squares line that can be used to model the trend in the data for Melbourne. The variable year is the explanatory variable.
  4. Write the values of the intercept and the slope of this least squares line in the appropriate boxes provided below.
  5. Round both values to four significant figures.   (2 marks)

    --- 0 WORK AREA LINES (style=lined) ---

congestion level = 
 
 + 
 
 × year
  1. Since 2008, the equations of the least squares lines for Sydney and Melbourne have predicted that future traffic congestion levels in Sydney will always exceed future traffic congestion levels in Melbourne.

     

    Explain why, quoting the values of appropriate statistics.   (2 marks)

    --- 5 WORK AREA LINES (style=lined) ---

Show Answers Only
  1. `text(See Worked Solutions)`
    1. `text(See Worked Solutions)`
    2. `1.15 text(%)`
    3. `2020`
  2. `-1515`
  3. `text(congestion level) = -1515 + 0.7667 xx text(year)`
  4. `text(See Worked Solutions)`
Show Worked Solution
a.   

 

b.i.   

 

b.ii.  `text(The least squares line is 1.15% higher each year.)`

♦ Mean mark (b)(ii) 36%.
COMMENT: Major problems caused by part (b)(ii). Review!

  ` :.\ text(Average rate of increase) = 1.15 text(%)`

 

b.iii.    `text(Find year when:)`
  `43` `= -2280 + 1.15 xx text(year)`
  `text(year)` `= 2323/1.15`
    `= 2020`

 

c.  `-1515`

 

d.   `text(congestion level) = -1515 + 0.7667 xx text(year)`

 

e.   `text(Melbourne congestion level in 2008)`

♦♦♦ Mean mark 18%.

`= -1515 + 0.7667 xx 2008`

`= 24.5 text(%)`

 
`text{In 2008 Sydney has higher congestion (29.2 > 24.5)}`

`text(After 2008, Sydney congestion grows at 1.15% per)`

`text(year and Melbourne grows at 0.7667% per year.)`

`:.\ text(Sydney predicted to always exceed Melbourne.)`

Filed Under: Correlation and Regression Tagged With: Band 3, Band 4, Band 5, Band 6, page-break-before-question, smc-265-20-Find LSRL Equation/Gradient, smc-265-60-Extrapolation / Interpolation, smc-265-80-Rounding (Sig Fig)

CORE, FUR2 2017 VCAA 4

The eggs laid by the female moths hatch and become caterpillars.

The following time series plot shows the total area, in hectares, of forest eaten by the caterpillars in a rural area during the period 1900 to 1980.

The data used to generate this plot is also given.
 

The association between area of forest eaten by the caterpillars and year is non-linear.

A log10 transformation can be applied to the variable area to linearise the data.

  1. When the equation of the least squares line that can be used to predict log10 (area) from year is determined, the slope of this line is approximately 0.0085385
  2. Round this value to three significant figures.   (1 mark)
  3. Perform the log10 transformation to the variable area and determine the equation of the least squares line that can be used to predict log10 (area) from year.
  4. Write the values of the intercept and slope of this least squares line in the appropriate boxes provided below.
  5. Round your answers to three significant figures.  (2 marks)

    --- 0 WORK AREA LINES (style=lined) ---

The least squares line predicts that the log10 (area) of forest eaten by the caterpillars by the year 2020 will be approximately 2.85

  1. Using this value of 2.85, calculate the expected area of forest that will be eaten by the caterpillars by the year 2020.
  2.  i. Round your answer to the nearest hectare.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  3. ii. Give a reason why this prediction may have limited reliability.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

Show Answers Only

a.  `0.00854\ (text(3 sig fig))`

b.  `log_10(text(area)) = −14.4 + 0.000854 xx text(year)`

c.i.  `708\ text(hectares)`

c.ii. `text(This prediction extrapolates significantly from the given)`
        `text(data range and as a result, its reliability decreases.)`

Show Worked Solution

a.   `0.0085385 = 0.00854\ (text(3 sig fig))`

♦ Mean marks of part (a) and (b) 44%.

 

b.    `log_10(text(area))` `= −14.4 + 0.000854 xx text(year)`

 

♦♦ Mean mark part (c)(i) 29%.
COMMENT: When the question specifies using the value 2.85, use it!

c.i.    `log_10(text(Area))` `= 2.85`
  `:.\ text(Area)` `= 10^2.85`
    `= 707.94…`
    `= 708\ text(hectares)`

 

c.ii.   `text(This prediction extrapolates significantly from the given)`

  `text(data range and as a result, its reliability decreases.)`

Filed Under: Correlation and Regression Tagged With: Band 4, Band 5, Band 6, smc-265-60-Extrapolation / Interpolation, smc-265-70-Linearise - log10, smc-265-80-Rounding (Sig Fig)

CORE, FUR2 2010 VCAA 2

In the scatterplot below, average annual female income, in dollars, is plotted against average annual male income, in dollars, for 16 countries. A least squares regression line is fitted to the data.
 


 

The equation of the least squares regression line for predicting female income from male income is

female income = 13 000 + 0.35 × male income

  1. What is the explanatory variable?  (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

  2. Complete the following statement by filling in the missing information.

     

    From the least squares regression line equation it can be concluded that, for these countries, on average, female income increases by `text($________)` for each $1000 increase in male income.  (1 mark)

    --- 0 WORK AREA LINES (style=lined) ---

    1. Use the least squares regression line equation to predict the average annual female income (in dollars) in a country where the average annual male income is $15 000.  (1 mark)

      --- 1 WORK AREA LINES (style=lined) ---

    2. The prediction made in part c.i. is not likely to be reliable.

       

      Explain why.  (1 mark)

      --- 2 WORK AREA LINES (style=lined) ---


Show Answers Only

  1. `text(Male income)`
  2. `$350`
    1. `$18\ 250`
    2. `text(The model established by the regression)`
      `text(equation cannot be relied upon outside the)`
      `text(range of the given data set.)`
  3.  

Show Worked Solution

a.   `text(Male income)`
 

b.   `text(Increase in female income)`

`= 0.35 xx 1000`

`= $350`
 

c.i.   `text(Average annual female income)`

`= 13\ 000 + 0.35 xx 15\ 000`

`= $18\ 250`

♦♦ This part was poorly answered (exact data unavailable).
MARKER’S COMMENT: Many students offered “real world” explanations which did not gain a mark here.

 
c.ii.
   `text(The model established by the regression)`

   `text(equation cannot be relied upon outside the)`

   `text(range of the given data set.)`

Filed Under: Correlation and Regression Tagged With: Band 3, Band 4, Band 5, smc-265-40-Interpret Gradient, smc-265-60-Extrapolation / Interpolation, smc-265-75-Explanatory / Response

CORE, FUR2 2015 VCAA 5

The time series plot below displays the life expectancy, in years, of people living in Australia and the United Kingdom (UK) for each year from 1920 to 2010.
 

Core, FUR2 2015 VCAA 51

  1. By how much did life expectancy in Australia increase during the period 1920 to 2010?
  2. Write your answer correct to the nearest year.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  3. In 1975, the life expectancies in Australia and the UK were very similar.
  4. From 1975, the gap between the life expectancies in the two countries increased, with people in Australia having a longer life expectancy than people in the UK.
  5. To investigate the difference in life expectancies, least squares regression lines were fitted to the data for both Australia and the UK for the period 1975 to 2010.
  6. The results are shown below. 

  Core, FUR2 2015 VCAA 52

  1. The equations of the least squares regression lines are as follows.
`text(Australia:)\ \ \ ` `text(life expectancy) = – 451.7 + 0.2657 xx text(year)`
`text(UK:)` `text(life expectancy) = – 350.4 + 0.2143 xx text(year)` 

 

  1. Use these equations to predict the difference between the life expectancies of Australia and the UK in 2030.
  2. Give your answer correct to the nearest year.   (2 marks)

    --- 4 WORK AREA LINES (style=lined) ---

  3. Explain why this prediction may be of limited reliability.   (1 mark)

    --- 3 WORK AREA LINES (style=lined) ---

Show Answers Only
  1. `text(22 years)`
    1. `3\ text(years)`
    2. `text(The year 2030 is outside the available range)`

       

      `text(of data and therefore its predictions may)`

       

      `text(become unreliable.)`

Show Worked Solution

a.   `text{The increase in life expectancy (1920 – 2010)}`

`=82-60`

`=22\ text(years)`

 

b.i.    `text{Life expectancy (Aust)}` `= −451.7 + 0.2657× 2030`
    `= 87.67…\ text(years)`
     
  `text{Life expectancy (UK)}` `= −350.4 + 0.2143× 2030`
    `=84.62…\ text(years)`

 

`:.\ text(Difference)` `= 87.67…-84.62…`
  `= 3\ text(years)\ \ \ text{(nearest year)}`
♦ Mean mark 45%.
MARKER’S COMMENT: Relate answers directly to the limitations of the given statistical data rather than future events in (b)(ii).

 

b.ii.   `text(The year 2030 is outside the available range)`

  `\ text(of data and therefore its predictions may)`

  `\ text(become unreliable.)`

Filed Under: Correlation and Regression Tagged With: Band 3, Band 4, Band 5, smc-265-60-Extrapolation / Interpolation

Copyright © 2014–2025 SmarterEd.com.au · Log in