Table 1 lists the Olympic year, \(\textit{year}\), and the gold medal-winning height for the men's high jump, \(\textit{Mgold}\), in metres, for each Olympic Games held from 1928 to 2020. No Olympic Games were held in 1940 or 1944, and the 2020 Olympic Games were held in 2021. Table 1 \begin{array}{|c|c|} --- 1 WORK AREA LINES (style=lined) --- --- 2 WORK AREA LINES (style=lined) --- --- 3 WORK AREA LINES (style=lined) --- --- 0 WORK AREA LINES (style=lined) --- --- 0 WORK AREA LINES (style=lined) --- --- 4 WORK AREA LINES (style=lined) ---
\hline \quad \textit{year} \quad & \textit{Mgold}(m) \\
\hline 1928 & 1.94 \\
\hline 1932 & 1.97 \\
\hline 1936 & 2.03 \\
\hline 1948 & 1.98 \\
\hline 1952 & 2.04 \\
\hline 1956 & 2.12 \\
\hline 1960 & 2.16 \\
\hline 1964 & 2.18 \\
\hline 1968 & 2.24 \\
\hline 1972 & 2.23 \\
\hline 1976 & 2.25 \\
\hline 1980 & 2.36 \\
\hline 1984 & 2.35 \\
\hline 1988 & 2.38 \\
\hline 1992 & 2.34 \\
\hline 1996 & 2.39 \\
\hline 2000 & 2.35 \\
\hline 2004 & 2.36 \\
\hline 2008 & 2.36 \\
\hline 2012 & 2.33 \\
\hline 2016 & 2.38 \\
\hline 2020 & 2.37 \\
\hline
\end{array}
CORE, FUR2 2021 VCAA 2
The two running events in the heptathlon are the 200 m run and the 800 m run. The times taken by the athletes in these two events, times200 and time800, are linearly related.
When a least squares line is fitted to the data, the equation of this line is found to be
time800 = 0.03931 + 5.2756 × time200
- Round the values for the intercept and the slope to three significant figures. Write your answers in the boxes provided. (1 mark)
time800= + × time200 - The mean and the standard deviation for each variable, time200 and time800, are shown in the table below.
The equation of the least squares line is
time800 = 0.03931 + 5.2756 × time200
Use this information to calculate the coefficient of determination as a percentage.Round your answer to the nearest percentage. (2 marks)
CORE, FUR2 2020 VCAA 6
The table below shows the mean age, in years, and the mean height, in centimetres, of 648 women from seven different age groups.
- What was the difference, in centimetres, between the mean height of the women in their twenties and the mean height of the women in their eighties? (1 mark)
--- 1 WORK AREA LINES (style=lined) ---
A scatterplot displaying this data shows an association between the mean height and the mean age of these women. In an initial analysis of the data, a line is fitted to the data by eye, as shown.
- Describe this association in terms of strength and direction. (1 mark)
--- 1 WORK AREA LINES (style=lined) ---
- The line on the scatterplot passes through the points (20,168) and (85,157).
Using these two points, determine the equation of this line. Write the values of the intercept and the slope in the appropriate boxes below.
Round your answers to three significant figures. (1 mark)
--- 0 WORK AREA LINES (style=lined) ---
mean height = |
|
+ |
|
× mean age |
- In a further analysis of the data, a least squares line was fitted.
The associated residual plot that was generated is shown below.
The residual plot indicates that the association between the mean height and the mean age of women is non-linear.
The data presented in the table in part a is repeated below. It can be linearised by applying an appropriate transformation to the variable mean age.
Apply an appropriate transformation to the variable mean age to linearise the data. Fit a least squares line to the transformed data and write its equation below.
Round the values of the intercept and the slope to four significant figures. (2 marks)
--- 5 WORK AREA LINES (style=lined) ---
CORE, FUR2-NHT 2019 VCAA 3
The life span, in years, and gestation period, in days, for 19 types of mammals are displayed in the table below.
- A least squares line that enables life span to be predicted from gestation period is fitted to this data. (1 mark)
Name the explanatory variable in the equation of this least squares line.
- Determine the equation of the least squares line in terms of the variables life span and gestation period.
Write your answers in the appropriate boxes provided below.
Round the numbers representing the intercept and slope to three significant figures. (2 marks)
= + ×
- Write the value of the correlation rounded to three decimal places. (1 mark)
`r =`
CORE, FUR2 2019 VCAA 4
The relative humidity (%) at 9 am and 3 pm on 14 days in November 2017 is shown in Table 3 below.
A least squares line is to be fitted to the data with the aim of predicting the relative humidity at 3 pm (humidity 3 pm) from the relative humidity at 9 am (humidity 9 am).
- Name the explanatory variable. (1 mark)
- Determine the values of the intercept and the slope of this least squares line.
Round both values to three significant figures and write them in the appropriate boxes provided.
humidity 3 pm = |
|
+ |
|
× humidity 9 am (1 mark) |
- Determine the value of the correlation coefficient for this data set.
Round your answer to three decimal places. (1 mark)
CORE, FUR1 2019 VCAA 9-10 MC
A least squares line is used to model the relationship between the monthly average temperature and latitude recorded at seven different weather stations. The equation of the least squares line is found to be
`quad text(average temperature) = 42.9842 - 0.877447 xx text(latitude)`
Part 1
When the numbers in this equation are correctly rounded to three significant figures, the equation will be
- `text(average temperature) = 42.984 - 0.877 xx text(latitude)`
- `text(average temperature) = 42.984 - 0.878 xx text(latitude)`
- `text(average temperature) = 43.0 - 0.878 xx text(latitude)`
- `text(average temperature) = 42.9 - 0.878 xx text(latitude)`
- `text(average temperature) = 43.0 - 0.877 xx text(latitude)`
Part 2
The coefficient of determination was calculated to be 0.893743
The value of the correlation coefficient, rounded to three decimal places, is
- − 0.945
- − 0.898
- 0.806
- 0.898
- 0.945
CORE, FUR2 2018 VCAA 3
Table 3 shows the yearly average traffic congestion levels in two cities, Melbourne and Sydney, during the period 2008 to 2016. Also shown is a time series plot of the same data.
The time series plot for Melbourne is incomplete.
- Use the data in Table 3 to complete the time series plot above for Melbourne. (1 mark)
(Answer on the time series plot above.)
- A least squares line is used to model the trend in the time series plot for Sydney. The equation is
`text(congestion level = −2280 + 1.15 × year)`
- Draw this least squares line on the time series plot. (1 mark)
(Answer on the time series plot above.)
- Use the equation of the least squares line to determine the average rate of increase in percentage congestion level for the period 2008 to 2016 in Sydney.
Write your answer in the box provided below. (1 mark)
- Draw this least squares line on the time series plot. (1 mark)
|
% per year |
-
- Use the least squares line to predict when the percentage congestion level in Sydney will be 43%. (1 mark)
The yearly average traffic congestion level data for Melbourne is repeated in Table 4 below.
- When a least squares line is used to model the trend in the data for Melbourne, the intercept of this line is approximately –1514.75556
Round this value to four significant figures. (1 mark)
- Use the data in Table 4 to determine the equation of the least squares line that can be used to model the trend in the data for Melbourne. The variable year is the explanatory variable.
Write the values of the intercept and the slope of this least squares line in the appropriate boxes provided below.
Round both values to four significant figures. (2 marks)
congestion level = |
|
+ |
|
× year |
- Since 2008, the equations of the least squares lines for Sydney and Melbourne have predicted that future traffic congestion levels in Sydney will always exceed future traffic congestion levels in Melbourne.
Explain why, quoting the values of appropriate statistics. (2 marks)
CORE, FUR2 2017 VCAA 4
The eggs laid by the female moths hatch and become caterpillars.
The following time series plot shows the total area, in hectares, of forest eaten by the caterpillars in a rural area during the period 1900 to 1980.
The data used to generate this plot is also given.
The association between area of forest eaten by the caterpillars and year is non-linear.
A log10 transformation can be applied to the variable area to linearise the data.
- When the equation of the least squares line that can be used to predict log10 (area) from year is determined, the slope of this line is approximately 0.0085385
Round this value to three significant figures. (1 mark)
- Perform the log10 transformation to the variable area and determine the equation of the least squares line that can be used to predict log10 (area) from year.
Write the values of the intercept and slope of this least squares line in the appropriate boxes provided below.
Round your answers to three significant figures. (2 marks)
- The least squares line predicts that the log10 (area) of forest eaten by the caterpillars by the year 2020 will be approximately 2.85
Using this value of 2.85, calculate the expected area of forest that will be eaten by the caterpillars by the year 2020.
Round your answer to the nearest hectare. (1 mark)
- Give a reason why this prediction may have limited reliability. (1 mark)
- The least squares line predicts that the log10 (area) of forest eaten by the caterpillars by the year 2020 will be approximately 2.85