SmarterEd

Aussie Maths & Science Teachers: Save your time with SmarterEd

  • Login
  • Get Help
  • About

Data Analysis, GEN2 2024 NHT 1

Data was collected to investigate the behaviour of tides in Sydney Harbour.

There are usually two high tides and two low tides each day.

The variables in this study were:

  • Day: the day number in the sample
  • LLT: the height of the lowest low tide for that day (in metres)
  • HHT: the height of the highest high tide for that day (in metres)

Table 1 displays the data collected for a sample of 14 consecutive days in February 2021.

Table 1

\begin{array}{|c|c|c|}
\hline
 \rule{0pt}{2.5ex}\ \ \ \textit{Day}\ \ \ \rule[-1ex]{0pt}{0pt}& \textit{LLT (m)} & \textit{HHT (m)}\\
\hline \rule{0pt}{2.5ex}1 \rule[-1ex]{0pt}{0pt}& 0.43 & 1.65 \\
\hline \rule{0pt}{2.5ex}2 \rule[-1ex]{0pt}{0pt}& 0.49 & 1.55 \\
\hline \rule{0pt}{2.5ex}3 \rule[-1ex]{0pt}{0pt}& 0.55 & 1.44 \\
\hline \rule{0pt}{2.5ex}4 \rule[-1ex]{0pt}{0pt}& 0.61 & 1.42 \\
\hline \rule{0pt}{2.5ex}5 \rule[-1ex]{0pt}{0pt}& 0.68 & 1.42 \\
\hline \rule{0pt}{2.5ex}6 \rule[-1ex]{0pt}{0pt}& 0.73 & 1.42 \\
\hline \rule{0pt}{2.5ex}7 \rule[-1ex]{0pt}{0pt}& 0.72 & 1.42 \\
\hline \rule{0pt}{2.5ex}8 \rule[-1ex]{0pt}{0pt}& 0.65 & 1.47 \\
\hline \rule{0pt}{2.5ex}9 \rule[-1ex]{0pt}{0pt}& 0.57 & 1.55 \\
\hline \rule{0pt}{2.5ex}10 \rule[-1ex]{0pt}{0pt}& 0.48 & 1.64 \\
\hline \rule{0pt}{2.5ex}11 \rule[-1ex]{0pt}{0pt}& 0.39 & 1.74 \\
\hline \rule{0pt}{2.5ex}12 \rule[-1ex]{0pt}{0pt}& 0.30 & 1.83 \\
\hline \rule{0pt}{2.5ex}13 \rule[-1ex]{0pt}{0pt}& 0.25 & 1.90 \\
\hline \rule{0pt}{2.5ex}14 \rule[-1ex]{0pt}{0pt}& 0.22 & 1.92 \\
\hline
\end{array}

  1. For the \(H H T\) values in Table 1:
    1. Calculate the mean, in metres.
    2. Round your answer to one decimal place.   (1 mark)

      --- 2 WORK AREA LINES (style=lined) ---

    3. Calculate the standard deviation, in metres.
    4. Round your answer to three decimal places.    (1 mark)

      --- 2 WORK AREA LINES (style=lined) ---

  2. Use the \(HHT\) data from Table 1 to construct a boxplot on the grid below.    (2 marks)

    --- 0 WORK AREA LINES (style=lined) ---

     

  1. The five-number summary of the \(L L T\) data is shown in Table 2 below.
  2. Table 2

\begin{array}{|c|c|c|c|c|}
\hline \rule{0pt}{2.5ex}\textbf{Minimum} \rule[-1ex]{0pt}{0pt}& \ \ \textbf{Q1} \ \ & \textbf{Median} & \ \  \textbf{Q3} \ \ & \textbf{Maximum} \\
\hline \rule{0pt}{2.5ex}0.22 \rule[-1ex]{0pt}{0pt}& 0.39 & 0.52 & 0.65 & 0.73 \\
\hline
\end{array}

  1. Show that the minimum \(L L T\) value of 0.22 m is not an outlier.    (2 marks)

    --- 4 WORK AREA LINES (style=lined) ---

  2. A least squares line can be used to model the association between \(L L T\) and \(H H T\). In this model, \(H H T\) is the response variable.
  3. Use the data from Table 1 to determine the equation of this least squares line.
  4. Round the values of the intercept and slope to four significant figures.
  5. Write your answers in the boxes provided.    (2 marks)

    --- 0 WORK AREA LINES (style=lined) ---

Show Answers Only

a.i.   \(\text{Mean = 1.6}\)

a.ii.  \(\text{Std dev = 0.185}\)

b.   
               

c.   \(IQR (LLT) = 0.65-0.39=0.26\)

\(\text{Lower fence}\ =Q_1-1.5 \times IQR = 0.39-1.5 \times o.26=0\)

\(\text{Since 0.22 > 0, 0.22 is not an outlier.}\)

d.   \(HHT = 2.130 + (-1.054) \times LLT\)

Show Worked Solution

a.i.   \(\text{Mean = 1.6}\)

a.ii.  \(\text{Std dev = 0.185}\)
 

b.   \(\text{Order \(HHT\) data:}\)

\(1.42, 1.42, 1.42, [1.42], 1,44, 1.47, 1.55 | 1.55, 1.64, 1.65, [1.74],\)

\(1.83, 1.90, 1.92\)

\(\text{High = 1.92, Low = 1.42, \(Q_1=1.42, Q_3=1.74\), Median = 1.55}\)

 
c.
   \(IQR (LLT) = 0.65-0.39=0.26\)

\(\text{Lower fence}\ =Q_1-1.5 \times IQR = 0.39-1.5 \times 0.26=0\)

\(\text{Since 0.22 > 0, 0.22 is not an outlier.}\)
 

d.   \(HHT\ \text{is the response \((y)\) variable.}\)

\(\text{By CAS:}\)

\(HHT = 2.130 + (-1.054) \times LLT\)

Filed Under: Correlation and Regression, Graphs - Stem/Leaf and Boxplots, Summary Statistics Tagged With: Band 3, Band 4, smc-265-20-Find LSRL Equation/Gradient, smc-468-20-Mean, smc-468-30-Std Dev, smc-468-50-IQR / Outliers, smc-643-30-Draw Box Plots

Data Analysis, GEN1 2024 VCAA 6 MC

More than 11 000 athletes from more than 200 countries competed in the Tokyo Summer Olympic Games.

An analysis of the number of athletes per country produced the following five-number summary.

\begin{array}{|c|c|c|c|c|}
\hline
\rule{0pt}{2.5ex} \textbf{Minimum} \rule[-1ex]{0pt}{0pt}& \textbf{First quartile } & \textbf{Median } & \textbf{Third quartile} & \textbf{Maximum } \\
\hline
\rule{0pt}{2.5ex} 2 \rule[-1ex]{0pt}{0pt}& 5 & 11 & 48 & 613 \\
\hline
\end{array}

The smallest number of athletes per country that would display as an outlier on a boxplot of this data is

  1. 49
  2. 112
  3. 113
  4. 613
Show Answers Only

\(C\)

Show Worked Solution

\(IQR=48-5=43\)

\(\text{Upper boundary}\) \(=Q_3+1.5\times IQR\)
  \(=48+1.5\times 43\)
  \(=112.5\)

 
\(\therefore\ \text{Smallest number of athletes to show as an outlier = 113.}\)

\(\Rightarrow C\)

♦ Mean mark 53%.

Filed Under: Summary Statistics Tagged With: Band 5, smc-468-50-IQR / Outliers

Data Analysis, GEN1 2024 VCAA 5 MC

The number of siblings of each member of a class of 24 students was recorded.

The results are displayed in the table below.

\begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline
\rule{0pt}{2.5ex} \ \ 2\ \ \rule[-1ex]{0pt}{0pt} & \ \ 1 \ \ & \ \ 3 \ \ & \ \ 2 \ \ & \ \ 1 \ \ & \ \ 1 \ \ & \ \ 1 \ \ & \ \ 4 \ \ & \ \ 1 \ \ & \ \ 1 \ \ & \ \ 1 \ \ & \ \ 1 \ \ \\
\hline
\rule{0pt}{2.5ex} 1 \rule[-1ex]{0pt}{0pt} & 2 & 1 & 2 & 2 & 1 & 3 & 4 & 2 & 2 & 3 & 1 \\
\hline
\end{array}

A boxplot was constructed to display the spread of the data.

Which one of the following statements about this boxplot is correct?

  1. There are no outliers.
  2. The value of the interquartile range (IQR) is 1.5
  3. The value of the median is 1.5
  4. All of the five-number summary values are whole numbers.
Show Answers Only

\(C\)

Show Worked Solution

\(Q_1=1,\ Q_2=1.5,\ Q_3=2\rightarrow\ \ \text{eliminate D}\)

\(IQR=2-1=1\rightarrow\ \ \text{eliminate B}\)

\(Q_2=1.5 \longrightarrow \text{Median }=1.5 \ \rightarrow\ \text{C correct}\)

\(Q_3+1.5\times IQR=2+1.5\times 1 = 3.5 \ \rightarrow\ \ \text{eliminate A}\)

\(\Rightarrow C\)

Filed Under: Graphs - Stem/Leaf and Boxplots Tagged With: Band 3, smc-468-50-IQR / Outliers, smc-643-10-Single Box-Plots, smc-643-60-Outliers

CORE, FUR2 2021 VCAA 1

In the sport of heptathlon, athletes compete in seven events.

These events are the 100 m hurdles, high jump, shot-put, javelin, 200 m run, 800 m run and long jump.

Fifteen female athletes competed to qualify for the heptathlon at the Olympic Games.

Their results for three of the heptathlon events – high jump, shot-put and javelin – are shown in Table 1

  1. Write down the number of numerical variables in Table 1.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  2. Complete Table 2 below by calculating the mean height jumped for the high jump, in metres, by the 15 athletes. Write your answer in the space provided in the table.   (1 mark)

    --- 0 WORK AREA LINES (style=lined) ---

  3. In shot-put, athletes throw a heavy spherical ball (a shot) as far as they can. Athlete number six, Jamilia, threw the shot 14.50 m.
  4. Calculate Jamilia's standardised score (`z`). Round your answer to one decimal place.   (1 mark)

    --- 3 WORK AREA LINES (style=lined) ---

  5. In the qualifying competition, the heights jumped in the high jump are expected to be approximately normally distributed.
  6. Chara's jump in this competition would give her a standardised score of  `z = –1.0`
  7. Use the 68–95–99.7% rule to calculate the percentage of athletes who would be expected to jump higher than Chara in the qualifying competition.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  8. The boxplot below was constructed to show the distribution of high jump heights for all 15 athletes in the qualifying competition.

 

  1. Explain why the boxplot has no whisker at its upper end.   (1 mark)

    --- 2 WORK AREA LINES (style=lined) ---

  2. For the javelin qualifying competition (refer to Table 1), another boxplot is used to display the distribution of athlete's results.
  3. An athlete whose result is displayed as an outlier at the upper end of the plot is considered to be a potential medal winner in the event.
  4. What is the minimum distance that an athlete needs to throw the javelin to be considered a potential medal winner?   (2 marks)

    --- 5 WORK AREA LINES (style=lined) ---

Show Answers Only

  1. `3`
  2. `1.81`
  3. `0.5 \ text{(to d.p.)}`
  4. `84text(%)`
  5. `text{See Worked Solutions}`
  6. `46.89 \ text{m}`

Show Worked Solution

a.    `3 \ text{High jump, shot-put and javelin}`

 `text{Athlete number is not a numerical variable}`
  

b.     `text{High jump mean}`

`= (1.76 + 1.79 + 1.83 + 1.82 + 1.87 + 1.73 + 1.68 + 1.82 +`

`1.83 + 1.87 + 1.87 + 1.80 + 1.83 + 1.87 + 1.78) ÷ 15`

`= 1.81`
 

c.   `z text{-score} (14.50)` `= {14.50-13.74}/{1.43}`
    `= 0.531 …`
    `= 0.5 \ text{(to 1 d.p.)}`

 
d.  `P (z text{-score} > -1 ) = 84text(%)`
 

e.  `text{If the} \ Q_3 \ text{value is also the highest value in the data set,}`

`text{there is no whisker at the upper end of a boxplot.}`
 

f.  `text{Javelin (ascending):}`

`38.12, 39.22, 40.62, 40.88, 41.22, 41.32, 42.33, 42.41, `

`42.51, 42.65, 42.75, 42.88, 45.64, 45.68, 46.53`

`Q_1 = 40.88 \ \ , \ Q_3 = 42.88 \ \ , \ \ IQR = 42.88-40.88 = 2`

`text{Upper Fence}` `= Q_3 + 1.5  xx IQR`
  `= 42.88 + 1.5 xx 2`
  `= 45.88`

 
`:. \ text{Minimum distance = 45.89 m  (longer than upper fence value)}`

Filed Under: Graphs - Stem/Leaf and Boxplots, Normal Distribution, Summary Statistics Tagged With: Band 2, Band 3, Band 4, smc-468-20-Mean, smc-468-50-IQR / Outliers, smc-600-10-Single z-score, smc-643-10-Single Box-Plots

CORE, FUR1 2020 VCAA 1-3 MC

The times between successive nerve impulses (time), in milliseconds, were recorded.

Table 1 shows the mean and the five-number summary calculated using 800 recorded data values.
 


 

Part 1

The difference, in milliseconds, between the mean time and the median time is

  1.  10
  2.  70
  3.  150
  4.  220
  5.  230

 
Part 2

Of these 800 times, the number of times that are longer than 300 milliseconds is closest to

  1. 20
  2. 25
  3. 75
  4. 200
  5. 400

 
Part 3

The shape of the distribution of these 800 times is best described as

  1. approximately symmetric.
  2. positively skewed.
  3. positively skewed with one or more outliers.
  4. negatively skewed.
  5. negatively skewed with one or more outliers.
Show Answers Only

`text(Part 1:)\ B`

`text(Part 2:)\ D`

`text(Part 3:)\ C`

Show Worked Solution

Part 1

`text(Difference)` `= 220 -150`
  `= 70`

`=> B`
 

Part 2

`Q_3 = 300`

`:.\ text(Impulses longer than 300 milliseconds)`

`= 25text(%) xx 800`

`= 200`

`=> D`
 

Part 3

`text(Distribution has a long tail to the right)`

♦ Mean mark 50%.

`:.\ text(Positively skewed)`

`text(Upper fence)` `= Q_3 + 1.5 xx IQR`
  `= 300 + 1.5 (300 – 70)`
  `= 645`

 
`=> C`

Filed Under: Summary Statistics Tagged With: Band 2, Band 3, Band 4, smc-468-20-Mean, smc-468-40-Median Mode and Range, smc-468-50-IQR / Outliers

CORE, FUR2 2018 VCAA 1

 

The data in Table 1 relates to the impact of traffic congestion in 2016 on travel times in 23 cities in the United Kingdom (UK).

The four variables in this data set are:

  • city — name of city
  • congestion level — traffic congestion level (high, medium, low)
  • size — size of city (large, small)
  • increase in travel time — increase in travel time due to traffic congestion (minutes per day).
  1. How many variables in this data set are categorical variables?  (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

  2. How many variables in this data set are ordinal variables  (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

  3. Name the large UK cities with a medium level of traffic congestion.  (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

  4. Use the data in Table 1 to complete the following two-way frequency table, Table 2.  (2 marks)

    --- 0 WORK AREA LINES (style=lined) ---

     

     


     

  5. What percentage of the small cities have a high level of traffic congestion?  (1 mark)

    --- 3 WORK AREA LINES (style=lined) ---

Traffic congestion can lead to an increase in travel times in cities. The dot plot and boxplot below both show the increase in travel time due to traffic congestion, in minutes per day, for the 23 UK cities.
 


 

  1. Describe the shape of the distribution of the increase in travel time for the 23 cities.  (1 mark)

    --- 1 WORK AREA LINES (style=lined) ---

  2. The data value 52 is below the upper fence and is not an outlier.
  3. Determine the value of the upper fence.  (1 mark)

    --- 3 WORK AREA LINES (style=lined) ---

Show Answers Only

  1. `3\ text(city, congestion level, size)`
  2. `2\ text(congestion level, size)`
  3. `text(Newcastle-Sunderland and Liverpool)`
  4. `text(See Worked Solutions)`
  5. `25 text(%)`
  6. `text(Positively skewed)`
  7. `52.5`

Show Worked Solution

a.   `3\-text(city, congestion level, size)`
 

b.   `2\-text(congestion level, size)`
 

c.   `text(Newcastle-Sunderland and Liverpool)`
 

d.   

 

e.    `text(Percentage)` `= text(Number of small cities high congestion)/text(Number of small cities) xx 100`
    `= 4/16 xx 100`
    `= 25 text(%)`

 
f.
   `text(Positively skewed)`

 

g.   `IQR = 39-30 = 9`
 

`text(Calculate the Upper Fence:)`

`Q_3 + 1.5 xx IQR` `= 39 + 1.5 xx 9`
  `= 52.5`

Filed Under: Graphs - Stem/Leaf and Boxplots, Summary Statistics Tagged With: Band 2, Band 3, Band 4, page-break-before-question, smc-468-10-Data Classification, smc-468-50-IQR / Outliers, smc-643-10-Single Box-Plots, smc-643-60-Outliers, smc-643-70-Distribution Description

CORE, FUR2 2013 VCAA 2

The development index for each country is a whole number between 0 and 100.

The dot plot below displays the values of the development index for each of the 28 countries that has a high development index.
 

CORE, FUR2 2013 VCAA 21 
 

  1. Using the information in the dot plot, determine each of the following.  (1 mark)

    --- 0 WORK AREA LINES (style=lined) ---

     

     

    Core, FUR2 2013 VCAA 2_2   
     

  2. Write down an appropriate calculation and use it to explain why the country with a development index of 70 is an outlier for this group of countries.  (2 marks)

    --- 5 WORK AREA LINES (style=lined) ---

Show Answers Only

  1. `text(Mode = 78,  Range = 9)`
  2. `text(See Worked Solutions)`

Show Worked Solution

a.   `text(Mode = 78)`

`text(Range = 79 − 70 = 9)`

 

b.   `text(An outlier occurs if a data point is below)`

`Q_1 − 1.5 xx IQR`

 

`Q_1 = 75, \ \ Q_3 = 78, and IQR = 78-75=3`

`:. Q_1 − 1.5 xx IQR` `= 75 − 1.5 xx 3`
  `= 70.5`

 

`:. 70\ text{is an outlier  (70 < 70.5)}`

Filed Under: Graphs - Histograms and Other Tagged With: Band 3, Band 4, smc-468-40-Median Mode and Range, smc-468-50-IQR / Outliers, smc-644-10-Dot Plots

CORE, FUR1 2012 VCAA 10 MC

Which one of the following statistics is never negative?

A.  a median

B.  a residual

C.  a standardised score

D.  an interquartile range

E.  a correlation coefficient

Show Answers Only

`D`

Show Worked Solution

`text (S) text(ince IQR)\ = Q_3 – Q_1, and`

`Q_1\ text(is always less than)\ Q_3,`

`text(IQR is always positive.)`

`rArr D`

Filed Under: Summary Statistics Tagged With: Band 4, smc-468-40-Median Mode and Range, smc-468-50-IQR / Outliers

Copyright © 2014–2025 SmarterEd.com.au · Log in