Data are sets of facts, such as numbers, words, measurements, observations, etc. Statistics is the study of data, based on the collection, representation and interpretation of data.
Which grade raised the most amount of money per homeroom?
This question could be answered using the mean (average, xx bar). The 8th grade raised a mean amount of $158 per homeroom. This average is higher than the 9th grade amount of $148.71.
However, the 9th grade class data contain an outlier, $38. So the mean is not the best representation for this data. The value of the mean is sensitive to outliers. This outlier is much lower than the other data values. The mean will end up too low to represent the data. Therefore, the median should be used to represent the 9th grade data.
If we use the median to answer the question, the 9th grade had the higher median, $160. By this measure, the 9th grade raised the greater amount of money per homeroom.
Both the mean and the median are measures of center.
Another way to look at these statistics is to evaluate the variability of data within each set by determining how spread out the data are. Variability is how much the data values change within the data set.
Numerically, the range of a data set is the difference between its highest data value and its lowest data value. The 9th grade’s range is $155 ($193 – $38 = $155), while the 8th grade’s range is only $50 ($180 – $130 = $50).
This means the amounts of money the 8th grade homerooms raised were less variable (more consistent) than the amounts from 9th grade homerooms.
Here is the same data graphed as box and whisker plots.
Notice how the 9th grade class has a much wider range than the 8th grade because the data for 9th grade are more spread out than the data for 8th grade.
Also notice in 9th grade, one data point ($38) is pulling the data set to the left. This concept is known as skewed.
Recall the range of a data set is the difference between its highest data value and its lowest data value. The 9th grade’s range is $155 ($193 – $38 = $155), while the 8th grade’s range is only $50 ($180 – $130 = $50).
The box and whisker plots provide a visual model for the variability in the data. The amounts of money the 8th grade homerooms raised were less variable (more consistent) than the amounts from 9th grade homerooms.
Another measure of variability is the interquartile range. Interquartile range (also referred to as IQR) is the difference between the value of the third quartile (Q3q sub 3) and the value of the first quartile (Q1q sub 1). The interquartile range is a stronger measure of variability than the range because the interquartile value isolates the difference between the highest and lowest value within the middle 50% of the data.
The 9th grade team has an outlier. This is a data point that is much smaller or larger than most of the other values in a set of data. That would be the $38 in the data set.
This makes the box and whisker plot skewed left. That means that most of the data is grouped on the right of the plot, but the outlier makes the whisker stretch to the left.
The interquartile range for 8th grade is $32 ($170 - $138 = $32).
The interquartile range for the 9th grade is $36 ($188 - $142 = $36).
So, if an award was given to the most consistent grade in this fundraiser, the 8th grade class would be the winner. It was less variable both in range and interquartile range.