In a previous lesson, we learned about measures of central tendency (mean, median, mode), which describe the 'typical' value in a data set. But that's only half the story. Consider these two data sets:
Both sets have the same mean (50) and the same median (50). But are they the same? Absolutely not! Set A is very spread out, while Set B is tightly clustered together. We need measures of spread (or variability) to describe this difference.
The range is the simplest measure of spread. It's the difference between the maximum and minimum values in a data set.
The range gives us a quick idea of how spread out the data is, but it can be misleading because it is only affected by the two most extreme values (outliers).
A more robust way to understand the spread of data is to look at how it's divided into quarters, or quartiles. This starts with the five-number summary.
Example: Let's find the five-number summary for the data set {2, 3, 5, 6, 8, 10, 11}.
The interquartile range (IQR) is the range of the middle 50% of the data. It is a very useful measure of spread because it is not affected by outliers.
This tells us that the middle half of our data is spread over a range of 7 units.
What is the range of the data set {15, 2, 9, 7, 22, 11}?
Find the first quartile (Q1) for the data set {1, 3, 4, 6, 9, 10, 12, 15}.
A data set has a first quartile of 20 and a third quartile of 35. What is the interquartile range (IQR)?