Representation of data: diagrams, measures of central tendency and dispersion
📊 Probability & Statistics 1 (S1) – Representation of Data
1️⃣ Types of Diagrams
Data can be shown in many ways. Think of a diagram as a picture that tells a story about numbers. Below are the most common types you’ll use in A‑Level.
- Bar Chart – tall bars for each category.
- Pie Chart – slices of a circle, each slice shows a proportion.
- Histogram – bars that are next to each other, used for continuous data.
- Box Plot – a box with a line inside, shows spread and outliers.
2️⃣ Bar Chart Example
Imagine you surveyed 10 friends about their favourite fruit. The results are:
| Fruit | Number of Friends |
|---|---|
| Apples | 4 |
| Bananas | 3 |
| Cherries | 2 |
| Dates | 1 |
In a bar chart, the height of each bar would be proportional to the numbers above. The taller the bar, the more friends like that fruit. 📈
3️⃣ Pie Chart Example
Using the same data, a pie chart would show each fruit as a slice of a circle. The size of each slice is the percentage of friends who chose that fruit.
If you want to calculate the slice for Apples:
$ \displaystyle \frac{4}{10} \times 100\% = 40\% $
So Apples would take up 40% of the pie. 🍰
4️⃣ Histogram Example
Suppose you have the ages of 30 students: 12, 13, 13, 14, 14, 15, …, 18. A histogram groups ages into ranges (bins) and shows how many students fall into each range.
Bins: 12–13, 14–15, 16–17, 18–19. Each bin is a bar whose height is the count of students in that age range. 📉
5️⃣ Box Plot (Box‑and‑Whisker) Example
A box plot summarises data with five key numbers: minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum. Outliers are shown as individual points.
Think of it as a “data sandwich”: the box is the middle 50% of the data, the line inside is the middle value, and the whiskers reach to the extremes. 🥪
📈 Measures of Central Tendency
These give you a single number that represents the “centre” of a data set.
Mean (Average)
$ \displaystyle \bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} $
Example: Ages 12, 14, 16, 18. $ \displaystyle \bar{x} = \frac{12+14+16+18}{4} = 15 $
Median
The middle value when data are ordered. If there’s an even number of values, take the average of the two middle ones.
Example: Ages 12, 14, 16, 18 → median = (14+16)/2 = 15. Example: Ages 12, 14, 16 → median = 14.
Mode
The value that occurs most often. A data set can have no mode, one mode, or multiple modes.
Example: 12, 14, 14, 16 → mode = 14. Example: 12, 14, 16, 18 → no mode (all unique).
📏 Measures of Dispersion
These tell you how spread out the data are.
Range
$ \displaystyle \text{Range} = \max(x_i) - \min(x_i) $
Example: Ages 12, 14, 16, 18 → range = 18 – 12 = 6.
Variance
$ \displaystyle s^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1} $
It measures the average squared deviation from the mean. A larger variance means data are more spread out.
Standard Deviation
$ \displaystyle s = \sqrt{s^2} $
The square root of the variance gives a measure in the same units as the data. It’s easier to interpret than variance.
Inter‑Quartile Range (IQR)
$ \displaystyle \text{IQR} = Q3 - Q1 $
The IQR shows the spread of the middle 50% of the data, making it less sensitive to outliers.
🔢 Quick Reference Table
| Measure | Formula | Interpretation |
|---|---|---|
| Mean | $ \displaystyle \bar{x} = \frac{\sum x_i}{n} $ | Average value. |
| Median | Middle value in ordered list. | Half the data below, half above. |
| Mode | Most frequent value. | Commonest observation. |
| Range | $ \displaystyle \max - \min $ | Overall spread. |
| Variance | $ \displaystyle \frac{\sum (x_i-\bar{x})^2}{n-1} $ | Average squared deviation. |
| Standard Deviation | $ \displaystyle \sqrt{s^2} $ | Spread in original units. |
| IQR | $ Q3 - Q1 $ | Spread of middle 50%. |
💡 Tips for Remembering
- Mean = “average” – add up everything, then divide.
- Median = “middle” – order the data first.
- Mode = “most popular” – the value that pops up most.
- Range = “biggest gap” – highest minus lowest.
- Variance = “squared spread” – think of squaring the differences.
- Standard Deviation = “square‑root of variance” – brings it back to normal units.
- IQR = “middle 50%” – ignore the extremes.
🧩 Practice Challenge
You have the following exam scores: 78, 82, 85, 90, 90, 92, 95, 100. 1. Draw a bar chart. 2. Calculate the mean, median, mode, range, variance, and standard deviation. 3. Identify any outliers using a box plot.
Try it on paper, then check your answers with the quick reference table above. Good luck! 🚀
Revision
Log in to practice.