Grade 10 · Statistics · Cambridge IGCSE 0580 · Age 14–15
Cumulative frequency is a powerful statistical tool for analysing large datasets. Instead of working with raw data, you build a running total that lets you estimate medians, quartiles, and percentiles — all essential skills for Cambridge IGCSE 0580 Extended Paper.
Running totals from grouped frequency data
S-shaped curve, plot at upper class boundary
Read off n/2, n/4, 3n/4 positions
Measure of spread: UQ − LQ
5-number summary: min, LQ, med, UQ, max
p-th percentile at p/100 × n position
When data is spread over a wide range, we group it into class intervals. Each class has a frequency (the count of values in that interval). The class intervals must not overlap, and together they must cover all the data.
| Height h (cm) | Frequency |
|---|---|
| 140 ≤ h < 150 | 4 |
| 150 ≤ h < 160 | 11 |
| 160 ≤ h < 170 | 18 |
| 170 ≤ h < 180 | 15 |
| 180 ≤ h < 190 | 9 |
| 190 ≤ h < 200 | 3 |
| Total | 60 |
The cumulative frequency for a class is the running total of all frequencies up to and including that class. You add each frequency to the total so far.
| Height h (cm) | Frequency | Cumulative Frequency | Plot at (upper boundary) |
|---|---|---|---|
| 140 ≤ h < 150 | 4 | 4 | 150 |
| 150 ≤ h < 160 | 11 | 15 | 160 |
| 160 ≤ h < 170 | 18 | 33 | 170 |
| 170 ≤ h < 180 | 15 | 48 | 180 |
| 180 ≤ h < 190 | 9 | 57 | 190 |
| 190 ≤ h < 200 | 3 | 60 | 200 |
Once you have your CF table, plot cumulative frequency (y-axis) against upper class boundary (x-axis). Then join the points with a smooth curve — this creates the characteristic S-shape (also called an ogive).
The median is the middle value when all data is arranged in order. For n data values, the median is at the n/2 position on the cumulative frequency axis.
Quartiles divide the data into four equal quarters.
| Measure | Position on CF axis | For n = 60 |
|---|---|---|
| Lower Quartile (LQ or Q₁) | n/4 | 60/4 = 15 |
| Median (Q₂) | n/2 | 60/2 = 30 |
| Upper Quartile (UQ or Q₃) | 3n/4 | 3×60/4 = 45 |
The IQR measures the spread of the middle 50% of the data. A smaller IQR means the data is more consistent; a larger IQR means it is more spread out.
A box-and-whisker plot is a visual summary of a distribution using five key values: the minimum, lower quartile, median, upper quartile, and maximum. These are called the five-number summary.
Exam questions often give two groups (e.g. boys and girls, or two schools) and ask you to compare their distributions. You must always compare both average AND spread.
A percentile divides data into hundredths. The p-th percentile is the value below which p% of the data falls. Quartiles are just special percentiles: LQ = 25th percentile, Median = 50th percentile, UQ = 75th percentile.
| Time t (min) | Freq |
|---|---|
| 20 ≤ t < 30 | 3 |
| 30 ≤ t < 40 | 7 |
| 40 ≤ t < 50 | 12 |
| 50 ≤ t < 60 | 11 |
| 60 ≤ t < 70 | 7 |
| Upper boundary | Cumulative Frequency |
|---|---|
| 30 | 3 |
| 40 | 10 |
| 50 | 22 |
| 60 | 33 |
| 70 | 40 |
These are the errors that cost marks most often in IGCSE exams. Study each one carefully.
The cumulative frequency up to a class represents "all values up to the end of that class" — so you plot at the upper boundary. Using the midpoint gives an S-curve shifted to the left and produces wrong estimates for the median and quartiles.
The positions are n/4 for LQ and 3n/4 for UQ, not n/4 + 1 or other adjustments. For grouped continuous data, use exactly n/4 and 3n/4. Some students confuse this with the discrete data rule where you'd use (n+1)/4 — that rule does not apply here.
Without the starting point at (lower boundary, 0), your S-curve will not be anchored correctly and the shape will be wrong at the bottom.
A CF curve should always be drawn as a smooth freehand curve. Straight lines between points imply the data is uniformly distributed within each class, which is rarely true. You will lose the "smooth curve" mark.
Cumulative frequency always counts from the bottom up (how many are BELOW or AT a value). To find how many are ABOVE, subtract from the total.
When comparing distributions, you must (1) state BOTH values, (2) name both groups, (3) give a contextual interpretation. A standalone statement without comparison earns 0 marks.
| Measure | Formula / Method | Notes |
|---|---|---|
| Cumulative Frequency | Running total of frequencies | Final CF = n (total) |
| Plot position | Upper class boundary | NOT midpoint |
| Starting point | (Lower boundary of first class, 0) | Anchors the S-curve |
| Median position | n/2 | Read from CF axis |
| Lower Quartile (LQ) position | n/4 | Read from CF axis |
| Upper Quartile (UQ) position | 3n/4 | Read from CF axis |
| Interquartile Range (IQR) | IQR = UQ − LQ | Middle 50% spread |
| p-th Percentile position | (p/100) × n | Read from CF axis |
| Number exceeding value v | n − CF(v) | Subtract from total |
| Range | Max − Min | From the data/table |
| n (total frequency) | LQ position (n/4) | Median position (n/2) | UQ position (3n/4) |
|---|---|---|---|
| 40 | 10 | 20 | 30 |
| 60 | 15 | 30 | 45 |
| 80 | 20 | 40 | 60 |
| 100 | 25 | 50 | 75 |
| 120 | 30 | 60 | 90 |
Enter up to 6 class intervals with their frequencies. The visualiser will calculate the CF table, draw the S-curve, and read off the median, quartiles, and IQR automatically.
For each question, a frequency table is given. Enter the missing cumulative frequencies.
1. Complete the CF table. Classes: 0–10 (f=2), 10–20 (f=8), 20–30 (f=15). What is the CF at the end of the third class?
2. Classes: 0–5 (f=4), 5–10 (f=9), 10–15 (f=13), 15–20 (f=6). CF at end of 2nd class?
3. Using Q2's table, CF at end of 3rd class?
4. Using Q2's table, total n (CF at end of 4th class)?
5. Classes: 10–20 (f=6), 20–30 (f=14), 30–40 (f=20), 40–50 (f=10). CF at upper boundary 40?
6. Using Q5's table, total n?
7. The CF values for 5 classes are: 5, 18, 35, 44, 50. What is the frequency of the 3rd class?
8. CF values at boundaries 20, 30, 40, 50, 60 are 4, 15, 28, 37, 40. What is the frequency for the class 40–50?
For each value of n, calculate the CF axis position to read off each measure.
1. n = 40. At what CF value is the median?
2. n = 40. At what CF value is the LQ?
3. n = 40. At what CF value is the UQ?
4. n = 80. Median position?
5. n = 80. UQ position?
6. n = 100. LQ position?
7. n = 120. Median position?
8. n = 60. For the 80th percentile, at what CF value do you read off?
Use the given CF table data (n = 60) to answer the questions. The CF curve passes through: (20,0), (30,3), (40,10), (50,22), (60,33), (70,40), (80,50), (90,60). Use linear interpolation within each class interval.
1. n = 60. Median position?
2. LQ position for n = 60?
3. UQ position for n = 60?
4. The CF at x=50 is 22, and at x=60 is 33. By linear interpolation, estimate the median (at CF=30). Give to 1 d.p.
5. CF at x=40 is 10 (which equals the LQ position). LQ = ?
6. CF at x=70 is 40 (UQ position is 45). Use interpolation between x=70 (CF=40) and x=80 (CF=50). Estimate UQ to 1 d.p.
7. IQR = UQ − LQ. Using LQ=40, UQ=75 from Q5 and Q6.
8. How many values are below x = 50 in this dataset? (Read directly from CF table)
1. Five-number summary: Min=10, LQ=25, Median=35, UQ=50, Max=70. What is the IQR?
2. Five-number summary: Min=5, LQ=20, Median=30, UQ=45, Max=65. What is the range?
3. A box plot has: Min=12, LQ=28, Median=36, UQ=44, Max=60. What is the length of the box?
4. The IQR of a dataset is 18 and the LQ is 32. What is the UQ?
5. From a CF curve (n=50), you read: LQ=24, Median=31, UQ=40. Find the IQR.
6. Box plot: Min=0, Max=100, IQR=30, Median=55, LQ=40. What is the UQ?
7. A distribution has Min=15, Range=65, IQR=20, LQ=38. Find the Median if it is 8 above the LQ.
8. Two groups: Group A IQR=12, Group B IQR=20. Which group is MORE consistent? Enter 1 for A or 2 for B.
1. A CF table ends: ..., (70, 45), (80, 60). How many values are in the class 70–80?
2. n = 80. At what CF do you read the 90th percentile?
3. From a curve (n=80): CF at x=50 is 72. How many values exceed 50?
4. A CF curve has n=100. LQ=42, UQ=68. IQR = ?
5. n = 100. At what CF do you read the 35th percentile?
6. Frequency table: class 60–70 (f=8), 70–80 (f=14), 80–90 (f=18), 90–100 (f=10). What is n?
7. From Q6's table, the CF at the upper boundary 80 is?
8. Group A: Median=65, IQR=14. Group B: Median=72, IQR=14. Which group has higher typical values? Enter 1 for A, 2 for B.
Non-calc questions are marked with [NC]. All others may use a calculator.
[NC] 1. Frequency table: 0–10(f=5), 10–20(f=12), 20–30(f=8). What is the CF at upper boundary 20?
[NC] 2. n = 60. Median position on CF axis?
[NC] 3. n = 60. LQ position?
[NC] 4. n = 60. UQ position?
[NC] 5. LQ = 28, UQ = 52. IQR = ?
[NC] 6. Cumulative frequencies: 6, 19, 35, 42, 50. What is the frequency of the 4th class?
[NC] 7. Five-number summary: Min=10, LQ=30, Median=45, UQ=60, Max=90. Range = ?
[NC] 8. Same as Q7. IQR = ?
9. n = 120. UQ position?
[NC] 10. CF table ends at (50, 36) and (60, 50). Frequency of class 50–60?
[NC] 11. n = 50. 60th percentile position on CF axis?
[NC] 12. Box plot: LQ=40, IQR=25. UQ = ?
13. From a CF curve (n=80): CF at x=70 is 56. How many values are above 70?
[NC] 14. n = 40. 75th percentile position?
[NC] 15. Frequencies: 4, 10, 18, 12, 6. Total n?
[NC] 16. Cumulative frequency at boundary 3 is 32 and at boundary 4 is 50. Frequency of class 4?
[NC] 17. n = 100. LQ position?
18. From a CF curve (n=100): LQ=45, UQ=73. IQR = ?
[NC] 19. Box plot whisker goes from 20 (min) to LQ=38. Length of left whisker = ?
[NC] 20. The median is at position n/2. For n=90, median position?
21. n = 90. 20th percentile position on CF axis?
[NC] 22. Group A median=58, Group B median=62. Which has higher typical value? Enter 1 for A, 2 for B.
[NC] 23. Group A IQR=15, Group B IQR=22. Which is more consistent? Enter 1 for A, 2 for B.
[NC] 24. For a CF curve, should you plot at the upper class boundary or midpoint? Enter 1 for upper boundary, 2 for midpoint.
[NC] 25. CF values: 8, 20, 38, 50. n = 50. UQ position = 37.5. Using CF=38 at x=40 and CF=20 at x=30, estimate the UQ by interpolation. Give to 1 d.p.
1. A CF curve passes through (50, 0), (60, 8), (70, 26), (80, 44), (90, 56), (100, 60). n=60. By linear interpolation between (70,26) and (80,44), estimate the median (CF=30) to 1 d.p.
2. Using the data in Q1, estimate the LQ (CF=15) by interpolating between (60,8) and (70,26). Give to 1 d.p.
3. Using Q1 data: UQ is at CF=45. Interpolate between (80,44) and (90,56). Estimate UQ to 1 d.p.
4. Using your answers from Q1–Q3, calculate the IQR to 1 d.p.
5. n = 60. Find the 90th percentile position on the CF axis.
6. From Q1's CF curve, estimate the 90th percentile (CF=54) by interpolating between (90,56) and (100,60)? Use: x = 90 + ((54−56)/(60−56))×10. Give to 1 d.p. [Hint: interpolate backwards — CF=54 is BELOW 56].
7. From Q1's data, how many values exceed 85? (Interpolate to find CF at x=85, then subtract from 60.)
8. Two groups both have n=80. Group A: IQR=18. Group B: IQR=27. The median of A=64, median of B=58. Write ONE valid comparison about spread. Enter IQR of the MORE consistent group.
9. A CF table: boundary 30→CF=0, 40→CF=12, 50→CF=35, 60→CF=n. If the median is exactly 50, what must n be? (Median at n/2, and CF at x=50 is 35, so n/2=35.)
10. Box plot: Min=20, Max=90. IQR=24, LQ=38. Find the length of the right whisker (UQ to Max).
11. The 25th and 75th percentiles of a dataset are 36 and 60 respectively. The median is 48. Is the distribution positively skewed, negatively skewed, or symmetric? (Enter 1=positive, 2=negative, 3=symmetric)
12. A dataset has n=200. A student claims the UQ is at CF=150. Is this correct? (Enter 1 for yes, 2 for no — UQ should be at 3n/4=150.)
Mark-scheme style. Show working in your book. Enter final answers for self-marking.
The table shows the masses (kg) of 80 parcels delivered by a courier.
| Mass m (kg) | Frequency |
|---|---|
| 0 ≤ m < 1 | 6 |
| 1 ≤ m < 2 | 14 |
| 2 ≤ m < 3 | 23 |
| 3 ≤ m < 4 | 19 |
| 4 ≤ m < 5 | 12 |
| 5 ≤ m < 6 | 6 |
Using the parcel data from Question 1 (n = 80). The CF curve passes through: (1,6), (2,20), (3,43), (4,62), (5,74), (6,80).
Using the parcel data: Min = 0 kg, Max = 6 kg, LQ = 2.00 kg, Median ≈ 2.70 kg, UQ ≈ 3.89 kg.
Still using the parcel CF data (n = 80). CF curve: (1,6),(2,20),(3,43),(4,62),(5,74),(6,80).
The table shows statistics for delivery times (minutes) for two courier companies.
| Statistic | Company A | Company B |
|---|---|---|
| Minimum | 15 | 10 |
| Lower Quartile | 28 | 22 |
| Median | 35 | 40 |
| Upper Quartile | 44 | 62 |
| Maximum | 60 | 90 |