Inference & Hypothesis Testing | FractionRush A-Level

Welcome to Inference & Hypothesis Testing

Hypothesis testing is the engine of scientific inference. From drug trials that determine whether a new treatment works, to quality control on a production line, to opinion polling before an election — this rigorous framework lets us make principled decisions from data. Rather than guessing, we quantify exactly how surprising our observations are under a default assumption, then decide whether the data is surprising enough to overturn it.

H₀: parameter = value | H₁: parameter > / < / ≠ value
p-value = P(result as extreme as observed | H₀ true)
Reject H₀ if p-value < significance level α

In this module you will develop a complete toolkit: setting up hypotheses correctly, choosing the right type of test, computing p-values from Binomial and Normal distributions, identifying critical regions, and understanding the two ways a test can go wrong.

Learning Objectives

Formulate the null hypothesis H₀ and alternative hypothesis H₁ correctly for a given context
Choose between a one-tail and two-tail test based on the wording of the question
State and interpret the significance level α of a hypothesis test
Calculate the test statistic or p-value for a given data set
Find the critical region for a test at a given significance level
Compare the p-value to α and write a conclusion in context
Understand Type I error: rejecting H₀ when it is actually true, with probability α
Understand Type II error: failing to reject H₀ when it is actually false, with probability β
Test a proportion p using the Binomial distribution B(n, p₀)
Test a population mean μ using the Normal distribution with known variance σ²

Topics in This Module

Null & Alternative Hypotheses

Formulating H₀ and H₁ from a problem context

One-Tail vs Two-Tail

Choosing the correct test direction

p-values

Computing and interpreting probability of observed result

Critical Regions

Finding the rejection region from tables

Type I & II Errors

False positives, false negatives, power of a test

Binomial & Normal Tests

Testing proportions and means from data

Learn 1 — Setting Up a Hypothesis Test

Every hypothesis test begins with two competing statements about a population parameter (such as a proportion p or a mean μ).

The Null Hypothesis H₀

H₀ is the default assumption — the status quo. It always contains an equality sign. We assume H₀ is true until the data gives us sufficient evidence to doubt it.

H₀: p = p₀ or H₀: μ = μ₀

The Alternative Hypothesis H₁

H₁ is the claim we are testing. It contains a strict inequality. There are three forms:

H₁: p > p₀ — one-tail test (right-tail). We suspect the parameter is larger than stated.
H₁: p < p₀ — one-tail test (left-tail). We suspect the parameter is smaller than stated.
H₁: p ≠ p₀ — two-tail test. We suspect the parameter is different (either direction).

Why We Can Never "Prove" H₀

A hypothesis test can only tell us whether our data is inconsistent with H₀. If the data is surprising under H₀ (p-value small), we reject H₀. If the data is not surprising, we fail to reject H₀ — but this is not the same as proving H₀ is true. Absence of evidence is not evidence of absence.

Language matters: Always write "fail to reject H₀" or "insufficient evidence to reject H₀". Never write "accept H₀" or "prove H₀".

Significance Level α

The significance level α is the threshold probability we set before seeing the data. If the p-value falls below α, we say the result is statistically significant and reject H₀.

Common values: α = 0.05 (5%), α = 0.01 (1%), α = 0.10 (10%)

The significance level also equals the probability of a Type I error — rejecting H₀ when it is actually true. A smaller α reduces this risk but makes it harder to reject H₀.

Tip: The choice of α is made before the test. If the question says "test at the 5% level", then α = 0.05 throughout — do not adjust it after seeing the data.

Learn 2 — p-values and Critical Regions

The p-value

The p-value is the probability of obtaining a result at least as extreme as the one observed, assuming H₀ is true. It measures how surprising the data is under the null hypothesis.

p-value = P(result as extreme as observed | H₀ true)

Decision rule:
If p-value ≤ α → Reject H₀. Significant evidence against H₀.
If p-value > α → Fail to reject H₀. Insufficient evidence against H₀.

One-tail vs Two-tail p-values

For a one-tail test (H₁: p > p₀), the p-value is the area in one tail only: P(X ≥ x_obs) or P(X ≤ x_obs).

For a two-tail test (H₁: p ≠ p₀), the p-value is doubled: 2 × P(X ≥ |x_obs|), because extreme values in either direction count as evidence. At 5% significance, each tail has only 2.5%.

Common error: For a two-tail test at 5%, do not compare the one-tail probability to 0.05. Compare it to 0.025 (half of 5%), or double the probability and compare to 0.05.

The Critical Region

The critical region (or rejection region) is the set of values of the test statistic for which we would reject H₀. The critical value is the boundary of this region.

One-tail right: Reject H₀ if X ≥ c (where P(X ≥ c) ≤ α)
One-tail left: Reject H₀ if X ≤ c (where P(X ≤ c) ≤ α)
Two-tail: Reject H₀ if X ≤ c₁ or X ≥ c₂

For Binomial tests, the critical region is found by cumulating probabilities from the tail until we first exceed α. For Normal tests, we use the standard Normal table (Z-table) to find the critical Z-value.

Actual significance level: Because the Binomial is discrete, the actual probability of rejecting H₀ at the boundary is often slightly less than α. The "actual significance level" is P(being in the critical region | H₀ true).

Learn 3 — Binomial Hypothesis Test

When testing a proportion p, the test statistic X is the count of successes in n trials. Under H₀, X ~ B(n, p₀).

Setting Up

H₀: p = p₀ vs H₁: p > p₀ / p < p₀ / p ≠ p₀
X ~ B(n, p₀) under H₀

Computing the p-value

For H₁: p > p₀ → p-value = P(X ≥ x_obs) = 1 − P(X ≤ x_obs − 1)
For H₁: p < p₀ → p-value = P(X ≤ x_obs)
For H₁: p ≠ p₀ → p-value = 2 × min(P(X ≤ x_obs), P(X ≥ x_obs))

These probabilities are found using Binomial tables (Cambridge provides cumulative tables) or the formula:

P(X = k) = C(n,k) · p₀ᵏ · (1−p₀)ⁿ⁻ᵏ

Worked Example: Testing a Biased Coin

A coin is suspected of being biased towards heads. It is tossed 20 times and 15 heads are observed. Test at the 5% significance level.

H₀: p = 0.5 (fair coin) H₁: p > 0.5 (one-tail right, because we suspect more heads)

Under H₀: X ~ B(20, 0.5)

p-value = P(X ≥ 15) = P(15) + P(16) + … + P(20)
Using tables or calculation: P(X ≥ 15) = 1 − P(X ≤ 14) ≈ 1 − 0.9793 = 0.0207

Compare: 0.0207 < 0.05 = α
Conclusion: Reject H₀. There is significant evidence at the 5% level that the coin is biased towards heads.

Always state: (1) H₀ and H₁ with the parameter defined, (2) the distribution under H₀, (3) the p-value calculation, (4) comparison to α, (5) conclusion in context.

Learn 4 — Normal Distribution Test (Testing a Mean)

When the population variance σ² is known and we have a sample of size n, the sample mean X̄ follows a Normal distribution under H₀.

Distribution of X̄ Under H₀

X̄ ~ N(μ₀, σ²/n) under H₀: μ = μ₀

Test Statistic

We standardise to get a Z-score, which follows N(0, 1):

Z = (X̄ − μ₀) / (σ / √n) ~ N(0, 1) under H₀

The denominator σ/√n is the standard error of the mean — not σ itself.

Decision Rule Using Z

One-tail right (H₁: μ > μ₀): Reject H₀ if Z > z_α (e.g., z₀.₀₅ = 1.645)
One-tail left (H₁: μ < μ₀): Reject H₀ if Z < −z_α
Two-tail (H₁: μ ≠ μ₀): Reject H₀ if |Z| > z_{α/2} (e.g., z₀.₀₂₅ = 1.96)

Worked Example

A machine produces bolts with mean length 50 mm and known standard deviation σ = 2 mm. A sample of n = 25 bolts gives x̄ = 50.8 mm. Test H₀: μ = 50 vs H₁: μ > 50 at 5%.

Under H₀: X̄ ~ N(50, 4/25) = N(50, 0.16), so SE = 2/√25 = 0.4

Test statistic: Z = (50.8 − 50) / 0.4 = 0.8 / 0.4 = 2.00

Critical value for 5% one-tail: z₀.₀₅ = 1.645

Since 2.00 > 1.645, Z falls in the critical region.
Conclusion: Reject H₀. Significant evidence at 5% that the mean bolt length exceeds 50 mm.

Key mistake: The standard error is σ/√n, not σ. Always divide σ by √n before computing Z.

Learn 5 — Type I and Type II Errors

A hypothesis test can make two distinct types of error. Understanding these is essential for designing tests and interpreting their conclusions.

Type I Error (False Positive)

Reject H₀ when H₀ is actually true.

P(Type I error) = α (the significance level)

Example: Concluding a fair coin is biased when it isn't.

Type II Error (False Negative)

Fail to reject H₀ when H₀ is actually false.

P(Type II error) = β

Example: Failing to detect that a coin is biased when it is.

The Power of a Test

Power = 1 − β = P(reject H₀ | H₀ is false)

A powerful test is one that is good at detecting a false H₀. We want high power (low β).

The Trade-off

Reducing α (making the test more stringent) moves the critical boundary further into the tail, making it harder to reject H₀. This reduces Type I errors but increases Type II errors — more genuine effects go undetected.

Smaller α → fewer Type I errors → more Type II errors (higher β, lower power)
Larger α → more Type I errors → fewer Type II errors (lower β, higher power)

Calculating P(Type II error)

To find β for a specific alternative value p₁ or μ₁:

Step 1: Find the critical region under H₀ (e.g., reject if X ≥ c).

Step 2: Under the specific alternative H₁ value, find P(X < c). This is P(Type II error) — the probability of not landing in the critical region even though H₁ is true.

Example: H₀: p = 0.5, H₁: p = 0.7, n = 10, α = 5%. Suppose critical region is X ≥ 8.
β = P(X < 8 | p = 0.7) = P(X ≤ 7 | B(10, 0.7)) ≈ 0.617

Exam tip: Type I = reject true H₀ (probability = α, the significance level, always). Type II = miss a false H₀ (probability β, depends on true parameter value).

Worked Examples

Example 1 — Binomial One-Tail Test (Coin)

A coin is tossed 20 times. H₀: p = 0.5, H₁: p > 0.5. Observe 14 heads. Test at 5%.

Under H₀: X ~ B(20, 0.5)

p-value = P(X ≥ 14) = 1 − P(X ≤ 13)
From tables: P(X ≤ 13) ≈ 0.9423, so p-value ≈ 1 − 0.9423 = 0.0577

0.0577 > 0.05 → Fail to reject H₀. M1A1
Insufficient evidence at 5% that the coin is biased towards heads.

Example 2 — Binomial Two-Tail Test (Spinner)

A spinner is claimed to give P(red) = 1/3. In 30 spins, 14 reds are observed. Test H₀: p = 1/3, H₁: p ≠ 1/3 at 5%.

Under H₀: X ~ B(30, 1/3). Expected value = 10.

Since 14 > 10, we look at the right tail. p-value = 2 × P(X ≥ 14)
P(X ≥ 14) = 1 − P(X ≤ 13). Using tables: P(X ≤ 13) ≈ 0.9183
One-tail prob ≈ 0.0817. Two-tail p-value ≈ 2 × 0.0817 = 0.163

0.163 > 0.05 → Fail to reject H₀. M1A1
Insufficient evidence that the probability of red differs from 1/3.

Example 3 — Normal Test (One-Tail, Bolts)

Machine mean μ = 50 mm, σ = 2 mm, n = 25, x̄ = 50.8. Test H₀: μ = 50, H₁: μ > 50 at 5%.

SE = σ/√n = 2/5 = 0.4

Z = (50.8 − 50)/0.4 = 2.00

Critical value: z₀.₀₅ = 1.645. Since 2.00 > 1.645, reject H₀. M1A1
Significant evidence that mean bolt length exceeds 50 mm.

Example 4 — Finding Critical Region (Binomial)

H₀: p = 0.3, H₁: p < 0.3, n = 20, α = 10%.

We need largest c such that P(X ≤ c) ≤ 0.10 under X ~ B(20, 0.3).
P(X ≤ 2) = P(0) + P(1) + P(2)
P(0) = 0.7²⁰ ≈ 0.0008, P(1) ≈ 0.0068, P(2) ≈ 0.0278. Sum ≈ 0.0355 ≤ 0.10 ✓
P(X ≤ 3): add P(3) ≈ 0.0716. Sum ≈ 0.107 > 0.10 ✗

Critical region: X ≤ 2 A1
Actual significance level = P(X ≤ 2 | p = 0.3) ≈ 3.55%

Example 5 — Normal Two-Tail Test

H₀: μ = 100, H₁: μ ≠ 100, σ = 15, n = 36, x̄ = 104. Test at 1%.

SE = 15/√36 = 15/6 = 2.5

Z = (104 − 100)/2.5 = 1.60

Two-tail 1%: critical value z₀.₀₀₅ = 2.576. Since |1.60| < 2.576, fail to reject H₀. M1A1
Insufficient evidence at 1% that the mean differs from 100.

Example 6 — P(Type I Error)

A test is carried out at 5% significance level. What is the probability of a Type I error?

By definition, P(Type I error) = P(reject H₀ | H₀ true) = α = 0.05 B1

Example 7 — Critical Region for Binomial (Left-tail, 5%)

H₀: p = 0.4, H₁: p < 0.4, n = 10. Find critical region at 5%.

X ~ B(10, 0.4) under H₀.
P(X ≤ 0) = 0.6¹⁰ ≈ 0.006 ≤ 0.05 ✓
P(X ≤ 1) ≈ 0.006 + 10 × 0.4 × 0.6⁹ ≈ 0.006 + 0.040 = 0.046 ≤ 0.05 ✓
P(X ≤ 2) ≈ 0.046 + 45 × 0.16 × 0.6⁸ ≈ 0.046 + 0.121 = 0.167 > 0.05 ✗

Critical region: X ≤ 1 A1

Example 8 — Normal One-Tail Test (Scores)

Test scores have claimed mean 70. Researcher suspects the true mean is higher. n = 50, x̄ = 72, σ = 10. Test at 5%.

H₀: μ = 70 H₁: μ > 70 (one-tail right)

SE = 10/√50 = 10/7.071 ≈ 1.414

Z = (72 − 70)/1.414 ≈ 1.414

Critical value: 1.645. Since 1.414 < 1.645, fail to reject H₀. M1A1
Insufficient evidence at 5% that mean score exceeds 70.

Common Mistakes

1. Confusing H₀ and H₁

✗ H₀: p > 0.5 (wrong — H₀ cannot contain a strict inequality)

✓ H₀: p = 0.5 H₁: p > 0.5 (H₀ always contains the equals sign)

2. Wrong tail probability for two-tail test

✗ Two-tail at 5%: compare p-value to 0.05 after computing one-tail probability

✓ Either compare the one-tail probability to 0.025 (α/2), or double the one-tail probability and compare to 0.05

3. Writing "Accept H₀"

✗ "We accept H₀" or "H₀ is true"

✓ "We fail to reject H₀" or "There is insufficient evidence to reject H₀ at the 5% level"

4. Mixing up Type I and Type II errors

✗ Type I = failing to detect a true effect; Type II = wrongly concluding there is an effect

✓ Type I = rejecting a TRUE H₀ (false positive), P = α. Type II = failing to reject a FALSE H₀ (false negative), P = β

5. Stating the critical region incorrectly

✗ "The critical region is 8" or "X = 8"

✓ "The critical region is X ≥ 8" or "X ≤ 2" — always state as an inequality giving a set of values

6. Using both tails in a one-tail p-value

✗ H₁: p > 0.5 → p-value = P(X ≥ x) + P(X ≤ n − x)

✓ For a one-tail test, the p-value involves ONE tail only: P(X ≥ x) for H₁: p > p₀, or P(X ≤ x) for H₁: p < p₀

7. Forgetting √n in the Normal test

✗ Z = (x̄ − μ₀) / σ (dividing by σ, not the standard error)

✓ Z = (x̄ − μ₀) / (σ/√n) — the denominator is the standard error of the mean, σ divided by √n

Key Formulas

Concept	Formula / Statement
Null hypothesis	H₀: p = p₀ or H₀: μ = μ₀ (always contains = )
Alternative hypothesis	H₁: p > p₀ (one-tail right) \| H₁: p < p₀ (one-tail left) \| H₁: p ≠ p₀ (two-tail)
Significance level	α = P(Type I error) = P(reject H₀ \| H₀ true). Common: 1%, 5%, 10%
p-value	P(result as extreme as observed \| H₀ true). Reject H₀ if p-value ≤ α
Binomial test statistic	X ~ B(n, p₀) under H₀. P(X ≥ x) or P(X ≤ x) from tables/formula
Binomial probability	P(X = k) = C(n,k) · p₀ᵏ · (1 − p₀)ⁿ⁻ᵏ
Normal test statistic	Z = (X̄ − μ₀) / (σ/√n) ~ N(0,1) under H₀
Standard error	SE(X̄) = σ/√n (population SD divided by √ sample size)
Critical values (N(0,1))	One-tail 5%: 1.645 \| Two-tail 5%: 1.960 \| One-tail 1%: 2.326 \| Two-tail 1%: 2.576
Type I error	Reject H₀ when H₀ is true. P(Type I) = α
Type II error	Fail to reject H₀ when H₀ is false. P(Type II) = β
Power	1 − β = P(reject H₀ \| H₀ false)
Critical region	Set of test statistic values for which H₀ is rejected (e.g., X ≥ c or Z > z_α)
Two-tail critical region	X ≤ c₁ or X ≥ c₂, where each tail has probability ≤ α/2

Proof Bank

Proof 1 — Why P(Type I error) = α by construction

The significance level α is defined as the maximum probability of rejecting H₀ when H₀ is true. The critical region C is chosen precisely so that:

P(X ∈ C | H₀ true) ≤ α

For a continuous test statistic (like Z ~ N(0,1)), we choose C = {Z > z_α} where z_α is defined such that P(Z > z_α) = α. Therefore P(reject H₀ | H₀ true) = P(Z > z_α | Z ~ N(0,1)) = α exactly.

For a discrete test statistic (Binomial), we choose the largest critical value c such that P(X ≥ c | H₀) ≤ α. The actual significance level is P(X ≥ c | H₀), which may be strictly less than α due to discreteness. The critical region is still constructed so that Type I error probability does not exceed the stated α.

This is why P(Type I error) = α is not a coincidence — it is the definition of how the critical region is chosen.

Proof 2 — Derivation of the Normal Test Statistic Z = (X̄ − μ₀)/(σ/√n)

Given: X₁, X₂, …, Xₙ are i.i.d. (independent, identically distributed) with mean μ and known variance σ².

Step 1: Distribution of the sum.
By linearity of expectation: E(X₁ + X₂ + … + Xₙ) = nμ
By independence: Var(X₁ + … + Xₙ) = nσ²

Step 2: Distribution of the sample mean.
X̄ = (X₁ + … + Xₙ)/n
E(X̄) = μ
Var(X̄) = σ²/n (variance scales as 1/n² × nσ² = σ²/n)
So X̄ ~ N(μ, σ²/n) (by the Central Limit Theorem, or exactly if Xᵢ are Normal).

Step 3: Standardise under H₀ (μ = μ₀).
Under H₀: X̄ ~ N(μ₀, σ²/n).
Subtracting the mean and dividing by the standard deviation:

Z = (X̄ − μ₀) / √(σ²/n) = (X̄ − μ₀) / (σ/√n) ~ N(0, 1)

This Z is the test statistic. Values far from 0 (in the relevant tail) give evidence against H₀.

Normal Distribution Hypothesis Test Visualiser

Explore how the critical region changes with significance level and test type. The red shaded area shows the critical region (where H₀ would be rejected).

Test type:

α: 5%

Exercise 1 — Setting Up Hypotheses (10 Questions)

Exercise 2 — Binomial Hypothesis Tests (10 Questions)

Exercise 3 — Normal Distribution Tests (10 Questions)

Exercise 4 — Type I and Type II Errors (10 Questions)

Exercise 5 — Critical Regions (10 Questions)

Practice (30 Questions)

Challenge (15 Questions)

Exam Style Questions

Question 1 [6 marks]

A supermarket claims that 40% of customers use self-checkout. A manager suspects the true proportion is lower. She surveys 15 randomly selected customers and finds that 3 used self-checkout.

(i) Write down H₀ and H₁, defining the parameter p. [1]
(ii) Using a 5% significance level, carry out a hypothesis test and state your conclusion in context. [4]
(iii) What is the probability of a Type I error in this test? [1]

(i) p = proportion of customers using self-checkout. H₀: p = 0.4, H₁: p < 0.4 [B1]

(ii) Under H₀: X ~ B(15, 0.4). p-value = P(X ≤ 3).
P(0) = 0.6¹⁵ ≈ 0.000470, P(1) ≈ 0.004699, P(2) ≈ 0.021985, P(3) ≈ 0.063449 [M1]
P(X ≤ 3) ≈ 0.0906 [A1]
0.0906 > 0.05, so fail to reject H₀. [M1]
Insufficient evidence at 5% that fewer than 40% of customers use self-checkout. [A1]

(iii) P(Type I error) = 0.05 (= α, the significance level) [B1]

Question 2 [7 marks]

The heights of adult males in a country are known to be Normally distributed with standard deviation 8 cm. A researcher believes the mean height μ has changed from the historical value of 175 cm. She measures a random sample of 64 males and finds a sample mean of 177.2 cm.

(i) State H₀ and H₁ for an appropriate test. [1]
(ii) Calculate the test statistic Z. [2]
(iii) State the critical region for a 5% significance level and conclude the test. [2]
(iv) Find the p-value for this test. [2]

(i) H₀: μ = 175, H₁: μ ≠ 175 (two-tail, as she suspects it has "changed") [B1]

(ii) SE = 8/√64 = 8/8 = 1 [M1]
Z = (177.2 − 175)/1 = 2.2 [A1]

(iii) Two-tail 5%: critical region |Z| > 1.960. Since |2.2| = 2.2 > 1.960, reject H₀. [B1, A1]
Significant evidence at 5% that the mean height has changed from 175 cm.

(iv) p-value = 2 × P(Z > 2.2) = 2 × (1 − Φ(2.2)) = 2 × 0.0139 = 0.0278 [M1A1]

Question 3 [5 marks]

A hypothesis test uses H₀: p = 0.3 and H₁: p > 0.3 with n = 20 and a 5% significance level.

(i) Find the critical region for this test. [3]
(ii) State the actual significance level of the test. [1]
(iii) If in fact p = 0.5, find P(Type II error). [1]

(i) X ~ B(20, 0.3). Need smallest c with P(X ≥ c) ≤ 0.05.
P(X ≥ 9) = 1 − P(X ≤ 8) ≈ 1 − 0.8867 = 0.1133 > 0.05
P(X ≥ 10) = 1 − P(X ≤ 9) ≈ 1 − 0.9520 = 0.0480 ≤ 0.05 ✓ [M1A1]
Critical region: X ≥ 10 [A1]

(ii) Actual significance level = P(X ≥ 10 | p = 0.3) ≈ 0.0480 = 4.80% [B1]

(iii) P(Type II error) = P(X < 10 | p = 0.5) = P(X ≤ 9 | B(20, 0.5)) ≈ 0.4119 [B1]

Question 4 [4 marks]

Explain the difference between a Type I error and a Type II error in the context of a test where H₀ states that a new drug has no effect.

Type I error: Concluding the drug has an effect when in fact it does not. [B1]
Probability of Type I error = α (the significance level). [B1]
Type II error: Concluding the drug has no effect when in fact it does have an effect. [B1]
A more stringent significance level (smaller α) reduces Type I errors but increases Type II errors. [B1]

Question 5 [6 marks]

A random variable X ~ B(25, p). It is required to test H₀: p = 0.2 against H₁: p ≠ 0.2 at the 10% significance level. The observed value is X = 9.

(i) Find the p-value for this test. [3]
(ii) State your conclusion. [2]
(iii) State the critical region for this two-tail test. [1]

(i) X ~ B(25, 0.2). Expected = 5. Since 9 > 5, we look at the right tail.
P(X ≥ 9) = 1 − P(X ≤ 8) ≈ 1 − 0.9532 = 0.0468 [M1A1]
Two-tail p-value = 2 × 0.0468 = 0.0936 [A1]

(ii) 0.0936 < 0.10, so reject H₀. [M1]
Significant evidence at 10% that p ≠ 0.2. [A1]

(iii) Critical region: X ≤ 1 or X ≥ 9 [B1]

Question 6 [5 marks]

A factory claims that the mean weight of a product is 500 g with standard deviation 12 g. A quality inspector takes a sample of 36 items and finds a sample mean of 496 g. Test at the 1% level whether the mean weight is less than 500 g.

H₀: μ = 500, H₁: μ < 500 (one-tail left) [B1]
SE = 12/√36 = 12/6 = 2 [M1]
Z = (496 − 500)/2 = −4/2 = −2.00 [A1]
Critical value for 1% one-tail (left): −2.326. Since −2.00 > −2.326, fail to reject H₀. [M1]
Insufficient evidence at 1% that the mean weight is less than 500 g. [A1]

Question 7 [4 marks]

A test has critical region X ≤ 2 where X ~ B(12, p). Given H₀: p = 0.35 and H₁: p < 0.35, find the actual significance level and P(Type II error) when p = 0.2.

Actual significance = P(X ≤ 2 | p = 0.35, n = 12)
P(0) = 0.65¹² ≈ 0.00569, P(1) ≈ 0.03680, P(2) ≈ 0.10886 [M1]
P(X ≤ 2) ≈ 0.151 → actual significance level ≈ 15.1% [A1]

P(Type II | p = 0.2) = P(X > 2 | B(12, 0.2)) = 1 − P(X ≤ 2 | B(12, 0.2))
P(X ≤ 2 | B(12, 0.2)) ≈ 0.5583 → P(Type II) ≈ 1 − 0.5583 = 0.442 [M1A1]

Question 8 [5 marks]

The lifetime T (in hours) of a type of battery is Normally distributed with σ = 5 hours. The manufacturer claims μ = 60 hours. A consumer group tests a random sample of 25 batteries and obtains x̄ = 57.8 hours. Test at the 5% level whether the mean lifetime is less than claimed.

H₀: μ = 60, H₁: μ < 60 [B1]
SE = 5/√25 = 5/5 = 1 [M1]
Z = (57.8 − 60)/1 = −2.2 [A1]
Critical value: −1.645 (one-tail left, 5%). Since −2.2 < −1.645, reject H₀. [M1]
Significant evidence at 5% that mean battery lifetime is less than 60 hours. [A1]

Past Paper Questions (Cambridge A-Level Style)

Past Paper 1 — 9709/62/O/N/19 style [6 marks]

A bag contains a large number of discs. The manufacturer states that 30% of the discs are red. James thinks the proportion of red discs is less than 30%. James takes a random sample of 20 discs and finds 3 are red.

(i) Test James's claim at the 5% significance level. [5]
(ii) Write down the probability of a Type I error. [1]

(i) H₀: p = 0.3, H₁: p < 0.3 where p = proportion of red discs [B1]
X ~ B(20, 0.3) under H₀. p-value = P(X ≤ 3) [M1]
P(X ≤ 3) = P(0)+P(1)+P(2)+P(3) ≈ 0.001 + 0.007 + 0.028 + 0.072 = 0.107 [A1]
0.107 > 0.05, fail to reject H₀ [M1]
Insufficient evidence at 5% that less than 30% of discs are red [A1]

(ii) P(Type I error) = 0.05 [B1]

Past Paper 2 — 9709/62/M/J/18 style [7 marks]

The random variable X has distribution B(n, p). A single observation x is used to test H₀: p = 0.45 against H₁: p < 0.45. With n = 20, the critical region is X ≤ 5.

(i) Find the actual significance level of the test. [2]
(ii) Find P(Type II error) when p = 0.3. [2]
(iii) State the effect on P(Type II error) if the significance level is increased. [1]
(iv) The observation is x = 4. State your conclusion. [2]

(i) P(X ≤ 5 | B(20, 0.45)) ≈ 0.0553. Actual significance level ≈ 5.53% [M1A1]

(ii) P(Type II | p = 0.3) = P(X > 5 | B(20, 0.3)) = 1 − P(X ≤ 5 | B(20, 0.3))
P(X ≤ 5 | B(20, 0.3)) ≈ 0.4163 → P(Type II) ≈ 0.584 [M1A1]

(iii) Increasing significance level moves the critical boundary (e.g., X ≤ 6), making it easier to reject H₀, so P(Type II error) decreases. [B1]

(iv) x = 4 ≤ 5, so x is in the critical region. Reject H₀. [M1]
Significant evidence that p < 0.45. [A1]

Past Paper 3 — 9709/63/O/N/20 style [6 marks]

The masses of apples in an orchard have been Normally distributed for many years with mean 185 g and standard deviation 22 g. Following a change in growing conditions, a farmer believes the mean mass has increased. He takes a random sample of 30 apples and finds the mean mass is 191 g.

Carry out a hypothesis test at the 10% significance level. State your hypotheses and conclusion clearly. [6]

H₀: μ = 185, H₁: μ > 185 [B1]
SE = 22/√30 ≈ 4.018 [M1]
Z = (191 − 185)/4.018 ≈ 1.493 [A1]
Critical value for 10% one-tail: 1.282. Since 1.493 > 1.282, reject H₀. [M1A1]
Significant evidence at 10% that the mean mass of apples has increased following the change in conditions. [A1]

Past Paper 4 — 9709/62/O/N/21 style [5 marks]

A teacher claims that students score an average of 65% on a test. A student believes the actual mean is different. She collects data from a random sample of 40 students and calculates a sample mean of 62.4%. The population standard deviation is known to be 9%.

Test the student's belief at the 5% level. [5]

H₀: μ = 65, H₁: μ ≠ 65 (two-tail, "different") [B1]
SE = 9/√40 ≈ 1.423 [M1]
Z = (62.4 − 65)/1.423 ≈ −1.827 [A1]
Two-tail 5%: |Z| must exceed 1.960. |−1.827| = 1.827 < 1.960, fail to reject H₀. [M1]
Insufficient evidence at 5% that the mean score differs from 65%. [A1]

Past Paper 5 — 9709/61/M/J/22 style [7 marks]

In a large town, it is claimed that 55% of households recycle regularly. A council member suspects the true proportion is higher. She surveys a random sample of 18 households; let X be the number that recycle regularly.

(i) State suitable hypotheses. [1]
(ii) Find the critical region for a test at the 5% significance level. [3]
(iii) The council member finds 14 households that recycle. State and justify the conclusion. [2]
(iv) State the probability of a Type I error using your critical region. [1]

(i) H₀: p = 0.55, H₁: p > 0.55 where p = proportion of households that recycle [B1]

(ii) X ~ B(18, 0.55). Need P(X ≥ c) ≤ 0.05.
P(X ≥ 13) = 1 − P(X ≤ 12) ≈ 1 − 0.8694 = 0.1306 > 0.05
P(X ≥ 14) = 1 − P(X ≤ 13) ≈ 1 − 0.9519 = 0.0481 ≤ 0.05 ✓ [M1A1]
Critical region: X ≥ 14 [A1]

(iii) x = 14 ≥ 14, so x is in the critical region. Reject H₀. [M1]
Significant evidence at 5% that more than 55% of households recycle regularly. [A1]

(iv) P(Type I error) = P(X ≥ 14 | p = 0.55) ≈ 0.0481 [B1]

Inference & Hypothesis Testing A-Level Stats 2