Cambridge Syllabus Notes

Algorithm Design and Problem‑Solving: Suggesting and Using Test Data

Why Test Data Is Essential 🎯

When you design an algorithm, you’re essentially writing a recipe. Just as a chef needs to try a dish with different ingredients to ensure it tastes good, you need to run your algorithm with a variety of input data to make sure it behaves correctly, efficiently, and robustly. Good test data helps you:

Verify correctness (does it give the right answer?)
Check performance (does it run fast enough?)
Detect edge‑case bugs (does it handle the smallest or largest inputs?)
Ensure reliability (does it handle unexpected or invalid data?)

Types of Test Data 📊

Typical (average) data – represents what users normally provide.
Boundary data – tests the limits of input ranges (e.g., minimum, maximum, just inside/outside limits).
Extreme data – very large or very small values to test scalability.
Random data – to catch hidden bugs that only appear with unpredictable patterns.
Error data – invalid or malformed inputs to test robustness.

How to Choose Test Data for an Algorithm

1. **Understand the problem specification** – note input size limits, data types, and required outputs.
2. **Identify critical values** – e.g., if the algorithm uses a loop from 1 to n, test with n = 0, 1, 2, 1000.
3. **Create a test data matrix** – list each test case with its purpose and expected outcome.
4. **Automate where possible** – use scripts to generate random or boundary data sets.
5. **Document each test case** – include the input, expected output, and reasoning.

Example: Sorting Algorithm Test Data

Suppose we design a quicksort algorithm that sorts an array of integers. Here’s a concise test data table:

Test Case	Input Array	Expected Output	Purpose
1	[3, 1, 4, 1, 5]	[1, 1, 3, 4, 5]	Typical data
2	[]	[]	Empty array (boundary)
3	[1, 2, 3, 4, 5]	[1, 2, 3, 4, 5]	Already sorted (performance check)
4	[5, 4, 3, 2, 1]	[1, 2, 3, 4, 5]	Reverse order (worst case)
5	[2, 2, 2, 2]	[2, 2, 2, 2]	All equal (edge case)

Example: Binary Search Test Data

Binary search works on a sorted array. Test data should include cases where the target is present, absent, at the ends, and not in the array at all.

Test Case	Array	Target	Expected Index	Purpose
1	[10, 20, 30, 40, 50]	30	2	Target in middle
2	[10, 20, 30, 40, 50]	10	0	Target at start
3	[10, 20, 30, 40, 50]	50	4	Target at end
4	[10, 20, 30, 40, 50]	35	-1	Target not present

Testing a More Complex Problem: Maximum Subarray

The maximum subarray problem asks for the contiguous sub‑array with the greatest sum. Test data should cover positive numbers, negative numbers, and a mix of both.

All positives: e.g. [1, 2, 3, 4] → sum = 10.
All negatives: e.g. [-1, -2, -3] → best sum = -1 (single element).
Mixed: e.g. [-2, 1, -3, 4, -1, 2, 1, -5, 4] → sum = 6 (sub‑array [4, -1, 2, 1]).
Single element: [42] → sum = 42.
Large array: generate 10,000 random integers between -1000 and 1000 to test performance.

Practical Tips for Students 🧪

Write a test harness that automatically runs your algorithm against all test cases and reports failures.
Use assertions to check that the output matches the expected result.
Keep a log of test case names, inputs, outputs, and whether they passed.
When a test fails, debug incrementally – start with the simplest failing case.
Remember that time complexity matters: for large n, test with n = 10^5 or more to see if your algorithm scales.

Summary of Key Points

Test data is like the ingredients for a recipe – you need variety to ensure a good result.
Cover typical, boundary, extreme, random, and error cases.
Document each test case: input, expected output, and purpose.
Automate testing where possible to save time and reduce human error.
Use test data to guide algorithm design: if you know the worst‑case input, you can optimise accordingly.

Suggest and apply suitable test data