Cambridge Syllabus Notes

13.3 Floating‑point Numbers, Representation and Manipulation

Floating‑point numbers let computers store real numbers (like 3.1415) in a compact binary format. Think of it as a digital version of a ruler that can stretch (float) to show numbers of any size, from tiny fractions to huge values, while keeping a fixed number of digits.

What is a floating‑point number?

A floating‑point number is written as sign × mantissa × 2^exponent.
🔢 Example: 6.5 = +1.625 × 2² (mantissa 1.625, exponent 2, sign +).

Binary floating‑point format (IEEE 754)

The standard format splits the bits into three parts:

Sign bit – 0 for positive, 1 for negative.
Exponent field – stores the exponent + a bias.
Mantissa (fraction) field – stores the significant digits.

The bias is chosen so that the exponent can be stored as an unsigned integer. For single precision (32 bits): bias = 127. For double precision (64 bits): bias = 1023.

Bits	Field	Description
1	Sign	0 = +, 1 = –
8	Exponent	Exponent + bias
23	Mantissa	Fractional part (leading 1 is implicit)

Encoding an example: 6.5 in single precision

1. Convert 6.5 to binary: 110.1₂.
2. Normalise: 1.101 × 2².
3. Sign bit = 0 (positive).
4. Exponent = 2 + bias(127) = 129 → 10000001₂.
5. Mantissa = 101 followed by 20 zeros → 101000000000000000000.

Final 32‑bit pattern: 0 10000001 10100000000000000000000.
🔢 In hex: 0x40D00000.

Key concepts to remember

Floating‑point numbers are stored as sign × mantissa × 2^exponent.
The exponent is stored with a bias to allow negative exponents.
The mantissa has an implicit leading 1 (except for subnormal numbers).
Rounding occurs when the exact binary representation has more bits than the mantissa can hold.
Special bit patterns represent Infinity, NaN, and subnormal numbers.

Exam Tips

💡 Bias calculation: exponent field = true exponent + bias.
💡 Normalise before encoding: shift binary point so that only one non‑zero digit is left of the point.
💡 Rounding rule: round to nearest, ties to even (IEEE 754 default).
💡 Check for special cases: zero, Infinity, NaN – they have unique bit patterns.
💡 Practice converting numbers: start with decimal → binary → normalise → encode. The more you practice, the faster you’ll get.

Quick Practice Problem

Encode the decimal number −12.75 in single‑precision IEEE 754 format. Show the sign bit, exponent field, and mantissa bits.

📝 Hint: 12.75 = 1100.11₂ → normalise to 1.10011 × 2³. Then apply the bias and fill the mantissa.

Describe the format of binary floating-point real numbers