Show understanding that binary representations can give rise to rounding errors

13.3 Floating‑Point Numbers, Representation and Manipulation

What are Floating‑Point Numbers? 🧮

Floating‑point numbers are a way computers store real numbers (numbers with a fractional part) using a fixed amount of memory. They look like this in decimal: -12.34 or 0.00056.

Why Use Binary? 🔢

Computers work in binary (base‑2). A floating‑point number is split into three parts:

  1. Sign bit – 0 for positive, 1 for negative.
  2. Exponent – tells us how many places to shift the binary point.
  3. Mantissa (fraction) – the significant digits.

IEEE 754 – The Standard 📐

The most common format is single precision (32 bits):

Bits Meaning
1 Sign
8 Exponent (biased by 127)
23 Mantissa (fraction)

Binary Representation of 0.1 ⚖️

In binary, 0.1 is a repeating fraction: $$0.0001100110011\ldots_2$$ Because we only have 23 mantissa bits, the computer keeps the first 23 bits and then rounds: $$0.00011001100110011001101_2$$

When you convert that back to decimal you get: $$0.10000000149011612$$ So the stored value is slightly larger than the true 0.1.

Rounding Errors in Arithmetic 🧩

Because of the tiny difference above, adding two numbers can give a result that looks wrong:

  • $0.1 + 0.2$ is expected to be $0.3$, but in binary we get $0.30000000000000004$.
  • When you print it in many programming languages you’ll see the extra digits.

This happens because each number is already an approximation, and the addition introduces another small error that is then rounded again.

Analogy: Measuring with a Ruler 📏

Imagine you have a ruler that only has marks every millimetre. If you try to measure a length that is 1.234 cm, you can only record 1.2 cm or 1.3 cm – you lose the extra 0.034 cm. That’s exactly what happens with floating‑point numbers: the “ruler” (the mantissa) is limited, so we lose precision.

Tips to Handle Rounding Errors

  1. Use higher‑precision types (double, decimal) when possible.
  2. When comparing numbers, use a tolerance: |a - b| < 1e-9.
  3. Round the result to the desired number of decimal places before displaying.
  4. For financial calculations, use fixed‑point or decimal types that avoid binary rounding.

Quick Quiz 🚀

1. What is the binary representation of the decimal number 0.5? 2. Why does $0.1 + 0.2$ not equal $0.3$ in many programming languages? 3. How can you check if two floating‑point numbers are “close enough”?

Revision

Log in to practice.

2 views 0 suggestions