error-correction

Hamming Code Error Correction Explained (with Math)

Denny Denny
8 min read
A seven-bit Hamming codeword with parity bits at power-of-two positions glowing distinctly from data bits.

TL;DR: A Hamming(7,4) code adds three parity bits to four data bits so that any single-bit flip — whether in a data bit or in a parity bit itself — can be located and corrected. Each parity bit covers a specific subset of positions defined by binary place values, and the XOR of the received parity bits forms a syndrome that points directly at the flipped position.

A simple parity bit catches the fact that something went wrong, but not what. If a single bit in an 8-bit word flips during transmission, parity tells you to discard the word and ask for a retransmit. That’s fine over a serial link with a back-channel; it’s useless for memory ECC where there is no retry, no second chance to read the same bit out of a DRAM cell. For that you need a code that doesn’t just detect — it corrects. Richard Hamming worked out the canonical solution in 1950, and his name is still on it.

What Does a Hamming Code Actually Do?

A Hamming code embeds redundancy into a transmitted word so that any single-bit error produces a unique pattern of parity-check failures. By examining which checks failed, the receiver computes the position of the flipped bit and inverts it. The data bit’s location was never directly known — only its address among the parity-check failures.

The construction has two pieces:

  1. Bit positions are numbered from 1. Position 1, 2, 4, 8 — every power of two — holds a parity bit. Every other position holds a data bit. For Hamming(7,4) that means positions 1, 2, 4 are parity; positions 3, 5, 6, 7 are data.
  2. Each parity bit covers exactly the positions whose binary expansion has the corresponding bit set. Parity bit at position 2k2^k covers all positions ii with bit kk set in ii.

That second rule is the engine. Let’s expand it for n=7n=7:

  • P1P_1 (position 1, 202^0) covers positions with bit 0 set: 1, 3, 5, 7
  • P2P_2 (position 2, 212^1) covers positions with bit 1 set: 2, 3, 6, 7
  • P4P_4 (position 4, 222^2) covers positions with bit 2 set: 4, 5, 6, 7

Visualizing the three sets as a Venn diagram, the data bits at positions 3, 5, 6, 7 fall in different overlap regions: position 3 is in P1P2P_1 \cap P_2, position 5 is in P1P4P_1 \cap P_4, position 6 is in P2P4P_2 \cap P_4, and position 7 is in all three.

Position1234567
RoleP1P_1P2P_2D1D_1P4P_4D2D_2D3D_3D4D_4
Covered by P1P_1xxxx
Covered by P2P_2xxxx
Covered by P4P_4xxxx

Computing the Parity Bits

Each parity bit is set so that the XOR of all bits in its coverage set is zero (assuming even parity). Algebraically:

P1=D1D2D4P_1 = D_1 \oplus D_2 \oplus D_4 P2=D1D3D4P_2 = D_1 \oplus D_3 \oplus D_4 P4=D2D3D4P_4 = D_2 \oplus D_3 \oplus D_4

The implementation is a small fan-out tree of XOR gates. The XOR is doing exactly the work it’s purpose-built for: summing over F2\mathbb{F}_2. If that sum is zero on the receive side, the covered set has even weight and parity is consistent; if it’s one, something flipped.

Worked Example: Encoding 1011

Let’s encode the 4-bit data word 10111011 — meaning D1=1,D2=0,D3=1,D4=1D_1 = 1, D_2 = 0, D_3 = 1, D_4 = 1.

P1=101=0P_1 = 1 \oplus 0 \oplus 1 = 0 P2=111=1P_2 = 1 \oplus 1 \oplus 1 = 1 P4=011=0P_4 = 0 \oplus 1 \oplus 1 = 0

The transmitted codeword, position 1 to position 7, is:

Position1234567
BitP1P_1=0P2P_2=1D1D_1=1P4P_4=0D2D_2=0D3D_3=1D4D_4=1

So the wire carries 01100110110011 (reading left to right, position 1 first).

Detecting and Correcting an Error

Suppose during transmission position 5 flips — the receiver reads 01101110110111 instead of 01100110110011. The receiver re-runs each parity check on the received bits and produces three syndrome bits S4S2S1S_4 S_2 S_1:

S1=P1D1D2D4=0111=1S_1 = P_1 \oplus D_1 \oplus D_2 \oplus D_4 = 0 \oplus 1 \oplus 1 \oplus 1 = 1 S2=P2D1D3D4=1111=0S_2 = P_2 \oplus D_1 \oplus D_3 \oplus D_4 = 1 \oplus 1 \oplus 1 \oplus 1 = 0 S4=P4D2D3D4=0111=1S_4 = P_4 \oplus D_2 \oplus D_3 \oplus D_4 = 0 \oplus 1 \oplus 1 \oplus 1 = 1

Stacking the syndrome bits in order S4S2S1=1012=5S_4 S_2 S_1 = 101_2 = 5. The syndrome is the binary address of the flipped position. That’s not a coincidence — it’s the entire reason the parity coverage was defined by binary place values. If a single bit at position ii flips, exactly the parity checks whose coverage includes ii fail, and those checks correspond to the bits set in the binary expansion of ii.

The receiver inverts position 5, recovering the original codeword 01010110101011, and extracts D1D2D3D4=1011D_1 D_2 D_3 D_4 = 1011. Correction complete.

If the syndrome had been 000000, no error occurred. If the syndrome had been 0102=2010_2 = 2, parity bit P2P_2 itself flipped — the algorithm corrects parity-bit errors with the same machinery, no special case needed.

Hamming(15, 11) and the General Rule

The construction generalizes. With rr parity bits in positions 1,2,4,,2r11, 2, 4, \dots, 2^{r-1}, the codeword has length n=2r1n = 2^r - 1, of which k=nrk = n - r are data bits. So:

rrnnkkCode
374Hamming(7,4)
41511Hamming(15,11)
53126Hamming(31,26)
66357Hamming(63,57)

Larger codes are more efficient — Hamming(63,57) carries 57 data bits per 63 transmitted, an overhead of 9.5% versus 75% for Hamming(7,4) — but they still correct only a single bit. There’s no free lunch: stronger codes need more parity.

Hamming Distance, SEC, and DED

The minimum Hamming distance of a code is the smallest number of bit flips that can transform one valid codeword into another valid codeword. For a code to correct tt errors and detect dd additional errors (without correcting them), the minimum distance must be at least 2t+d+12t + d + 1.

Hamming(7,4) has minimum distance 3, which gives:

  • t=1,d=0t = 1, d = 0: single-error correction (SEC), or
  • t=0,d=2t = 0, d = 2: double-error detection without correction.

The standard mode is single-error correction: any single flip is recovered, but a double flip produces a syndrome that matches some other single-flip pattern, and the decoder confidently miscorrects to the wrong word.

To get single-error correction with double-error detection (SEC-DED) you add one more bit: an overall parity bit covering the entire codeword. The minimum distance jumps to 4. The decoder now has four cases:

  • Syndrome zero, overall parity matches: no error.
  • Syndrome nonzero, overall parity mismatch: single error, correct using syndrome.
  • Syndrome nonzero, overall parity matches: double error detected, do not correct.
  • Syndrome zero, overall parity mismatch: the overall parity bit itself flipped.

This extended Hamming code, often written as Hamming(8,4) or Hamming(72,64), is the foundation of every commercial DRAM ECC implementation. Server-grade DIMMs use a (72,64) variant — 64 data bits plus 8 ECC bits per 64-bit memory word — to silently correct single-bit errors and halt the system on uncorrectable double-bit errors.

A Note on Implementation Cost

Encoder and decoder logic are both pure XOR networks. The encoder for Hamming(7,4) needs three XORs of three inputs each — implementable as six 2-input XOR gates — so it’s a fixed combinational block. The decoder is the same XOR tree plus a decoder that converts the 3-bit syndrome to a 7-bit “flip mask” XORed with the received word.

For the storage cost, compare it to plain parity from a parity-bit detector: one bit of parity catches single errors but cannot fix them; three bits of Hamming both catches and fixes them. The crossover where Hamming becomes worthwhile is exactly the situation where retransmission is impossible — RAM, deep-space probes, optical media.

Common Pitfalls

  • Indexing from zero. The Hamming construction uses 1-indexed positions because the syndrome’s value equals the position. Indexing from zero breaks the elegant identity.
  • Mixing odd and even parity inconsistently. Pick one (even is standard) and stick with it on both encode and decode sides.
  • Treating Hamming as a CRC. A CRC catches burst errors of a specific length; Hamming catches a fixed number of arbitrary-position errors. They are different tools.
  • Forgetting that parity bits can themselves flip. The syndrome is meant to flag any of the nn positions, including parity positions. The construction handles it automatically — don’t add special cases.
  • Confusing Hamming distance with Levenshtein distance. Hamming counts bit substitutions only, with both words the same length. The two are unrelated.

How This Connects to the Rest of Coding Theory

Hamming codes are the cleanest example of a linear block code: every codeword is a linear combination (over F2\mathbb{F}_2) of basis codewords, parity is a linear function of the data, and decoding reduces to multiplying the received word by a parity-check matrix. The historical thread runs all the way back to early studies of binary representation, and forward through Reed-Solomon, BCH, LDPC, and turbo codes — each adding correction strength at the cost of decoding complexity.

What’s Next

The companion post Parity Bit Tutorial: Single-Bit Error Detection zooms in on the simpler case — a single XOR-based parity check — which is the foundation Hamming codes generalize. After that, Gray Code Explained: Why Rotary Encoders Use It covers a related but distinct use of XOR-based encoding: not for error correction, but for glitch-free transitions in mechanical sensors and clock-domain crossing.

To experiment with the XOR networks at the heart of Hamming encoding, drop several XOR gates into a fresh DigiSim canvas, wire them up per the P1,P2,P4P_1, P_2, P_4 equations above, and verify the encoded word matches the table for every 4-bit input. Then add a manual “fault injector” switch on each line and watch the syndrome bits track the flipped position.