cpu-architecture

CPU Flags Register: Carry, Zero, Overflow, Sign

Denny Denny
9 min read
Four glowing CPU status flag indicators arranged on a register strip, each lit in a different accent color.

TL;DR: A CPU flags register stores four single-bit results from the most recent ALU operation: Zero (Z) when the result is zero, Carry (C) on unsigned overflow, Negative/Sign (N) equal to the result’s MSB, and Overflow (V) on signed overflow. Conditional branches read these flags to decide whether to redirect the program counter.

The flags register is what gives a CPU the ability to make decisions. Without it, every program would be a straight line: fetch, execute, fetch, execute, fetch, execute. Conditional jumps need a single bit somewhere that says “the last comparison was equal” or “the last subtraction overflowed,” and the flags register is the standard place to put those bits.

This post defines each of the four standard flags, shows the boolean logic that generates them, walks through worked examples for the tricky cases, and closes by tying everything to the program counter and conditional branches covered earlier in the CPU-architecture track.

What Is a Flags Register?

A flags register — also called a status register, condition code register, or PSW (program status word) on some architectures — is a small set of single-bit latches that record properties of the most recent ALU operation.

Common names across architectures:

ArchitectureRegister nameWidth
x86EFLAGS / RFLAGS32 / 64 bits (only a few flags used)
ARMCPSR / APSR32 bits
RISC-V(no architectural flags)n/a — uses comparison instructions
6502P (processor status)8 bits
MIPS(no architectural flags)n/a
DigiSim toy CPUFlags register4 bits — Z, C, N, V

Note the asymmetry: classic CISC and most ARM-style designs ship with a flags register; modern RISC-V chooses to skip it and instead provides explicit comparison instructions like BEQ rs1, rs2, target. Each approach has trade-offs, but flags are the older, more pedagogically useful pattern, and they map cleanly onto the 4-bit register building block.

DigiSim’s FLAGS_REGISTER is a 4-bit latch that captures Z, C, N, and V from the ALU on each instruction.

The Four Standard Flags

Zero Flag (Z)

The Zero flag is set when the ALU result is exactly zero.

Definition for an nn-bit result R=Rn1Rn2R1R0R = R_{n-1}R_{n-2}\ldots R_1 R_0:

Z=Rn1Rn2R0Z = \overline{R_{n-1}} \cdot \overline{R_{n-2}} \cdot \ldots \cdot \overline{R_0}

Equivalently, Z is the NOR of every result bit. For an 8-bit ALU, Z is one wire fed by an 8-input NOR gate (often built from four 2-input NOR gates and a tree).

Z is what the BEQ (branch if equal), BZ (branch if zero), and JE (jump if equal) instructions read. Compare-and-branch is implemented as subtract then test ZCMP A, B is A - B discarded except for the flags, and BEQ target is “branch if Z is set.”

Carry Flag (C)

The Carry flag captures the carry-out of the most-significant bit of an unsigned addition or subtraction. It is the flag that detects unsigned overflow.

For an nn-bit ripple-carry adder built from a chain of full adders — like the 4-bit ripple-carry adder — C is simply the carry-out of the topmost stage:

C=cnC = c_{n}

For subtraction ABA - B implemented as A+B+1A + \overline{B} + 1, conventions differ:

  • x86, MIPS-style: C is the borrow — set when the subtraction needed to borrow (i.e., when A<BA < B unsigned).
  • ARM: C is the inverted borrow — set when the subtraction did not borrow.

DigiSim follows the x86 convention: C = 1 after SUB means A<BA < B as unsigned values.

Negative / Sign Flag (N)

The Negative flag — sometimes called the Sign flag — is just the most significant bit of the result:

N=Rn1N = R_{n-1}

In two’s-complement representation the MSB is the sign bit, so N reflects whether the result is negative when interpreted as a signed value.

N is wired directly: it is one wire from the ALU’s top result bit to the flags register’s N input. There is no logic in between.

Overflow Flag (V)

The Overflow flag detects signed overflow — when the result of a signed addition or subtraction is mathematically correct but no longer fits in the destination width.

The clean definition: V is set when the carry into the MSB and the carry out of the MSB differ.

V=cn1cnV = c_{n-1} \oplus c_{n}

where cn1c_{n-1} is the carry into bit n1n-1 (the sign bit) and cnc_n is the carry out of bit n1n-1. The XOR comparison is exactly the XOR gate — V is one XOR gate driven by two carries that the adder is already computing.

Why this works: signed overflow only happens when adding two positives produces a negative or adding two negatives produces a positive. In both of those cases the carry into the sign bit and the carry out of it differ. In every non-overflow case they match.

Conditional Branches Read the Flags

The whole point of these four bits is to control the program counter. A branch instruction is parameterized by which flag (or boolean combination of flags) it tests.

Common branch conditions — the names match the x86 mnemonics:

MnemonicConditionBoolean expression
BEQ / JEEqual / ZeroZZ
BNE / JNENot equalZ\overline{Z}
BCS / JCCarry set / unsigned overflowCC
BCC / JNCCarry clearC\overline{C}
BMI / JSNegative / minusNN
BPL / JNSPositive or zeroN\overline{N}
BVS / JOOverflow setVV
BVC / JNOOverflow clearV\overline{V}
BLT / JLSigned less thanNVN \oplus V
BGE / JGESigned greater or equalNV\overline{N \oplus V}
BLE / JLESigned less or equalZ+(NV)Z + (N \oplus V)
BGT / JGSigned greater thanZNV\overline{Z} \cdot \overline{N \oplus V}
BLS / JBEUnsigned less or equalC+ZC + Z
BHI / JAUnsigned greater thanCZ\overline{C} \cdot \overline{Z}

That table is exactly what the control unit’s branch-condition decoder implements. It is a small combinational block — a few AND, XOR, and OR gates — that takes the four flag bits and a condition selector from the instruction and outputs a single “take branch” signal feeding the PC’s update mux.

Worked Examples (8-bit)

The flags register’s behavior is most clearly seen in three corner cases.

Example 1: 8 + 8 — Boring Case

Adding 8 and 8 in 8-bit unsigned arithmetic:

  0000 1000   (8)
+ 0000 1000   (8)
  ---------
  0001 0000   (16)
  • R=00010000=16R = 0001\,0000 = 16. Z = 0 (result is nonzero).
  • Carry-out of MSB: c8=0c_8 = 0. C = 0.
  • MSB of result: R7=0R_7 = 0. N = 0.
  • Carry into MSB: c7=0c_7 = 0, carry out: c8=0c_8 = 0. V = 000 \oplus 0 = 0.

Flags: Z=0,C=0,N=0,V=0Z=0, C=0, N=0, V=0. Nothing exciting; result is correct in both signed and unsigned interpretations.

Example 2: 127 + 1 — Signed Overflow

Adding the maximum positive signed 8-bit value (0111 1111 = 127) and 1:

  0111 1111   (+127 signed, 127 unsigned)
+ 0000 0001   (+1)
  ---------
  1000 0000   (-128 signed, 128 unsigned)
  • R=10000000R = 1000\,0000. As signed: 128-128. As unsigned: 128128. Z = 0.
  • Carry-out of MSB: c8=0c_8 = 0 (no carry escaped the top). C = 0.
  • MSB of result: R7=1R_7 = 1. N = 1.
  • Carry into MSB: c7=1c_7 = 1, carry out: c8=0c_8 = 0. V = 101 \oplus 0 = 1.

Flags: Z=0,C=0,N=1,V=1Z=0, C=0, N=1, V=1. The result is correct for unsigned (127 + 1 = 128) but wrong for signed (127 + 1 should be 128, but 128 doesn’t fit in signed 8-bit). V flags this; C does not.

Example 3: 0 - 1 — Unsigned Borrow, No Signed Overflow

Subtracting 1 from 0 (computed as 0+1+1=0+11111110+10 + \overline{1} + 1 = 0 + 1111\,1110 + 1):

   0000 0000     (0)
+  1111 1110     (~1 = -2 in two's complement, but here just bits)
+         1     (carry-in for two's-complement subtract)
  ---------
   1111 1111     (-1 signed, 255 unsigned)
  • R=11111111R = 1111\,1111. Z = 0.
  • Carry-out of MSB: c8=0c_8 = 0 (no carry escapes the top, because every bit position is 0+1+0=10 + 1 + 0 = 1). On x86 convention the C flag for SUB is the inverted carry-out, so a borrow occurred and C = 1.
  • MSB of result: R7=1R_7 = 1. N = 1.
  • Carry into MSB: c7=0c_7 = 0, carry out: c8=0c_8 = 0. V = 000 \oplus 0 = 0.

Flags: Z=0,C=1,N=1,V=0Z=0, C=1, N=1, V=0. The result is correct in both interpretations (-1 signed, 255 unsigned). C is set because we needed to borrow; V is clear because nothing overflowed in signed arithmetic.

These three examples cover the full design intent of the flags: Z and N are simple inspection wires, C handles unsigned, and V handles signed. They are independent and need to be tested independently.

Implementation: A 4-Bit Latch at the ALU Output

The flags register is structurally trivial. Each flag bit is a single D flip-flop whose data input is the corresponding boolean combination of ALU signals and whose clock-enable is asserted by instructions that should update flags.

ALU result[7:0] ──┬─── 8-input NOR ──── Z ──┐
                  │                          │
                  ├─── result[7] ─────── N ──┤
                  │                          ├──▶ FLAGS_REGISTER (4-bit)
ALU c_n ──────────┼───────────────────── C ──┤        │
                  │                          │        │
ALU c_(n-1) ──────┴────────── XOR c_n ─ V ──┘        │

clock + flag_write_enable ───────────────────────────▶┘

The four bits are written on the same clock edge as the ALU result lands in its destination register. They are read combinationally by the next instruction’s branch-condition logic, with no clocking on the read path.

Not Every Instruction Updates Flags

ALU instructions like ADD, SUB, AND, OR, XOR, CMP all update flags. Memory loads, register moves, and unconditional jumps typically do not — they must not, because the flags they would set would clobber the result of the previous arithmetic and break the comparison-then-branch idiom.

The control unit gates flag_write_enable based on the opcode bits — a single line out of the instruction-decode network.

In x86 things are messier: shifts and rotates conditionally update flags depending on the shift count; certain instructions (MOV, LEA) never update flags; one (POPF) writes the entire EFLAGS register from a stack value. The same pattern holds — a control signal gates the update — but the table of which instructions assert it is large.

Avoiding the “Flags Are Forever” Bug

Because flags persist until the next flag-updating instruction overwrites them, careful assembly programmers rely on the fact that intervening loads, stores, and moves leave the flags alone. Inserting a “harmless” instruction between a CMP and a BEQ is only harmless if that instruction doesn’t update flags. Compilers are fastidious about this; humans frequently aren’t.

How RISC-V Skips All of This

RISC-V does not have a flags register. Instead of CMP + BEQ it has a single fused branch on register comparison instruction:

BEQ rs1, rs2, target   ; branch if rs1 == rs2
BNE rs1, rs2, target   ; branch if rs1 != rs2
BLT rs1, rs2, target   ; signed less than
BLTU rs1, rs2, target  ; unsigned less than
BGE rs1, rs2, target   ; signed greater or equal
BGEU rs1, rs2, target  ; unsigned greater or equal

Each branch reads two registers, performs the comparison combinationally, and updates the PC. There is no global flags state. Trade-offs:

  • Pro: No serial dependency between an arithmetic instruction and a downstream branch — out-of-order execution is simpler.
  • Pro: No bug from “intervening instruction clobbered the flags.”
  • Con: Comparing against a constant requires loading it into a register first. Slightly larger code.
  • Con: Carry chains across multi-precision arithmetic require explicit sltu (set-less-than-unsigned) sequences to compute the carry bit manually.

For pedagogy, the flags-register design is far more illustrative because it makes the connection between arithmetic and control flow explicit. RISC-V’s approach is the result of decades of optimization on top of that mental model.

Build It in DigiSim

Open the CPU Flags Register template. Wire the ALU result into the Z/N detectors, hook up the carry-out and second-to-last carry to the V XOR, and feed the four bits into the FLAGS_REGISTER. Drive the inputs through Examples 1–3 above and watch each flag light up exactly when the worked math says it should.

The next two posts in the CPU-architecture series cover two’s complement representation (the encoding that makes both unsigned and signed arithmetic share the same hardware) and the 8-bit ALU in depth (the source of every signal feeding this register). Read those alongside the Fetch-Decode-Execute case study to see the full data path from PC fetch to flag-driven branch.