cpu-architecture

Program Counter Explained: The CPU's Bookmark

Denny Denny
9 min read
A program counter register with an incrementing binary value pointing to a memory map laid out in a grid.

TL;DR: The program counter (PC) is the register that holds the memory address of the next instruction a CPU will fetch. It increments automatically after every fetch, but branch and jump instructions overwrite it with a new target address — that single piece of hardware is what makes loops, conditionals, and function calls possible.

The program counter is the smallest, simplest register in a CPU and also the one that controls everything. Every instruction the processor will ever execute is selected by whatever address the PC happens to be holding at the start of the next fetch cycle. Without it, the processor has no idea where the program is, where it has been, or where it is going next.

This post walks through what the PC stores, how it increments, how branches and jumps redirect it, and how to wire one up from a register, an adder, and a multiplexer. Each component links back to a working circuit you can poke at in the Sequential Instruction Executor template.

What Is the Program Counter?

The program counter is a register dedicated to one job: pointing at the next instruction in memory. It is not a general-purpose register that user code reads and writes freely. It is a special-purpose register driven by the CPU’s control logic.

Its width determines how much memory the CPU can address for code:

  • An 8-bit PC can address 28=2562^8 = 256 memory locations.
  • A 16-bit PC can address 216=65,5362^{16} = 65{,}536 bytes.
  • A 32-bit PC can address 2324.292^{32} \approx 4.29 billion bytes.
  • A 64-bit PC can address 2641.8×10192^{64} \approx 1.8 \times 10^{19} bytes.

DigiSim’s PROGRAM_COUNTER_8BIT component implements the 8-bit case, which is large enough to run real programs in the simulator while staying small enough to inspect bit-by-bit.

The PC Sits at the Top of the Fetch Stage

The fetch stage of the fetch-decode-execute cycle reads four signals in sequence:

  1. The PC drives its current value onto the address bus.
  2. Memory returns the instruction word at that address.
  3. The instruction word is latched into the instruction register.
  4. The PC updates so that the next fetch reads the next instruction.

Steps 1–3 are addressed in the broader fetch coverage. Step 4 is what this post is about.

Why Must the PC Increment Automatically?

If the PC stayed put, the CPU would fetch the same instruction forever. Hardware therefore wires an automatic increment into the PC so that, by default, every fetch cycle is followed by a step forward.

The increment is computed by a dedicated adder — typically a small ripple-carry adder like the one detailed in Mastering Binary Addition — whose inputs are the current PC and a constant offset. The output of that adder feeds back into the PC’s data input, and the PC latches it on the next clock edge.

PC + 1 vs PC + 4: Why Instruction Width Matters

The constant added to the PC is the width of an instruction in addressable units, not always the literal value 1.

ArchitectureInstruction widthIncrementReason
DigiSim 8-bit toy CPU1 byte (1 word)PC + 1Each address holds a complete instruction
MIPS, classic RISC-V (RV32I)4 bytesPC + 4Memory is byte-addressed; instructions are 4 bytes wide
ARM (A32)4 bytesPC + 4Same reason
ARM Thumb / RV32C2 bytesPC + 2Compressed instruction encoding
x861–15 bytesPC + (decoded length)Variable-length instructions; the decoder reports the length

The principle is the same in every case: after fetching an instruction, advance the PC past the bytes that instruction occupies. RISC architectures get a fixed adder; x86 needs the decoder to feed back the length.

Boolean Description

For an architecture with fixed instruction width WW, the default next-PC value is:

PCn+1=PCn+W\text{PC}_{n+1} = \text{PC}_{n} + W

In the DigiSim 8-bit case, W=1W = 1, so the increment is just a +1 adder.

How Do Branches and Jumps Override the PC?

Sequential execution is the boring case. Real programs need loops, conditionals, function calls, and returns — all of which require the PC to skip somewhere other than the next sequential address.

Two operations cover all of them:

  • Jump (unconditional): Replace the PC with a target address regardless of any condition.
  • Branch (conditional): Replace the PC with a target address only if a flag — zero, carry, sign, overflow — is in the required state. Otherwise fall through to PC + W. The flag bits live in the flags register, covered in the upcoming companion post on carry, zero, overflow, and sign.

Both operations boil down to writing a non-sequential value into the PC.

Where Does the Target Address Come From?

The target is encoded in one of three ways:

  1. Immediate (PC-relative): The instruction word contains an offset. The next PC is PC + offset. RISC-V BEQ, ARM B, and x86 short jumps all use this form. The advantage is position-independent code.
  2. Absolute: The instruction word contains a complete target address. This is common in older architectures and in x86 long jumps.
  3. Register-indirect: The target lives in a general-purpose register or in the ALU output. RISC-V JALR, ARM BX, and x86 JMP rax are examples. This form supports function pointers, virtual dispatch, and computed goto.

In every case, the new PC value arrives at the program counter through a multiplexer.

Implementation: Register + Increment + Mux

The hardware structure of the PC is one of the cleanest data paths in a CPU:

                        ┌──────────────┐
       PC current ─────▶│  Adder (+W)  │──── PC + W ──┐
                        └──────────────┘              │

                                              ┌───────────────┐
                                              │ MUX (2:1)     │
                target_addr ─────────────────▶│  sel = take?  │──── next_PC
                                              └───────────────┘


                                              ┌───────────────┐
                  clock ─────────────────────▶│   PC register │──── current_PC
                                              └───────────────┘

Three pieces:

  1. A register to hold the current address. This is just an 8-bit register — the same building block covered in Mastering the 4-bit Register, widened to match the address bus.
  2. An adder wired with one input as PC and the other as the constant width WW.
  3. A 2:1 multiplexer with one input from the adder (default sequential path) and one from the branch target (taken-branch path). The select line is driven by the control unit and is asserted when the current instruction is a jump or a branch whose condition evaluates true. Multiplexers are the topic of The Data Traffic Controller.

On every clock edge the PC latches whatever the mux is currently selecting. That single latch is the entire mechanism that distinguishes “advance” from “branch.”

Optional Inputs to the Mux

Real designs add more sources to the same mux:

SourceWhen selected
PC + WDefault — sequential execution
PC + offset (PC-relative)Conditional branches, short jumps
Absolute immediateLong jumps, calls to fixed addresses
Register / ALU outputIndirect jumps, function returns, switch dispatch
Reset / boot vectorAfter power-on or external reset
Exception / interrupt vectorAfter a trap

The select line widens from 1 bit to 3 bits, but the structural idea is unchanged.

Reset Behavior: Where Does the PC Start?

When the CPU is powered on or reset, the PC must hold a known address — otherwise execution begins at random.

There are two common conventions:

  • PC = 0: Execution begins at address 0x0000…0. The DigiSim toy CPU and many embedded microcontrollers use this. ROM is mapped at the bottom of the address space, and the first instruction at address 0 is the first instruction of the program.
  • Boot vector: A fixed non-zero address. x86 cold-resets to physical address 0xFFFFFFF0 (the top of the 4 GB space, near the BIOS ROM). ARM Cortex-M cores read the initial PC from a reset vector stored at address 0x00000004. Some 6502-family chips read the reset vector from 0xFFFC/0xFFFD.

In hardware, “reset” is a wire to the PC’s asynchronous-clear (or synchronous-load) input. When asserted, the PC is forced to the boot value regardless of the mux.

Worked Example: A Five-Instruction Loop

Consider the following pseudo-program in an 8-bit CPU with W=1W = 1:

0x00: LDI R0, 5      ; load 5 into R0
0x01: LDI R1, 0      ; load 0 into R1
0x02: ADD R1, R0     ; R1 = R1 + R0
0x03: DEC R0         ; R0 = R0 - 1, sets Z flag if R0 hits 0
0x04: BNZ 0x02       ; if Z == 0, branch back to 0x02
0x05: HLT

Stepping through PC values:

CyclePC at fetchInstructionAfter fetchBranch taken?Next PC
10x00LDI R0, 5PC + 1 = 0x01n/a0x01
20x01LDI R1, 0PC + 1 = 0x02n/a0x02
30x02ADD R1, R0PC + 1 = 0x03n/a0x03
40x03DEC R0 (R0=4, Z=0)PC + 1 = 0x04n/a0x04
50x04BNZ 0x02PC + 1 = 0x05; target = 0x02yes (Z=0)0x02
60x02ADD R1, R0

At cycle 5 the mux selects the branch target (0x02) instead of the sequential next address (0x05) because the Z flag is clear. The same one-bit decision drives every loop, every if, and every function call in every CPU on Earth.

Common Pitfalls

  • Off-by-one in PC + W. A common simulator bug is to use PC + 1 for an architecture whose instructions are 4 bytes wide. The CPU will fetch the second byte of every instruction as the start of the next instruction and behave as if every program were random data.
  • Updating PC before the fetch reads memory. The PC must hold the current address while memory is being read. Update on the clock edge, not combinationally.
  • Forgetting the reset. A PC with no reset wire boots to whatever value the flip-flops randomize to on power-up. Always tie the reset.
  • Branching to an unaligned address. On RISC architectures with 4-byte instructions, jumping to address 0x1003 instead of 0x1004 is a fault. The PC’s low bits should be tied to zero or a fault should be raised.

Build It in DigiSim

Open the Sequential Instruction Executor template. You will see:

  • A PROGRAM_COUNTER_8BIT block at the top of the fetch path.
  • An adder hard-wired to add 1 on every clock cycle.
  • A mux that selects between PC + 1 and a target address driven by the control unit.
  • A reset switch that forces the PC to 0x00.

Step the clock manually and watch the PC advance. Then load a program containing a branch instruction and observe the moment the mux flips from “sequential” to “target” — that single bit is control flow in its purest form.

Where the PC Sits in the Bigger Picture

The program counter is one node in the larger CPU data flow. The instruction it points at gets latched into the instruction register, where the decode stage cracks it into opcode and operand fields — covered in the upcoming Instruction Register and the Decode Stage post. The result of arithmetic operations sets the flags the conditional branches consult — covered in the upcoming CPU Flags Register post.

Read the Fetch-Decode-Execute case study for the full loop, then load the Sequential Instruction Executor template and single-step through a branch instruction. Watching the PC jump non-sequentially for the first time is the moment the abstraction stops being a diagram and starts being a circuit.