5.3 Computer Architecture & Microprocessors

Key Takeaways

The fetch-decode-execute cycle is the fundamental instruction loop, driven by the program counter and the clock.
Pipelining overlaps instruction stages so throughput approaches one instruction per clock despite multi-cycle latency.
The memory hierarchy trades speed for size: registers, then L1/L2/L3 cache, main memory (RAM), and finally disk/SSD.
Effective access time = hit rate x cache time + miss rate x memory time, so a 90% hit rate still pays a heavy miss penalty.
An n-bit address bus addresses 2^n locations, so a 32-bit byte address space spans 4 GiB.

Last updated: June 2026

The CPU, registers, memory, and buses

A classic von Neumann computer has three parts joined by buses: the central processing unit (CPU), main memory, and input/output (I/O); instructions and data share one memory (a Harvard architecture splits them). The CPU contains the arithmetic logic unit (ALU) that performs computation, a control unit that sequences operations, and fast registers.

Key registers include the program counter (PC) holding the next instruction's address, the instruction register (IR) holding the current instruction, the memory address register (MAR), the memory data register (MDR), and the accumulator or general registers.

Three buses connect the parts: the address bus carries the location, the data bus carries the value, and the control bus carries read/write and timing signals. Address-bus width sets maximum addressable memory: an n-bit address bus addresses 2^n locations, so a 32-bit address space spans 4 GiB of byte-addressable memory, and a 16-bit address bus reaches only 64 KiB.

The fetch-decode-execute cycle and addressing

Every instruction passes through the fetch-decode-execute cycle:

Fetch: the control unit copies the instruction at the address in the PC into the IR, then increments the PC.
Decode: the control unit interprets the opcode and identifies the operands.
Execute: the ALU or memory unit performs the operation, possibly writing back to a register; a result may also be stored to memory.

Operands are located using addressing modes: immediate (the value is in the instruction), register (operand is in a register), direct (the instruction holds the operand's address), indirect (the instruction holds a pointer to the address), and indexed (base register plus offset). Instruction sets follow two philosophies: RISC uses many simple, fixed-length instructions that pipeline cleanly, while CISC uses fewer, more powerful variable-length instructions. The FE may ask which favors pipelining - the answer is RISC.

Pipelining

Pipelining overlaps the stages of consecutive instructions like an assembly line, so while one instruction executes, the next decodes and a third fetches. A classic five-stage RISC pipeline is IF (fetch), ID (decode), EX (execute), MEM (memory), WB (write-back). A k-stage pipeline ideally raises throughput toward one instruction per clock cycle even though each instruction still takes k cycles end to end; the ideal speedup approaches k over a non-pipelined design.

Pipelines stall on hazards: a data hazard when an instruction needs a result not yet written back, a control hazard after a branch changes the instruction flow, and a structural hazard when two stages need the same resource. Forwarding (bypassing) routes a result to a later stage early, branch prediction guesses the path, and inserted stalls (bubbles) preserve correctness at some throughput cost.

The memory hierarchy

Faster memory is smaller and costlier per bit, so systems layer it; it works because of locality of reference - programs reuse recently accessed data (temporal locality) and nearby data (spatial locality).

Level	Typical Access	Relative Size	Managed By
Registers	< 1 ns	bytes	Compiler / CPU
L1 cache	~1 ns	tens of KB	Hardware
L2 / L3 cache	a few ns	KB to MB	Hardware
Main memory (RAM)	~50-100 ns	GB	Operating system
Disk / SSD	microseconds to ms	TB	Operating system

SRAM (fast, used for cache) holds a bit in a flip-flop and needs no refresh; DRAM (dense, used for main memory) stores a bit on a capacitor and must be periodically refreshed. ROM/flash is non-volatile.

Cache effectiveness and virtual memory

A cache hit finds requested data in cache; a miss forces a slower fetch from the next level. The effective access time (EAT) is:

EAT = (hit rate x cache access time) + (miss rate x memory access time)

Worked example: with a 95% hit rate, a 1 ns cache, and 100 ns memory, EAT = 0.95(1) + 0.05(100) = 0.95 + 5 = 5.95 ns - far better than 100 ns despite a small miss rate, which is why pushing hit rates above 95% matters so much.

Virtual memory uses disk as an extension of RAM, dividing the address space into fixed-size pages mapped to physical frames by a page table; a page fault occurs when a referenced page is not resident and must be loaded from disk. The translation lookaside buffer (TLB) caches recent page-table entries to speed up address translation.

Test Your Knowledge

A cache has a 90% hit rate, a 2 ns cache access time, and a 100 ns main-memory access time. What is the effective memory access time?

2 ns

11.8 ns

100 ns

51 ns

Interrupts and I/O

The CPU communicates with devices through I/O in three ways the FE tests:

Programmed I/O (polling): the CPU repeatedly reads a status flag, wasting cycles while it waits.
Interrupt-driven I/O: the device raises an interrupt when ready; the CPU saves state (PC and registers), runs an interrupt service routine (ISR), then resumes the interrupted program. This avoids busy-waiting.
Direct memory access (DMA): a DMA controller transfers a block between memory and a device without the CPU moving each word, freeing the processor and interrupting only when the transfer completes.

Interrupts are prioritized so urgent events (timer, power failure) preempt less critical ones; a maskable interrupt can be disabled, while a non-maskable interrupt cannot. The PC's reset vector and exception/trap handling follow the same save-and-restore pattern.

Test Your Knowledge

What is the maximum number of distinct byte locations a processor with a 20-bit address bus can address?

1,024

65,536

1,048,576

Up Next

5.4 Computer Networks

Continue learning

FE Electrical and Computer Exam

FE Electrical and Computer

5.3 Computer Architecture & Microprocessors

Key Takeaways

The CPU, registers, memory, and buses

The fetch-decode-execute cycle and addressing

Pipelining

The memory hierarchy

Cache effectiveness and virtual memory

Interrupts and I/O

FE Electrical and Computer Exam

1FE Electrical & Computer: Exam Map, Handbook & Strategy

2Mathematics, Economics & Ethics

3Circuits, Electronics & Electromagnetics

4Signals, Systems, Controls & Communications

5Digital Systems, Computers, Networks & Software

6Power, Materials & Final Readiness

FE Electrical and Computer

5.3 Computer Architecture & Microprocessors

Key Takeaways

The CPU, registers, memory, and buses

The fetch-decode-execute cycle and addressing

Pipelining

The memory hierarchy

Cache effectiveness and virtual memory

Interrupts and I/O