Let’s Talk Processor Architecture

“Hey Ben Walker”, you say, “you’re really good looking, but can you explain how my computer’s processor works?”. Well, I’m double trouble, and by the end of this post, you’re gonna understand processor architecture.

Actually, you’re gonna understand single-core scalar (as opposed to superscalar) processor architecture. These processors went obsolete in, like, 1993. So you’ll be twenty years out of date. Ladies.


(Click for large)

Examine the schematic above (read left-to-right), and then let’s dive in!

HOLD UP, WHAT’S THIS ABOUT CODE VERSUS DATA? Your processor needs two things. It needs data (like array<string> myAnimes, a list of the 500 animes you own), and it needs instructions that act on that data, or code (like myAnimes.eraseAll() please). Mind you, code is just data. It’s stored on your hard drive as bytes, same as anything else. However, your processor knows which bytes represent code and which represent data, and it handles the two very differently. Anyhow.

SYSTEM BUS: So code and data are just bytes. Problem is, those bytes lie in your hard drive or RAM (generally, ‘main memory’) — they aren’t stored in the processor itself. The system bus’ job is to take requests from the processor to grab specific bytes, get them from main memory, and forward those bytes around when received.

IF THE SYSTEM BUS RECEIVES DATA: It forwards that data to the memory management unit, which will in turn forward it to the registers.

MEMORY MANAGEMENT UNIT: Called the MMU. It’s a clearinghouse for bytes. It receives requests for code/data from the rest of the processor, figures out where to look in main memory, and tells the system bus to do so. It also forwards received data to the registers, and determines which register to store the data in. It also has a cache for instructions and data, so it can fulfill requests without going to the system bus and main memory.

REGISTERS: Registers are the only memory that can be read and written to by the processor. There are very few registers. Modern processors have 16 registers that can hold 8 bytes each, meaning 128 bytes of memory. That’s not enough memory to store a paragraph of text. Because there’s so little memory, it’s very important that the processor is efficient — it can only load data immediately before that data gets used, and once that data is used, it needs to be replaced as soon as possible.

IF THE SYSTEM BUS RECEIVES CODE: It forwards code to the instruction pre-fetcher, which in turn goes to the decoder, the sequencer, and the ALU.

INSTRUCTION PRE-FETCH: It figures out what instructions we’re going to execute in a few cycles, and sends requests for those instructions to the MMU right now so we’ll have them on hand when the time comes. In other words, it keeps the instructions flowing. Whenever your code branches ( if(x > 0) DoThis(); else DoThat(); ), the instruction pre-fetch has the interesting task of trying to predict whether to pre-fetch DoThis() or DoThat() before we’ve run all the instructions that determine if x>0. That logic is called a branch predictor.

INSTRUCTION DECODE: Remember how instructions are just stored as bytes, same as everything else? The instruction decode unit is what takes the data 0x01c1 and decodes it as ADD [REGISTER0] TO [REGISTER2]. If the decoder decodes an instruction and finds out that it references data that isn’t in our registers yet, the decoder requests that data from the MMU.

INSTRUCTION SEQUENCING AND CONTROL: It manages out-of-order execution. Imagine your code says myAnimes[315].MarkWatched(); ++numAnimesWatched; but the processor has yet to load your 315th anime into registers. The instruction sequencer recognizes instructions that we can execute on immediately, and jumps them ahead of instructions that are still waiting for data. So, it allows numAnimesWatched to increment even though we’re still waiting to load myAnimes[315]. Heck, the instruction sequencer will allow any instructions that aren’t affected by the result of myAnimes[315].MarkWatched() to skip ahead in line, keeping the processor as busy as possible. To save money and power, some processors — including the Xbox 360 processor — don’t include this unit, and can only process instructions in order. Those instructions are passed to the arithmetic / logic unit.

ARITHMETIC / LOGIC UNIT: Also called the ALU. This is the core of the processor. It receives commands such as ADD THESE NUMBERS TOGETHER (arithmetic) or SAY '1' IF THIS NUMBER IS BIGGER THAN THAT NUMBER, OTHERWISE SAY '0' (logic), and it does them. The results get written into the registers, or get sent to the memory management unit to be written out to main memory.

And hey, you’re done! Don’t get me wrong, this is an absurdly simplified overview. The actual block diagram of an Intel 80386 processor handles plenty of issues I ignored, such as handling overflow/underflow in arithmetic, switching between 16/32 bit operating modes, integer vs. floating point pipelines, and pretty much everything else. But you know what? You did good. Give yourself a cookie. Or email me at walkerb@walkerb.net and complain about everything I did wrong. And happy coding!

Leave a Reply

Your email address will not be published.