Category Archives: Programming

C++11 and “moving” data

Alright, time to get dirty in C++ land. We’re gonna talk about move semantics (also known as r-value references), one of the big new featuresets in C++11. Why do they matter? How do you use them?

Backstory: Copying vs Moving

Look at this code:

std::vector<MyClass> myCollection;
MyClass myItem;
myCollection.push_back(myItem);

When you run this code, you end up with two instances of MyClass that look the same: myItem, and myCollection[0], which is a copy of whatever myItem looked like at push_back() time. Because you could play with myItem after the push_back() call, this duplication is reasonable and cool!

But what if you had…


MyClass GimmeInstance()
{
   MyClass ret;
   //do important stuff to ret
   return ret;
}

myCollection.push_back( GimmeInstance() );

Things get more interesting here. The MyClass ret returned from GimmeInstance() is temporary — it’s gonna go away at the end of the push_back() line, there’s no way we’re playing with it after that function is done. However, we’re still going to copy it in to myCollection — for a brief moment at the end of push_back(), the value at myCollection[1] and the value returned from GimmeInstance() will be two separate copies of each other.

Well, that seems stupid. Why copy a variable we know is going away? Why not just… move it?

With C++11, we can — with some pretty important caveats.

Okay, what are the caveats about “moving” an object

There will always be one memory location that holds the return value from GimmeInstance(), and another memory location that holds myCollection[1]. Regardless of anything C++11 adds, at some point our program will have two MyClass instances in memory. We will always be looking at one memory location and writing to another.

Umm, that sounds like “copying” not “moving”

Yeah, it does. Anyhow, because we can’t get around having two different memory locations, move constructors are pointless for plain old data classes. If your class is simple enough that you can copy it via memcpy() without any fears, move constructors offer you nothing.

But let’s say this MyClass you’re dealing with isn’t plain old data. Maybe your class implements RAII, so constructing and destructing instances of MyClass is expensive or scary.

Move semantics allow you to play fast-and-loose with copy construction of your new object, and with destruction of your temporary object. You know the object you’re copying from is going away soon — so you can swap data out of it, steal its open handles instead of creating your own, and in general skip a bunch of steps. That is why move semantics are important enough to exist.

So how do I move instead of copy?

Alright! You move by using r-value references — this is the new feature from C++11. R-values are our temporary values. They’re called that because r-values show up on the right side of an equation — they exist only to set something else equal to them.

The return value of GimmeInstance() for myCollection.push_back(GimmeInstance()), or a*b for int c = a*b — those are r-values. You can’t take the address of r-values, and they’re deleted at the end of the line of code.

Anyhow, a function can ask for r-values by using the token && in the function header. This passes the r-value by reference — it behaves like the & token, except it only accepts r-value variables.


class MyClass
{
public:
   int m_fileHandle;

   MyClass()
      : m_fileHandle(-1)
   {
      //Create the handle (making it != -1); this is expensive
   }

   ~MyClass()
   {
      if (m_fileHandle != -1)
      {
         //Clean up the handle; this is expensive
      }
   }

   MyClass(MyClass&& other) //Copy construct from a r-value reference!
   {
      //burn and pillage the temporary; steal its handle
      m_fileHandle = other.m_fileHandle;
      other.m_fileHandle = -1;
   }
};

To return to the myCollection.push_back( GimmeInstance() ); call from the beginning of this post — the MyClass that gets added to the end of MyCollection will be in a different memory location from the MyClass returned from GimmeInstance(), but we’ll copy data into it through the MyClass(MyClass&& other) (called a move constructor).

Because of this, we save time in constructing myCollection[1] (because we don’t have to open a new handle), and we save time in destructing the r-value temporary returned from GimmeInstance() (because we don’t have to close its handle). That’s a pretty sweet deal.

Some extra points:

  • Copy constructors are the most obvious scenario for move semantics, but any function can use &&.
  • You can use std::move() to treat a named variable like an r-value. This is great if you know you won’t need a named variable after a certain point, but just be careful!
  • The Rule of Three stating that if you’re modifying the destructor, you also wanna modify the copy constructor and copy assignment operator? With this, it becomes the Rule of Five. If you’re modifying any of those three, modify the move constructor and move assignment operator too.

And that’s what’s up with move semantics! Good luck!

Finding Stuff in 3D Space: An Overview

Let me tell you about a task that 3D games face a bunch. It’s a difficult task. That task is finding out what’s related to you in 3D space. It’s how you solve scenarios like these:

  • I’m a gravity well! I want to know all the triangles close to me so I can pull them towards me (cuz that’s what gravity does)!
  • I’m a physics-simulated crate colliding with the ground! I want to know what part of myself is colliding with what part of the ground, so I can bounce off it!
  • I’m a camera! I want to know all the triangles in my field of view and nearby, so I can draw them!

These are all different queries. The gravity well wants to know everything within a certain distance, regardless of direction. The crate wants to know the location of a collision between two polygon meshes. The camera wants to know everything in a certain direction and distance. The only thing all these queries have in common is that they are about spatial relationships.

So! In all these cases, we want to give our actor a triangle/list-of-triangles representing the scene geometry they care about. But how can we generate that list? Iterating through every triangle in the whole level is a terrible idea — it’ll take O(n) time for n triangles, and game levels can have hundreds of thousands of triangles. We need some way to quickly reject large batches of triangles at once if we want to make this fast.

Well, thankfully, the problem of high-performance finding a certain item in a list of items has been heavily examined. And there’s a super-common solution — don’t represent your list of every-triangle-in-a-scene as a flat array, represent it as a tree!

Starting with the single top node of the tree that contains all triangles, pick one of multiple child nodes, each containing a subset of all triangles. Every time you go down one level in a tree, you auto-reject all the triangles in all the other branches at that level, and your time to find a triangle goes from O(n) to O(log(n)).

Anyhow, that isn’t a new idea. Video games have been using trees-of-triangles since Doom. There’s mainly two different types of trees-of-triangles used in video games: Binary Space Partition Trees, and Octrees. Each tree type is named after the number of leaves branching out of each node (2 for BSP trees, and 8 for octrees).

Octrees work by subdividing 3D space along the X, Y, and Z axes — the root of an octree represents the entire world, and each of its 8 children represents one of 8 equal-spaced volumes to cut the world up in to: front-right-top, front-right-bottom, front-left-top, front-left-bottom, back-right-top, back-right-bottom, back-left-top, and back-left-bottom. That is, going further positive or negative in each of [X,Y,Z].

To find spatially-related objects in an octree, you just traverse the octree and go left/right, up/down, and forward/back at every node. (you may have to traverse multiple leaves of the octree if your query doesn’t line up with the bounding boxes well).

Octrees are cool because they are simple, and because every time you go down one child, you cut out 7/8ths of the world! However, they have drawbacks in that many triangles will have to go in multiple leaves of the octree, because most triangles aren’t perfectly split along every axis.

Binary Space Partition trees can be implemented in many ways, but the core difference between them and octrees is that instead of subdividing volumes into octets along the X, Y, and Z axes, you only split volumes into two sections — but you can split along any axis you want, and you don’t have to split into even halves. Then, you iterate through and determine whether you’re “outside” or “inside” each split plane to advance down the tree.

Although you can use any split plane, some BSP trees exclusively split volumes such that one of the triangles in the scene lies along the split plane. If you abide by that restriction, then, due to some math that I frankly don’t understand, you can quickly generate a list of triangles sorted by distance from any point in the BSP-tree-represented world — which is great for drawing. Read this incredibly long explanation of BSPs in games for information.

So anyhow! Spatial partitioning systems are a great idea and you should use them.

What the heck is COM?

You’re likely vaguely aware that COM exists and it’s important. But what the heck is COM?

Well:

  • COM stands for Component-Object Model.
  • It refers to a set of libraries and APIs, as well as the programming patterns required to use them.
  • Its goal is to provide a way for two separate programs with no prior knowledge of each other, possibly written in different languages or on different machines, to exchange data.
  • COM libraries are made by Microsoft and come with the Windows OS (although the ideas behind COM are not Windows-exclusive).

Of course, you can have two programs interface in all sorts of ways — they can define ports to send and receive packets on, check a centralized server, or whatever else you can dream of. COM is just one of many solutions for inter-process communication.

That said, COM is appealing because it’s in-depth and reusable in a way that any communication protocol you hack together won’t be, it already solves a lot of low-level problems, and with the help of some Microsoft-made “wizards” such as the Microsoft Foundation Class library, programmers that don’t know much about the problem space and just want working code can get started quickly. Plus, a lot of Microsoft libraries already work with COM.

Unfortunately, because COM is so in-depth and re-usable, the terms and programming patterns dealing with COM are abstract and vague — since everything has to make sense regardless of programming language or whether you’re talking to another computer or your own. The phrase “Component-Object Model” itself is a great example of that vagueness in action.

See, COM is called “Component-Object Model” because everything in COM is patterned like this:

  • COM provides you with a bunch of Interfaces.
  • These interfaces declare virtual functions, but they don’t define them, so you can’t instantiate an instance of an Interface class.
  • Instead, these interfaces are implemented by Component Object Classes (also called coclasses), which you don’t have direct access to.

That is, COM is an object model (a way to represent the objects that maintain and affect state in your code) that only allows access to objects through defined interfaces (and these interfaces are the components that define those objects).

This object model is great for inter-process communication — since it doesn’t assume much about where an object is located or what your programming language can do to an object — but it has nothing to do with inter-process communication specifically. So the name “COM” doesn’t really gel with the reasons programmers use the Microsoft COM libraries.

Anyhow! How do you create a COM object?

  • Every interface or coclass has an associated GUID (globally unique identifier). In COM, they’re called CLSID for coclasses and IID for interfaces.
    • Example: DEFINE_GUID(IID_ISynchronizedCallBack, 0x74c26041, 0x70d1, 0x11d1, 0xb7, 0x5a, 0x0, 0xa0, 0xc9, 0x5, 0x64, 0xfe); is the GUID for the ISynchronizedCallback interface.
  • You call CoCreateInstance(...) (the ‘Co’ means ‘Component object’), pass in the GUID for the coclass and interface you want, and receive a pointer to an instantiated coclass which gives you the functions of the associated interface on that coclass. It fails if the coclass doesn’t support the requested interface.

You can also have an existing COM coclass object but you want a different interface into it. For instance, you could have a handle to a coclass through the IPersistFile interface — the COM interface for files that can be saved/loaded to disk, which offers Save() and Load() functions (and nothing else). That file also may have data that you can only get via an IDataObject interface and its GetData() function. So how do you go from your IPersistFile* to a IDataObject*? HINT: it’s not as simple as casting.

  • All COM interfaces extend from a base class called IUnknown.
  • The IUnknown base class has a function called QueryInterface
  • You can get a new interface for an existing object by calling QueryInterface(...) on your COM object pointer and passing in the ID of the interface you want
    • So, pMyPersistFile->QueryInterface(IID_IDataObject, &pMyDataObject) sets pMyDataObject to represent that same coclass, but with the IDataObject interface
    • If you pass in an interface that coclass doesn’t implement, you’ll get back a nullptr
  • So, that all looks like a lot more effort than C++ new and casting operations. Why go through all these hoops?

    Well, when you do C++ new, the memory for the class you’re new‘ing will only exist on your computer, and will only be managed by your process. When you use CoCreateInstance(...), you can generate these objects on a remote server (or on another process in your own computer) — and you don’t have to worry about the nitty-gritty details of doing so.

    Additionally, by having all these restrictions with interfaces, COM classes can be used in programs written in any programming language — you don’t have to necessarily understand C++ and virtual functions and inheritance to call Save() on your IPersistFile, and you can call Save() from C# to save a file managed by a C++ process on a computer a thousand miles away.

    And that’s what the heck COM is!

    On The History of Programming

    We’re starting this post with a Zen Koan. Here it is, from The Gateless Gate, a 13th century Chinese compilation.

    A monk asked the master to teach him.
    The master asked, “Have you eaten your rice?”
    “Yes, I have,” replied the monk.
    “Then go wash your bowl”, said the master.
    With this, the monk was enlightened.

    Cool! We’ll get back to this.

    If you are a programmer, and you care about programming, then you should study the history of programming. In fact, for most professionals, I’d argue that studying the history of programming is a better hour-per-hour investment than studying programming itself.

    See, the history of programming is different from most other histories. While gravity existed before Newton, and DNA existed before Watson and Crick, programming is literally nothing but the sum of peoples’ historical contributions to programming. It is a field built from nothingness by human thought alone. It is completely artificial.

    This leads to a useful fact: every addition to the world of programming, that gained enough traction to be in use today, was created to solve a problem. FORTRAN was created to make assembly coding faster on IBM mainframes. BASIC built on FORTRAN and allowed the same code to be run on different computers. C wasn’t actually based off BASIC, but it kept BASIC’s portability and instead focused on easier, more readable programming. C++ came after C and created object-oriented programming, allowing code re-use across tasks. Java expanded on C++, removing the need to recompile for different computer architectures.

    Your favorite programming language is influenced by the contributions and thought patterns of every language before it.

    By understanding the problems that birthed each new language, you can appreciate the solutions they offer. Java’s virtual machine was (and is) a Big Deal, the entire reason that Java exists. Learning Java without learning (or at least understanding) C++ robs that internalization from you. And don’t just learn about the languages that inspired your preferred language — learn about the offshoots and derivatives of languages you’re interested in, too. Each language highlights the deficiencies and tradeoffs of its parents and children, and learning Java can be just as useful to a C++ programmer as learning C++ could be to a Java programmer.

    Even if you refuse to leave your favorite language, knowledge of its ancestors and children can make you recognize a language’s strengths and weaknesses, and tailor your program to match. And that is a valuable skill.

    See, trying to master a programming language without understanding the context it arose from is like trying to understand a Zen Koan without understanding the context in which it’s meant to be read. You can’t.

    C++: When should you use copy constructors?

    LET’S TALK COPY CONSTRUCTORS.

    Well, first, let’s do a refresher course.

    1. Copy constructors are when you construct a new instance of a class by giving it an already-constructed instance of that class to clone. MyClass otherMyClass = thisMyClass; and MyClass otherMyClass(thisMyClass); are the two ways to call copy constructors. At the end of copy construction, otherMyClass is theoretically the same as thisMyClass.

    2. If you don’t explicitly create a copy constructor for your class (the footprint to use in your header is MyClass(MyClass& copyFromMe) ) then C++ will always default-create one. This auto-generated default copy constructor just calls the copy constructors of all member variables. The default copy constructor for primitive types (int, float, etc.) is just a data-copy.

    3. Default copy construction of pointers is a shallow copy — you copy the pointer, but not the data it points to.

    4. Copy constructors can be called implicitly. When you pass-by-value into a function, what that actually means is “copy-construct clones of the variables you want to pass in”. void DoStuffWith(MyClass thisMyClass) implicitly does a copy-construct on thisMyClass every time you call it.

    5. (A sidebar, but useful to know) MyClass otherMyClass = thisMyClass; and MyClass otherMyClass(thisMyClass);, despite looking different, both call your defined MyClass(MyClass& copyFromMe) function. MyClass otherMyClass = thisMyClass; doesn’t call your operator=(MyClass& copyFromMe) function because that function is meant to be used on already-initialized instances of your class (copy assignment, not copy construction). MyClass otherMyClass; otherMyClass = thisMyClass; will default-construct otherMyClass and then call operator=.

    So: when should you use copy constructors? Well, only when you really know what you’re doing, and you probably shouldn’t anyhow.

    Let’s see by example, using the two classes below. They both hold a lot of memory; one statically allocates it, one dynamically allocates it.


    class MyStaticAllocMem {
    public:
       int m_iBuffer[1920*1080];
    };

    class MyDynamicAllocMem {
    public:
       MyDynamicAllocMem() { m_pBuffer = new int[1920*1080]; }
       ~MyDynamicAllocMem() { delete[] m_pBuffer; }

       int* m_pBuffer;
    };

    void ActOnStaticAllocMem( MyStaticAllocMem staticMemValCpy ) { /* do stuff */ };
    void ActOnDynamicAllocMem( MyDynamicAllocMem dynamicMemValCpy ) { /* do stuff */ };

    void main()
    {
       MyStaticAllocMem staticMem;
       MyDynamicAllocMem dynamicMem;

       ActOnStaticAllocMem( staticMem );
       ActOnDynamicAllocMem( dynamicMem );
    }

    When you call ActOnStaticAllocMem( staticMem ), the copy constructor generates a clone of staticMem to use inside the function. This means it stack-allocates another 1920*1080 ints and copies over all 7.9mb of data. This is very, very slow. Plus, if one stack-allocated MyStaticAllocMem is enough to make you worry about stack overflow, well, now you have two!

    Still, your program can copy-construct static-allocated memory and survive. It’ll run significantly slower and be more prone to certain errors, but it runs. The same can not be said about copy-constructing dynamic-allocated memory.

    When you call ActOnDynamicAllocMem( dynamicMem ), the copy constructor generates a clone of dynamicMem to use inside the function. This only stack-allocates the memory for one pointer, which isn’t scary — what’s scary is this:

    • When ActOnDynamicAllocMem returns, its function-scope variable dynamicMemValCpy goes out of scope, calling dynamicMemValCpy‘s destructor
    • Because the default copy constructor is a shallow copy, dynamicMemValCpy has a pointer that points to the same location as your dynamicMem back in main() points to
    • dynamicMemValCpy dutifully deletes its m_pBuffer, and the function returns, leaving you back at main() with dynamicMem pointing to unallocated data
    • The next time you do anything with dynamicMem, the world explodes.

    There are plenty of workarounds to these issues, of course. You could use reference-counted pointers, or you could pass everything by reference or const-reference or by pointer (you really should do that). But it’s so easy to accidentally write void MyFunc(MyClass val) instead of void MyFunc(const MyClass& val), and the compiler will never complain.

    Now, there are times where copy constructors are super-helpful — you may legitimately want a throwaway clone of a given object. Then, you can mess with the clone’s data as much as you want, and there are no repercussions after the function exits. Copy-constructing a class with sizeof(myClass) less than sizeof(void*) may also be faster than passing a pointer or reference (I totally haven’t tested that, though).

    And if you don’t want to use copy construction, you can disallow it on a per-class basis using private: MyClass(const MyClass&).

    But still, given how easy it is to implicitly copy construct, and given the amount of ways implicit copy construction can kill you / the amount of engineering needed to ensure it doesn’t, I’m surprised it’s implemented in C++ as an opt-out feature instead of an opt-in feature.

    TL;DR — I do not like copy constructors.

    WHAT IS THIS ABOUT ASSEMBLY NOW

    “Dang Ben,” you say, “it’s absurd how good you are at Starcraft 2!” Well, yeah, you’re right. But what’s almost as absurd is how cool assembly is.

    Assembly is the code that your C/C++ gets compiled to (other languages too, but fuck ’em). It’s a super low-level, close-to-the-metal language, where each line of code represents exactly one task for the processor. There’s a bunch of different flavors of assembly, depending on your processor, but we’re talking about 32-bit x86 assembly here.

    Let’s see what the code int firstNum = 10; int secondNum = 31; int thirdNum = firstNum + secondNum; looks like in assembly:

    mov dword ptr [firstNum],0Ah
    mov dword ptr [secondNum],1Fh
    mov eax,dword ptr [firstNum]
    add eax,dword ptr [secondNum]
    mov dword ptr [thirdNum],eax

    As you can see, lines of code in assembly are structured command var1 var2.

    • command is one of a preset list of commands, called the instruction set. These instructions are the only things your processor can do; all your code is expressed in terms of these instructions.
    • var1 is the destination, and var2 is the source.
    • [x] means “don’t look at x, look at the memory at the address held in x“. So, it’s a pointer-dereference, like * in C.
    • dword ptr means x is 32 bits long, or double the size of a 16-bit word ptr.
    • mov means “move”, and 0Ah means “hex byte 0A”, so mov dword ptr [firstNum],0Ah writes 0x0A into firstNum.
    • Sometimes var1 is used as a source as well as a destination — add var1 var2 means var1 = var1 + var2

    Cool! So that tells us everything, except… what is eax? Remember that processors can only do arithmetic and logic on data in registers. eax isn’t a variable, it’s a handle to a physical register! x86 assembly has only eight registers that you can read and modify at will, and eax is the one you’ll see most (because it’s favored for arithmetic). If you want to know more about the differences between the eight registers (oh my god are there differences), then CLICK HERE to expand a whole aside on them.

    Anyhow, assembly isn’t just some academic concept. You can read the assembly your code gets compiled into, and even insert your own assembly in-line with C/C++ code for sick micro-optimizations (sort of — there’s caveats).

    disassemblin
    In Visual Studio, stick a breakpoint in some code and hit alt+8 when you hit it. Congratulations! You’re looking at assembly! You can even step through individual instructions to get some hot debugging action. This is a really powerful tool for learning low-level architecture, and I totally encourage you to play with it. There’s no abstractions left when you’re reading assembly. Check out how for and while loops are actually implemented — it’s all just GOTO instructions (well, the instruction is called JMP).

    If you want to write in assembly, you can do that too! Maybe. You can write inline assembly for x86 processors, but compilers for newer x64 processors don’t accept inline assembly and recommend you use a predefined set of highly-optimized, low-level intrinsic functions instead.

    This isn’t because of any hardware changes in x64. Instead, it’s because inline assembly isn’t necessarily a speed boost. Having inline assembly defeats a ton of compile-time optimizations, since it means the compiler doesn’t get full control over what data is in which registers at any time. You can string intrinsics together to get the speedy low-level behavior you want, and you aren’t fighting the compiler by doing so.

    So, you may not want to use inline assembly as a performance tool, since support for it is going away and it can hurt your perf by ruining compiler optimizations. However, it’s still a great learning tool, so don’t be afraid to try it out! To add inline assembly, just use the __asm{ ... } command. For instance:

    int myNum = 10;

    __asm {
       mov eax, dword ptr [myNum]
       mov ebx, 20
       add eax, ebx
       mov dword ptr [myNum], eax
    }

    if(myNum == 30)
       cout << "OH DAAAAAAMN";

    Anyhow, that's enough. I'm off to perfect my reaper-into-battlecruiser build. Happy coding!

    Let’s Talk Processor Architecture

    “Hey Ben Walker”, you say, “you’re really good looking, but can you explain how my computer’s processor works?”. Well, I’m double trouble, and by the end of this post, you’re gonna understand processor architecture.

    Actually, you’re gonna understand single-core scalar (as opposed to superscalar) processor architecture. These processors went obsolete in, like, 1993. So you’ll be twenty years out of date. Ladies.


    (Click for large)

    Examine the schematic above (read left-to-right), and then let’s dive in!

    HOLD UP, WHAT’S THIS ABOUT CODE VERSUS DATA? Your processor needs two things. It needs data (like array<string> myAnimes, a list of the 500 animes you own), and it needs instructions that act on that data, or code (like myAnimes.eraseAll() please). Mind you, code is just data. It’s stored on your hard drive as bytes, same as anything else. However, your processor knows which bytes represent code and which represent data, and it handles the two very differently. Anyhow.

    SYSTEM BUS: So code and data are just bytes. Problem is, those bytes lie in your hard drive or RAM (generally, ‘main memory’) — they aren’t stored in the processor itself. The system bus’ job is to take requests from the processor to grab specific bytes, get them from main memory, and forward those bytes around when received.

    IF THE SYSTEM BUS RECEIVES DATA: It forwards that data to the memory management unit, which will in turn forward it to the registers.

    MEMORY MANAGEMENT UNIT: Called the MMU. It’s a clearinghouse for bytes. It receives requests for code/data from the rest of the processor, figures out where to look in main memory, and tells the system bus to do so. It also forwards received data to the registers, and determines which register to store the data in. It also has a cache for instructions and data, so it can fulfill requests without going to the system bus and main memory.

    REGISTERS: Registers are the only memory that can be read and written to by the processor. There are very few registers. Modern processors have 16 registers that can hold 8 bytes each, meaning 128 bytes of memory. That’s not enough memory to store a paragraph of text. Because there’s so little memory, it’s very important that the processor is efficient — it can only load data immediately before that data gets used, and once that data is used, it needs to be replaced as soon as possible.

    IF THE SYSTEM BUS RECEIVES CODE: It forwards code to the instruction pre-fetcher, which in turn goes to the decoder, the sequencer, and the ALU.

    INSTRUCTION PRE-FETCH: It figures out what instructions we’re going to execute in a few cycles, and sends requests for those instructions to the MMU right now so we’ll have them on hand when the time comes. In other words, it keeps the instructions flowing. Whenever your code branches ( if(x > 0) DoThis(); else DoThat(); ), the instruction pre-fetch has the interesting task of trying to predict whether to pre-fetch DoThis() or DoThat() before we’ve run all the instructions that determine if x>0. That logic is called a branch predictor.

    INSTRUCTION DECODE: Remember how instructions are just stored as bytes, same as everything else? The instruction decode unit is what takes the data 0x01c1 and decodes it as ADD [REGISTER0] TO [REGISTER2]. If the decoder decodes an instruction and finds out that it references data that isn’t in our registers yet, the decoder requests that data from the MMU.

    INSTRUCTION SEQUENCING AND CONTROL: It manages out-of-order execution. Imagine your code says myAnimes[315].MarkWatched(); ++numAnimesWatched; but the processor has yet to load your 315th anime into registers. The instruction sequencer recognizes instructions that we can execute on immediately, and jumps them ahead of instructions that are still waiting for data. So, it allows numAnimesWatched to increment even though we’re still waiting to load myAnimes[315]. Heck, the instruction sequencer will allow any instructions that aren’t affected by the result of myAnimes[315].MarkWatched() to skip ahead in line, keeping the processor as busy as possible. To save money and power, some processors — including the Xbox 360 processor — don’t include this unit, and can only process instructions in order. Those instructions are passed to the arithmetic / logic unit.

    ARITHMETIC / LOGIC UNIT: Also called the ALU. This is the core of the processor. It receives commands such as ADD THESE NUMBERS TOGETHER (arithmetic) or SAY '1' IF THIS NUMBER IS BIGGER THAN THAT NUMBER, OTHERWISE SAY '0' (logic), and it does them. The results get written into the registers, or get sent to the memory management unit to be written out to main memory.

    And hey, you’re done! Don’t get me wrong, this is an absurdly simplified overview. The actual block diagram of an Intel 80386 processor handles plenty of issues I ignored, such as handling overflow/underflow in arithmetic, switching between 16/32 bit operating modes, integer vs. floating point pipelines, and pretty much everything else. But you know what? You did good. Give yourself a cookie. Or email me at walkerb@walkerb.net and complain about everything I did wrong. And happy coding!