Author Archives: walkerb

C++11 and “moving” data

Alright, time to get dirty in C++ land. We’re gonna talk about move semantics (also known as r-value references), one of the big new featuresets in C++11. Why do they matter? How do you use them?

Backstory: Copying vs Moving

Look at this code:

std::vector<MyClass> myCollection;
MyClass myItem;

When you run this code, you end up with two instances of MyClass that look the same: myItem, and myCollection[0], which is a copy of whatever myItem looked like at push_back() time. Because you could play with myItem after the push_back() call, this duplication is reasonable and cool!

But what if you had…

MyClass GimmeInstance()
   MyClass ret;
   //do important stuff to ret
   return ret;

myCollection.push_back( GimmeInstance() );

Things get more interesting here. The MyClass ret returned from GimmeInstance() is temporary — it’s gonna go away at the end of the push_back() line, there’s no way we’re playing with it after that function is done. However, we’re still going to copy it in to myCollection — for a brief moment at the end of push_back(), the value at myCollection[1] and the value returned from GimmeInstance() will be two separate copies of each other.

Well, that seems stupid. Why copy a variable we know is going away? Why not just… move it?

With C++11, we can — with some pretty important caveats.

Okay, what are the caveats about “moving” an object

There will always be one memory location that holds the return value from GimmeInstance(), and another memory location that holds myCollection[1]. Regardless of anything C++11 adds, at some point our program will have two MyClass instances in memory. We will always be looking at one memory location and writing to another.

Umm, that sounds like “copying” not “moving”

Yeah, it does. Anyhow, because we can’t get around having two different memory locations, move constructors are pointless for plain old data classes. If your class is simple enough that you can copy it via memcpy() without any fears, move constructors offer you nothing.

But let’s say this MyClass you’re dealing with isn’t plain old data. Maybe your class implements RAII, so constructing and destructing instances of MyClass is expensive or scary.

Move semantics allow you to play fast-and-loose with copy construction of your new object, and with destruction of your temporary object. You know the object you’re copying from is going away soon — so you can swap data out of it, steal its open handles instead of creating your own, and in general skip a bunch of steps. That is why move semantics are important enough to exist.

So how do I move instead of copy?

Alright! You move by using r-value references — this is the new feature from C++11. R-values are our temporary values. They’re called that because r-values show up on the right side of an equation — they exist only to set something else equal to them.

The return value of GimmeInstance() for myCollection.push_back(GimmeInstance()), or a*b for int c = a*b — those are r-values. You can’t take the address of r-values, and they’re deleted at the end of the line of code.

Anyhow, a function can ask for r-values by using the token && in the function header. This passes the r-value by reference — it behaves like the & token, except it only accepts r-value variables.

class MyClass
   int m_fileHandle;

      : m_fileHandle(-1)
      //Create the handle (making it != -1); this is expensive

      if (m_fileHandle != -1)
         //Clean up the handle; this is expensive

   MyClass(MyClass&& other) //Copy construct from a r-value reference!
      //burn and pillage the temporary; steal its handle
      m_fileHandle = other.m_fileHandle;
      other.m_fileHandle = -1;

To return to the myCollection.push_back( GimmeInstance() ); call from the beginning of this post — the MyClass that gets added to the end of MyCollection will be in a different memory location from the MyClass returned from GimmeInstance(), but we’ll copy data into it through the MyClass(MyClass&& other) (called a move constructor).

Because of this, we save time in constructing myCollection[1] (because we don’t have to open a new handle), and we save time in destructing the r-value temporary returned from GimmeInstance() (because we don’t have to close its handle). That’s a pretty sweet deal.

Some extra points:

  • Copy constructors are the most obvious scenario for move semantics, but any function can use &&.
  • You can use std::move() to treat a named variable like an r-value. This is great if you know you won’t need a named variable after a certain point, but just be careful!
  • The Rule of Three stating that if you’re modifying the destructor, you also wanna modify the copy constructor and copy assignment operator? With this, it becomes the Rule of Five. If you’re modifying any of those three, modify the move constructor and move assignment operator too.

And that’s what’s up with move semantics! Good luck!

Finding Stuff in 3D Space: An Overview

Let me tell you about a task that 3D games face a bunch. It’s a difficult task. That task is finding out what’s related to you in 3D space. It’s how you solve scenarios like these:

  • I’m a gravity well! I want to know all the triangles close to me so I can pull them towards me (cuz that’s what gravity does)!
  • I’m a physics-simulated crate colliding with the ground! I want to know what part of myself is colliding with what part of the ground, so I can bounce off it!
  • I’m a camera! I want to know all the triangles in my field of view and nearby, so I can draw them!

These are all different queries. The gravity well wants to know everything within a certain distance, regardless of direction. The crate wants to know the location of a collision between two polygon meshes. The camera wants to know everything in a certain direction and distance. The only thing all these queries have in common is that they are about spatial relationships.

So! In all these cases, we want to give our actor a triangle/list-of-triangles representing the scene geometry they care about. But how can we generate that list? Iterating through every triangle in the whole level is a terrible idea — it’ll take O(n) time for n triangles, and game levels can have hundreds of thousands of triangles. We need some way to quickly reject large batches of triangles at once if we want to make this fast.

Well, thankfully, the problem of high-performance finding a certain item in a list of items has been heavily examined. And there’s a super-common solution — don’t represent your list of every-triangle-in-a-scene as a flat array, represent it as a tree!

Starting with the single top node of the tree that contains all triangles, pick one of multiple child nodes, each containing a subset of all triangles. Every time you go down one level in a tree, you auto-reject all the triangles in all the other branches at that level, and your time to find a triangle goes from O(n) to O(log(n)).

Anyhow, that isn’t a new idea. Video games have been using trees-of-triangles since Doom. There’s mainly two different types of trees-of-triangles used in video games: Binary Space Partition Trees, and Octrees. Each tree type is named after the number of leaves branching out of each node (2 for BSP trees, and 8 for octrees).

Octrees work by subdividing 3D space along the X, Y, and Z axes — the root of an octree represents the entire world, and each of its 8 children represents one of 8 equal-spaced volumes to cut the world up in to: front-right-top, front-right-bottom, front-left-top, front-left-bottom, back-right-top, back-right-bottom, back-left-top, and back-left-bottom. That is, going further positive or negative in each of [X,Y,Z].

To find spatially-related objects in an octree, you just traverse the octree and go left/right, up/down, and forward/back at every node. (you may have to traverse multiple leaves of the octree if your query doesn’t line up with the bounding boxes well).

Octrees are cool because they are simple, and because every time you go down one child, you cut out 7/8ths of the world! However, they have drawbacks in that many triangles will have to go in multiple leaves of the octree, because most triangles aren’t perfectly split along every axis.

Binary Space Partition trees can be implemented in many ways, but the core difference between them and octrees is that instead of subdividing volumes into octets along the X, Y, and Z axes, you only split volumes into two sections — but you can split along any axis you want, and you don’t have to split into even halves. Then, you iterate through and determine whether you’re “outside” or “inside” each split plane to advance down the tree.

Although you can use any split plane, some BSP trees exclusively split volumes such that one of the triangles in the scene lies along the split plane. If you abide by that restriction, then, due to some math that I frankly don’t understand, you can quickly generate a list of triangles sorted by distance from any point in the BSP-tree-represented world — which is great for drawing. Read this incredibly long explanation of BSPs in games for information.

So anyhow! Spatial partitioning systems are a great idea and you should use them.

What the heck is COM?

You’re likely vaguely aware that COM exists and it’s important. But what the heck is COM?


  • COM stands for Component-Object Model.
  • It refers to a set of libraries and APIs, as well as the programming patterns required to use them.
  • Its goal is to provide a way for two separate programs with no prior knowledge of each other, possibly written in different languages or on different machines, to exchange data.
  • COM libraries are made by Microsoft and come with the Windows OS (although the ideas behind COM are not Windows-exclusive).

Of course, you can have two programs interface in all sorts of ways — they can define ports to send and receive packets on, check a centralized server, or whatever else you can dream of. COM is just one of many solutions for inter-process communication.

That said, COM is appealing because it’s in-depth and reusable in a way that any communication protocol you hack together won’t be, it already solves a lot of low-level problems, and with the help of some Microsoft-made “wizards” such as the Microsoft Foundation Class library, programmers that don’t know much about the problem space and just want working code can get started quickly. Plus, a lot of Microsoft libraries already work with COM.

Unfortunately, because COM is so in-depth and re-usable, the terms and programming patterns dealing with COM are abstract and vague — since everything has to make sense regardless of programming language or whether you’re talking to another computer or your own. The phrase “Component-Object Model” itself is a great example of that vagueness in action.

See, COM is called “Component-Object Model” because everything in COM is patterned like this:

  • COM provides you with a bunch of Interfaces.
  • These interfaces declare virtual functions, but they don’t define them, so you can’t instantiate an instance of an Interface class.
  • Instead, these interfaces are implemented by Component Object Classes (also called coclasses), which you don’t have direct access to.

That is, COM is an object model (a way to represent the objects that maintain and affect state in your code) that only allows access to objects through defined interfaces (and these interfaces are the components that define those objects).

This object model is great for inter-process communication — since it doesn’t assume much about where an object is located or what your programming language can do to an object — but it has nothing to do with inter-process communication specifically. So the name “COM” doesn’t really gel with the reasons programmers use the Microsoft COM libraries.

Anyhow! How do you create a COM object?

  • Every interface or coclass has an associated GUID (globally unique identifier). In COM, they’re called CLSID for coclasses and IID for interfaces.
    • Example: DEFINE_GUID(IID_ISynchronizedCallBack, 0x74c26041, 0x70d1, 0x11d1, 0xb7, 0x5a, 0x0, 0xa0, 0xc9, 0x5, 0x64, 0xfe); is the GUID for the ISynchronizedCallback interface.
  • You call CoCreateInstance(...) (the ‘Co’ means ‘Component object’), pass in the GUID for the coclass and interface you want, and receive a pointer to an instantiated coclass which gives you the functions of the associated interface on that coclass. It fails if the coclass doesn’t support the requested interface.

You can also have an existing COM coclass object but you want a different interface into it. For instance, you could have a handle to a coclass through the IPersistFile interface — the COM interface for files that can be saved/loaded to disk, which offers Save() and Load() functions (and nothing else). That file also may have data that you can only get via an IDataObject interface and its GetData() function. So how do you go from your IPersistFile* to a IDataObject*? HINT: it’s not as simple as casting.

  • All COM interfaces extend from a base class called IUnknown.
  • The IUnknown base class has a function called QueryInterface
  • You can get a new interface for an existing object by calling QueryInterface(...) on your COM object pointer and passing in the ID of the interface you want
    • So, pMyPersistFile->QueryInterface(IID_IDataObject, &pMyDataObject) sets pMyDataObject to represent that same coclass, but with the IDataObject interface
    • If you pass in an interface that coclass doesn’t implement, you’ll get back a nullptr
  • So, that all looks like a lot more effort than C++ new and casting operations. Why go through all these hoops?

    Well, when you do C++ new, the memory for the class you’re new‘ing will only exist on your computer, and will only be managed by your process. When you use CoCreateInstance(...), you can generate these objects on a remote server (or on another process in your own computer) — and you don’t have to worry about the nitty-gritty details of doing so.

    Additionally, by having all these restrictions with interfaces, COM classes can be used in programs written in any programming language — you don’t have to necessarily understand C++ and virtual functions and inheritance to call Save() on your IPersistFile, and you can call Save() from C# to save a file managed by a C++ process on a computer a thousand miles away.

    And that’s what the heck COM is!

    Matrix Multiplication, A Novella: Chapter 4

    So now you work at a video game company. That happened in between chapters.

    That means you’re big league rolling! All those Khan Academy videos and Intro To Linear Algebra courses keep saying video games use matrices, and you’re gonna find out how.

    Nathan Drake WireframeSo, most big-budget video games use 3D models (there’s 2D big-budget games, too, but fuck ’em). You’re probably familiar with the idea that all 3D models are composed of a bunch of tiny triangles (or squares, which are just two triangles touching bellies). Although each triangle alone is just a geometric shape, you can pile on enough of them (and have pretty enough textures sitting on top of them) that it all starts looking like a human face.

    Anyhow I can’t make a human face so I’m gonna use those triangles to form Simple-O, the Simple Robot.

    You keep doing your thing, Simple-O.

    You keep doing your thing, Simple-O.

    Now, all those triangles that make up Simple-O? You can think of all the triangles combined as being a set of 3D positions (called vertices), and a list of mappings:

    vertex0 = [1.5, 2.3, 0.5] //a 3-dimensional position in [x,y,z]
    vertex1 = [1.7, 2.3, 0.5]
    vertex2 = [1.7, 3.0, 0.5]
    vertex3 = [1.5, 3.0, 0.1]

    triangle0 = [vertex0, vertex1, vertex2] //a triangle with endpoints at vertex0, vertex1, and vertex2
    triangle1 = [vertex0, vertex2, vertex3] //triangle1 touches bellies with triangle0

    Those mappings — the idea that triangle1 is made by connecting vertex0, vertex2, and vertex3 — never really change. The triangles that form Simple-O’s left arm will forever form his left arm.

    That said, although vertex0, vertex2, and vertex3 will forever be a triangle, their individual positions can and will change — for instance, their y value will increase if Simple-O becomes twice as tall.

    Okay, Simple-O, thats kinda weird.

    Okay, Simple-O, that’s kinda weird.

    If Simple-O is going to be twice as tall, although we don’t need to touch our mappings, we need to touch every vertex. Since 3D characters for 360/PS3 games can have 15,000+ vertices, we’re talking lots of vertices becoming twice as tall.

    At this point, there’s some parallels to your job at the North Pole calculating presents for kids based on their actions:

    • We’ve got a super-long list of items
      • Kids in the North Pole
      • Vertices in Simple-O’s 3D model
    • Each item has multiple values associated with it
      • Number of punches/hugs/kisses per kid in the North Pole
      • Positions in x, y and z per vertex in Simple-O’s 3D Model
    • Given these values, we want to compute different-but-related values per item
      • Convert number of punches/hugs/kisses into amount of candy/coal per kid in the North Pole
      • Convert positions in x, y and z into new positions in x, y and z after growing twice as tall in Simple-O’s 3D model

    Now, when we dealt with kids in the North Pole, it was easy to understand the conversion from punches/hugs/kisses to presents. We’re getting a lot subtler when we talk about vertex positions in video games. There’s two important differences to recognize here.

    First, while punches, hugs, and kisses are clearly separate actions, positions in x/y/z are not so separate at first glance. After all, you move between x, y, and z every day without thinking of which dimension you’re moving along, right?

    The key here is to think not in terms of one location in 3D space, but to think in terms of 3 locations in 3 separate 1D spaces. It’s hard to do, because we’ve been trained to think in terms of 3D space since we were babies trying to reach for the food in front of us. Just remember — when a point moves up or down (its y value changes), it does nothing to the x or z values. x, y, and z really are separate. An object moving diagonally is doing two separate things at once — like a child punching while kissing. The fact that punching-while-kissing is harder than moving diagonally doesn’t matter. You are not looking at one value (position), you are looking at three completely different values (distance-along-x-axis, distance-along-y-axis, distance-along-z-axis).

    Second, when we converted actions-per-child into presents-per-child, we went from one set of information (actions a child performs) to a completely different set of information (presents the child deserves). Our matrix to make Simple-O grow twice as tall isn’t really giving us a new set of information — we’re converting from x/y/z values on the vertices that define his model to slightly different x/y/z values. We don’t ever stop talking about x/y/z values-per-vertex, not like we stopped talking about actions-per-child when we moved to presents-per-child.


    This is a weird logical leap, and it has a lot of implications worth thinking about. For instance:


    Let these differences sink in — re-read the past few paragraphs if you need to. They are super-important to feeling comfortable with 3D vertex positions.

    Once you’re done letting them sink in, we can move to the next task: what the heck does B, the position-after-2x-taller per position matrix, look like? Or, how do we represent the idea of “grow twice as tall” as a matrix?

    Actually, it’s pretty simple: let’s think of it in terms of outputX, outputY, and outputZ per inputX, inputY, and inputZ. It’s a spreadsheet just like we’ve been doing.


    In our old presents-per-action spreadsheet, each row defined an action and told us how performing that action affects what presents you’ll get. Each column defined a present and told us what actions would contribute to getting that present.

    We can apply that reading here. Each row defines a distance in either the x, y, or z dimension and tells us how that distance affects our distance along x, y, or z when we’re twice as tall. Each column tells us what our x, y, and z values for a theoretical twice-as-tall Simple-O rely on.

    Let’s fill this in slowly.

    First, we know that when you get taller, your distance along one of the x, y, or z dimensions doesn’t affect your distance along either of the other two dimensions. That is, getting taller doesn’t make you wider or fatter. So, our input x value can only affect our output x value, and same for y and z. Our outputs for a dimension only depend on our inputs in that dimension, and everything else can be set to 0.


    Next, as we just said, we’re only getting taller — not wider or fatter. So our ‘inputX->outputX’ and ‘inputZ->outputZ’ values are both just ‘1’ — our taller Simple-O is 1x as thick and 1x as wide.


    At this point, it should be obvious: since we’re getting twice as tall, our ‘inputY->outputY’ value is 2. That is, our outputY is double the distance along our y-axis than our inputY.


    And that’s a 3D scaling matrix! As you can imagine, setting any of your x, y and z values to a non-1 value will make Simple-O that much wider, taller, and fatter. let’s see what our vertical scale looks like applied to Simple-O:


    Nicely done!

    Compiler-Generated Functions in C++

    Let’s take a simple class:

    class MyClass
        int x;

    How many functions does MyClass have? None?

    Oh, if only.

    MyClass has four functions. These four default functions are generated by the compiler automatically. Here’s the functions and their behaviors:

    • CONSTRUCTOR: MyClass(). It’s called when you create an instance of your class. It calls the constructors of any base classes and member variables. It might do other things, too — more on that below.
    • DESTRUCTOR: ~MyClass(). It’s called when you delete an instance of your class (or when it goes out of scope and gets cleaned up). It calls the destructors of any member variables and base classes, and that’s it.
    • COPY CONSTRUCTOR: MyClass(const MyClass& other). It’s called when you create an instance of a class off another existing instance of that class (MyClass mc1; mc1.x = 1337; MyClass mc2(mc1);). It calls the copy constructors of any base classes and member variables.
    • COPY ASSIGNMENT: MyClass& operator=(const MyClass& other). It’s called when you set an already-created class to equal another already-created class (MyClass mc1; mc1.x = 1337; MyClass mc2; mc2 = mc1;). It calls the copy assignment-ers of any base classes and member variables.

    Intrinsic types (int, float, bool, pointers to anything, etc) have constructors that do nothing (not even initialize to zero), destructors that do nothing, and copy constructors and copy assignment-ers that blindly copy the bytes over.

    Your user-defined classes can replace any of these compiler-generated functions with your own functions by adding one of the listed function declarations in your class definition. This is called overriding.

    Between the compiler-generated default functions and any possible overrides of them, there’s a lot of edge cases to understand.

    Compiler-generated constructors can have multiple behaviors.

    If MyClass has a user-defined constructor, then MyClass item1; and MyClass item2 = MyClass(); will both call your user-defined MyClass() — there’s only one behavior.

    However, if MyClass is relying on the compiler-generated constructor, MyClass item1; performs default initialization, while MyClass item2 = MyClass(); performs value initialization.

    Default initialization calls the constructors of any base classes, and nothing else. Since constructors for intrinsic types don’t do anything, that means all your member variables will have garbage data — whatever data was last seen in those addresses.

    Value initialization also calls the constructors of any base classes. Then, one of two things happens:

    • If MyClass is a “plain old data” class, meaning all its member variables are either intrinsic types or classes that only contain intrinsic types and no user-defined constructor/destructor, it initializes everything to 0.
    • If MyClass is too complicated to qualify as “plain old data”, it doesn’t touch any data, same as default initialization (so member variables have garbage data unless explicitly constructed otherwise).

    To phrase it another way:

    • If you have a simple class (it’s “plain old data”), and you create it in a simple way (MyClass item1;), C++ performs simple behavior (creates the class but doesn’t initialize the memory to anything, it’s all garbage data).
    • If you have a simple class, and you create it in a complex way (MyClass item2 = MyClass();), C++ performs complex behavior (creates the class and initializes memory to zero).
    • If you have a complex class (it’s not just “plain old data”), then regardless of how you create it, C++ doesn’t want to assume anything so it doesn’t initialize any memory unless you override the constructor and tell it to.
    • If you define a constructor that receives parameters, or a copy constructor, then the compiler won’t generate a default constructor.

      If your class has parameter-receiving constructors like MyClass(int i), or a user-defined copy-constructor, that indicates you’re doing trickery at construction (because otherwise, you’d be fine with the compiler-generated default behavior). Therefore, the compiler won’t generate a default MyClass(), in order to guarantee there’s no code paths that don’t apply your trickery. Note that if you define a constructor, the compiler will still generate a default copy constructor.

      If you’re going to subclass, override the destructor and make it virtual.

      Imagine a game architecture where all renderable objects extend from a IRenderable base class that contains a virtual void Render() function. A derived class such as PlayerCharacter will inherit from IRenderable and override Render() with its render code. The game just keeps a list of IRenderable* pointers — one of which will be a pointer to our PlayerCharacter object, typecast as an IRenderable — and it burns through the list calling Render() on every entry in order to draw the scene.

      Now, when the game is done, it will delete every item in its IRenderable* list, thus calling their destructors. However, based on our definition of the default destructor, this will only call the ~IRenderable() destructor and the destructors of base classes of IRenderable. The derived class, PlayerCharacter, will never get its destructor called, since we only deleted an IRenderable — we didn’t know we were looking at a PlayerCharacter to delete.

      This is called slicing — we don’t destroy the full PlayerCharacter class, we destroy a slice of it (the parts that are contained in its IRenderable base class).

      To solve this problem, declare a virtual ~IRenderable() destructor. This will override the compiler-generated default ~IRenderable(). Then, when you delete an IRenderable*, by the magic of virtual functions, you’ll actually call the destructor of the PlayerCharacter subclass. Thus, you get to delete all the data held by PlayerCharacter, not just the slice of it contained within IRenderable.

      If you need to override one of the compiler-generated destructor, copy constructor, or copy assignment-er, then you should override all three.

      This is known as the “Rule of Three”: if the default compiler-generated behavior for one of these three behaviors isn’t good enough, the default behavior for the other two probably isn’t good either.

      Most likely, if you’re overriding one of these three functions, you’re overriding the destructor to delete some pointers that you new‘ed at some point in your object’s lifetime. I’ve already written a whole lot about why compiler-generated copy constructors are bad if you have pointers that get deleted in your destructor — in summary, a copy of a class will point to the exact same data, and when the copy dies, the data dies with it, even if the copied class still needs it.

      So, it’s a pretty firm rule that you should override the copy constructor and copy assignment-er if you override the destructor. However, it’s not necessarily true that if you override one of the copy constructor or copy assignment-er, you need to override the other two. Having only one of them overridden is a “bad smell” — it indicates that you should carefully check your logic. Just make sure that your logic is correct, and you’ll be fine.

      IN CONCLUSION: That’s 1000 words on behaviors that the compiler gives you for free. C++ is scary, man.

    Matrix Multiplication, A Novella: Chapter 3

    There’s too many kids in the world!

    See, you set up a great system going on for tracking a kid’s actions and determining how much coal or candy they get. It’s easy, and it’s foolproof. Why, it’s so good that Santa cut the budget for North Pole Behavior Psychology Lab, because he thinks you’re capable enough to do it without money.

    Now, this is a problem. More and more kids are believing in Santa every day. And the computers you’ve got running the matrix multiplications to determine each kid’s presents? They’re slow. Relics from the 80s.

    In fact, you just crossed a scary threshold — given the number of children Santa needs to deliver to next Christmas, you don’t have enough computing power and time to determine presents for every kid by Christmas Eve. Christmas can’t happen.

    There’s three ways you can fix this problem:

    • Reduce the number of kids to compute presents for
    • Increase the number of computers to run math
    • Decrease the amount of time it takes a computer to compute presents

    Unfortunately, thanks to budget cuts and your lack of desire to kill children, you can’t implement the first two solutions. Sounds like you have to make optimizations if you want to save Christmas again.

    So let’s re-examine our pipeline. Identify anything we can simplify.


    Alright! So we go from knowing the actions per child (matrix A), to knowing the points per child (A*B), to knowing the presents per child (A*B*C).

    Huh. So, when we look at our system this way… why do we even bother with matrix B, the points per action matrix? I mean, we cancel out both terms belonging to B when we multiply A*B*C — we don’t ultimately care about points or actions. Can we fold B into C? Create one presents per action matrix, and discard the idea of “points” entirely?


    Ooooooh. That’s a good idea.

    Yes! That is totally possible! In fact, this is one of the most awesome and convenient things about matrix multiplication. All you need to do to combine B and C into a presents per action matrix is… multiply them.


    And if we multiply our actions per child matrix with this new presents per action matrix…


    It’s the same result as we got before! That’s our exact presents-per-child matrix from previous chapter, but we only did one matrix multiplication to get it!

    This is called the associative property of multiplication — previously, we did (A*B)*C, and now we’re doing A*(B*C), and we get the same results. That is, matrix B can associate with A or C. If B and C are both constant matrices, we can just multiply them together into a new matrix D=B*C and use it everywhere.

    So with one quick calculation of D=B*C, you just halved the amount of processing time your computers need to calculate presents per kid! You are really good at saving Christmas, dude.

    Matrix Multiplication, A Novella: Chapter 2

    Well, your days as an event planner are over. There was a lot of blood at Bacchanal S, and the police are looking for you. So you ditch town, fly to the North Pole, and find a job in Santa’s Behavior Psychology lab.

    “Naughty” and “Nice” aren’t simple concepts, and Santa needs experts to determine which child is which. The lab tracks each child’s actions, and each action is associated with a naughtiness and niceness “score”… hey, does this look familiar to you?


    Good lord, this is just like the spreadsheets you used as an event planner!


    Okay, you just crushed some naughty/nice scores — but there’s more. There’s another spreadsheet that determines what Christmas presents a child gets for a given amount of naughtiness or niceness:


    You can handle this! You have to take that naughtiness/niceness-score-per-child matrix you just generated, and multiply it by this matrix…


    Wait. That can’t be right. Your math is correct, but those numbers are way too huge. There must be a logic error somewhere. Shoot, how does this matrix multiplication thing work, again?

    • Your first matrix is a spreadsheet representing amount of Y per X, for some set of items X and Y. For instance:
      • “Number of [tables/chairs/balloons] I need to rent per event, for [my 5 upcoming events]” (Y = items to rent, X = events to rent items for)
      • “Number of [punches/hugs/kisses] performed per child, for [every child Santa tracks]” (Y = actions children perform, X = children whose actions Santa tracks)
    • Your second matrix is a spreadsheet representing the amount of Z per Y. For instance:
      • “Price charged by [Vendor1/Vendor2/Vendor3], for each [table/chair/balloon]” (Z = vendor prices, Y = items to rent)
      • “How many [naughtiness/niceness points] you get, for each [punch/hug/kiss]” (Z = personality points, Y = actions children perform)
    • By multiplying the two matrices, you go from amount of Y per X to amount of Z per X
      • “I know how many [tables/chairs/balloons] each event requires, and I know how much [Vendor 1/Vendor 2/Vendor 3] charges to rent an individual [table/chair/balloon], so how much would it cost to use [Vendor 1/Vendor 2/Vendor 3] for each event?”
      • “I know how many [punches/hugs/kisses] each child performed, and I know how [naughty/nice] it is to perform an individual [punch/hug/kiss], so how [naughty/nice] has each child been?”

    Now, let’s see what went wrong when you tried going from naughtiness/niceness-score-per-child to presents-per-child:


    Look at that! These units are all wrong. So what are the right units?

    Well, we know that matrix multiplication means “amount of Y per X * amount of Z per Y = Amount of Z per X“. We have the matrix [naughtiness/niceness score] per [each child Santa tracks], and we want to end up with the matrix [amount of coal/candy] per [each child Santa tracks]. So…

    • X = [each child Santa tracks]
    • Y = [naughtiness/niceness score]
    • Z = [amount of coal/candy]

    Therefore, we want our second matrix, amount of Z per Y, to be [amount of coal/candy] per [naughtiness/niceness score]! Then,


    So we want the amount of coal per naughtiness point, not the amount of naughtiness points per coal — we inverted the units, is all. In order to invert our matrix back to the correct position, we have to flip the rows and columns (this is called a transpose) and invert the numbers to match. And now we get:


    Nice! You just solved what presents each kid is getting! You LITERALLY saved Christmas.

    Matrix Multiplication, A Novella: Chapter 1

    So you got hired as an event planner. Congratulations on your new job!

    You’re a clever event planner, and therefore you know that your events need tables for people to eat at. You rent tables per-event from your favorite party supply place, Vendor 1. Here’s the spreadsheet you use to track costs:

    (I hope you didn’t just ignore that spreadsheet. If your eyes glazed over, go back and read it. It’s simple.)

    Everything looks great! However, two nights before Gala A, you wake up in a cold sweat: you forgot chairs. You figure out how many chairs each event needs, call Vendor 1 to get a price, and re-tool your spreadsheet. Nice save! Now your spreadsheet looks like this (click for bigger, and don’t ignore this one either):

    It’s one night before Gala A, and your phone is ringing at 2AM. What could it be? It’s Vendor 1’s archrival, Vendor 2! She wants to scoop her old boss, and she has a deal for you: chairs cost $7 each, but tables are only $8 each. You can’t mix and match tables from Vendor 1 with chairs from Vendor 2 — ever since Vendor 2 keyed Vendor 1’s car, they won’t work together. (Click for bigger)

    Looks like Vendor 2 is cheaper for Gala A, Soiree B, and Reception C. You should keep Vendor 1 for Banquet D, though!

    Man, you’re good at this event planning stuff. Two months pass, and you’re an old pro. This is what your spreadsheet looks like now:

    More events! vendors! More items to rent! This is intense. Frankly, now that you’re a big-shot, these labels are looking like overhead. So you clean it up:

    These are the same spreadsheets, just with all the non-math information chopped out. You decide to call these math-only spreadsheets matrices. To keep things from getting confusing, you implement some rules:

    • Matrix A holds item-to-rent details per event. A row represents a single event, and a column represents the number of a single item-type to rent for each event.
    • Matrix B holds vendor prices per item-to-rent. A row represents a single item-type, and a column represents a vendor’s prices for each item.
    • If we want to add a new item-type (for instance, lamps) then we need to know how much each vendor charges per-lamp. This means that the number of columns in A (representing item-types) must match the number of rows in B (since they also represent item-types).

    Now, you’ve removed everything but the sweet, sweet math. And once you do that, something surprisingly simple shows up:

    That is… (click for bigger)

    And you just invented matrix multiplication. Congratulations!

    Let’s Talk About Cameras: Perspective and Field of View

    [Note: This is technically part of the series on DirectX, but is readable on its own, and does not require programming experience to be understood.]

    At the end of my previous post on rendering 3D scenes with DirectX, you were able to render an object, but you would only see a specific narrow area (-1 to +1 in X and Y, 0 to 1 in Z). Well, that’s because we didn’t set up a camera into the scene. We can do that! But first, we need to know how cameras work.

    Renderers like DirectX only really care about three properties of your camera. The first two properties are its position and its orientation (where it’s looking at); it’s pretty easy to visualize how changing those parameters will change the final rendered scene. The third camera property that DirectX cares about is the camera’s field of view, and that’s actually pretty tricky for people to visualize. So that’s what this article is about!

    Let’s get started! Here’s an overhead view of a living room scene:
    Living Room Top Down View

    And here’s that same scene viewed from a camera at the front of the living room:
    70 Degree FOV Camera Front

    And here’s a fun fact!

    In the top-down view of this scene, the sofa at the back appears five times wider than the black chair at the front (it’s 302 pixels long, the black chair is 54 pixels long). However, in the above image, the sofa appears only twice as wide (355 pixels long, while the black chair is 180 pixels long). So, the black chair has grown to be 2.5x larger than the sofa!

    Of course, that’s not true. Your mind instinctively understands that it’s the same scene, and the same furniture — the chair didn’t really grow bigger. The sofa only appears to be smaller because it’s further away from the camera.

    That decrease in size with distance is called perspective, and it can be controlled. You can change how quickly things get smaller, and it creates different images. Check out this picture of the same scene, and hover your mouse over it to compare it with the previous image:

    That image above is the same living room scene, except things get even smaller as they get farther from the camera. None of the chairs have moved between these two images.

    So how do we describe the degree to which things in a scene get smaller as they get further away? It’s called field of view (or focal length if you’re working with real-world cameras, but I think “field of view” is a clearer term).

    Field of view is measured in degrees, and it represents how much of the scene the camera can “see”. If a camera has a 70 degree field of view, then it can “see” anything at most 35 degrees to the left or right from the direction it’s facing. Hopefully, this image will explain things:

    Two Different Fields of View

    On top, we can see a birds-eye view of the living room with two wedges marking what two different cameras can see. On bottom is the image each camera produces. The camera on the left has a 70 degree field of view, meaning that the triangular visibility wedge in the bird’s-eye view is 70 degrees wide. The camera on the right has a 90 degree field of view, so its wedge is 90 degrees wide.

    When the camera’s field of view increases, the camera can see more stuff. However, it doesn’t see equal amounts more nearby stuff and far-away stuff — it sees way more far-away stuff than nearby stuff. Check out how the visibility wedges for the two cameras are almost the same at the nearby black chair, but the 90 degree FOV camera on the right has a massively wider view wedge at the far-away sofa. In order to fit all that far-away stuff, things that are far away have to get smaller. And that’s perspective!

    One final note — you may have noticed that the 70-degree FOV camera on the left has actually been placed farther from the black chair than the 90-degree FOV camera on the right. I did this on purpose to make the black chair appear equally big in both scenes. Here’s what it looks like to have the camera increase its field of view and move closer at the same time:


    It looks like the world is growing more distant around the chair!

    Movies use this effect way often, because it’s a super-fun way to express an idea to the audience.

    Alfred Hitchcock’s Vertigo moves the camera back and increases the field of view here to make it look like the ground is receding, emphasizing the main character’s vertigo.

    Steven Spielberg does it here in Jaws to show the main character — who just spotted a shark — losing track of everything else around him. Martin Scorsese’s Goodfellas does this trick in reverse (move the camera further away and decrease the field of view) here. This causes the viewer to feel the characters’ paranoia — the world around them is closing in, and they have to watch for an assassin in every corner.

    So that’s pretty cool! You just learned a ton about cameras and it came with a free lesson in film appreciation. And come DirectX part 5, we’ll learn how to write our own 3D cameras and play with this stuff ourselves!

    DirectX Part 4: Loading FBX Models

    “Hey Ben”, you say, “these articles on DirectX are great!” (I know.) “But when you talked about vertices in your previous post, you had us hard-code an array of vertex information. I’m working with snooty artists who want to to use a graphical user interface to place thousands of triangles. Can I import a mesh from Maya?”


    Some AAA shit right here

    Some AAA shit right here

    Chances are, your artists save out to .FBX files. FBX is a proprietary format, owned by Autodesk (the guys who make Maya and Max — the tools 90% of game artists use to generate content). If you don’t have any tools/skills to create FBXes, you can download a shitty FBX of a squashed bell that I made (see a screenshot of bell in Maya viewport on the right).

    Anyhow, so Autodesk are assholes, but they aren’t stupid, and they know nobody will use their formats if their formats aren’t easy for people to do stuff with. So they created an SDK for loading FBX files and giving you all the sexy data however you want it. You can download the SDK here and you can read the documentation here. It will give you libraries to compile against (probably located in C:\Program Files\Autodesk\FBX\FBX SDK\2014.1\lib\vs2012\x64\release, with headers in C:\Program Files\Autodesk\FBX\FBX SDK\2014.1\include) that parse FBXes into data. It’s free and you can use it in commercial products.

    To get the FBX SDK set up in Visual Studio:

    • Download and install the SDK
    • In Visual Studio, right-click on your project > Properties > Configuration Properties > VC++ Directories > Include Directories, and add C:\Program Files\Autodesk\FBX\FBX SDK\2014.1\include . Make sure you do it across debug+release configurations.
    • Right-click on your project > Properties > Configuration Properties > Linker > General > Additional Library Directories, and add C:\Program Files\Autodesk\FBX\FBX SDK\2014.1\lib\vs2012\[x86/x64]\debug for debug configuration, C:\Program Files\Autodesk\FBX\FBX SDK\2014.1\lib\vs2012\[x86/x64]\release for release configuration
    • Right-click on your project > Properties > Configuration Properties > Linker > Input, and add libfbxsdk.lib under “Additional Dependencies” (across debug+release!)
    • Copy C:\Program Files\Autodesk\FBX\FBX SDK\2014.1\lib\vs2012\[x86/x64]\[debug/release]\libfbxsdk.dll into your project’s output directory for the appropriate x86/x64 and debug/release build.
    • You’ll probably want to add a reference to the .pdb containing debug information for the FBX SDK, C:\Program Files\Autodesk\FBX\FBX SDK\2014.1\lib\vs2012\x64\debug\libfbxsdk.pdb, via Tools > Options > Debugging > Symbols

    Now that you’ve got access to the FBX SDK, you’re gonna write some code that looks like this:

    #include <fbxsdk.h>
    #include <vector>

    struct MyVertex
       float pos[3];

    FbxManager* g_pFbxSdkManager = nullptr;

    HRESULT LoadFBX(std::vector* pOutVertexVector)
       if(g_pFbxSdkManager == nullptr)
          g_pFbxSdkManager = FbxManager::Create();

          FbxIOSettings* pIOsettings = FbxIOSettings::Create(g_pFbxSdkManager, IOSROOT );

       FbxImporter* pImporter = FbxImporter::Create(g_pFbxSdkManager,"");
       FbxScene* pFbxScene = FbxScene::Create(g_pFbxSdkManager,"");

       bool bSuccess = pImporter->Initialize("C:\\MyPath\\MyModel.fbx", -1, g_pFbxSdkManager->GetIOSettings() );
       if(!bSuccess) return E_FAIL;

       bSuccess = pImporter->Import(pFbxScene);
       if(!bSuccess) return E_FAIL;


       FbxNode* pFbxRootNode = pFbxScene->GetRootNode();

          for(int i = 0; i < pFbxRootNode->GetChildCount(); i++)
             FbxNode* pFbxChildNode = pFbxRootNode->GetChild(i);

             if(pFbxChildNode->GetNodeAttribute() == NULL)

             FbxNodeAttribute::EType AttributeType = pFbxChildNode->GetNodeAttribute()->GetAttributeType();

             if(AttributeType != FbxNodeAttribute::eMesh)

             FbxMesh* pMesh = (FbxMesh*) pFbxChildNode->GetNodeAttribute();

             FbxVector4* pVertices = pMesh->GetControlPoints();

             for (int j = 0; j < pMesh->GetPolygonCount(); j++)
                int iNumVertices = pMesh->GetPolygonSize(j);
                assert( iNumVertices == 3 );

                for (int k = 0; k < iNumVertices; k++)             {                int iControlPointIndex = pMesh->GetPolygonVertex(j, k);

                   MyVertex vertex;
                   vertex.pos[0] = (float)pVertices[iControlPointIndex].mData[0];
                   vertex.pos[1] = (float)pVertices[iControlPointIndex].mData[1];
                   vertex.pos[2] = (float)pVertices[iControlPointIndex].mData[2];
                   pOutVertexVector->push_back( vertex );


       return S_OK;

    Although a lot of code, it’s not that bad! Here’s what it does:

    • Initialize the FBX loader and tell it what types of data to load (this is the FbxIOSettings, and you can use it to specifically load only meshes, only meshes + materials, etc.)
    • Load the file at C:\MyPath\MyModel.fbx and grab its root node (the “handle” to the FBX contents)
    • Go through each item in the FBX, and ignore any items that aren’t meshes
    • Make sure that every polygon in the mesh is a triangle! This is important because DirectX only draws triangles. Tell your artists to triangulate their meshes (or write something that auto-triangulates meshes here).
    • Copy over the data for the 3 vertices that make up the triangle into your pOutVertexVector

    And once it’s all done, you can call LoadFBX(...), pass in a pointer to an empty vector, and you’ll come out with a vector of vertex positions!

    Now, I’m leaving it to you to hook everything up. No code samples. It’s your chance to fly solo. Just remember these things!

    • Since you stored your vertex data in a std::vector, everything is contiguous memory. If you copy in myVertexVector.size() * sizeof(MyVertex) bytes starting from &myVertexVector[0] to the GPU, you’ll have a perfect array of positions!
    • Right now, your array holds nothing but vertex positions. Modify your pVertexLayout to match, or add the COLOR and SOME_MORE_DATA attributes from the previous example into your MyVertex and give them reasonable values.
    • And in your Render() loop, make sure that m_pDeviceContext->Draw(...) is told the correct amount of vertices to draw.
    • Don’t forget to change your vertex shader if you make any changes to the vertex layout!

    Finally, if you don’t use the shitty-bell FBX attached up at the top of this post, take note. In its default state, DirectX will only draw polygons with an [X,Y] from [-1,-1] to [1,1] and with a Z from 0 to 1. Look through the vertices in your FBX and make sure everything is in that range!

    And by the end of it all, here’s what you’ll see:

    Wow, this image of a janky purple bell was worth all the hours I have spent following your tutorials, Ben!

    Wow, Ben, this image of a janky purple bell was worth all the hours I have spent following your tutorials!