Monthly Archives: July 2013

DirectX Part 3: Vertices and Shaders

Let’s get some real 3D going.


Everything in 3D is just a series of triangles, and triangles are just a series of vertices. Vertices must have 3-dimensional positions — it’s the only absolutely required information DirectX 11 needs — but they can have any number of additional traits. Normal vectors, colors (for vertex coloring), lighting information (per-vertex lighting), metadata, etc. So, before anything happens we have to tell DX11 what our vertex layout looks like — that is, what information defines a given vertex:

D3D11_INPUT_ELEMENT_DESC pVertexLayout[] =
UINT uiNumElements = ARRAYSIZE( pVertexLayout );

Most of these toggles are meaningless to beginners. The two important ones are semantic ("POSITION" and "SOME_MORE_DATA"), which is the variable name you’ll call on in shaders, and format (DXGI_FORMAT_R32G32B32_FLOAT and DXGI_FORMAT_R32_FLOAT), which defines how much / what type of data is associated with the named variable.

You can name your vertex variables anything you want, but some names (such as "POSITION") are reserved and must have certain formats associated with them.

In our pVertexLayout, the format for "COLOR" is 3 RGB floats — easy. "POSITION" is also 3 RGB floats — they’re actually going to be used as XYZ, the RGB nomenclature means nothing. "SOME_MORE_DATA" is just one float for playing with.

Next, we’ll create the actual vertices to draw. It’s just going to look like raw numbers — only the pVertexLayout lets the GPU understand how to read the data.

FLOAT pVertexArray[] =
   0.0f, 0.5f, 0.5f,   1.0f, 0.0f, 0.0f,   0.2f,
   0.5f, -0.5f, 0.5f,   0.0f, 1.0f, 0.0f,   0.0f,
   -0.5f, -0.5f, 0.5f,   0.0f, 0.0f, 1.0f,   -0.2f

So, this defines three vertices:

  • a vertex located at (0.0, 0.5, 0.5) that’s colored red (1, 0, 0) and has a SOME_MORE_DATA of 0.2
  • a vertex located at (0.5, -0.5, 0.5) that’s colored green (0, 1, 0) and has a SOME_MORE_DATA of 0.0
  • a vertex located at (-0.5, -0.5, 0.5) that’s colored blue (0, 0, 1) and has a SOME_MORE_DATA of -0.2

Next, we’ll write the shader file itself! This should be exciting for you, because this is sorta the heart of rendering. Create a new file and call it “shaders.hlsl” or something similar. Just preserve the “.hlsl” format. HLSL is a common shader-authoring language, and you’re about to write a hello-world in it. Here it is:

struct VS_INPUT
   float4 vPosition : POSITION;
   float3 vColor : COLOR;
   float OffsetX : SOME_MORE_DATA;

struct VS_OUTPUT
   float4 vPosition : SV_POSITION;
   float3 vColor : COLOR;

VS_OUTPUT SimpleVertexShader( VS_INPUT Input )
   VS_OUTPUT Output;
   Output.vPosition.x = Input.vPosition.x + Input.OffsetX;
   Output.vPosition.y = Input.vPosition.y;
   Output.vPosition.z = Input.vPosition.z;
   Output.vPosition.w = Input.vPosition.w;
   Output.vColor = Input.vColor;
   return Output;

float4 SimplePixelShader( VS_OUTPUT Input ) : SV_Target
   return float4( Input.vColor.r, Input.vColor.g, Input.vColor.b, 1.0 );

This is fairly simple, largely because DirectX does a lot of magic in the background. We define a vertex shader that receives a pre-defined VS_INPUT struct and outputs a VS_OUTPUT struct. That float myVal : SOMETHING construct means that we want myVal to magically receive the value SOMETHING that we define in our pVertexLayout description.

SOME_MORE_DATA is going to be placed in OffsetX in our VS_INPUT, and POSITION and COLOR will also be there. We’ll create a VS_OUTPUT, copy over position and color, and add our OffsetX to the position’s x value. (By the way, fun fact — instead of saying vPosition.{x,y,z,w}, you can say vPosition.{r,g,b,a} or vPosition.{[0],[1],[2],[3]} — they all compile the same. Use whichever nomenclature makes sense!)

That SV_POSITION in VS_OUTPUT means that it’s a SYSTEM VALUE. System values are hardcoded variables that get special treatment, and the ultimate vertex position is one such special variable.

Then, SimplePixelShader will magically receive that information and return a color to draw to screen (by writing it in SV_Target — the special variable that stores a final color for this pixel).

So that’s everything you need — you’ve defined what your vertices will look like, you’ve made some vertices, and you’ve written a shader to handle them and draw the triangle they form to screen. Now, you need to hook it all up.


First, write a function to handle shader compiling. Note that the shaders.hlsl file we just wrote contains multiple shaders — a vertex shader and a pixel shader — and we’ll have to compile each separately.

#include <C:\Program Files (x86)\Windows Kits\8.0\Include\um\d3dcompiler.h>

HRESULT CompileShaderFromFile(const WCHAR* pFileURI, const CHAR* pShaderName, const CHAR* pShaderModelName, ID3DBlob** ppOutBlob)
   dwShaderFlags |= D3DCOMPILE_DEBUG;

   ID3DBlob* pErrorBlob = nullptr;

   HRESULT hr = D3DCompileFromFile( pFileURI, nullptr, D3D_COMPILE_STANDARD_FILE_INCLUDE, pShaderName, pShaderModelName, dwShaderFlags, 0, ppOutBlob, &pErrorBlob );

   if( FAILED(hr) ) return hr;

   if( pErrorBlob ) pErrorBlob->Release();

   return S_OK;

A lot of confusing toggles — par for the course. Pass in the path to shaders.hlsl in pFileURI, the name of the shader in pShaderName (i.e. "SimpleVertexShader"), and the name of the shader model to compile against in pShaderModelName (use "vs_5_0" for compiling vertex shaders, and "ps_5_0" for pixel shaders). The ppOutBlob returned is a handle to the compiled shader. Close your eyes to everything else.

Let’s use it to set up our vertex shader.

ID3DBlob* pVertexShaderBlob = nullptr;
ID3D11InputLayout* pVertexLayout = nullptr;

CompileShaderFromFile( L"SimpleShaders.hlsl", "SimpleVertexShader", "vs_5_0", &pVertexShaderBlob );
m_pd3dDevice->CreateVertexShader( pVertexShaderBlob->GetBufferPointer(), pVertexShaderBlob->GetBufferSize(), nullptr, &m_pVertexShader );
m_pDeviceContext->VSSetShader( m_pVertexShader, NULL, 0 );

HRESULT hr = m_pd3dDevice->CreateInputLayout( pVertexLayout, uiNumElements, pVertexShaderBlob->GetBufferPointer(), pVertexShaderBlob->GetBufferSize(), &pVertexLayout );


m_pDeviceContext->IASetInputLayout( pVertexLayout );

So we use our new function to compile the SimpleVertexShader, we create a handle to the compiled code (m_pVertexShader) that recognizes it as a vertex shader, and then we tell our D3DDeviceContext to use it. Cool!

Next, we call m_pd3dDevice->CreateInputLayout, to make the GPU aware of our pVertexLayout that we defined all the way at the top, and set it as our official vertex layout. Note that CreateInputLayout requires the vertex shader in addition to the vertex input layout — this is because it cross-checks the two to make sure pVertexLayout contains all the information m_pVertexShader asks for.

Next, we set up our pixel shader, almost the same as we set our vertex shader…

ID3DBlob* pPixelShaderBlob = nullptr;
CompileShaderFromFile( L"SimpleShaders.hlsl", "SimplePixelShader", "ps_5_0", &pPixelShaderBlob );
m_pd3dDevice->CreatePixelShader( pPixelShaderBlob->GetBufferPointer(), pPixelShaderBlob->GetBufferSize(), nullptr, &m_pPixelShader );


m_pDeviceContext->PSSetShader( m_pPixelShader, NULL, 0 );

…And then we set our vertices…

ZeroMemory( &bd, sizeof(bd) );
bd.ByteWidth = sizeof(pVertexArray)
bd.Usage = D3D11_USAGE_DEFAULT;
bd.BindFlags = D3D11_BIND_VERTEX_BUFFER;
bd.CPUAccessFlags = 0; //no CPU access necessary

ZeroMemory( &InitData, sizeof(InitData) );
InitData.pSysMem = pVertexArray; //Memory in CPU to copy in to GPU

ID3D11Buffer* pVertexBuffer;
m_pd3dDevice->CreateBuffer( &bd, &InitData, &pVertexBuffer );

// Set vertex buffer
UINT offset = 0;
UINT stride = 7 * sizeof(float); //how much each vertex takes up in memory -- the size of 7 floats, one each for position XYZ, color RGB, and our SOME_MORE_DATA
m_pDeviceContext->IASetVertexBuffers( 0, 1, &pVertexBuffer , &stride , &offset );

m_pDeviceContext->IASetPrimitiveTopology( D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST );

Which, despite the line count, isn’t actually that scary! Our D3D11_BUFFER_DESC just says we want to allocate some memory on the GPU with size equal to the size of pVertexArray, to be used as a vertex buffer — it’s default behavior in every other way. Our D3D11_SUBRESOURCE_DATA tells the GPU where our vertex data lives on the CPU. We pass both structures in to m_pd3dDevice->CreateBuffer to copy that data to the GPU, then tell the GPU to use it as our VertexBuffer!

And now, finally, everything is set up. In your Render loop, call m_pDeviceContext->Draw( 3, 0 ); to draw 3 vertices. And will you look at that.

It took hours, but damn it, that is your triangle.


DirectX Part 2.5: The Rendering Pipeline

Okay, so, we’ve got all our DirectX stuff set up to start rendering pretty pictures.

So it’s important, at this time, to talk about the pipeline that does the rendering. Unfortunately, it’s a beast:

That's a ten-step pipeline, yes it is

Some of these can be simplified or ignored for now — but it’s important you understand this stuff. This is the core of rendering.


This stage is where we assemble (as in, gather together) our inputs (as in, our vertices and textures and stuff). The input-assembler stage knows what information needs to be associated with which vertices and shaders (does every vertex have an associated color, if it needs one? A UV position? Are we loading the textures each shader needs?). This stage makes sure to get that information from the CPU, and it passes that information in to the GPU for processing.


This stage does operations on vertices, and vertices alone. It receives one vertex with associated data, and outputs one vertex with associated data. It’s totally possible you don’t want to affect the vertex at all, in which case your vertex shader will just pass data through, untouched. One of the most common operations in the vertex shader is skinning, or moving vertices to follow a skeleton doing character animations.


These are new for DirectX 11, and a bit advanced, so I’m summarizing all these stages at once. The vertex shader only allows one output vertex per input vertex — you can’t end up with more vertices than you passed in. However, generating vertices on-the-fly has turned out to be very useful for algorithms like dynamic level of detail. So these pipeline stages were created. They allow you to generate new vertices to pass to further stages. The tessellation stages specifically are designed to create vertices that “smooth” the paths shaped by other vertices. For basic projects, it’s common to not use these stages at all.


Also fairly new, introduced in DirectX 10. This stage does operations on primitives — or triangles (also lines and points, but usually triangles). It takes as input all the vertices to build the triangle (and possibly additional vertices indicating the area around that triangle), and can output any number of vertices (including zero). In the sense that it operates on sets of vertices and outputs not-necessarily-the-same-amount of vertices, it’s similar to the hull shader / tessellator / domain shader. However, the geometry shader is different, because it can output less vertices than it received, and it allows you to create vertices anywhere, whereas tessellation can only create vertices along the path shaped by other vertices. Before DirectX 11, tessellation was done in the geometry shader, but because it was such a common use case, DX11 moved tessellation into its own special purpose (and significantly faster) pipeline stages. For basic projects, it’s common to not use this at all.


After running the geometry shader, you have the full set of vertices you want to operate on. The stream-output stage allows you to redirect all the vertices back into the input-assembler stage for a second pass, or copy them out to CPU for further processing.  This stage is optional, and will not be used if you only need one pass to generate your vertices and don’t need the CPU to know what those vertices are (which, again, is probably the case for basic projects).


The GPU outputs a 1920×1080 (or whatever size) RGB image, but right now it only has a bunch of 3d vertex data. The rasterizer bridges that gap. It takes as input the entire set of triangle positions in the scene, information about where the camera is positioned and where it’s looking, and the size of the image to output. It then determines which triangles the camera would “see”, and which pixels they take up. This sounds easy, but is actually hard.


This stage works on each individual pixel of your image, and does things like texturing and lighting. Essentially, it’s where everything is made to look pretty. It receives input data about the vertices that compose the primitive that this pixel “sees”, interpolated to match the position of this pixel on the primitive itself. It then performs operations (such as “look up this corresponding pixel in a texture” or “light this pixel as though it were 38 degrees tilted and 3 meters away from an orange light”), and outputs per-pixel data — most notably the color of the pixel. Arguably, this is the most important stage of the entire DirectX pipeline, because this is where most an image’s prettiness comes from.


Although it seems like you’re done at this point, you can’t just take the pixel shader output and arrange it all to make an image. It’s possible for the pixel shader to compute multiple output colors for a single pixel — for instance, if a pixel “sees” an opaque object and a semi-transparent object in front of it, or if the pixel shader was given a far-away object to compute colors for but it later turned out that pixel’s “sight” was blocked by a closer object. This is a problem, since our final rendered frame can only contain one color per pixel. So that’s where the output merger stage comes in. You tell it how to handle differences in depth and alpha values (as well as stencil values, which you can set for fancy rendering tricks) between two outputs of the same pixel. It follows those rules and creates the one final image to draw to screen.

And there you go! It’s a lot, but this is all the steps the GPU takes to go from raw data to an output image. There is no magic, just these steps.