C++11 and “moving” data

Alright, time to get dirty in C++ land. We’re gonna talk about move semantics (also known as r-value references), one of the big new featuresets in C++11. Why do they matter? How do you use them?

Backstory: Copying vs Moving

Look at this code:

std::vector<MyClass> myCollection;
MyClass myItem;
myCollection.push_back(myItem);

When you run this code, you end up with two instances of MyClass that look the same: myItem, and myCollection[0], which is a copy of whatever myItem looked like at push_back() time. Because you could play with myItem after the push_back() call, this duplication is reasonable and cool!

But what if you had…


MyClass GimmeInstance()
{
   MyClass ret;
   //do important stuff to ret
   return ret;
}

myCollection.push_back( GimmeInstance() );

Things get more interesting here. The MyClass ret returned from GimmeInstance() is temporary — it’s gonna go away at the end of the push_back() line, there’s no way we’re playing with it after that function is done. However, we’re still going to copy it in to myCollection — for a brief moment at the end of push_back(), the value at myCollection[1] and the value returned from GimmeInstance() will be two separate copies of each other.

Well, that seems stupid. Why copy a variable we know is going away? Why not just… move it?

With C++11, we can — with some pretty important caveats.

Okay, what are the caveats about “moving” an object

There will always be one memory location that holds the return value from GimmeInstance(), and another memory location that holds myCollection[1]. Regardless of anything C++11 adds, at some point our program will have two MyClass instances in memory. We will always be looking at one memory location and writing to another.

Umm, that sounds like “copying” not “moving”

Yeah, it does. Anyhow, because we can’t get around having two different memory locations, move constructors are pointless for plain old data classes. If your class is simple enough that you can copy it via memcpy() without any fears, move constructors offer you nothing.

But let’s say this MyClass you’re dealing with isn’t plain old data. Maybe your class implements RAII, so constructing and destructing instances of MyClass is expensive or scary.

Move semantics allow you to play fast-and-loose with copy construction of your new object, and with destruction of your temporary object. You know the object you’re copying from is going away soon — so you can swap data out of it, steal its open handles instead of creating your own, and in general skip a bunch of steps. That is why move semantics are important enough to exist.

So how do I move instead of copy?

Alright! You move by using r-value references — this is the new feature from C++11. R-values are our temporary values. They’re called that because r-values show up on the right side of an equation — they exist only to set something else equal to them.

The return value of GimmeInstance() for myCollection.push_back(GimmeInstance()), or a*b for int c = a*b — those are r-values. You can’t take the address of r-values, and they’re deleted at the end of the line of code.

Anyhow, a function can ask for r-values by using the token && in the function header. This passes the r-value by reference — it behaves like the & token, except it only accepts r-value variables.


class MyClass
{
public:
   int m_fileHandle;

   MyClass()
      : m_fileHandle(-1)
   {
      //Create the handle (making it != -1); this is expensive
   }

   ~MyClass()
   {
      if (m_fileHandle != -1)
      {
         //Clean up the handle; this is expensive
      }
   }

   MyClass(MyClass&& other) //Copy construct from a r-value reference!
   {
      //burn and pillage the temporary; steal its handle
      m_fileHandle = other.m_fileHandle;
      other.m_fileHandle = -1;
   }
};

To return to the myCollection.push_back( GimmeInstance() ); call from the beginning of this post — the MyClass that gets added to the end of MyCollection will be in a different memory location from the MyClass returned from GimmeInstance(), but we’ll copy data into it through the MyClass(MyClass&& other) (called a move constructor).

Because of this, we save time in constructing myCollection[1] (because we don’t have to open a new handle), and we save time in destructing the r-value temporary returned from GimmeInstance() (because we don’t have to close its handle). That’s a pretty sweet deal.

Some extra points:

  • Copy constructors are the most obvious scenario for move semantics, but any function can use &&.
  • You can use std::move() to treat a named variable like an r-value. This is great if you know you won’t need a named variable after a certain point, but just be careful!
  • The Rule of Three stating that if you’re modifying the destructor, you also wanna modify the copy constructor and copy assignment operator? With this, it becomes the Rule of Five. If you’re modifying any of those three, modify the move constructor and move assignment operator too.

And that’s what’s up with move semantics! Good luck!

Leave a Reply

Your email address will not be published.