Compiler-Generated Functions in C++

Let’s take a simple class:

class MyClass
{
public:
    int x;
};

How many functions does MyClass have? None?

Oh, if only.

MyClass has four functions. These four default functions are generated by the compiler automatically. Here’s the functions and their behaviors:

  • CONSTRUCTOR: MyClass(). It’s called when you create an instance of your class. It calls the constructors of any base classes and member variables. It might do other things, too — more on that below.
  • DESTRUCTOR: ~MyClass(). It’s called when you delete an instance of your class (or when it goes out of scope and gets cleaned up). It calls the destructors of any member variables and base classes, and that’s it.
  • COPY CONSTRUCTOR: MyClass(const MyClass& other). It’s called when you create an instance of a class off another existing instance of that class (MyClass mc1; mc1.x = 1337; MyClass mc2(mc1);). It calls the copy constructors of any base classes and member variables.
  • COPY ASSIGNMENT: MyClass& operator=(const MyClass& other). It’s called when you set an already-created class to equal another already-created class (MyClass mc1; mc1.x = 1337; MyClass mc2; mc2 = mc1;). It calls the copy assignment-ers of any base classes and member variables.

Intrinsic types (int, float, bool, pointers to anything, etc) have constructors that do nothing (not even initialize to zero), destructors that do nothing, and copy constructors and copy assignment-ers that blindly copy the bytes over.

Your user-defined classes can replace any of these compiler-generated functions with your own functions by adding one of the listed function declarations in your class definition. This is called overriding.

Between the compiler-generated default functions and any possible overrides of them, there’s a lot of edge cases to understand.

Compiler-generated constructors can have multiple behaviors.

If MyClass has a user-defined constructor, then MyClass item1; and MyClass item2 = MyClass(); will both call your user-defined MyClass() — there’s only one behavior.

However, if MyClass is relying on the compiler-generated constructor, MyClass item1; performs default initialization, while MyClass item2 = MyClass(); performs value initialization.

Default initialization calls the constructors of any base classes, and nothing else. Since constructors for intrinsic types don’t do anything, that means all your member variables will have garbage data — whatever data was last seen in those addresses.

Value initialization also calls the constructors of any base classes. Then, one of two things happens:

  • If MyClass is a “plain old data” class, meaning all its member variables are either intrinsic types or classes that only contain intrinsic types and no user-defined constructor/destructor, it initializes everything to 0.
  • If MyClass is too complicated to qualify as “plain old data”, it doesn’t touch any data, same as default initialization (so member variables have garbage data unless explicitly constructed otherwise).

To phrase it another way:

  • If you have a simple class (it’s “plain old data”), and you create it in a simple way (MyClass item1;), C++ performs simple behavior (creates the class but doesn’t initialize the memory to anything, it’s all garbage data).
  • If you have a simple class, and you create it in a complex way (MyClass item2 = MyClass();), C++ performs complex behavior (creates the class and initializes memory to zero).
  • If you have a complex class (it’s not just “plain old data”), then regardless of how you create it, C++ doesn’t want to assume anything so it doesn’t initialize any memory unless you override the constructor and tell it to.
  • If you define a constructor that receives parameters, or a copy constructor, then the compiler won’t generate a default constructor.

    If your class has parameter-receiving constructors like MyClass(int i), or a user-defined copy-constructor, that indicates you’re doing trickery at construction (because otherwise, you’d be fine with the compiler-generated default behavior). Therefore, the compiler won’t generate a default MyClass(), in order to guarantee there’s no code paths that don’t apply your trickery. Note that if you define a constructor, the compiler will still generate a default copy constructor.

    If you’re going to subclass, override the destructor and make it virtual.

    Imagine a game architecture where all renderable objects extend from a IRenderable base class that contains a virtual void Render() function. A derived class such as PlayerCharacter will inherit from IRenderable and override Render() with its render code. The game just keeps a list of IRenderable* pointers — one of which will be a pointer to our PlayerCharacter object, typecast as an IRenderable — and it burns through the list calling Render() on every entry in order to draw the scene.

    Now, when the game is done, it will delete every item in its IRenderable* list, thus calling their destructors. However, based on our definition of the default destructor, this will only call the ~IRenderable() destructor and the destructors of base classes of IRenderable. The derived class, PlayerCharacter, will never get its destructor called, since we only deleted an IRenderable — we didn’t know we were looking at a PlayerCharacter to delete.

    This is called slicing — we don’t destroy the full PlayerCharacter class, we destroy a slice of it (the parts that are contained in its IRenderable base class).

    To solve this problem, declare a virtual ~IRenderable() destructor. This will override the compiler-generated default ~IRenderable(). Then, when you delete an IRenderable*, by the magic of virtual functions, you’ll actually call the destructor of the PlayerCharacter subclass. Thus, you get to delete all the data held by PlayerCharacter, not just the slice of it contained within IRenderable.

    If you need to override one of the compiler-generated destructor, copy constructor, or copy assignment-er, then you should override all three.

    This is known as the “Rule of Three”: if the default compiler-generated behavior for one of these three behaviors isn’t good enough, the default behavior for the other two probably isn’t good either.

    Most likely, if you’re overriding one of these three functions, you’re overriding the destructor to delete some pointers that you new‘ed at some point in your object’s lifetime. I’ve already written a whole lot about why compiler-generated copy constructors are bad if you have pointers that get deleted in your destructor — in summary, a copy of a class will point to the exact same data, and when the copy dies, the data dies with it, even if the copied class still needs it.

    So, it’s a pretty firm rule that you should override the copy constructor and copy assignment-er if you override the destructor. However, it’s not necessarily true that if you override one of the copy constructor or copy assignment-er, you need to override the other two. Having only one of them overridden is a “bad smell” — it indicates that you should carefully check your logic. Just make sure that your logic is correct, and you’ll be fine.

    IN CONCLUSION: That’s 1000 words on behaviors that the compiler gives you for free. C++ is scary, man.

3 thoughts on “Compiler-Generated Functions in C++

  1. Rion Provus

    I would like to thank you for what you do. I am considered by may to be one of the best coders at my university, and yet I learned more in 10 mins on your site than I do in the average day at this point. I am not saying I haven’t learned a great deal from my professors, they are amazing. Still you have a very easy and straight forward approach, and you seem to have filled in more intricate gaps of information than I even new existed. I have never followed a single blog before, but I will be keeping up with you regularly. Again thank you.

    Reply

Leave a Reply to Scott Cancel reply

Your email address will not be published.