Modern C++ Memory Management With unique_ptr

2016-02-06

C++

This post is going to be a general background article on unique_ptr and how/why you should use it (if you are not already). In my current line of work I still deal with a large C++03 codebase, but with efforts ongoing to pull C++11 into the application I have spent a great deal of time thinking about how we can make the most out of C++11. One of the biggest wins is smart pointers and manual memory management, which are usable in C++03 with boost, but are so much more powerful in C++11 thanks to move semantics. I will focus on unique_ptr in this post as a start, but that doesn’t mean you shouldn’t also use the other smart pointer types included in C++11, shared_ptr and weak_ptr, when appropriate.

Anatomy Of A `unique_ptr`

The unique_ptr introduced with C++11 (based on boost::scoped_ptr) is conceptually very simple. It wraps a raw pointer to a heap-allocated object with RAII semantics, destroying the associated object with the unique_ptr (i.e. at the end of whatever scope it is declared in). Because object destruction is triggered by RAII, there is no runtime or memory overhead for unique_ptr compared to a raw pointer. There are 3 ways to bind an object instance to a unique_ptr:

The constructor that takes a raw T*.

1	std::unique_ptr<MyClass> instance(new MyClass());

With make_unique (the return value of which can be move-assigned).

1	auto instance = std::make_unique<MyClass>();

By using the reset member function.

1 2	std::unique_ptr<MyClass> instance; instance.reset(new MyClass());

~~There is only one way to destroy the object associated with the unique_ptr early, before the unique_ptr itself is destroyed.~~[1] Destroying the contained object early (before the unique_ptr itself is destroyed) can be triggered by calling the reset member function, which can also optionally take another raw pointer that the unique_ptr should subsequently take ownership of. However, before taking ownership of the new pointer it will always ensure the previous object is deleted first. If no new pointer is passed to reset, the unique_ptr will hold nullptr after the current object is deleted. The same logic applies to assignments; a unique_ptr can be move-assigned an object from another unique_ptr, or assigned nullptr. In both cases any object already held by the unique_ptr will be deleted before accepting the new value.

Using smart pointers is semantically the same as with raw pointers. The * and -> operators have both been overloaded to provide familiar mechanics for accessing the underlying object:

1
2
3

std::unique_ptr<MyClass> instance(new MyClass());
std::cout << instance->v << std::endl;
int y = (*instance).calculate();

In fact modern smart pointers in C++ go one further by defining an implicit bool conversion; so the == nullptr, == 0, == NULL, or whatever null-pointer constant your organization has chosen, can be excluded.

std::unique_ptr<MyClass> instance(new MyClass());
std::unique_ptr<MyClass> noinstance;
assert(instance); // `instance` is initialized and valid
assert(!noinstance); // `noinstance` is uninitialised and invalid

Block-Scoped Object Lifetimes

The time/space within the program between the creation and deletion of an object is the object’s lifetime: the part of the program for which the object is valid. With smart pointers, this is dictated by the owning smart pointer(s) controlling the object’s lifetime.

Consider the following C++03-ish example:

{
    MyClass* instance(new MyClass());
    // ... more code ...
    delete instance;
}

With C++11 and unique_ptr this becomes:

{
    std::unique_ptr<MyClass> instance(new MyClass());
    // ... more code ...
} // `instance` deallocated here automatically

Or better yet, using make_unique from C++14:

{
    auto instance = std::make_unique<MyClass>();
    // ... more code ...
} // `instance` deallocated here automatically

This concept can be extended with class scopes to yield a wider range of object lifetimes we can express using unique_ptr. For example, the traditional PIMPL pattern (simplified for this example):

class Outer {
public:
    Outer() : impl(new Inner()) {};
    ~Outer() { delete impl; }
    
private:
    Inner* impl;
}

With modern C++ this can be refactored to:

class Outer {
public:
    Outer() : impl(std::make_unique<Inner>()) {};
    
private:
    std::unique_ptr<Inner> impl;
}

Note we don’t even need to define a destructor anymore, because our impl object is automatically destroyed when the Outer object is destroyed.

Why Smart Pointers Instead Of Manual Memory Management?

I have had people ask this question when I have suggested using smart pointers instead of traditional manual memory management (using delete), because their existing code works perfectly fine and causes no leaks. The most obvious benefit of using smart pointers is avoiding memory leaks (shared_ptr reference cycles being an exception). But how does using smart pointers avoid leaks?

When manual memory management is done correctly it does indeed work, but it becomes more brittle over time as code is refactored and extended. As a simplified example, say we had some code like this hiding somewhere in our application:

{
    MyClass* instance = new MyClass();
    // ... more code ...
    delete instance;
}

Then, while adding a new feature, someone modifies the code so there are multiple places where a pointer is initialized with an instance of different types (using runtime polymorphism). In reality a mistake like this is much less obvious than in this example. This example intentionally shows very poor design to make the flaw more obvious.

{
    MyClass* instance = 0;
    if (someCondition) {
        instance = new SubclassA();
    }
    // ... more code ...
    if (someOtherCondition) {
        instance = new SubclassB();
    }
    // ... more code ...
    delete instance;
}

Whoops, we may have just caused a memory leak: if both conditions are true the first object assigned to instance is never deleted. We could fix this up by calling delete to dispose of the first object before we create the second if the instance pointer is not zero.

{
    MyClass* instance = 0;
    if (someCondition)
        instance = new SubclassA();
    // ... more code ...
    if (someOtherCondition) {
        if (instance != nullptr)
            delete instance;
        instance = new SubclassB();
    }
    // ... more code ...
    delete instance;
}

This sort of breakage can be very common when modifying code that uses traditional manual memory management techniques comprised of delete calls scattered throughout code. This error could have just as well been a double free due to over-application of delete. It could have also been a use after free error where the pointer is not reset to nullptr after deletion, and our program continues to use it unaware until suddenly Cthulhu starts wreaking havoc on our application. The more resilient modern technique to eliminate all of these simple errors is to apply smart pointers in these scenarios instead wherever possible. For example, if we rewrite the above original example using unique_ptr:

{
    auto instance = std::make_unique<MyClass>();
    // ... more code ...
} // `instance` automatically deleted here

With the same naive refactoring applied this would become:

{
    std::unique_ptr<MyClass> instance;
    if (someCondition) {
        instance.reset(new SubclassA()); // no delete here, `instance` starts out as `nullptr` automatically
    }
    // ... more code ...
    if (someOtherCondition) {
        instance.reset(new SubclassB()); // any previous object associated with `instance` delete here
    }
    // ... more code ...
} // `instance` automatically deleted here

And it Just Works™. No pitfalls to be seen here; the API of smart pointers does not allow an already held pointer to be overwritten without first triggering a release of the object associated with that pointer. It’s extremely difficult to screw up the usage of unique_ptr in these types of scenarios.

Raw Pointers/References And `unique_ptr`

So with the advent of smart pointers in modern C++ are we supposed to completely throw away raw pointers and references? Of course not!

Raw pointers and references still serve a purpose for referencing/accessing objects without affecting their lifetime. The usual rules apply: for raw pointers and references to be valid, the lifetime of the raw pointers/references must be a subset of the lifetime of the object being referred to. It is easier to align the lifetimes of both smart and raw pointers if RAII and scope are used to manage both.

Consider the following example:

{
    // Scope 1
    auto instance = std::make_unique<MyClass>();
    {
        // Scope 2
        MyClass* ptrA = instance.get();
        // ... some usage of ptrA ...
    }
    // ... more code ...
    {
        // Scope 3
        MyClass* ptrB = instance.get();
        /// ... some usage of ptrB
    }
}

The lifetimes in this example could be represented by the following Venn diagram:

The scopes of the raw pointers both fall completely within _Scope 1_

Whereas if we got the lifetimes wrong and did something like this:

{
    // Scope 2
    MyClass* ptrA;
    {
        // Scope 1
        auto instance = std::make_unique<MyClass>();
        ptrA = instance.get();
        {
            // Scope 3
            MyClass* ptrB = instance.get();
            /// ... some usage of ptrB
        }
    }
    // ... some usage of ptrA ...
    // oops! ptrA is invalid because `instance` and the associated object no longer exist
}

The associated Venn diagram of the lifetimes would look more like this:
s Scope 2 is outside of instance

This would result in a use-after-free error when we try to access ptrA after the end of Scope 1, and possibly an application crash.

Smart Pointers For A Better Tomorrow

There are a multitude of safety and maintainability advantages when using smart pointers instead of traditional manual memory management with direct calls to delete. Using smart pointers won’t provide any immediate gratification over working delete code, but in the long term they make code much more resilient in the face of maintenance, refactoring and extension.

With the addition of shared_ptr, weak_ptr and move semantics from C++11 there should be few real scenarios which still require manual delete calls. Hopefully as C++ continues to develop and advance this will be ever more true. delete is still useful to have in your C++ development toolbox, but it should be a tool of last resort.

Stay tuned for a follow-up post on shared_ptr and weak_ptr.

Update:
[1] Thanks to /u/immutablestate on reddit for pointing out that assignment operations can also trigger an early release of an object held by a unique_ptr.

Thanks to /u/corysama and /u/malaprop0s for other corrections.

Anatomy Of A unique_ptr

Block-Scoped Object Lifetimes

Why Smart Pointers Instead Of Manual Memory Management?

Raw Pointers/References And unique_ptr

Smart Pointers For A Better Tomorrow

Anatomy Of A `unique_ptr`

Raw Pointers/References And `unique_ptr`