C++11: unique_ptr<T>
There are a lot of great features in C++11, but unique_ptr
stands out in the area of
code hygiene. Simply put, this is a magic bullet for dynamically created objects. It won’t solve
every problem, but it is really, really good at what it does: managing dynamically created objects
with simple ownership semantics.
The Basics
The class template unique_ptr<T>
manages a pointer to an object of type T.
You will usually construct an object of this type by calling new
to create an object
in the unique_ptr
constructor:
After calling the constructor, you can use the object very much like a raw pointer. The
*
and ->
operators work exactly like you would expect, and are
very efficient - usually generating nearly the same assembly code as raw pointer access.
As you might expect from a class that wraps raw pointers, the first benefit you will get from using
unique_ptr
is automatic destruction of the contained object when the pointer goes out
of scope. You don’t have to track every possible exit point from a routine to make sure the
object is freed properly - it is done automatically. And more importantly, it will be destroyed if
your function exits via an exception.
Containers
So far this is nice, but hardly revolutionary. Writing a class that just does what I’ve described
is fairly trivial, and you could have done it with the original C++ standard. In fact, the
ill-fated (and now deprecated) auto_ptr
was just that, a first stab at an
RIAA pointer wrapper.
Unfortunately, the language hadn’t evolved to the point where auto_ptr
could be done
properly. As a result, you couldn’t use it for some pretty basic things. For example, you
couldn’t store auto_ptr
objects in most containers. Kind of a big problem.
C++11 fixed these problems with the addition of rvalue references and move semantics. As a result,
unique_ptr
objects can be stored in containers, work properly when containers are
resized or moved, and will still be destroyed when the container is destroyed. Just like you want.
Uniqueness and Move Semantics
So what exactly is the meaning of the word unique in this context? Mostly just what it
says: when you create a unique_ptr
object, you are declaring that you are going to
have exactly one copy of this pointer. There is never any doubt about who owns it, because
you can’t inadvertently make copies of the pointer.
With a classic raw pointer, this kind of code is a bug lying in wait:
Here I have allocated an object, and I have a pointer to it. When I call make_use
,
what happens to that pointer? Does make_use
make a copy of it for later use? Does it
take ownership of it and delete when done? Does it simply borrow it for a while and then return
it to the caller for later destruction?
We can’t really answer any of these questions with confidence, because C++ doesn’t make it easy to have a contract regarding the use of a pointer. You end up relying on code inspection, memory, or documentation. All of these things break regularly.
With unique_ptr
, you won’t have these problems. If you want to pass the pointer to
another routine, you won’t make a duplicate copy of the pointer that has to be accounted
for - the compiler prohibits it.
Who owns the pointer
Let’s take a simple example - I create a pointer and want to store it in a container. As a new user
of unique_ptr
, I write some pretty straightforward code:
This seems reasonable, but doing this gets me into a gray area: who owns the pointer? Will the container destroy it at some point in its lifetime? Or is it still my job do so?
The rules of using unique_ptr
prohibit this kind of code, and trying to compile it
will lead to the classic cascade of template-based compiler errors (the ones that were going to
be fixed with concepts, remember?), ending thus:
Anyway, the problem here is that we are only allowed to have one copy of the pointer - unique ownership rules apply. If I want to give the object to another piece of code, I have to invoke move semantics:
After I move the object into the container, my original unique_ptr, q, has given up ownership of
the pointer and it now rests with the container. Although object q still exists, any attempt to
dereference it will generate a null pointer exception. In fact, after the move operation, the
internal pointer owned by q has been set to null
.
Move semantics will be used automatically any place you create an rvalue reference. For example, returning a unique_ptr from a function doesn’t require any special code:
Nor does passing a newly constructed object to a calling function:
Legacy code
We all have legacy code to deal with, and even when using unique_ptr
, you will find that there are times you just have to pass some function a raw pointer. There are two ways to do this:
Calling get()
returns a pointer to the underlying method. You really want to avoid
calling this if you can, because as soon as you release that raw pointer into the wild, you have
lost much of the advantage you achieved by switching to unique_ptr
. With careful code
inspection you can probably convince yourself that the pointer is indeed only being used
ephemerally, and it will disappear once the called routine is done with it.
Extracting the pointer with release()
is a more realistic way to do it. At this point
you are saying “I’m done with the ownership of this pointer, it’s yours now”, a fair enough
thing to say.
As your code base matures, you will need to say this less and less often.
One other place you will find that your code differs slightly from that using raw pointers is that you will now often be passing unique_ptr as a reference argument:
When passing by reference, we don’t have to worry about values being inadvertently copied and muddying ownership. The effective meaning of passing in by reference is that of saying, go ahead and use this pointer, but the caller is responsible for lifetime management of the object.
The Cost
Proponents of unique_ptr argue that the cost of using this wrapper is minimal, and this seems like it is true. Below I show you a pair of routines that increment a member of the class foo
. One is passed a raw pointer to and increment routine, the other a unique_ptr
.
Let’s look at the disassembled code that was compiled in Release mode:
The unique_ptr
version does indeed have one extra pointer dereference in comparison
to the raw pointer version. While it is not so easy to count cycles these days, it seems likely
that difference between these two cases are going to be quite low, perhaps as low as 10%. A
routine that performed multiple operations on the object would see the penalty reduced further.
One Final Note
I’m sold on unique_ptr
, so you will be seeing it plenty in my code. However, you
might be feeling a bit more cautious. And that’s okay, with C++11, it is pretty easy for you to
test the waters without too much extra work. Just make sure that the routines that use
pointers use the auto
type to hold all pointers - this means no changes in the
consumer code if you change to/from raw pointers to unique_pointer
. And routines that
are passed these pointers can be defined as function templates, making it easy to adapt to
whatever type of data is passed in.
In a perfect world, with no legacy code requiring raw pointer semantics, unique_ptr
can literally guarantee that you won’t leak data allocated from the free store. This was simply
not possible in a reasonable way until C++11. Use it!