…yet

This post was born from a Reddit thread regarding observer_ptr.

For those unfamiliar, observer_ptr is the “dumb” smart pointer. It simply provides some of the semantics of a T* in a class type.

Why would I want a class to wrap T* when I can just use T*?

Here’s a few good reasons to use an observer_ptr:

  1. observer_ptr<T> captures your intent, just like unique_ptr, shared_ptr and T&. In this case, your intent is to have a non-owning reference to an object.
  2. observer_ptr<T> can be instrumented with runtime debug checks if you try to use it in an invalid manner. This alone is my favorite reason.
  3. observer_ptr<T> does not provide operator[] nor arithmetic.

The thread in question was regarding the nameobserver_ptr”. The name doesn’t quite roll off the tough, and it may cause confusion with concepts like the observer pattern, to which it is unrelated.

I like the idea of observer_ptr. I hope the proposal goes through, but I’d like to address an orthogonal issue and suggest a completely new pointer/reference type that I haven’t seen yet:

optional_ref<T>

Or maybe just

opt_ref<T>

It will fill some gaps we have in our reference/pointer toolbox.

Why Do We Keep Using T*?

Some modern C++ evangelists say “Never use raw pointers!” Other evangelists admit that they have a valid purpose and instead limit themselves to “Never use raw new or delete.”

I’m split between the two camps. I’d rather see the T* syntax be relegated to low-level libraries and never used again, but I admit it still has valid use cases with no good alternatives.

Why T* Is Evil

A huge problem with raw pointers is overloaded semantics. A raw pointer T* may be used as:

  1. A reference to a T
  2. A reference to a T that can be “re-bound”
  3. A reference to an array of T
  4. A reference to an array of T that can be “re-bound”
  5. An optional reference to a T
  6. An optional reference to an array of T

This is a big problem with using T*.

especially in code that follows certain style guides mandating its use. You know the one.

Modern C++ provides us with the following alternatives for each use case of T*, each with a distinct meaningful use case :

  1. We have T& to represent non-nullable references. It’s one of my favorite C++ features. Implicitly nullable references in other languages are complete insanity. It should be the first choice when choosing a reference type.

  2. We have std::reference_wrapper<T>. It may be a mouthful and a lot to type, but where it really shines is when used in containers or as a data member of a class.

    Using a T& as a class member is a difficult. It prevents implementation of class assignment and effectively makes the class “immovable.” Maybe this is what you want: Go for it.

    Using a T& in many containers simply doesn’t work. Since many containers perform assignments internally, we reach the same issue as using T& as a data member.

    In old C++ we’d reach for T* to answer this problem, but now we have a better solution: std::reference_wrapper<T>. It has no default constructor, so one won’t accidentally leave it “un-bound” like one might with a T*. It provides a never-null guarantee like a real T&, and it automatically converts to and from a T&, so passing and returning it as a T& is completely invisible and unsurprising.

    The only downside is that you can’t use the regular . operator to get at the members of T. You need to call .get() or bind it to a real T&. Not a big problem for the upsides, in my opinion. Maybe someday we can overload operator. too.

  3. For references to contiguous sequences, we have a few alternatives, the only standard one being iterator pairs. Using iterator pairs is great if you want to support non-contiguous ranges with no performance penalty on contiguous ones.

    Soon, we’ll also have Ranges, which are even better.

    For contiguous sequences, let’s hope for span<T> or array_view<T>.

  4. Iterators pairs and span<T>/array_view<T> solve the rebinding problem nicely. Nothing more to say.

  5. All we have today is T* or… shudder std::optional<std::reference_wrapper<T>>. This is where an optional_ref<T> could save the day.

  6. Optional arrays can be conveyed with std::optional of another type.

We’re really missing two big cases: (4) and (5).

(4) is solved by gsl::span, and will hopefully have a standard counterpart in the future.

(5) has an ugly solution: std::optional<std::reference_wrapper<T>>. Can’t want.

A Brief and Horribly Incomplete History Lesson on std::optional<T>

std::optional<T> is one of my favorite library additions ever. In technical terms, it creates a type with the domain of all T values plus a null T value. Even if T is conceptually infinite (like std::string, where there are infinite possible values), we can construct a value outside the domain of T, which we represent in code with std::nullopt.

It’s a way to say “Maybe there’s a T. Or there’s this sentinel ‘null’ value”.

Before std::optional was std::optional, it was boost::optional. It had very similar semantics, with a two notable differences:

  • std::nullopt is boost::none.
  • boost::optional supports references. (!!!)

Why did std::optional drop support for reference type parameters?

The problem with having a std::optional<T&> boils down to code like this:

int a = 1729;
int b = 42;
std::optional<int&> int_ref = a;
int_ref = b;
cout << a;  // <-- What should this print?

There are two semantics that might be taken here:

  1. int_ref = b re-binds the reference of int_ref to refer to b.
  2. int_ref = b assigns through to a.

In case (1), the value of a is unaffected by the assignment, so it prints 1729. In case (2), the value of b is assigned to a, and we print 42.

boost::optional takes the first route and re-binds the reference.

As an aside, I used to ardently believe that case (1) is the only sensible behavior to expect. Even while writing this post, I found an insightful comment from Tony van Eerd on the exact topic. Even though behavior (2) is possible to occur, I’d be extremely surprised to see it ever happen should std::optional<T> have been given valid behavior for reference types. I still think std::optional<T&> should have gone through with rebind semantics, but what’s done is done.

This reference-binding confusion (along with some concerns about total ordering of std::optional) was a big hold-up that prevented us from getting std::optional in C++14 as originally hoped.

What about std::optional<std::reference_wrapper<T>>?

A few things:

  • operator* and operator-> on std::optional<std::reference_wrapper<T>> return a std::reference_wrapper<T>, not a T&. If you’re immediately passing the object as a parameter this isn’t a problem, but it makes usage code ugly with calls to .get() everywhere.
  • it has more overhead than a raw pointer T* (without some standard-library trickery). std::reference_wrapper<T> can be implemented using a simple raw pointer, and the simplest std::optional<T> implementation consumes an std::aligned_storage<T> plus a bool to represent the state. A smart standard library could detect the reference_wrapper parameter and instead use a regular pointer with nullptr being the “dis-engaged” state.
  • It’s so much typing, and we’re lazy.

And worst of all:

It’s not constructible from a T&&.

This is an intentional design of std::reference_wrapper, to prevent this code:

std::reference_wrapper<const std::string> my_str = std::string("Hello!");
// my_str now points into the void

Unfortunately, that also prevents this code from working:

void lang_ref(const MyClass&);
void lib_ref(std::reference_wrapper<const MyClass>);

void bar() {
    lang_ref(MyClass{}); // <-- Okay
    lib_ref(MyClass{});  // <-- ERROR!
}

The reference_wrapper<const MyClass> parameter will not bind to a MyClass&& in the above example. This isn’t usually a problem because using std::reference_wrapper<T> as a function parameter is rarely what you actually want.

Because of this prohibition on r-values for reference_wrapper binding, we similarly can’t do this:

void with_optional_ref(std::optional<std::reference_wrapper<const MyClass>>)

void bar() {
    with_optional_ref(MyClass{});
}

optional<T> is only convertible from a U if U is implicitly convertible to T. For a U of MyClass&& and a T of reference_wrapper<const MyClass>, this convertibility is not allowed, so the U cannot convert to optional<T>. Dang.

An Alternative: opt_ref<T>

An extremely primitive implementation of an “optional reference” type might look like this:

template <typename T>
class opt_ref {
    T* _ptr = nullptr;

public:
    opt_ref() noexcept = default;
    opt_ref(T& reference) noexcept
        : _ptr(std::addressof(reference)) {}
    opt_ref(std::nullopt_t) noexcept
        : _ptr(nullptr) {}

    explicit operator bool() const noexcept { return _ptr != nullptr; }

    T& operator*() noexcept {
        assert(_ptr != nullptr && "Dereferencing null opt_ref");
        return *_ptr;
    }

    T* operator->() noexcept {
        assert(_ptr != nullptr && "Dereferencing null opt_ref");
        return _ptr;
    }
};

Here I’ve chosen the name opt_ref for brevity, and because I think it suffices to explain its purpose.

This definition affords a few niceties over both optional<reference_wrapper<T>> and T* as an “optional reference” type.

Over optional<reference_wrapper<T>>:

  • It’s convertible from a T&& for the case of opt_ref<const T>. This can be dangerous if misused, but that’s what C++ is, right?
  • The operator* and operator-> both return the underlying T&. Nice.
  • The same size overhead as a T*.
  • It’s less to type. A lot less.

Over T*:

  • None of the overloaded semantics. It is clear in what it represents.
  • It will bind to plain expressions. No need to invoke operator& or std::addressof.
  • We can put debug assertions on it’s operator* and operator->.

“Why Should I Care?”

It’s may be difficult to see utility in such a tiny class template. It doesn’t solve a huge swath of problems we face. It doesn’t fundamentally change the way one write’s code. It’s just a silly wrapper around a pointer.

But that’s the thing: Everything is just a wrapper around pointers. Some are just bigger than others. std::vector? Just a wrapper around some pointers. std::unique_ptr? Just a wrapper around a pointer. std::reference_wrapper? Just a wrapper of a pointer.

C++ is about building abstractions. The abstractions lead to better applications. After all, libraries are useless until they end up as part of an application.

Libraries like Jonathan Müller’s output_parameter<T> and type_safe library, Niall Douglas’s Outcome (hopefully someday std::expected), or std::chrono are the real bread and butter of C++. These aren’t enormous all-encompassing frameworks that implement half the universe. They’re just little libraries that build upon lower-level primitives to reduce errors and create expressive, meaningful code.

I won’t claim opt_ref is anything as fantastic as the libraries I just mentioned. It’s a concept anyone could have easily come up with and implemented in five minutes like I did.

It was a five minutes well spent. For the relevant project, I swapped out all places using a T* and changed to opt_ref<T>. The payoff was fast, and I could now have real optional reference parameters. It saved me time debugging as well when I accidentally dereference them without checking and I see an assertion fire exactly where the code hit, giving me a clear and meaningful error message in my stderr output.

Finally, I ask: Why keep using pointers when we have better alternatives?