Values, Pointers and References in C++

If you've primarily used high level languages like Python, you may not be used to explicitly thinking about the ownership or representation of your values in memory.^[1] In system languages like C++ or Rust, we have direct control over these aspects, and are able to use the type system to explicitly represent when a function takes ownership of a value, vs when it only takes a (temporary) reference.^[2]

First, different types of ownership, in order of preference:

T t. A normal owned value of type T, uniqlue owned. If declared as a variable it is stored on the stack, if a member variable of a class or struct it is stored inline. In either case the value is destroyed when the variable goes out of scope.
std::unique_ptr<T> t. Owned value of type T that is stored on the heap, uniquely owned: when the unique_ptr goes out of scope the value is freed. The pointer can not be copied (this would violate unique ownership), only moved.
std::shared_ptr<T>. Shared value of type T. Copies of the pointer can be made, all of which refer to the same value. The value is only freed once all copies of the pointer have gone out of scope.

Use simple values when possible, and unique_ptr if the value may be very large or needs to be moved when it otherwise could not be moved, or it would be expensive to do so. shared_ptr should only be used very rarely - shared ownership is much harder to reason about than unique ownership (cyclic vs acyclic graph). Often there is an alternative, simpler design with unique ownership.

Without ownership:

T& t. Reference to an object of type T. The object is guaranteed to exist^[3]. Cannot be changed to refer to a different object.
T* t. Pointer to an object of type T or to nothing. Can be changed to point at different objects, or to not point at any object at all (nullptr).

Prefer a reference over a pointer if possible, and a const reference over a mutable reference.

For strings, the owned type is std::string and the reference is std::string_view. For dense arrays, the owned types are std::array (fixed size) and std::vector (dynamic size), the reference is std::span.

You might have had to when using a cache or keeping long-lived references around: even in Python you'll end up with a memory leak if you eg naively use owned entries in a cache ↩
In Python, we also need to provide this information, but can only do so in the human readable documentation - which may or may not be read by a developer, and can become out of date without causing any obvious errors in our tests. ↩
Creating a dangling reference is undefined behaviour. ↩

Tags: programming, c++