Type-Safe Unions in C++
Last night I watched Ben Deane’s talk from CppCon on “Using Types Effectively”. In this talk he describes how to effectively use the type system of C++ to enforce invariants at compile-time and make code safer. I highly recommend watching the whole talk. I want to focus on one particular idea which formed the core of this presentation; the implementation of type-safe unions with std::variant
.
In his talk, Ben outlines a situation which I have certainly seen occur before in C++ code where a stateful class bundles data for several different states together, even though some of those values might be invalid or unused in certain states.
|
|
As Ben points out in his talk, there’s many problems with this format. For example, m_id
won’t be used unless the ConnectionState
is CONNECTED
. In all other states this value could be (and probably is) invalid, so it doesn’t make sense to allow access to it in those states. The solution to this problem presented by Ben was using a separate struct for each state, and combining them with a std::variant
.
|
|
This immediately makes the use of an enum
unnecessary since each state is now represented by an individual type, any of which may be held by the std::variant
. It also separates out the state variables for each state, making the meaning of each clearer by embedding them within the context they relate to. Additionally, this struct should now be smaller in memory relative to the size of the original struct: the variant will only take up an amount of space equal to the largest individual struct (all of which are smaller than the original), plus a little overhead for the variant to store the typeid/tag of the contained value.
Type-Safe Unions in Rust
While watching Ben’s presentation, I couldn’t help but feel I’d seen this all before… in Rust! I’m aware the idea for type-safe unions isn’t unique or original to Rust, but it’s the first language I’ve experimented with that has made them a first-class feature. In Rust, these type-safe unions are implemented using the language’s very powerful enumerations. The example from above adapted for Rust would look something like the following:
|
|
This doesn’t look all that dissimilar to the C++ implementation, except that we retain the separate declaration for the enum
(rather than having it embedded in the Connection
struct).
Using Type-Safe Unions
After looking at std::variant
next to Rust’s tagged enum
, I can’t help but feel that usage is slightly more ergonomic in Rust. This isn’t particularly surprising given the legacy C++ is bound to.
Initializing the union is easy and concise in both languages:
|
|
|
|
There are two three main options for extracting a value from a std::variant
. It is possible to:
- Use
std::get<type>(variant)
to get the value for a specific alternative of the variant. Throws astd::bad_variant_access
exception if the variant isn’t currently holding a value of the given type. - Use
std::get_if<type>(&variant)
to get a pointer to the value contained in the variant. Returnsnullptr
if the variant doesn’t contain a value of the given type.
The exception thrown by std::get
will be problematic for some use cases, but can be avoided by checking the variant state using the std::holds_alternative<type>(variant)
or std::get_if<type>(&variant)
non-member functions.
The third option [2], is using the std::visit
non-member function. This function takes a function object (e.g. lambda) with an overload for each type the variant can hold, plus a list of variants. It executes the operator()
method corresponding to the type currently held each variant passed to the function. It may also take a generic lambda.
|
|
Rust has exactly two ways to access the value inside an enum. The first is by using the match
construct. match
is similar to C++’s switch
statement, but it is much more powerful and doesn’t have the same foot-cannons as switch
.
|
|
This is a pretty nice way to handle extracting values from the enum
, and it has features like pattern matching and exhaustiveness checking which help make it versatile and safe at the same time. Intuition tells me that it may be possible to do something like this in C++ with [2]In C++ the std::variant
using a mix of the switch
expression, std::variant_alternative
, and some constexpr
or template metaprogramming. Until C++17 ships and implementations appear in the wild (likely sometime next year) I’ll just have to imagine how/whether this would work.visit
method can be used on std::variant
to similar effect, but without some of the extra goodies and guarantees offered by Rust’s match
.
The other way to extract a value from an enum
in Rust is to use the if let
or while let
expressions. This allows conditional binding to the enum
value if the tag matches the one specified.
|
|
The while let
expression in Rust is a similar conditional binding, but with the ability to loop until the enum cannot be unpacked. This can be simulated in C++ using std::get
and std::holds_alternative
:
|
|
Closing Comments
I agree with Ben Deane’s closing statement: std::variant
will be one of the most important additions to C++ with the introduction of the C++17 standard. It clearly has some deficiencies and ergonomic issues, but it’s largely the the tool C++ deserves (and one it needs). To me, the biggest disappointment in the std::variant
API is the use of a pointer for the std::get_if
return value instead of the std::optional
(also coming in C++17). However, given std::optional
‘s history of being delayed from standardisation, I can understand the reluctance of the std::variant
authors to do so.
Rust provides an interesting insight into what a “clean room” implementation of such a variant type might look like, and has excellent first-class facilities for handling tagged unions/variants. I prefer the ergonomics of Rust’s approach, but for integration into existing projects and codebases std::variant
strikes a practical compromise. I look forward to making use of it when C++17 finally arrives.
Updates:
[1] /u/ssokolow mentioned on the Rust subreddit that this pattern can be taken even further for state machines (at least in Rust).
[2] /u/evaned correctly pointed out that I forgot to mention the std::visitor::visit
method which works much the same as Rust’s match, and improves the ergonomics of std::variant
.