std.variant Is Everything Cool About D

Jared Hanson has been involved with the D community since 2012, and an active contributor since 2013. Recently, he joined the Phobos team and devised a scheme to make it look like he’s contributing by adding at least 1 tag to every new PR. He holds a Bachelor of Computer Science degree from the University of New Brunswick and works as a Level 3 Support Engineer at one of the largest cybersecurity companies in the world.

I recently read a great article by Matt Kline on how std::visit is everything wrong with modern C++. My C++ skills have grown rusty from disuse (I have long since left for the greener pastures of D), but I was curious as to how things had changed in my absence.

Despite my relative unfamiliarity with post–2003 C++, I had heard about the addition of a library-based sum type in std for C++17. My curiosity was mildly piqued by the news, although like many new additions to C++ in the past decade, it’s something D has had for years. Given the seemingly sensational title of Mr. Kline’s article, I wanted to see just what was so bad about std::visit, and to get a feel for how well D’s equivalent measures up.

My intuition was that the author was exaggerating for the sake of an interesting article. We’ve all heard the oft-repeated criticism that C++ is complex and inconsistent (even some of its biggest proponents think so), and it is true that the ergonomics of templates in D are vastly improved over C++. Nevertheless, the underlying concepts are the same; I was dubious that std::visit could be much harder to use than std.variant.visit, if at all.

For the record, my intuition was completely and utterly wrong.

Exploring std.variant

Before we continue, let me quickly introduce D’s std.variant module. The module centres around the Variant type: this is not actually a sum type like C++’s std::variant, but a type-safe container that can contain a value of any type. It also knows the type of the value it currently contains (if you’ve ever implemented a type-safe union, you’ll realize why that part is important). This is akin to C++’s std::any as opposed to std::variant; very unfortunate, then, that C++ chose to use the name variant for its implementation of a sum type instead. C’est la vie. The type is used as follows:

import std.variant;

Variant a;
a = 42;
assert(a.type == typeid(int));
a += 1;
assert(a == 43);

float f = a.get!float; //convert a to float
assert(f == 43);
a /= 2;
f /= 2;
assert(a == 21 && f == 21.5);

a = "D rocks!";
assert(a.type == typeid(string));

Variant b = new Object();
Variant c = b;
assert(c is b); //c and b point to the same object
b /= 2; //Error: no possible match found for Variant / int

Luckily, std.variant does provide a sum type as well: enter Algebraic. The name Algebraic refers to algebraic data types, of which one kind is a “sum type”. Another example is the tuple, which is a “product type”.

In actuality, Algebraic is not a separate type from Variant; the former is an alias for the latter that takes a compile-time specified list of types. The values which an Algebraic may take on are limited to those whose type is in that list. For example, an Algebraic!(int, string) can contain a value of type int or string, but if you try to assign a string value to an Algebraic!(float, bool), you’ll get an error at compile time. The result is that we effectively get an in-library sum type for free! Pretty darn cool. It’s used like this:

alias Null = typeof(null); //for convenience
alias Option(T) = Algebraic!(T, Null);

Option!size_t indexOf(int[] haystack, int needle) {
    foreach (size_t i, int n; haystack)
        if (n == needle)
            return Option!size_t(i);
    return Option!size_t(null);
}

int[] a = [4, 2, 210, 42, 7];
Option!size_t index = a.indexOf(42); //call indexOf like a method using UFCS
assert(!index.peek!Null); //assert that `index` does not contain a value of type Null
assert(index == size_t(3));

Option!size_t index2 = a.indexOf(117);
assert(index2.peek!Null);

The peek function takes a Variant as a runtime argument, and a type T as a compile-time argument. It returns a pointer to T that points to the Variant’s contained value iff the Variant contains a value of type T; otherwise, the pointer is null.

^{Note: I’ve made use of Universal Function Call Syntax to call the free function indexOf as if it were a member function of int[].}

In addition, just like C++, D’s standard library has a special visit function that operates on Algebraic. It allows the user to supply a visitor for each type of value the Algebraic may hold, which will be executed iff it holds data of that type at runtime. More on that in a moment.

To recap:

std.variant.Variant is the equivalent of std::any. It is a type-safe container that can contain a value of any type.
std.variant.Algebraic is the equivalent of std::variant and is a sum type similar to what you’d find in Swift, Haskell, Rust, etc. It is a thin wrapper over Variant that restricts what type of values it may contain via a compile-time specified list.
std.variant provides a visit function akin to std::visit which dispatches based on the type of the contained value.

With that out of the way, let’s now talk about what’s wrong with std::visit in C++, and how D makes std.variant.visit much more pleasant to use by leveraging its powerful toolbox of compile-time introspection and code generation features.

The problem with std::visit

The main problems with the C++ implementation are that – aside from clunkier template syntax – metaprogramming is very arcane and convoluted, and there are very few static introspection tools included out of the box. You get the absolute basics in std::type_traits, but that’s it (there are a couple third-party solutions, which are appropriately horrifying and verbose). This makes implementing std::visit much more difficult than it has to be, and also pushes that complexity down to consumers of the library, which makes using it that much more difficult as well. My eyes bled at this code from Mr. Kline’s article which generates a visitor struct from the provided lambda functions:

template <class... Fs>
struct overload;

template <class F0, class... Frest>
struct overload<F0, Frest...> : F0, overload<Frest...>
{
    overload(F0 f0, Frest... rest) : F0(f0), overload<Frest...>(rest...) {}

    using F0::operator();
    using overload<Frest...>::operator();
};

template <class F0>
struct overload<F0> : F0
{
    overload(F0 f0) : F0(f0) {}

    using F0::operator();
};

template <class... Fs>
auto make_visitor(Fs... fs)
{
    return overload<Fs...>(fs...);
}

Now, as he points out, this can be simplified down to the following in C++17:

template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;

template <class... Fs>
auto make_visitor(Fs... fs)
{
    return overload<Fs...>(fs...);
}

However, this code is still quite ugly (though I suspect I could get used to the elipses syntax eventually). Despite being a massive improvement on the preceding example, it’s hard to get right when writing it, and hard to understand when reading it. To write (and more importantly, read) code like this, you have to know about:

Templates
Template argument packs
SFINAE
User-defined deduction guides
The ellipsis operator (…)
Operator overloading
C++-specific metaprogramming techniques

There’s a lot of complicated template expansion and code generation going on, but it’s hidden behind the scenes. And boy oh boy, if you screw something up you’d better believe that the compiler is going to spit some supremely perplexing errors back at you.

^{Note: As a fun exercise, try leaving out an overload for one of the types contained in your variant and marvel at the truly cryptic error message your compiler prints.}

Here’s an example from cppreference.com showcasing the minimal amount of work necessary to use std::visit:

template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;

using var_t = std::variant<int, long, double, std::string>;
std::vector<var_t> vec = {10, 15l, 1.5, "hello"};

for (auto& v: vec) {
    std::visit(overloaded {
        [](auto arg) { std::cout << arg << ' '; },
        [](double arg) { std::cout << std::fixed << arg << ' '; },
        [](const std::string& arg) { std::cout << std::quoted(arg) << ' '; },
    }, v);
}

^{Note: I don’t show it in this article, but if you want to see this example re-written in D, it’s here.}

Why is this extra work forced on us by C++, just to make use of std::visit? Users are stuck between a rock and a hard place: either write some truly stigmata-inducing code to generate a struct with the necessary overloads, or bite the bullet and write a new struct every time you want to use std::visit. Neither is very appealing, and both are a one-way ticket to Boilerplate Hell. The fact that you have to jump through such ridiculous hoops and write some ugly-looking boilerplate for something that should be very simple is just… ridiculous. As Mr. Kline astutely puts it:

The rigmarole needed for std::visit is entirely insane.

We can do better in D.

The D solution

This is how the typical programmer would implement make_visitor, using D’s powerful compile-time type introspection tools and code generation abilities:

import std.traits: Parameters;

struct variant_visitor(Funs...)
{
    Funs fs;
    this(Funs fs) { this.fs = fs; }

    static foreach(i, Fun; Funs) //Generate a different overload of opCall for each Fs
        auto opCall(Parameters!Fun params) { return fs[i](params); }
}

auto make_visitor(Funs...)(Funs fs)
{
    return variant_visitor!Funs(fs);
}

And… that’s it. We’re done. No pain, no strain, no bleeding from the eyes. It is a few more lines than the C++ version, granted, but in my opinion, it is also much simpler than the C++ version. To write and/or read this code, you have to understand a demonstrably smaller number of concepts:

Templates
Template argument packs
static foreach
Operator overloading

However, a D programmer would not write this code. Why? Because std.variant.visit does not take a callable struct. From the documentation:

Applies a delegate or function to the given Algebraic depending on the held type, ensuring that all types are handled by the visiting functions. Visiting handlers are passed in the template parameter list. (emphasis mine)

So visit only accepts a delegate or function, and figures out which one to pass the contained value to based on the functions’ signatures.

Why do this and give the user fewer options? D is what I like to call an anti-boilerplate language. In all things, D prefers the most direct method, and thus, visit takes a compile-time specified list of functions as template arguments. std.variant.visit may give the user fewer options, but unlike std::visit, it does not require them to painstakingly create a new struct that overloads opCall for each case, or to waste time writing something like make_visitor.

This also highlights the difference between the two languages themselves. D may sometimes give the user fewer options (don’t worry though – D is a systems programming language, so you’re never completely without options), but it is in service of making their lives easier. With D, you get faster, safer code, that combines the speed of native compilation with the productivity of scripting languages (hence, D’s motto: “Fast code, fast”). With std.variant.visit, there’s no messing around defining structs with callable methods or unpacking tuples or wrangling arguments; just straightforward, understandable code:

Algebraic!(string, int, bool) v = "D rocks!";
v.visit!(
    (string s) => writeln("string: ", s),
    (int    n) => writeln("int: ", n),
    (bool   b) => writeln("bool: ", b),
);

And in a puff of efficiency, we’ve completely obviated all this machinery that C++ requires for std::visit, and greatly simplified our users’ lives.

For comparison, the C++ equivalent:

template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;

std::variant<std::string, int, bool> v = "C++ rocks?";
std::visit(overloaded {
    [](const std::string& s) { std::cout << s << '\n'; },
    [](int  n) { std::cout << n << '\n'; },
    [](bool b) { std::cout << b << '\n'; }
}, v);

^{Note: Unlike the C++ version, the error message you get when you accidentally leave out a function to handle one of the types is comprehendable by mere mortals. Check it out for yourself.}

As a bonus, our D example looks very similar to the built-in pattern matching syntax that you find in many up-and-coming languages that take inspiration from their functional forebears, but is implemented completely in user code.

That’s powerful.

Other considerations

As my final point – if you’ll indulge me for a moment – I’d like to argue with a strawman C++ programmer of my own creation. In his article, Mr. Kline also mentions the new if constexpr feature added in C++17 (which of course, D has had for over a decade now). I’d like to forestall arguments from my strawman friend such as:

But you’re cheating! You can use the new if constexpr to simplify the code and cut out make_visitor entirely, just like in your D example!

Yes, you could use if constexpr (and by the same token, static if in D), but you shouldn’t (and Mr. Kline explicitly rejects using it in his article).There are a few problems with this approach which make it all-around inferior in both C++ and D. One, this method is error prone and inflexible in the case where you need to add a new type to your variant/Algebraic. Your old code will still compile but will now be wrong. Two, doing it this way is uglier and more complicated than just passing functions to visit directly (at least, it is in D). Three, the D version would still blow C++ out of the water on readability. Consider:

//D
v.visit!((arg) {
    alias T = Unqual!(typeof(arg)); //Remove const, shared, etc.

    static if (is(T == string)) {
        writeln("string: ", arg);
    }
    else static if (is(T == int)) {
        writeln("int: ", arg);
    }
    else static if (is(T == bool)) {
        writeln("bool: ", arg);
    }
});

vs.

//C++
visit([](auto& arg) {
    using T = std::decay_t<decltype(arg)>;

    if constexpr (std::is_same_v<T, string>) {
        printf("string: %s\n", arg.c_str());
    }
    else if constexpr (std::is_same_v<T, int>) {
        printf("integer: %d\n", arg);
    }
    else if constexpr (std::is_same_v<T, bool>) {
        printf("bool: %d\n", arg);
    }
}, v);

Which version of the code would you want to have to read, understand, and modify? For me, it’s the D version – no contest.

If this article has whet your appetite and you want to find out more about D, you can visit the official Dlang website, and join us on the mailing list or #D on IRC at freenode.net. D is a community-driven project, so we’re also always looking for people eager to jump in and get their hands dirty – pull requests welcome!.

5 thoughts on “std.variant Is Everything Cool About D”

rob March 29, 2018 at 6:53 pm

those printfs in the last one really hurt
R March 31, 2018 at 5:00 pm

Great article Jared, thanks! Also, I think there’s a small mistake in your last C++ example, ‘is_same’ should be ‘is_same::value’.
1. R March 31, 2018 at 5:08 pm
  
  It seems the comment section doesn’t like less-than/greater-than. ::value should be after greater-than in the ‘if constexpr’ condition.
  1. Jared April 1, 2018 at 4:21 pm
    
    Thanks, I copied the code verbatim from Matt’s article and didn’t check that it actually compiled.
Marcus Manning October 20, 2018 at 12:39 pm

Great article, I even didn’t know about the visit method.
Nonetheless, I’m a bit frustrated with dlangs algebraic/variant types as they should represent Union Types (untagged unions) found in other languages and not sum types (tagged unions) and therefore inherently including implicit conversion between Union types which is not done in dlang, e.g.

int i=1;double d=1.0, string s=”hello”

void fun(Algebraic!(int,double,string) a) {writenln(“Algebraic!(int,double,string)”)}
void fun(string a) {writenln(“string”)}
void fun(Algebraic!(int,float) a) {writenln(“Algebraic!(int,float)”)}

fun(i)//Algebraic!(int,float)
fun(s)//string
fun(d)//Algebraic!(int,double,string)

Also, this is a key property of union types (a.k.a existential types):
int gun(int i)=return i+1
double gun(double d)=return d/2
string gun(string s)=return s+” “+”world”

Algebraic!(int,double,string) a=”hello”
gun(a)//hello world, //always valid even if a gets initialized with integer or double

Comments are closed.

The D Blog

The official blog for the D Programming Language.

std.variant Is Everything Cool About D

Exploring std.variant

To recap:

The problem with std::visit

The D solution

Other considerations

5 thoughts on “std.variant Is Everything Cool About D”