Category Archives: Code

Crafting Self-Evident Code with D

Digital Mars D logo

Have you ever looked at your code from five years ago and had to study it to figure out what it was doing? And the further back in time you look, the worse it gets? Pity me, who is still maintaining code I wrote over 40 years ago. This article illustrates many simple methods of making your code self-evident and much easier to understand and maintain

To let you know what you’re fightin’ for, allow me to introduce this little gem I wrote back in 1987:

#include <stdio.h>
#define O1O printf
#define OlO putchar
#define O10 exit
#define Ol0 strlen
#define QLQ fopen
#define OlQ fgetc
#define O1Q abs
#define QO0 for
typedef char lOL;

lOL*QI[] = {"Use:\012\011dump file\012","Unable to open file '\x25s'\012",
  "\012","   ",""};

main(I,Il)
lOL*Il[];
{       FILE *L;
         unsigned lO;
         int Q,OL[' '^'0'],llO = EOF,

         O=1,l=0,lll=O+O+O+l,OQ=056;
         lOL*llL="%2x ";
         (I != 1<<1&&(O1O(QI[0]),O10(1011-1010))),
         ((L = QLQ(Il[O],"r"))==0&&(O1O(QI[O],Il[O]),O10(O)));
         lO = I-(O<<l<<O);
         while (L-l,1)
         {       QO0(Q = 0L;((Q &~(0x10-O))== l);
                         OL[Q++] = OlQ(L));
                 if (OL[0]==llO) break;
                 O1O("\0454x: ",lO);
                 if (I == (1<<1))
                 {       QO0(Q=Ol0(QI[O<<O<<1]);Q<Ol0(QI[0]);
                         Q++)O1O((OL[Q]!=llO)?llL:QI[lll],OL[Q]);/*"
                         O10(QI[1O])*/
                         O1O(QI[lll]);{}
                 }
                 QO0 (Q=0L;Q<1<<1<<1<<1<<1;Q+=Q<0100)
                 {       (OL[Q]!=llO)? /* 0010 10lOQ 000LQL */
                         ((D(OL[Q])==0&&(*(OL+O1Q(Q-l))=OQ)),
                         OlO(OL[Q])):
                         OlO(1<<(1<<1<<1)<<1);
                 }
                 O1O(QI[01^10^9]);
                 lO+=Q+0+l;}
         }
         D(l) { return l>=' '&&l<='\~';
}

Yes, this is how we wrote C code back then. I even won an award for it!

Although I am a very slow learner, I do learn over time, and gradually the code got better. You’re probably having the same issues with your code. (Or the code written by coworkers, as I agree that your code certainly does not need improvement!)

This article is about techniques that will help make code self-evident. You are probably already doing some of them. I bet there are some you aren’t. I also am sure you’re going to argue with me about some of them. Trust me, you’re wrong! If you don’t agree with me now, you will if you’re still programming five years hence.

I know you’re busy, so let’s jump right in with an observation:

“Anybody can write complicated code. It takes genius to write simple code.”

or, if you prefer:

“The highest accolade your code can garner is: oh pshaw, anybody could have
written that!”

For example, since I started as an aerospace engineer:

consider this lever commonly found in aircraft cockpits. No fair if you already know what it does. Examine it casually. What is it for?

.

.

.

It raises and lowers the landing gear. What’s the clue? It’s got a little tire for a knob! Pull the lever up, and the gear gets sucked up. Push it down, the gear goes down. Is that self-evident or what? It’s a masterpiece of simplicity. It doesn’t even need any labels. If the cockpit is filled with smoke, or you’re focused on what’s outside the window, your hand knows immediately it’s on the gear lever—not the flaps or the throttles or the copilot’s ejection seat (just kidding). This kind of stupid simple control is what cockpit designers strive for because pulling the right lever is literally a life-and-death decision. I mean literally in the literal sense of the word.

This is what we desperately want to achieve in programming. Stupid simple. We’ll probably fail, but the closer the better.

Diving in…

Just Shoot Me Now

#define BEGIN {
#define END   }

Believe it or not, this was common C practice back in the 1980s. It falls into the category of “Don’t try to make your new language look like your previous language”. This problem appears in multiple guises. I still tend to name variables in Fortran style from back when the oceans hadn’t yet formed. Before moving to D, I realized that using C macros to invent a personal custom language on top of C was an abomination. Removing it and replacing it with ordinary C code was a big improvement in clarity.

Don’t Reinvent bool

Learn what bool is and use it as intended. Accept that the following are all the same:

false true
0 1
no yes
off on
0 volts 5 volts

And that this makes code unequivocally worse:

enum { No, Yes };

Just use false and true. Done. And BTW,

enum { Yes, No };

is just an automatic “no hire” decision, as if (Yes) will thoroughly confuse everyone. If you’ve done this, run and fix it before someone curses your entire ancestry.

Horrors Blocked by D

D’s syntax has been designed to prevent some types of coding horrors.

C++ Regex expressions with operator overloading

I’m not even going to link to this. It can be found with diligent searching. What it does is use operator overloading to make ordinary-looking C++ code actually be a regex! It violates the principle that code should not pretend to be in another language. Imagine the inept error messages the compiler will bless you with if there’s a coding mistake with this. D makes this hard to do by only allowing arithmetic operators to be overloaded, which disallows things like an overloaded unary *.

(It’s harder, but still possible, to abuse operator overloading in D. But fortunately, making it harder has largely discouraged it.)

Metaprogramming with macros

Many people have requested macros be added to D. We’ve resisted this because macros inevitably result in people inventing their own custom, undocumented language layered over D. This makes it impractical for anyone else to make use of this code. In my not-so-humble opinion, macros are the reason why Lisp has never caught on in the mainstream. No Lisper can read anyone else’s Lisp code.

C++ Argument Dependent Lookup

Nobody knows what symbol will actually be found. ADL was added so one could do operator overloading on the left operand. D just has a simple syntax for left or right operand overloading.

SFINAE

Nobody knows if SFINAE is in play or not for any particular expression.

Floor Wax or Tasty Dessert Topping

This refers to the confusion between a struct being a value type or a reference type or some chimera of both. In D, a struct is a value type and a class is a reference type. To be fair, some people still try to build a D chimera type, but they should be cashiered.

Multiple inheritance

Nobody has ever made a convincing case for why this is needed. Things get really nasty when diamond inheritance is used. Pity the next guy and avoid the temptation. D has multiple inheritance for interfaces only, which has proved to be more than adequate.

Code Flow

Code should flow from left to right, and top to bottom. Just like how this article is read.

f() + g() // which executes first?

Fortunately, D guarantees a left-to-right ordering (C does not). But what about:

g(f(e(d(c(b(a))),3)));

That executes inside out! Quick, which function call does the 3 get passed to? D’s Universal Function Call Syntax to the rescue:

a.b.c.d(3).e.f.g;

That’s the equivalent, but execution flows clearly left-to-right. Is this an extreme example, or the norm?

import std.stdio;
import std.array;
import std.algorithm;

void main() {
     stdin.byLine(KeepTerminator.yes).
     map!(a => a.idup).
     array.
     sort.
     copy(stdout.lockingTextWriter());
}

This code reads from stdin by lines, puts the lines in an array, sorts the array, and writes the sorted result to stdout. It doesn’t quite meet our “stupid simple” criteria, but it is pretty close. All with a nice flow from left to right and top to bottom.

The example also nicely segues into the next observation.

The More Control Paths, the Less Understandable

Shaw: You know a great deal about computers, don’t you?
Mr. Spock: I know all about them.

I submit that:

version (X)
     doX();
doY();
if (Z)
     doZ();

is less comprehensible than:

doX();
doY();
doZ();

What happened to the conditional expressions? Move them to the interiors of doX() and doZ().

I know what you’re thinking. “But Walter, you didn’t eliminate the conditional expressions, you just moved them!” Quite right, but those conditional expressions properly belong in the functions, rather than enclosing those functions. They are part of the conceptual encapsulation a function provides, so the caller is clean.

Negation

Negation in English:

Dr McCoy: We’re trying to help you, Oxmyx.
Bela Oxmyx: Nobody helps nobody but himself.
Mr. Spock: Sir, you are employing a double negative.

Cowardly Lion: Not nobody! Not nohow!

Negation in English is often used as emphasis, rather than logical negation. Our perception of negation is fuzzy and fraught with error. This is something propagandists use to smear someone.

What the propagandist says: “Bob is not a drunkard!”

What the audience hears: “Bob is a drunkard!”

Skilled communicators avoid negation. Savvy programmers do, too. How many times have you overlooked a not operator? I have many times.

if (!noWay)

is inevitably perceived as:

if (noWay)

I mentioned this discovery to my good friend Andrei Alexandrescu. He didn’t buy it. He said I needed research to back it up. I didn’t have any research, but didn’t change my mind (i.e., hubris). Eventually, I did run across a paper that did do such research and came to the same conclusion as my assumption. I excitedly sent it to Andrei, and to his great credit, he conceded defeat, which is why Andrei is an exceptional man (rare is the person who ever concedes defeat!).

The lesson here is to avoid using negation in identifiers if at all possible.

if (way)

Isn’t that better?

DMD Source Code Hall of Shame

My own code is hardly a paragon of virtue. Some identifiers:

  • tf.isnothrow
  • IsTypeNoreturn
  • Noaccesscheck
  • Ignoresymbolvisibility
  • Include.notComputed
  • not nothrow

I have no excuse and shall have myself flagellated with a damp cauliflower. Did I say I didn’t like the code I wrote five years ago?

This leads us to the D version conditional.

Negation and version

D version conditionals are very simple:

version ( Identifier )

Identifier is usually predefined by the compiler or the command line. Only an identifier is allowed—no negation, AND, OR, or XOR. (Collectively call that version algebra.) Our users often chafe at this restriction, and I get that it’s difficult to accept the rationale at first. It’s not impossible to do version algebra:

version (A) { } else {
    // !A
}

version (A) version (B) {
    // A && B
}

version (A) version = AorB;
version (B) version = AorB;
version (AorB) {
     // A || B
}

and so forth. But it’s clumsy and unattractive on purpose. Why would D do such a thing? It’s meant to encourage thinking about versions in a positive manner. Suppose a project has a Windows and an OSX build:

version (Windows) {
     ...
}
else version (OSX) {
     ...
}
else
     static assert(0, "unsupported operating system");

Isn’t that better than this:

...
version (!Windows){
...
}

I’ve seen an awful lot of that style in C. It makes it pointlessly difficult to add support for a new operating system. After all, what the heck is the “not Windows” operating system? That really narrows things down! The former snippet makes it so much easier.

Taking this a step further:

if (A && B && C && D)

if (A || B || C || D)

are easy for a human to parse. Encountering:

if (A && (!B || C))

is always like transitioning from smooth asphalt to a cobblestone road. Ugh. I’ve made mistakes with such constructions all the time. Not only is it hard to even see the !, but it’s still hard to satisfy yourself that it is correct.

Fortunately, De Morgan’s Theorem can sometimes come to the rescue:

(!A && !B) => !(A || B)
(!A || !B) => !(A && B)

It gets rid of one negation. Repeated application can often transform it into a much more easily understood equation while being equally correct.

Anecdote: When designing digital logic circuits, the NAND gate is more efficient than the AND gate because it has one less transistor. (AND means (A && B), NAND means !(A && B)). But humans just stink at crafting bug-free NAND logic. When I worked on the design of the ABEL programming language back in the 1980s, which was for programming Programmable Logic Devices, ABEL would accept input in positive logic. It would use De Morgan’s theorem to automatically convert it to efficient negative logic. The electronics designers loved it.

To sum up this section, here’s a shameful snippet from Ubuntu’s unistd.h:

#if defined __USE_BSD || (defined __USE_XOPEN && !defined __USE_UNIX98)

Prof Marvel: I can’t bring it back, I don’t know how it works!

Casts Are Bugs

Casts subvert the protections of the typing system. Sometimes you just gotta have them (to implement malloc, for example, the result needs a cast), but far too often they are simply there to correct sloppy misuse of types. Hence, in D casts are done with the keyword cast, not a peculiar syntax, making them easily greppable. It’s worthwhile to occasionally grep a code base for cast and see if the types can be reworked to eliminate the need for the cast and have the type system working for rather than against you.

Pull Request: remove some dyncast calls

Self-Documenting Function Declarations

char* xyzzy(char* p)
  • Does p modify what it points to?
  • Is p returned?
  • Does xyzzy free p?
  • Does xyzzy save p somewhere, like in a global?
  • Does xyzzy throw p?

These crucial bits of information are rarely noted in the documentation for the function. Even worse, the documentation often gets it wrong! What is needed is self-documenting code that is enforced by the compiler. D has attributes to cover this:

const char* xyzzy(return scope const char* p)
  • p doesn’t modify what it points to
  • p is returned
  • p is not free’d
  • xyzzy does not squirrel away a copy of p
  • p is not thrown in an exception

This is all documentation that now isn’t necessary to write, and the compiler will check its accuracy for you. Yes, it is called “attribute soup” for good reason, and takes some getting used to, but it’s still better than bad documentation, and adding attributes is optional.

Function Arguments and Returns

Function inputs and outputs present in the function declaration are the “front door”. Any inputs and outputs that are not in the function declaration are “side doors”. Side doors include things like global variables, environment variables, getting information from the operating system, reading/writing files, throwing exceptions, etc. Side doors are rarely accounted for in the documentation. The poor sap calling a function has to carefully read its implementation to discern what the side doors are.

Self-evident code should strive to run everything through the front door. Not only does this help with comprehension, but it also enables delightful things like easy unit testing.

Memory Allocation

An ongoing problem faced by functions that implement an algorithm that needs to allocate memory is what memory allocation scheme should be used. Often a reusable function imposes the memory allocation method on the caller. That’s backward.

For memory that is allocated and free’d by the function, the solution is that the function decides how to do it. For allocated objects that are returned by the function, the caller should decide the allocation scheme by passing an argument that specifies it. This argument often takes the form of a “sink” to send the output to. More on that later.

Pass Abstract “sink” for Output

The auld way (extracted from the DMD source code):

import dmd.errors;
void gendocfile(Module m) {
     ...
     if (!success)
         error("expansion limit");
}

error() is a function that error messages are sent to. This is a typical formulation seen in conventional code. The error message is going out through the side door. The caller of gendocfile() has no say in what’s done with the error message, and the fact that it even generates error messages is usually omitted by the documentation. Worse, the error message emission makes it impractical to properly unit test the function.

A better way is to pass an abstract interface “sink” as a parameter and send the error messages to the sink:

import dmd.errorsink;
void gendocfile(Module m, ErrorSink eSink) {
     ...
     if (!success)
         eSink.error("expansion limit");
}

Now the caller has total control of what happens to the error messages, and it is implicitly documented. A unit tester can provide a special implementation of the interface to suit testing convenience.

Here’s a real-world PR making this improvement:

doc.d: use errorSink

Pass Files as Buffers Rather than Files to Read

Typical code I’ve written, where the file names are passed to a function to read and process them:

void gendocfile(Module m, const(char)*[] docfiles) {
     OutBuffer mbuf;
     foreach (file; ddocfiles) {
         auto buffer = readFile(file.toDString());
         mbuf.write(buffer.data);
     }
     ...
}

This kind of code is a nuisance to unit test, as adding file I/O to the unit tester is very clumsy, and, as a result, no unit tests get written. Doing file I/O is usually irrelevant to the function, anyway. It just needs the data to operate on.

The fix is to pass the contents of the file in an array:

void gendocfile(Module m, const char[] ddoctext) {
     ...
}

The PR: move ddoc file reads out of doc.d

Write to Buffer, Caller Writes File

A typical function that processes data and writes the result to a file:

void gendocfile(Module m) {
     OutBuffer buf;
     ... fill buf ...
     writeFile(m.loc, m.docfile.toString(), buf[ ]);
}

By now, you know that the caller should write the file:

void gendocfile(Module m, ref OutBuffer outbuf) {
     ... fill outbuf ...
}

And the PR:
doc.d: move file writes to caller

Move Environment Calls to Caller

Here’s a function that obtains input from the environment:

void gendocfile(Module m) {
     char* p = getenv("DDOCFILE");
     if (p)
         global.params.ddoc.files.shift(p);
}

It should be pretty obvious by now what is wrong with that. PR to move the environment read to the caller and then pass the info through the front door:

move DDOCFILE from doc.d to main.d

Use Pointers to Functions (or Templates)

I was recently working on a module that did text processing. One thing it needed to do was identify the start of an identifier string. Since Unicode is complicated, it imported the (rather substantial) module that handled Unicode. But it bugged me that all that was needed was to determine the start of an identifier; the text processor needed no further knowledge of Unicode.

It finally occurred to me that the caller could just pass a function pointer as an argument to the text processor, and the text processor would need no knowledge whatsoever of Unicode.

import dmd.doc;
bool expand(...) {
     if (isIDStart(p))
         ...
}

became:

alias fp_t = bool function(const(char)* p);
bool expand(..., fp_t isIDStart) {
     if (isIDStart(p))
         ...
}

Notice how the import just went away, improving the encapsulation and comprehensibility of the function. The function pointer could also be a template parameter, whichever is more convenient for the application. The more moats one can erect around a function, the easier it is to understand.

The PR: remove dmacro.d dependency on doc.d

Two Categories of Functions

  • Alters the state of the program

Provide a clue in the name of the function, like doAction().

  • Asks a Question

Again, a clue should be in the name. Something like isSomething(), hasCharacteristic(), getInfo(), etc. Consider making the function pure to ensure it has no side effects.

Try not to create functions that both ask a question and modify state. Over time, I’ve been gradually splitting such functions into two.

Visual Pattern Recognition

Source code formatters are great. But I don’t use them. Oof! Here’s why:

final switch (of)
{
     case elf:   lib = LibElf_factory();    break;
     case macho: lib = LibMach_factory();   break;
     case coff:  lib = LibMSCoff_factory(); break;
     case omf:   lib = LibOMF_factory();    break;
}

It turns out your brain is incredibly good at pattern recognition. By lining things up, a pattern is created. Any deviation from that pattern is likely a bug, and your eyes will be drawn to the anomaly like a stink bug to rotting fruit.

I’ve detected so much rotting fruit by using patterns, and a source code formatter doesn’t do a good job of making patterns.

Prof. Marvel: I have reached a cataclysmic decision!

Use ref instead of *

A ref is a restricted form of pointer. Arithmetic is not allowed on it, and ref parameters are not allowed to escape a function. This not only informs the user but informs the compiler, which will ensure the function is a good boy with the ref.

PR: mangleToBuffer(): use ref

Takeaways

  • Use language features as intended (don’t invent your own language
    on top of it)
  • Avoid negation
  • Left to right, top to bottom
  • Functions do everything through the front door
  • Don’t conflate engine with environment
  • Reduce cyclomatic complexity
  • Separate functions that ask a question from functions that alter state
  • Keep trying—this is a process!

The recommendations here are pretty easy to follow. It’ll rarely be necessary to do much refactoring to implement them. I hope the real-life PRs referenced here show how easy it is to make code self-evident!

Action Item

Open your latest coding masterpiece in your favorite editor. Take a good hard look at it. Sorry, it’s a steaming pile of incomprehensibility! (Join the club.)

Go fix it!

Memory Safety in a Systems Programming Language Part 3

The first entry in this series shows how to use the new DIP1000 rules to have slices and pointers refer to the stack, all while being memory safe. The second entry in this series teaches about the ref storage class and how DIP1000 works with aggregate types (classes, structs, and unions).

So far the series has deliberately avoided templates and auto functions. This kept the first two posts simpler in that they did not have to deal with function attribute inference, which I have referred to as “attribute auto inference” in earlier posts. However, both auto functions and templates are very common in D code, so a series on DIP1000 can’t be complete without explaining how those features work with the language changes. Function attribute inference is our most important tool in avoiding so-called “attribute soup”, where a function is decorated with several attributes, which arguably decreases readability.

We will also dig deeper into unsafe code. The previous two posts in this series focused on the scope attribute, but this post is more focused on attributes and memory safety in general. Since DIP1000 is ultimately about memory safety, we can’t get around discussing those topics.

Avoiding repetition with attributes

Function attribute inference means that the language will analyze the body of a function and will automatically add the @safe, pure, nothrow, and @nogc attributes where applicable. It will also attempt to add scope or return scope attributes to parameters and return ref to ref parameters that can’t otherwise be compiled. Some attributes are never inferred. For instance, the compiler will not insert any ref, lazy, out or @trusted attributes, because very likely they are explicitly not wanted where they are left out.

There are many ways to turn on function attribute inference. One is by omitting the return type in the function signature. Note that the auto keyword is not required for this. auto is a placeholder keyword used when no return type, storage class, or attribute is specified. For example, the declaration half(int x) { return x/2; } does not parse, so we use auto half(int x) { return x/2; } instead. But we could just as well write @safe half(int x) { return x/2; } and the rest of the attributes (pure, nothrow, and @nogc) will be inferred just as they are with the auto keyword.

The second way to enable attribute inference is to templatize the function. With our half example, it can be done this way:

int divide(int denominator)(int x) { return x/denominator; }
alias half = divide!2;

The D spec does not say that a template must have any parameters. An empty parameter list can be used to turn attribute inference on: int half()(int x) { return x/2; }. Calling this function doesn’t even require the template instantiation syntax at the call site, e.g., half!()(12) is not required as half(12) will compile.

Another means to turn on attribute inference is to store the function inside another function. These are called nested functions. Inference is enabled not only on functions nested directly inside another function but also on most things nested in a type or a template inside the function. Example:

@safe void parentFun()
{
    // This is auto-inferred.
    int half(int x){ return x/2; }

    class NestedType
    {
        // This is auto inferred
        final int half1(int x) { return x/2; }

        // This is not auto inferred; it's a
        // virtual function and the compiler
        // can't know if it has an unsafe override
        // in a derived class.
        int half2(int x) { return x/2; }
    }

    int a = half(12); // Works. Inferred as @safe.
    auto cl = new NestedType;
    int b = cl.half1(18); // Works. Inferred as @safe.
    int c = cl.half2(26); // Error.
}

A downside of nested functions is that they can only be used in lexical order (the call site must be below the function declaration) unless both the nested function and the call are inside the same struct, class, union, or template that is in turn inside the parent function. Another downside is that they don’t work with Uniform Function Call Syntax.

Finally, attribute inference is always enabled for function literals (a.k.a. lambda functions). The halving function would be defined as enum half = (int x) => x/2; and called exactly as normal. However, the language does not consider this declaration a function. It considers it a function pointer. This means that in global scope it’s important to use enum or immutable instead of auto. Otherwise, the lambda can be changed to something else from anywhere in the program and cannot be accessed from pure functions. In rare cases, such mutability can be desirable, but most often it is an antipattern (like global variables in general).

Limits of inference

Aiming for minimal manual typing isn’t always wise. Neither is aiming for maximal attribute bloat.

The primary problem of auto inference is that subtle changes in the code can lead to inferred attributes turning on and off in an uncontrolled manner. To see when it matters, we need to have an idea of what will be inferred and what will not.

The compiler in general will go to great lengths to infer @safe, pure, nothrow, and @nogc attributes. If your function can have those, it almost always will. The specification says that recursion is an exception: a function calling itself should not be @safe, pure, or nothrow unless explicitly specified as such. But in my testing, I found those attributes actually are inferred for recursive functions. It turns out, there is an ongoing effort to get recursive attribute inference working, and it partially works already.

Inference of scope and return on function parameters is less reliable. In the most mundane cases, it’ll work, but the compiler gives up pretty quickly. The smarter the inference engine is, the more time it takes to compile, so the current design decision is to infer those attributes in only the simplest of cases.

Where to let the compiler infer?

A D programmer should get into the habit of asking, “What will happen if I mistakenly do something that makes this function unsafe, impure, throwing, garbage-collecting, or escaping?” If the answer is “immediate compiler error”, auto inference is probably fine. On the other hand, the answer could be “user code will break when updating this library I’m maintaining”. In that case, annotate manually.

In addition to the potential of losing attributes the author intends to apply, there is also another risk:

@safe pure nothrow @nogc firstNewline(string from)
{
    foreach(i; 0 .. from.length) switch(from[i])
    {
        case '\r':
        if(from.length > i+1 && from[i+1] == '\n')
        {
            return "\r\n";
        }
        else return "\r";

        case '\n': return "\n";

        default: break;
    }

    return "";
}

You might think that since the author is manually specifying the attributes, there’s no problem. Unfortunately, that’s wrong. Suppose the author decides to rewrite the function such that all the return values are slices of the from parameter rather than string literals:

@safe pure nothrow @nogc firstNewline(string from)
{
    foreach(i; 0 .. from.length) switch(from[i])
    {
        case '\r':
        if (from.length > i + 1 && from[i + 1] == '\n')
        {
            return from[i .. i + 2];
        }
        else return from[i .. i + 1];

        case '\n': return from[i .. i + 1];

        default: break;
    }

    return "";
}

Surprise! The parameter from was previously inferred as scope, and a library user was relying on that, but now it’s inferred as return scope instead, breaking client code.

Still, for internal functions, auto inference is a great way to save both our fingers when writing and our eyes when reading. Note that it’s perfectly fine to rely on auto inference of the @safe attribute as long as the function is used in explicitly in @safe functions or unit tests. If something potentially unsafe is done inside the auto-inferred function, it gets inferred as @system, not @trusted. Calling a @system function from a @safe function results in a compiler error, meaning auto inference is safe to rely on in this case.

It still sometimes makes sense to manually apply attributes to internal functions, because the error messages generated when they are violated tend to be better with manual attributes.

What about templates?

Auto inference is always enabled for templated functions. What if a library interface needs to expose one? There is a way to block the inference, albeit an ugly one:

private template FunContainer(T)
{
    // Not auto inferred
    // (only eponymous template functions are)
    @safe T fun(T arg){return arg + 3;}
}

// Auto inferred per se, but since the function it calls
// is not, only @safe is inferred.
auto addThree(T)(T arg){return FunContainer!T.fun(arg);}

However, which attributes a template should have often depends on its compile-time parameters. It would be possible to use metaprogramming to designate attributes depending on the template parameters, but that would be a lot of work, hard to read, and easily as error-prone as relying on auto inference.

It’s more practical to just test that the function template infers the wanted attributes. Such testing doesn’t have to, and probably shouldn’t, be done manually each time the function is changed. Instead:

float multiplyResults(alias fun)(float[] arr)
    if (is(typeof(fun(new float)) : float))
{
    float result = 1.0f;
    foreach (ref e; arr) result *= fun(&e);
    return result;
}

@safe pure nothrow unittest
{
    float fun(float* x){return *x+1;}
    // Using a static array makes sure
    // arr argument is inferred as scope or
    // return scope
    float[5] elements = [1.0f, 2.0f, 3.0f, 4.0f, 5.0f];

    // No need to actually do anything with
    // the result. The idea is that since this
    // compiles, multiplyResults is proven @safe
    // pure nothrow, and its argument is scope or
    // return scope.
    multiplyResults!fun(elements);
}

Thanks to D’s compile-time introspection powers, testing against unwanted attributes is also covered:

@safe unittest
{
    import std.traits : attr = FunctionAttribute,
        functionAttributes, isSafe;

    float fun(float* x)
    {
        // Makes the function both throwing
        // and garbage collector dependant.
        if (*x > 5) throw new Exception("");
        static float* impureVar;

        // Makes the function impure.
        auto result = impureVar? *impureVar: 5;

        // Makes the argument unscoped.
        impureVar = x;
        return result;
    }

    enum attrs = functionAttributes!(multiplyResults!fun);

    assert(!(attrs & attr.nothrow_));
    assert(!(attrs & attr.nogc));

    // Checks against accepting scope arguments.
    // Note that this check does not work with
    // @system functions.
    assert(!isSafe!(
    {
        float[5] stackFloats;
        multiplyResults!fun(stackFloats[]);
    }));

    // It's a good idea to do positive tests with
    // similar methods to make sure the tests above
    // would fail if the tested function had the
    // wrong attributes.
    assert(attrs & attr.safe);
    assert(isSafe!(
    {
        float[] heapFloats;
        multiplyResults!fun(heapFloats[]);
    }));
}

If assertion failures are wanted at compile time before the unit tests are run, adding the static keyword before each of those asserts will get the job done. Those compiler errors can even be had in non-unittest builds by converting that unit test to a regular function, e.g., by replacing @safe unittest with, say, private @safe testAttrs().

The live fire exercise: @system

Let’s not forget that D is a systems programming language. As this series has shown, in most D code the programmer is well protected from memory errors, but D would not be D if it didn’t allow going low-level and bypassing the type system in the same manner as C or C++: bit arithmetic on pointers, writing and reading directly to hardware ports, executing a struct destructor on a raw byte blob… D is designed to do all of that.

The difference is that in C and C++ it takes only one mistake to break the type system and cause undefined behavior anywhere in the code. A D programmer is only at risk when not in a @safe function, or when using dangerous compiler switches such as -release or -check=assert=off (failing a disabled assertion is undefined behavior), and even then the semantics tend to be less UB-prone. For example:

float cube(float arg)
{
    float result;
    result *= arg;
    result *= arg;
    return result;
}

This is a language-agnostic function that compiles in C, C++, and D. Someone intended to calculate the cube of arg but forgot to initialize result with arg. In D, nothing dangerous happens despite this being a @system function. No initialization value means result is default initialized to NaN (not-a-number), which leads to the result also being NaN, which is a glaringly obvious “error” value when using this function the first time.

However, in C and C++, not initializing a local variable means reading it is (sans a few narrow exceptions) undefined behavior. This function does not even handle pointers, yet according to the standard, calling this function could just as well have *(int*) rand() = 0XDEADBEEF; in it, all due to a trivial mistake. While many compilers with enabled warnings will catch this one, not all do, and these languages are full of similar examples where even warnings don’t help.

In D, even if you explicitly requested no default initialization with float result = void, it’d just mean the return value of the function is undefined, not anything and everything that happens if the function is called. Consequently, that function could be annotated @safe even with such an initializer.

Still, for anyone who cares about memory safety, as they probably should for anything intended for a wide audience, it’s a bad idea to assume that D @system code is safe enough to be the default mode. Two examples will demonstrate what can happen.

What undefined behavior can do

Some people assume that “Undefined Behavior” simply means “erroneous behavior” or crashing at runtime. While that is often what ultimately happens, undefined behavior is far more dangerous than, say, an uncaught exception or an infinite loop. The difference is that with undefined behavior, you have no guarantees at all about what happens. This might not sound any worse than an infinite loop, but an accidental infinite loop is discovered the first time it’s entered. Code with undefined behavior, on the other hand, might do what was intended when it’s tested, but then do something completely different in production. Even if the code is tested with the same flags it’s compiled with in production, the behavior may change from one compiler version to another, or when making completely unrelated changes to the code. Time for an example:

// return whether the exception itself is in the array
bool replaceExceptions(Object[] arr, ref Exception e)
{
    bool result;
    foreach (ref o; arr)
    {
        if (&o is &e) result = true;
        if (cast(Exception) o) o = e;
    }

    return result;
}

The idea here is that the function replaces all exceptions in the array with e. If e itself is in the array, it returns true, otherwise false. And indeed, testing confirms it works. The function is used like this:

auto arr = [new Exception("a"), null, null, new Exception("c")];
auto result = replaceExceptions
(
    cast(Object[]) arr,
    arr[3]
);

This cast is not a problem, right? Object references are always of the same size regardless of their type, and we’re casting the exceptions to the parent type, Object. It’s not like the array contains anything other than object references.

Unfortunately, that’s not how the D specification views it. Having two class references (or any references, for that matter) in the same memory location but with different types, and then assigning one of them to the other, is undefined behavior. That’s exactly what happens in

if (cast(Exception) o) o = e;

if the array does contain the e argument. Since true can only be returned when undefined behavior is triggered, it means that any compiler would be free to optimize replaceExceptions to always return false. This is a dormant bug no amount of testing will find, but that might, years later, completely mess up the application when compiled with the powerful optimizations of an advanced compiler.

It may seem that requiring a cast to use a function is an obvious warning sign that a good D programmer would not ignore. I wouldn’t be so sure. Casts aren’t that rare even in fine high-level code. Even if you disagree, other cases are provably bad enough to bite anyone. Last summer, this case appeared in the D forums:

string foo(in string s)
{
    return s;
}

void main()
{
    import std.stdio;
    string[] result;
    foreach(c; "hello")
    {
        result ~= foo([c]);
    }
    writeln(result);
}

This problem was encountered by Steven Schveighoffer, a long-time D veteran who has himself lectured about @safe and @system on more than one occasion. Anything that can burn him can burn any of us.

Normally, this works just as one would think and is fine according to the spec. However, if one enables another soon-to-be-default language feature with the -preview=in compiler switch along with DIP1000, the program starts malfunctioning. The old semantics for in are the same as const, but the new semantics make it const scope.

Since the argument of foo is scope, the compiler assumes that foo will copy [c] before returning it, or return something else, and therefore it allocates [c] on the same stack position for each of the “hello” letters. The result is that the program prints ["o", "o, "o", "o", "o"]. At least for me, it’s already somewhat hard to understand what’s happening in this simple example. Hunting down this sort of bug in a complex codebase could be a nightmare.

(With my nightly DMD version somewhere between 2.100 and 2.101 a compile-time error is printed instead. With 2.100.2, the example runs as described above.)

The fundamental problem in both of these examples is the same: @safe is not used. Had it been, both of these undefined behaviors would have resulted in compilation errors (the replaceExceptions function itself can be @safe, but the cast at the usage site cannot). By now it should be clear that @system code should be used sparingly.

When to proceed anyway

Sooner or later, though, the time comes when the guard rail has to be temporarily lowered. Here’s an example of a good use case:

/// Undefined behavior: Passing a non-null pointer
/// to a standalone character other than '\0', or
/// to an array without '\0' at or after the
/// pointed character, as utf8Stringz
extern(C) @system pure
bool phobosValidateUTF8(const char* utf8Stringz)
{
    import std.string, std.utf;

    try utf8Stringz.fromStringz.validate();
    catch (UTFException) return false;

    return true;
}

This function lets code written in another language validate a UTF-8 string using Phobos. C being C, it tends to use zero-terminated strings, so the function accepts a pointer to one as the argument instead of a D array. This is why the function has to be unsafe. There is no way to safely check that utf8Stringz is pointing to either null or a valid C string. If the character being pointed to is not '\0', meaning the next character has to be read, the function has no way of knowing whether that character belongs to the memory allocated for the string. It can only trust that the calling code got it right.

Still, this function is a good use of the @system attribute. First, it is presumably called primarily from C or C++. Those languages do not get any safety guarantees anyway. Even a @safe function is safe only if it gets only those parameters that can be created in @safe D code. Passing cast(const char*) 0xFE0DA1 as an argument to a function is unsafe no matter what the attribute says, and nothing in C or C++ verifies what arguments are passed.

Second, the function clearly documents the cases that would trigger undefined behavior. However, it does not mention that passing an invalid pointer, such as the aforementioned cast(const char*) 0xFE0DA1, is UB, because UB is always the default assumption with @system-only values unless it can be shown otherwise.

Third, the function is small and easy to review manually. No function should be needlessly big, but it’s many times more important than usual to keep @system and @trusted functions small and simple to review. @safe functions can be debugged to pretty good shape by testing, but as we saw earlier, undefined behavior can be immune to testing. Analyzing the code is the only general answer to UB.

There is a reason why the parameter does not have a scope attribute. It could have it, no pointers to the string are escaped. However, it would not provide many benefits. Any code calling the function has to be @system, @trusted, or in a foreign language, meaning they can pass a pointer to the stack in any case. scope could potentially improve the performance of D client code in exchange for the increased potential for undefined behavior if this function is erroneously refactored. Such a tradeoff is unwanted in general unless it can be shown that the attribute helps with a performance problem. On the other hand, the attribute would make it clearer for the reader that the string is not supposed to escape. It’s a difficult judgment call whether scope would be a wise addition here.

Further improvements

It should be documented why a @system function is @system when it’s not obvious. Often there is a safer alternative—our example function could have taken a D array or the CString struct from the previous post in this series. Why was an alternative not taken? In our case, we could write that the ABI would be different for either of those options, complicating matters on the C side, and the intended client (C code) is unsafe anyway.

@trusted functions are like @system functions, except they can be called from @safe functions, whereas @system functions cannot. When something is declared @trusted, it means the authors have verified that it’s just as safe to use as an actual @safe function with any arguments that can be created within safe code. They need to be just as carefully reviewed, if not more so, as @system functions.

In these situations, it should be documented (for other developers, not users) how the function was deemed to be safe in all situations. Or, if the function is not fully safe to use and the attribute is just a temporary hack, it should have a big ugly warning about that.

Such greenwashing is of course highly discouraged, but if there’s a codebase full of @system code that’s just too difficult to make @safe otherwise, it’s better than giving up. Even as we often talk about the dangers of UB and memory corruption, in our actual work our attitudes tend to be much more carefree, meaning such codebases are unfortunately common.

It might be tempting to define a small @trusted function inside a bigger @safe function to do something unsafe without disabling checks for the whole function:

extern(C) @safe pure
bool phobosValidateUTF8(const char* utf8Stringz)
{
    import std.string, std.utf;

    try (() @trusted => utf8Stringz.fromStringz)()
      .validate();
    catch (UTFException) return false;

    return true;
}

Keep in mind though, that the parent function needs to be documented and reviewed like an overt @trusted function because the encapsulated @trusted function can let the parent function do anything. In addition, since the function is marked @safe, it isn’t obvious on a first look that it’s a function that needs special care. Thus, a visible warning comment is needed if you elect to use @trusted like this.

Most importantly, don’t trust yourself! Just like any codebase of non-trivial size has bugs, more than a handful of @system functions will include latent UB at some point. The remaining hardening features of D, meaning asserts, contracts, invariants, and bounds checking should be used aggressively and kept enabled in production. This is recommended even if the program is fully @safe. In addition, a project with a considerable amount of unsafe code should use external tools like LLVM address sanitizer and Valgrind to at least some extent.

Note that the idea in many of these hardening tools, both those in the language and those of the external tools, is to crash as soon as any fault is detected. It decreases the chance of any surprise from undefined behavior doing more serious damage.

This requires that the program is designed to accept a crash at any moment. The program must never hold such amounts of unsaved data that there would be any hesitation in crashing it. If it controls anything important, it must be able to regain control after being restarted by a user or another process, or it must have another backup program. Any program that “can’t afford” to run potentially crashing checks is in no business to be trusted with systems programming either.

Conclusion

That concludes this blog series on DIP1000. There are some topics related to DIP1000 we have left up to the readers to experiment with themselves, such as associative arrays. Still, this should be enough to get them going.

Though we have uncovered some practical tips in addition to language rules, there surely is a lot more that could be said. Tell us your memory safety tips in the D forums!

Thanks to Walter Bright and Dennis Korpel for providing feedback on this article.

Memory Safety in a Modern Systems Programming Language Part 2

DIP1000: Memory Safety in a Modern System Programming Language Pt. 2

The previous entry in this series shows how to use the new DIP1000 rules to have slices and pointers refer to the stack, all while being memory safe. But D can refer to the stack in other ways, too, and that’s the topic of this article.

Object-oriented instances are the easiest case

In Part 1, I said that if you understand how DIP1000 works with pointers, then you understand how it works with classes. An example is worth more than mere words:

@safe Object ifNull(return scope Object a, return scope Object b)
{
    return a? a: b;
}

The return scope in the above example works exactly as it does in the following:

@safe int* ifNull(return scope int* a, return scope int* b)
{
    return a? a: b;
}

The principle is: if the scope or return scope storage class is applied to an object in a parameter list, the address of the object instance is protected just as if the parameter were a pointer to the instance. From the perspective of machine code, it is a pointer to the instance.

From the point of view of regular functions, that’s all there is to it. What about member functions of a class or an interface? This is how it’s done:

interface Talkative
{
    @safe const(char)[] saySomething() scope;
}

class Duck : Talkative
{
    char[8] favoriteWord;
    @safe const(char)[] saySomething() scope
    {
        import std.random : dice;

        // This wouldn't work
        // return favoriteWord[];

        // This does
        return favoriteWord[].dup;

        // Also returning something totally
        // different works. This
        // returns the first entry 40% of the time,
        // The second entry 40% of the time, and
        // the third entry the rest of the time.
        return
        [
            "quack!",
            "Quack!!",
            "QUAAACK!!!"
        ][dice(2,2,1)];
    }
}

scope positioned either before or after the member function name marks the this reference as scope, preventing it from leaking out of the function. Because the address of the instance is protected, nothing that refers directly to the address of the fields is allowed to escape either. That’s why return favoriteWord[] is disallowed; it’s a static array stored inside the class instance, so the returned slice would refer directly to it. favoriteWord[].dup on the other hand returns a copy of the data that isn’t located in the class instance, which is why it’s okay.

Alternatively one could replace the scope attributes of both Talkative.saySomething and Duck.saySomething with return scope, allowing the return of favoriteWord without duplication.

DIP1000 and Liskov Substitution Principle

The Liskov substitution principle states, in simplified terms, that an inherited function can give the caller more guarantees than its parent function, but never fewer. DIP1000-related attributes fall in that category. The rule works like this:

  • if a parameter (including the implicit this reference) in the parent functions has no DIP1000 attributes, the child function may designate it scope or return scope
  • if a parameter is designated scope in the parent, it must be designated scope in the child
  • if a parameter is return scope in the parent, it must be either scope or return scope in the child

If there is no attribute, the caller can not assume anything; the function might store the address of the argument somewhere. If return scope is present, the caller can assume the address of the argument is not stored other than in the return value. With scope, the guarantee is that the address is not stored anywhere, which is an even stronger guarantee. Example:

class C1
{   double*[] incomeLog;
    @safe double* imposeTax(double* pIncome)
    {
        incomeLog ~= pIncome;
        return new double(*pIncome * .15);
    }
}

class C2 : C1
{
    // Okay from language perspective (but maybe not fair
    // for the taxpayer)
    override @safe double* imposeTax
        (return scope double* pIncome)
    {
        return pIncome;
    }
}

class C3 : C2
{
    // Also okay.
    override @safe double* imposeTax
        (scope double* pIncome)
    {
        return new double(*pIncome * .18);
    }
}

class C4: C3
{
    // Not okay. The pIncome parameter of C3.imposeTax
    // is scope, and this tries to relax the restriction.
    override @safe double* imposeTax
        (double* pIncome)
    {
        incomeLog ~= pIncome;
        return new double(*pIncome * .16);
    }
}

The special pointer, ref

We still have not uncovered how to use structs and unions with DIP1000. Well, obviously we’ve uncovered pointers and arrays. When referring to a struct or a union, they work the same as they do when referring to any other type. But pointers and arrays are not the canonical way to use structs in D. They are most often passed around by value, or by reference when bound to ref parameters. Now is a good time to explain how ref works with DIP1000.

They don’t work like just any pointer. Once you understand ref, you can use DIP1000 in many ways you otherwise could not.

A simple ref int parameter

The simplest possible way to use ref is probably this:

@safe void fun(ref int arg) {
    arg = 5;
}

What does this mean? ref is internally a pointer—think int* pArg—but is used like a value in the source code. arg = 5 works internally like *pArg = 5. Also, the client calls the function as if the argument were passed by value:

auto anArray = [1,2];
fun(anArray[1]); // or, via UFCS: anArray[1].fun;
// anArray is now [1, 5]

instead of fun(&anArray[1]). Unlike C++ references, D references can be null, but the application will instantly terminate with a segmentation fault if a null ref is used for something other than reading the address with the & operator. So this:

int* ptr = null;
fun(*ptr);

…compiles, but crashes at runtime because the assignment inside fun lands at the null address.

The address of a ref variable is always guarded against escape. In this sense @safe void fun(ref int arg){arg = 5;} is like @safe void fun(scope int* pArg){*pArg = 5;}. For example, @safe int* fun(ref int arg){return &arg;} will not compile, just like @safe int* fun(scope int* pArg){return pArg;} will not.

There is a return ref storage class, however, that allows returning the address of the parameter but no other form of escape, just like return scope. This means that @safe int* fun(return ref int arg){return &arg;} works.

reference to a reference

reference to an int or similar type already allows much nicer syntax than one can get with pointers. But the real power of ref shows when it refers to a type that is a reference itself—a pointer or a class, for instance. scope or return scope can be applied to a reference that is referenced to by ref. For example:

@safe float[] mergeSort(ref return scope float[] arr)
{
    import std.algorithm: merge;
    import std.array : Appender;

    if(arr.length < 2) return arr;

    auto firstHalf = arr[0 .. $/2];
    auto secondHalf = arr[$/2 .. $];

    Appender!(float[]) output;
    output.reserve(arr.length);

    foreach
    (
        el;
        firstHalf.mergeSort
        .merge!floatLess(secondHalf.mergeSort)
    )   output ~= el;

    arr = output[];
    return arr;
}

@safe bool floatLess(float a, float b)
{
    import std.math: isNaN;

    return a.isNaN? false:
          b.isNaN? true:
          a<b;
}

mergeSort here guarantees it won’t leak the address of the floats in arr except in the return value. This is the same guarantee that would be had from a return scope float[] arr parameter. But at the same time, because arr is a ref parameter, mergeSort can mutate the array passed to it. Then the client can write:

float[] values = [5, 1.5, 0, 19, 1.5, 1];
values.mergeSort;

With a non-ref argument, the client would have to write values = values.sort instead (not using ref would be a perfectly reasonable API in this case, because we do not always want to mutate the original array). This is something that cannot be accomplished with pointers, because return scope float[]* arr would protect the address of the array’s metadata (the length and ptr fields of the array), not the address of it’s contents.

It is also possible to have a returnable ref argument to a scope reference. Since this example has a unit test, remember to use the -unittest compile flag to include it in the compiled binary.

@safe ref Exception nullify(return ref scope Exception obj)
{
    obj = null;
    return obj;
}

@safe unittest
{
    scope obj = new Exception("Error!");
    assert(obj.msg == "Error!");
    obj.nullify;
    assert(obj is null);
    // Since nullify returns by ref, we can assign
    // to it's return value.
    obj.nullify = new Exception("Fail!");
    assert(obj.msg == "Fail!");
}

Here we return the address of the argument passed to nullify, but still guard both the address of the object pointer and the address of the class instance against being leaked by other channels.

return is a free keyword that does not mandate ref or scope to follow it. What does void* fun(ref scope return int*) mean then? The spec states that return without a trailing scope is always treated as ref return. This example thus is equivalent to void* fun(return ref scope int*). However, this only applies if there is reference to bind to. Writing void* fun(scope return int*) means void* fun(return scope int*). It’s even possible to write void* fun(return int*) with the latter meaning, but I leave it up to you to decide whether this qualifies as conciseness or obfuscation.

Member functions and ref

ref and return ref often require careful consideration to keep track of which address is protected and what can be returned. It takes some experience to get confortable with them. But once you do, understanding how structs and unions work with DIP1000 is pretty straightforward.

The major difference to classes is that where the this reference is just a regular class reference in class member functions, this in a struct or union member function is ref StructOrUnionName.

union Uni
{
    int asInt;
    char[4] asCharArr;

    // Return value contains a reference to
    // this union, won't escape references
    // to it via any other channel
    @safe char[] latterHalf() return
    {
        return asCharArr[2 .. $];
    }

    // This argument is implicitly ref, so the
    // following means the return value does
    // not refer to this union, and also that
    // we don't leak it in any other way.
    @safe char[] latterHalfCopy()
    {
        return latterHalf.dup;
    }
}

Note that return ref should not be used with the this argument. char[] latterHalf() return ref fails to parse. The language already has to understand what ref char[] latterHalf() return means: the return value is a reference. The “ref” in return ref would be redundant anyway.

Note that we did not use the scope keyword here. scope would be meaningless with this union, because it does not contain references to anything. Just like it is meaningless to have a scope ref int, or a scope int function argument. scope makes sense only for types that refer to memory elsewhere.

scope in a struct or union means the same thing as it means in a static array. It means that the memory its members refer to cannot be escaped. Example:

struct CString
{
    // We need to put the pointer in an anonymous
    // union with a dummy member, otherwise @safe user
    // code could assign ptr to point to a character
    // not in a C string.
    union
    {
        // Empty string literals get optimised to null pointers by D
        // compiler, we have to do this for the .init value to really point to
        // a '\0'.
        immutable(char)* ptr = &nullChar;
        size_t dummy;
    }

    // In constructors, the "return value" is the
    // constructed data object. Thus, the return scope
    // here makes sure this struct won't live longer
    // than the memory in arr.
    @trusted this(return scope string arr)
    {
        // Note: Normal assert would not do! They may be
        // removed from release builds, but this assert
        // is necessary for memory safety so we need
        // to use assert(0) instead which never gets
        // removed.
        if(arr[$-1] != '\0') assert(0, "not a C string!");
        ptr = arr.ptr;
    }

    // The return value refers to the same memory as the
    // members in this struct, but we don't leak references
    // to it via any other way, so return scope.
    @trusted ref immutable(char) front() return scope
    {
        return *ptr;
    }

    // No references to the pointed-to array passed
    // anywhere.
    @trusted void popFront() scope
    {
        // Otherwise the user could pop past the
        // end of the string and then read it!
        if(empty) assert(0, "out of bounds!");
        ptr++;
    }

    // Same.
    @safe bool empty() scope
    {
        return front == '\0';
    }
}

immutable nullChar = '\0';

@safe unittest
{
    import std.array : staticArray;

    auto localStr = "hello world!\0".staticArray;
    auto localCStr = localStr.CString;
    assert(localCStr.front == 'h');

    static immutable(char)* staticPtr;

    // Error, escaping reference to local.
    // staticPtr = &localCStr.front();

    // Fine.
    staticPtr = &CString("global\0").front();

    localCStr.popFront;
    assert(localCStr.front == 'e');
    assert(!localCStr.empty);
}

Part One said that @trusted is a terrible footgun with DIP1000. This example demonstrates why. Imagine how easy it’d be to use a regular assert or forget about them totally, or overlook the need to use the anonymous union. I think this struct is safe to use, but it’s entirely possible I overlooked something.

Finally

We almost know all there is to know about using structs, unions, and classes with DIP1000. We have two final things to learn today.

But before that, a short digression regarding the scope keyword. It is not used for just annotating parameters and local variables as illustrated. It is also used for scope classes and scope guard statements. This guide won’t be discussing those, because the former feature is deprecated, and the latter is not related to DIP1000 or control of variable lifetimes. The point of mentioning them is to dispel a potential misconception that scope always means limiting the lifetime of something. Learning about scope guard statements is still a good idea, as it’s a useful feature.

Back to the topic. The first thing is not really specific to structs or classes. We discussed what return, return ref, and return scope usually mean, but there’s an alternative meaning to them. Consider:

@safe void getFirstSpace
(
    ref scope string result,
    return scope string where
)
{
    //...
}

The usual meaning of the return attribute makes no sense here, as the function has a void return type. A special rule applies in this case: if the return type is void, and the first argument is ref or out, any subsequent return [ref/scope] is assumed to be escaped by assigning to the first argument. With struct member functions, they are assumed to be assigned to the struct itself.

@safe unittest
{
    static string output;
    immutable(char)[8] input = "on stack";
    //Trying to assign stack contents to a static
    //variable. Won't compile.
    getFirstSpace(output, input);
}

Since out came up, it should be said it would be a better choice for result here than ref. out works like ref, with the one difference that the referenced data is automatically default-initialized at the beginning of the function, meaning any data to which the out parameter refers is guaranteed to not affect the function.

The second thing to learn is that scope is used by the compiler to optimize class allocations inside function bodies. If a new class is used to initialize a scope variable, the compiler can put it on the stack. Example:

class C{int a, b, c;}
@safe @nogc unittest
{
    // Since this unittest is @nogc, this wouldn't
    // compile without the scope optimization.
    scope C c = new C();
}

This feature requires using the scope keyword explicitly. Inference of scope does not work, because initializing a class this way does not normally (meaning, without the @nogc attribute) mandate limiting the lifetime of c. The feature currently works only with classes, but there is no reason it couldn’t work with newed struct pointers and array literals too.

Until next time

This is pretty much all that there is to manual DIP1000 usage. But this blog series shall not be over yet! DIP1000 is not intended to always be used explicitly—it works with attribute inference. That’s what the next post will cover.

It will also cover some considerations when daring to use @trusted and @system code. The need for dangerous systems programming exists and is part of the D language domain. But even systems programming is a responsible affair when people do what they can to minimize risks. We will see that even there it’s possible to do a lot.

Thanks to Walter Bright and Dennis Korpel for reviewing this article

Memory Safety in a Modern Systems Programming Language Part 1

Memory safety needs no checks

D is both a garbage-collected programming language and an efficient raw memory access language. Modern high-level languages like D are memory safe, preventing users from accidently reading or writing to unused memory or breaking the type system of the language.

As a systems programming language, not all of D can give such guarantees, but it does have a memory-safe subset that uses the garbage collector to take care of memory management much like Java, C#, or Go. A D codebase, even in a systems programming project, should aim to remain within that memory-safe subset where practical. D provides the @safe function attribute to verify that a function uses only memory-safe features of the language. For instance, try this.

@safe string getBeginning(immutable(char)* cString)
{
    return cString[0..3];
}

The compiler will refuse to compile this code. There’s no way to know what will result from the three-character slice of cString, which could be referring to an empty string (i.e., cString[0] is \0), a string with a length of 1, or even one or two characters without the terminating NUL. The result in those cases would be a memory violation.

@safe does not mean slow

Note that I said above that even a low-level systems programming project should use @safe where practical. How is that possible, given that such projects sometimes cannot use the garbage collector, a major tool used in D to guarantee memory safety?

Indeed, such projects must resort to memory-unsafe constructs every now and then. Even higher-level projects often have reasons to do so, as they want to create interfaces to C or C++ libraries, or avoid the garbage collector when indicated by runtime performance. But still, surprisingly large parts of code can be made @safe without using the garbage collector at all.

D can do this because the memory safe subset does not prevent raw memory access per se.

@safe void add(int* a, int* b, int* sum)
{
    *sum = *a + *b;
}

This compiles and is fully memory safe, despite dereferencing those pointers in the same completely unchecked way they are dereferenced in C. This is memory safe because @safe D does not allow creating an int* that points to unallocated memory areas, or to a float**, for instance. int* can point to the null address, but this is generally not a memory safety problem because the null address is protected by the operating system. Any attempt to dereference it would crash the program before any memory corruption can happen. The garbage collector isn’t involved, because D’s GC can only run if more memory is requestend from it, or if the collection is explicitly called.

D slices are similar. When indexed at runtime, they will check at runtime that the index is less than their length and that’s it. They will do no checking whatsoever on whether they are referring to a legal memory area. Memory safety is achieved by preventing creation of slices that could refer to illegal memory in the first place, as demonstrated in the first example of this article. And again, there’s no GC involved.

This enables many patterns that are memory-safe, efficient, and independent of the garbage collector.

struct Struct
{
    int[] slice;
    int* pointer;
    int[10] staticArray;
}

@safe @nogc Struct examples(Struct arg)
{
    arg.slice[5] = *arg.pointer;
    arg.staticArray[0..5] = arg.slice[5..10];
    arg.pointer = &arg.slice[8];
    return arg;
}

As demonstrated, D liberally lets one do unchecked memory handling in @safe code. The memory referred to by arg.slice and arg.pointer may be on the garbage collected heap, or it may be in the static program memory. There is no reason the language needs to care. The program will probably need to either call the garbage collector or do some unsafe memory management to allocate memory for the pointer and the slice, but handling already allocated memory does not need to do either. If this function needed the garbage collector, it would fail to compile because of the @nogc attribute.

However…

There’s a historical design flaw here in that the memory may also be on the stack. Consider what happens if we change our function a bit.

@safe @nogc Struct examples(Struct arg)
{
    arg.pointer = &arg.staticArray[8];
    arg.slice = arg.staticArray[0..8];
    return arg;
}

Struct arg is a value type. Its contents are copied to the stack when examples is called and can be ovewritten after the function returns. staticArray is also a value type. It’s copied along with the rest of the struct just as if there were ten integers in the struct instead. When we return arg, the contents of staticArray are copied to the return value, but ptr and slice continue to point to arg, not the returned copy!

But we have a fix. It allows one to write code just as performant in @safe functions as before, including references to the stack. It even enables a few formerly @system (the opposite of @safe) tricks to be written in a safe way. That fix is DIP1000. It’s the reason why this example already causes a deprecation warning by default if it’s compiled with the latest nightly dmd.

Born first, dead last

DIP1000 is a set of enhancements to the language rules regarding pointers, slices, and other references. The name stands for D Improvement Proposal number 1000, as that document is what the new rules were initially based on. One can enable the new rules with the preview compiler switch, -preview=dip1000. Existing code may need some changes to work with the new rules, which is why the switch is not enabled by default. It’s going to be the default in the future, so it’s best to enable it where possible and work to make code compatible with it where not.

The basic idea is to let people limit the lifetime of a reference (an array or pointer, for example). A pointer to the stack is not dangerous if it does not exist longer than the stack variable it is pointing to. Regular references continue to exist, but they can refer only to data with an unlimited lifetime—that is, garbage collected memory, or static or global variables.

Let’s get started

The simplest way to construct limited lifetime references is to assign to it something with a limited lifetime.

@safe int* test(int arg1, int arg2)
{
    int* notScope = new int(5);
    int* thisIsScope = &arg1;
    int* alsoScope; // Not initially scope...
    alsoScope = thisIsScope; // ...but this makes it so.

    // Error! The variable declared earlier is
    // considered to have a longer lifetime,
    // so disallowed.
    thisIsScope = alsoScope;

    return notScope; // ok
    return thisIsScope; // error
    return alsoScope; // error
}

When testing these examples, remember to use the compiler switch -preview=dip1000 and to mark the function @safe. The checks are not done for non-@safe functions.

Alternatively, the scope keyword can be explicitly used to limit the lifetime of a reference.

@safe int[] test()
{
    int[] normalRef;
    scope int[] limitedRef;

    if(true)
    {
        int[5] stackData = [-1, -2, -3, -4, -5];

        // Lifetime of stackData ends
        // before limitedRef, so this is
        // disallowed.
        limitedRef = stackData[];

        //This is how you do it
        scope int[] evenMoreLimited
            = stackData[];
    }

    return normalRef; // Okay.
    return limitedRef; // Forbidden.
}

If we can’t return limited lifetime references, how they are used at all? Easy. Remember, only the address of the data is protected, not the data itself. It means that we have many ways to pass scoped data out of the function.

@safe int[] fun()
{
    scope int[] dontReturnMe = [1,2,3];

    int[] result = new int[](dontReturnMe.length);
    // This copies the data, instead of having
    // result refer to protected memory.
    result[] = dontReturnMe[];
    return result;

    // Shorthand way of doing the same as above
    return dontReturnMe.dup;

    // Also you are not always interested
    // in the contents as a whole; you
    // might want to calculate something else
    // from them
    return
    [
        dontReturnMe[0] * dontReturnMe[1],
        cast(int) dontReturnMe.length
    ];
}

Getting interprocedural

With the tricks discussed so far, DIP1000 would be restricted to language primitives when handling limited lifetime references, but the scope storage class can be applied to function parameters, too. Because this guarantees the memory won’t be used after the function exits, local data references can be used as arguments to scope parameters.

@safe double average(scope int[] data)
{
    double result = 0;
    foreach(el; data) result += el;
    return result / data.length;
}

@safe double use()
{
    int[10] data = [1,2,3,4,5,6,7,8,9,10];
    return data[].average; // works!
}

Initially, it’s probably best to keep attribute auto inference off. Auto inference in general is a good tool, but it silently adds scope attributes to all parameters it can, meaning it’s easy to lose track of what’s happening. That makes the learning process a lot harder. Avoid this by always explicitly specifying the return type (or lack thereof with void or noreturn): @safe const(char[]) fun(int* val) as opposed to @safe auto fun(int* val) or @safe const fun(int* val). The function also must not be a template or inside a template. We’ll dig deeper on scope auto inference in a future post.

scope allows handling pointers and arrays that point to the stack, but forbids returning them. What if that’s the goal? Enter the return scope attribute:

//Being character arrays, strings also work with DIP1000.
@safe string latterHalf(return scope string arg)
{
    return arg[$/2 .. $];
}

@safe string test()
{
    // allocated in static program memory
    auto hello1 = "Hello world!";
    // allocated on the stack, copied from hello1
    immutable(char)[12] hello2 = hello1;

    auto result1 = hello1.latterHalf; // ok
    return result1; // ok

    auto result2 = hello2[].latterHalf; // ok
    // Nice try! result2 is scope and can't
    // be returned.
    return result2;
}

return scope parameters work by checking if any of the arguments passed to them are scope. If so, the return value is treated as a scope value that may not outlive any of the return scope arguments. If none are scope, the return value is treated as a global reference that can be copied freely. Like scope, return scope is conservative. Even if one does not actually return the address protected by return scope, the compiler will still perform the call site lifetime checks just as if one did.

scope is shallow

@safe void test()
{
    scope a = "first";
    scope b = "second";
    string[] arr = [a, b];
}

In test, initializing arr does not compile. This may be surprising given that the language automatically adds scope to a variable on initialization if needed.

However, consider what the scope on scope string[] arr would protect. There are two things it could potentially protect: the addresses of the strings in the array, or the addresses of the characters in the strings. For this assignment to be safe, scope would have to protect the characters in the strings, but it only protects the top-level reference, i.e., the strings in the array. Thus, the example does not work. Now change arr so that it’s a static array:

@safe void test()
{
    scope a = "first";
    scope b = "second";
    string[2] arr = [a, b];
}

This works because static arrays are not references. Memory for all of their elements is allocated in place on the stack (i.e., they contain their elements), as opposed to dynamic arrays which contain a reference to elements stored elsewhere. When a static array is scope, its elements are treated as scope. And since the example would not compile were arr not scope, it follows that scope is inferred.

Some practical tips

Let’s face it, the DIP1000 rules take time to understand, and many would rather spend that time coding something useful. The first and most important tip is: avoid non-@safe code like the plague if doable. Of course, this advice is not new, but it appears even more important with DIP1000. In a nutshell, the language does not check the validity of scope and return scope in a non-@safe function, but when calling those functions the compiler assumes that the attributes are respected.

This makes scope and return scope terrible footguns in unsafe code. But by resisting the temptation to mark code @trusted to avoid thinking, a D coder can hardly do damage. Misusing DIP1000 in @safe code can cause needless compilation errors, but it won’t corrupt memory and is unlikely to cause other bugs either.

A second important point worth mentioning is that there is no need for scope and return scope for function attributes if they receive only static or GC-allocated data. Many langauges do not let coders refer to the stack at all; just because D programmers can do so does not mean they must. This way, they don’t have to spend any more time solving compiler errors than they did before DIP1000. And if a desire to work with the stack arises after all, the authors can then return to annotate the functions. Most likely they will accomplish this without breaking the interface.

What’s next?

This concludes today’s blog post. This is enough to know how to use arrays and pointers with DIP1000. In principle, it also enables readers to use DIP1000 with classes and interfaces. The only thing to learn is that a class reference, including the this pointer in member functions, works with DIP1000 just like a pointer would. Still, it’s hard to grasp what that means from one sentence, so later posts shall illustrate the subject.

In any case, there is more to know. DIP1000 has some features for ref function parameters, structs, and unions that we didn’t cover here. We’ll also dig deeper on how DIP1000 plays with non-@safe functions and attribute auto inference. Currently, the plan is to do two more posts for this series.

Do let us know in the comment section or the D forums if you have any useful DIP1000 tips that were not covered!

Thanks to Walter Bright for reviewing this article.

Reducing Template Compile Times

Templates have been enormously profitable for the D programming language. They allow the programmer to generate efficient and correct code at compile time. Long gone are the days of preprocessor macros or handwritten, per-type data structures. D templates, though designed in the shadow of C++ templates, were not made in their image. D makes templates cleaner and more expressive, and also enables patterns like “Design by Introspection”.

Here is a simple example of a template that would require the use of preprocessor macros in C or C++:

template sizeOfTypeByName(string name)
{
  enum sizeOfTypeByName = mixin(name, ".sizeof");
}
unittest
{
  assert(sizeOfTypeByName!"int" == 4);
}

D’s templates are powerful tools but should not be used unthinkingly. Carelessness could result in long compile times or excessive code generation.

In this blog post, I introduce some simple concepts that can help in writing templates that minimize resource usage. Deeper intuition can also lead to the discovery of new abstractions or increased confidence in existing ones.

Read the memo

The D compiler memoizes template instantiations: if I instantiate MyTemplate!int once, the compiler produces an AST for that instantiation; if I instantiate that exact template again, the previous computation is reused.

As a demonstration, let’s write a generic addition function and use pragma(msg, ...) to print the number of instantiations at compile time. I’m going to use it twice with integers and twice with floating point numbers.

auto genericAdd(T)(const T x, const T y)
{
  pragma(msg, "genericAdd instantiated with ", T);
  return x + y;
}

// Instantiate with int
writeln(genericAdd!int(4, 5));
writeln(genericAdd!int(6, 1));

// Now for the float type
writeln(genericAdd!float(24.0, 32.0));
writeln(genericAdd!float(0.0, float.nan));

Now let’s compile the code and look at what our pragma(msg, ) says about template instantiations in the compiler.

dmd -c generic_add.d

This yields the following output during compilation:

genericAdd instantiated with int
genericAdd instantiated with float

We can see int and float as we expected, but notice that each is only mentioned once. Newcomers to languages with templates or generics can sometimes mistakenly think that using a template requires a potentially expensive instantiation on every use in the source code. For the benefit of those new users of D, the above is categoric proof that this is not the case; you cannot pay twice for templates you have already asked the compiler to instantiate. (You can, however, convince yourself that you are asking the compiler to do something it’s already done when you in fact are not. We’ll go over contrived and real-world examples of this later in this article.)

The benefit of this feature should be obvious, but what may not be obvious is how it can be employed in writing templates. Within the bounds of our desire for ergonomics, we should design the interfaces of our templates to maximize the number of identical instantiations.

What’s in a name?

The following example, adapted from a real-world change to a large D project, yielded a reduction in compile time of a few percent for unit-test builds.

Let’s say we have an expensive template whose behavior we want to test over a simple type. Our type might be:

struct Vector3
{
  float x;
  float y;
  float z;
}

To demonstrate the phenomenon, we don’t have to do anything fancy, so we’ll just declare a stub called send.

// Let's say this sends a value of type T to a database.
void send(T)(T x);

A note on syntax: Given a variable val of type int, this template could be explicitly instantiated as send!(int)(val). However, the compiler can infer the type T, so we can instantiate it as if it were a normal function call as send(val). Using D’s Uniform Function Call Syntax, we could alternatively call it like a property or member, as val.send() (the approach used in the following example), or even val.send, since parentheses are optional in function calls when there are no arguments.

Our test might then be something like:

struct Vector3
{
  float x;
  float y;
  float z;
}

unittest
{
  Vector3 value;
  value.send();
}

This is reasonable so far. However, an issue arises when we start to write more than one test. Should we want to test different behaviors of a fancy template, but instantiate it with the same type, then we end up spending more time in compilation than we would have expected. A lot more time. And we see large growth in the number of symbols emitted in the resulting binary, resulting in a larger file size than one would expect. Why is that?

Despite our intuition that the compiler should consider multiple declarations of a type like Vector3 in multiple unittest blocks as identical, it actually does not. We can demonstrate this effect with an extremely simple example. We’ll provide an implementation of send that prints at compile time the type of each instantiation. Then we’ll use static foreach to generate five distinct implementations of a single unit test.

void send(T)(T x)
{
  pragma(msg, T); 
}

// Generate 5 unittest blocks
static foreach(_; 0..5)
{
  unittest
  {
    struct JustInt
    {
      int x;
    }
    JustInt value;
    value.send;
  }
}

This results in the following output from the compiler:

JustInt
JustInt
JustInt
JustInt
JustInt

Huh? Doesn’t this violate our “you can’t pay twice” rule? If you were to take this output from the compiler as gospel, then yes, but there’s a more subtle truth here.

Fully qualified names

The name of a type as you would write it in your editor is not the complete name of a type. Let’s amend the implementation of send to print the return value of a template called fullyQualifiedName rather than printing T directly. The rest of the example remains the same.

void send(T)(T x)
{
  import std.traits : fullyQualifiedName;
  pragma(msg, fullyQualifiedName!T); 
}

Assuming the module is named example, this yields something like:

example.__unittest_L13_C3_1.JustInt
example.__unittest_L13_C3_2.JustInt
example.__unittest_L13_C3_3.JustInt
example.__unittest_L13_C3_4.JustInt
example.__unittest_L13_C3_5.JustInt

This explains our previous conundrum. By declaring the type locally in each test, we have actually declared a new type per test, each of which results in a new instantiation.

A type’s fully qualified name includes the name of its enclosing scope ({package-name.}module-name.{scope-name(s).}TypeName). The compiler rewrites each unittest as a unique function with a generated name. We have five unique functions, each with its own local, distinct declaration of a JustInt type. And so we end up with five distinct types.

We want to ensure that one instantiation is reused across unittest blocks. We do that by moving the declaration of JustInt to module scope, outside of the unit tests.

struct JustInt
{
  int x;
}

static foreach(_; 0..5)
{
  unittest
  {
    JustInt value;
    value.send;
  }
}

The send template now prints:

example.JustInt

Much better.

Some hard data

To collect some anecdata about the usefulness of these changes, we’ll look at compilation times and the size of compiled binaries. Since this template is very trivial, let’s generate a hundred copies of the same unittest rather than five so we can see a trend.

On my system, timing the compilation of our programs shows the locally declared types took 243ms to compile, but the version with a single global type declaration took 159ms to compile. A difference of 84ms is not all that much, sure, but in a large codebase, there may be a lot of these speedups waiting to be found. Any reduction in compile times is to be embraced, especially when it’s cumulative.

As for binary size, I saw a savings of 69K on disk. The quantity of machine code generated by the compiler is worth keeping a close eye on. Larger binaries mean more work for the linker, which in turn means more time waiting for builds to complete. The easiest job is the one you don’t have to do.

A more complex example

The following example demonstrates a very simple but fundamental change to a template that yields an enormous improvement in compile times and other metrics.

Let’s say we have a fairly simple interpreter, and we want to expose functions in our D code to the scripts executed by our interpreter. We can do that with some sort of registration function, which we’ll call register.

The signature of the register function

To prove the point I’m discussing, we don’t need to implement this function—its interface is what can cause a big slow down.

Let’s say our register function looks like this:

// Context is something our hypothetical interpreter works with
void register(alias func, string registeredName)(Context x); 

It’s pretty reasonable, right? It takes a template alias parameter that specifies the function to call (a common idiom in D) and a template value parameter of type string that represents the name of the function as it is exposed to scripts. The implementation of register will presumably map the value of registeredName to the func alias, and then scripts can call the function using that value. Functions can be registered with, e.g., the following:

context = createAContext();
context.register!(writeln!string, "writeln")();

The scripts can call the Phobos writeln function template using the name writeln.

The compile-time performance of the register template

The interface for register looks harmless, but it turns out that it has a significant impact on compile time. We can test this by registering some random functions. The actual contents of the functions don’t matter—this article is about template compile times, so we just want a baseline figure for roughly how much time the infrastructure templates take to compile rather than the code they are hooking together.

Although we will pull the functions out of a hat, the thing that will drive our intuition is to realize that a small number of interfaces will likely be reused many times. We could start with a basket of interface stubs like this:

int stub1(string) { return 1; }

int stub2(string) { return 2; }

int stub3(string) { return 3; }

// etc.

More broadly, with a bunch of functions that have identical signatures and a bunch of functions with random parameter lists and return types, we can get a rough baseline. With the set of stubs I used, compile times ended up at roughly 5 seconds.

So what happens if we move the compile-time parameters to run time? Since registeredName is a template value parameter, we can just move it into the function parameter list with no change. We have to handle the func parameter differently. Almost any symbol can bind at compile time to a template alias parameter, but symbols can’t bind at run time to function parameters. We have to use a function pointer instead. In that case, we can use the type of the referenced function as a template parameter.

void register(FuncType)(Context x, FuncType ptr, string registeredName);

With this signature, the compile time drops to roughly 1 second.

What’s going on?

D is a fairly fast language to compile. Good decisions have been made over the lifetime of the language to make that possible. It is also the case that one can happily write slow-to-compile D code. Although we are choosing to ignore the compilation speeds of the non-infrastructure code to simplify the point being made, this can actually (in a certain sense) be the case in real projects, too. As such it is worthwhile to pay attention not to the quantity of metaprogramming being done semantically but rather the quantity of metaprogramming being performed by the compiler.

In this case, with the first interface used for register, the compiler had no opportunity to reuse any instantiations. Because it accepted the registered functions as symbols, each instantiation was unique. By shifting instead to take the type of the registered function as a template parameter and a pointer to the function as a function parameter, the compiler could reuse instantiations. stub1, stub2, and stub3 are distinct symbols, but they each have the same type of (int function(string)).

To be clear, this is not an indictment of template alias parameters. There are good use cases for them (the Phobos algorithms API is an example). The point of this example is to show how the compile-time costs of unique template instantiations can be hidden. A decision about the trade-offs between compile-time and run-time performance can only be made if the programmer is aware there is a decision to make. So when implementing a template, consider how it will be used. If it’s going to end up creating many unique instantiations, then you can weigh the benefits of keeping that interface versus redesigning it to maximize reuse.

A false friend

In linguistics, a false friend is a pair of words from two different languages that look the same but have different meanings. I’m going to abuse this term by using it to refer to a pair of programming patterns that actually result in the same program behavior, but via different routes through the language implementation, i.e., one of these patterns has, say, worse performance or compile times than the other.

A simple example:

Let’s say we have a library that exposes a template as part of its API, like this one that prints a string when a module is loaded during run-time initialization:

template FunTemplate(string op)
{
  shared static this()
  {
    import std.stdio;
    writeln(op);
  }
}

// Use like this
mixin FunTemplate!"Hello from DLang";

Now let’s say we want to refactor the library in some way that it’s desirable to distinguish the name FunTemplate and its implementation. How would you go about doing that?

One way would be to tack Impl onto the implementation name, then declare an eponymous template that aliases the shortened name to the implementation name and forwards the template argument like this:

template FunTemplate(string op)
{
  alias FunTemplate = FunTemplateImpl!op;
}

This does the job, but it also creates an additional instantiation for each different value of op, one instance of FunTemplate, and one of FunTemplateImpl. So if we instantiate it with, e.g., five different values, we end up with ten unique instantiations. Now imagine doing that with a template that’s heavily used throughout a program.

Since we only want to provide an alternate name for the implementation and aren’t doing anything to the parameter list, we can achieve the same result without adding another template into the mix: just use alias by itself.

alias FunTemplate = FunTemplateImpl; 

Since FunTemplate is no longer a template, FunTemplate!"Foo" only creates the one instance of FunTemplateImpl.

Normalization of template arguments

Once we know what we want a template to look like, and we’re satisfied with the interface we want it to have, there are sometimes subtle ways to separate the interface and implementation of a template such that we can minimize the total amount of work the compiler has to do.

The definition of “work” in this context can be important to consider, as we can find ways to balance a tradeoff between compile times and the amount of object code generated for each instantiation. One technique to reduce these costs is by normalizing a given list of template arguments into something called a canonical form.

Canonical Forms

A canonical form, resulting from a process called canonicalization, is a mathematical structure that is intended to reduce multiple different-looking but identical objects into one form that we can then manipulate as we see fit. Using an automatic code formatter is an example of transforming input (in this case, source code) into a canonical form.

Application to templates

Consider a template like this one:

template Expensive(Args...)
{
  /* Some kind of expensive metaprogramming or code generation */
}

If we can think of a useful canonical form that isn’t too hard to compute, we can then write a second template Reduce to implement it, then inject it like in the following example.

template Expensive(Args...)
{
  // Reduce to some kind of canonical form
  alias reduced = Reduce!(Args);

  // Where ExpensiveImpl is the same as above but renamed
  alias Expensive = ExpensiveImpl!(reduced);
}

To be worth doing, ExpensiveImpl must be significantly more expensive than the reduction operation (pay attention to differences in sys time when measuring this), where “significant” is meant statistically rather than informally, i.e., any win is good as long as you can rigorously prove it’s real.

An example: sorting template arguments

Take a templated aggregate like this:

struct ExposeMethods(Types...)
{
  /* Some kind of internal state dependant on Types but not their order */

  static foreach(Type; Types) {
    bool test(Type x) { 
      /* Something slow to compile depending on Type */
    }
  }
}

If it’s instantiated with, e.g., five different types across a large codebase, we could spend a lot of time redoing semantically identical compilation. If all possible types are used as input many times, we could end up with a few permutations, and if not we will probably get a few identical subsets.

A canonical form that might come to mind (i.e., a potential definition of Reduce) is simply sorting the arguments by their names. This can be achieved via the use of staticSort.

Conclusion

D has powerful metaprogramming and code generation features. But like anything in programming, their use isn’t free. If you want to avoid the situation where you find yourself making coffee while your project builds, then it’s imperative to be aware of the cost vs. benefits of the metaprogramming features you use. Then you can make intelligent decisions about your compile-time interfaces and implementations.

Appendix – Tracing the D compiler to count template instantiations

Here’s a simple lesson in Linux userspace tracing: you can use a tool like bpftrace or DTrace to spy on the D compiler compiling other things, so we can get basic figures about the compilation of other D programs without either hacking the compiler or changing their build process.

You’ll need a bpftrace file like the following (saved as e.g., main.bt):

BEGIN
{
  printf("Tracing a D file\n");
}
uprobe:/home/mhh/dlang/dmd-2.097.0/linux/bin64/dmd:_Dmain
{
  printf("This is the main\n");
}
uprobe:/home/mhh/dlang/dmd-2.097.0/linux/bin64/dmd:_D3dmd10dsymbolsem24templateInstanceSemanticFCQBs9dtemplate16TemplateInstancePSQCz6dscope5ScopePSQDr4root5array__T5ArrayTCQEq10expression10ExpressionZQBkZv
{
  //We do nothing with the knowledge here but if you write some code you can get info about the templates relatively easily
  printf("Instantiating a template\n");
}

What am I hooking here? That big mangled name in the middle of the script is templateInstanceSemantic in the dsymbolsem module in the DMD source. By hooking it, we can get a rough idea of when a template is being worked on.

Running it with sudo bpftrace main.bt (eBPF tracing currently requires root) when building DMD, for example, I see there are about 50,800 template hits.

You can use a more complicated script in a system like bcc to reconstruct the compiler’s internal data structures. With that, we can get output a bit more like 09:05:17 69310 b'/home/mhh/d_dev/dmd/src/dmd/errors.d:85' and actually reconstruct the source/line info (alongside a timestamp and PID).

Teaching D from Scratch: Is it a viable first language?

About two-and-a-half years ago, one of my son’s homeschooling friends asked if I would teach a coding class. Both of my kids have shown interest in coding via Scratch, and I also need to pull my weight when it comes to homeschooling them (my wife doing most of the instruction). So I thought, surely, this can’t be too hard. The challenges I would encounter over the course of the last few years surprised me more often than I would have expected!

First, a bit about my background. I’ve been writing software professionally for over 20 years, about half using system languages and half doing web development. I’ve dabbled in a lot of programming projects, including embedded microcontrollers with 256 bytes of RAM, collaborative video systems using high-end Unix workstations, and automating a factory testing rack-mounted appliances. What I haven’t ever done to any official degree is teach programming. For sure, this is a challenge that is novel to me, and also one that I have found very satisfying.

A bit about the kids. I had ten students, aged 9 – 15, all of whom had very little experience writing real software. Since this class was going to be a side project for me on top of my normal job, we went rather slowly, meeting once every two weeks.

One of the goals of my class was to try not to dumb down coding into an abstraction. I wanted the kids to experience real programming without too much of a pre-built hand-holding framework, and I believed they could do it. I’ve seen many (and purchased some) online resources for teaching to code that hide most of the real work and just focus on using custom libraries or simplified languages to learn the “basics”, while not actually teaching to be productive in a standard environment. I also witnessed firsthand my own kids getting bored when the material was presented too slowly or too abstractly.

Note: you can see some of my homework assignments and review pages on a dedicated website that I created for this class

First Try: Javascript and Lua

The first language we worked with, because many of the kids didn’t have actual full-blown computers to work on (this was pre-Covid so it was at my house), was Javascript. Javascript is possible to develop in a browser, using a tablet or a Chromebook, so all the kids I was teaching could participate. My focus in these early days was basic concepts like loops, branching, and basic statements. I think Javascript can be a great first language, though the things you can do with it as a beginner are a bit bland, and to do anything exciting you have to start learning the DOM (including how objects and methods work).

After half a year we moved on to our second language: Lua, specifically in the video game system Roblox. I almost wish we had skipped this one because it was a bit too advanced for these students, and I was not familiar with the language and the system to begin with. The kids did have fun with the model-building UI. However, I can see an experienced teacher finding good ways to instruct using this environment. Roblox, with its pre-built rendering and physics engines, has the nice benefit of allowing one to generate a really cool game with relatively little code—something new coders quite enjoy.

Just use D

And that leads us to the second year, where I decided to just try teaching what I know best: the D programming language. Since all the kids have a keen interest in building games, I searched for a simple library I could use to build games. I have to give a shout-out to the library I discovered, Raylib, and the YouTuber Ki Rill, whose very straightforward and well-made videos gave me the confidence that this also would be something I could teach.

For a quick demo of what Raylib looks like, here is a program I wrote in a few minutes that bounces a Raylib-drawn D-man around the window.

Everyone in the class is using Windows, so the toolchain that I am having them use is the stock DMD compiler and the Visual Studio Code development environment with the code-d extension. Throughout the last year-and-a-half, I have had the kids build such classics as hangman and tic-tac-toe using text-based “graphics”. After that came the introduction to 2D graphics and Raylib, where we built a brick-breaking game similar to Arkanoid and others.

Bricker

Finally, I let the students pick what they wanted for the last project, and their pick was a rock-paper-scissors online tournament game. I’m not sure of the lasting power of that one…

Keep in mind that my goal was to teach them programming, not necessarily D. Working through all these projects allowed me to introduce a lot of the typical programming concepts (branching, looping, functions, data types, arrays, etc.). This year we are focusing on aggregates (specifically arrays and structs), and how they can be useful to encapsulate your coding solution into pieces that are easier to deal with. The game we are writing this year is a snake clone. I’m hoping to wrap things up on that by the end of the year, and then next year add in networking to have the game playable between the students (all the kids these days are into online tournament-style games).

Snake

My Take on Teaching With D

So how is D as a beginner’s programming language? I can say with confidence that the kids understand the flavor of D that I have taught just as well as the other languages. What flavor is that you may ask? Why vanilla of course! As D was the first typed language they were exposed to, I had to first explain types and why they are important. I also had to pick and choose which concepts in D I did not want to confuse them with. To that end, I’ve left out a lot of the advanced features (so far), such as:

  • Templates
  • Type inference
  • Operator overloading
  • Overloading in general!
  • __traits
  • Classes/Interfaces/OOP
  • Pointers/references (mostly)
  • Memory management in general (thank you GC!)

This means that a lot of language features are missing from “Vanilla D”, and a lot of those are pieces I like. I had to force myself to avoid those things while teaching so as not to confuse them. I also try to avoid shortcuts if I can, as I want them to master the basic concepts before learning how to do it quicker or less verbosely. I tried hard to always use curly braces for scope blocks, for instance, instead of just writing single statements without them.

The features that I found the kids handled well were basic types like ints and strings (though what a funny name “string” is, that took a while to sink in), if statements and foreach/while loops, structs, and functions. They had a harder time understanding arrays and associative arrays, which might be due to my lack of experience teaching. I’ll also note that teaching virtually has some significant challenges—there’s just no substitute to being able to walk over to a student’s computer and observe and help their progress or draw out on a whiteboard how data is laid out. One surprising thing (to me anyway) that they handled completely in stride, was nested functions. That’s a nice D feature that is not available in C, barely in C++, and something that I personally took a while to get used to.

What Could Go Wrong?

The challenges were numerous. First I had to try and remember what it’s like to know nothing about programming. I did not take any classes on teaching programming and did not look up the “proper” way to teach it, I just went with what I thought would be the next best thing to do based on their skills and experience. Having class only once every two weeks is a big drawback since the students’ brains tend to garbage-collect some of the stuff we worked on last time. For sure, a daily or maybe twice-weekly class schedule would improve the situation, but that would mean I’d have to generate material to teach that many classes, which I unfortunately don’t have the time for.

Secondly, I previously had no experience with games programming. Raylib made this pretty easy, and as long as you can understand event loops, it’s pretty straightforward (and fun—I wish I had learned it sooner).

Aside from my personal and environmental shortcomings, I have some suggestions for the D language and community that would make D a much better “first language”. In no particular order:

Error Messages

Error messages in D are sometimes esoteric, and I’m not just talking about templates. The most common error the students make is bad punctuation. This leads to some of the worst messages. Most of the time, I need to explain what is happening, but even when explaining it myself, I am sometimes thwarted by the compiler. For instance, if you forget a semicolon:

void main()
{
    import std.stdio;
    int x = 5
    writeln(x); // line 5
}

The resulting error is:

foo.d(5): Error: semicolon expected, not `writeln`

But wait, there is a semicolon on line 5! So why is it having a problem? Technically, putting an extra semicolon just before the writeln call on line 5 would solve the error (after all, D doesn’t have significant whitespace). But really, the compiler shouldn’t be suggesting that.

I’d rather see something like:

foo.d(5): Error: previous statement not terminated, perhaps you need 
a semicolon on line 4?

What if you are missing a brace or have an extra one? That spits out a lot of error vomit that is sometimes really difficult to decipher. I know this isn’t as easy a problem to solve, but since the compiler is not even to the tricky semantic parts, what if it used some other hints of the source to try and tease out a suggested solution? Like maybe taking cues from the indentation.

A great feature of the D compiler is the spelling correction suggestions. But sometimes it’s not easy to see what has changed in the suggested spelling correction (especially when it’s just different capitalization). Why not highlight the changed letters? Since D has gained colorized output for errors, things have become much clearer in my opinion.

I do have to relay a comment from one of my students, who said he appreciated the ability of VS Code to jump to the line that the compiler error is referencing. This helps put into perspective where their mindset is.

Debugger on VSCode for Windows

Another challenge I’ve faced is the lack of a good debugging experience for VS Code on Windows. The C debugger for VS Code displays, for example, a string as a length and a C string (which might have trailing garbage). I’d rather not confuse them with this. I have read that Visual D has a better debugging experience, but I need dub support for these projects, and Visual D focuses on Visual Studio integration, something I don’t necessarily want to deal with in teaching these kids. I’m hoping that a recent focus on debugging (e.g., the LLDB integration project) will help.

I think WebFreak has done a great job on the VS Code plugin, and for the most part, it works amazingly well. Thank you! I know debugging is a piece that is hard to get right.

Linking External Libraries

My next challenge is using dub for linking. Dub is great if you are only doing things with D and OS-supplied libraries. As soon as you need to depend on an external C library that is not part of your OS (such as Raylib), now you are not just adding dependencies but downloading pre-built libraries, or trying to build them yourself. I’m not sure if I have great suggestions on this, but I would like some way for dub to be told where to download these libs or have a central place to put them where the linker can find them outside of specifying the location in your dub.json file. Many libraries are released on GitHub, which makes it a prime candidate for downloading and installing external libraries (maybe even when ImportC becomes viable, it can automatically generate the bindings as well).

I spend an inordinate amount of time helping the kids set up their build environments, and having this built-in to the IDE or dub would be immensely helpful for using D for learning.

Suggestions for the Future

I am confident that D can be a viable first language. I have students that, prior to my class, had never before written real code now creating 2D games using D and Raylib (with a lot of hand-holding, darn it!). What I think D needs for this success to be scalable is a set of resources that assumes no programming experience. Most of D’s learning resources are aimed at teaching experienced programmers how D works in terms of their existing knowledge. They answer the question, “How do I do <insert feature> from <insert language> in D?” What we need is to answer the question, “How do I write code?”, that just happens to be using D.

Editor’s Note:Some readers may be thinking at this point of Ali Çehreli’s Programming in D. Though written for beginner-level programmers, it’s focused more on teaching the language than teaching to program generally.

This shouldn’t be too hard to create, as there are a ton of resources on learning to program in C already. D is so similar you could almost just replace C with D and call it a day. In fact, I’m considering creating a YouTube series on the topic with the experience I’ve gained. One important tool that needs creating is a map of D features and where they fall on the difficulty scale. This allows one to write D tutorials that follow a gradual introduction of features. Broaching the subject of complex features is probably needed early, as they will experience some of them (you are using a template whenever you use writeln, or to), but explanation of the entire feature may be deferred to a later level.

And finally, such materials should be engaging! I can’t stress enough how important it is to be doing something fun when you are learning to program. The kids definitely are more engaged since we are creating games that they might actually want to play vs. simple hello world examples or even Javascript web pages. A nice way to proceed might be to write a game in Vanilla D and then introduce features that allow you to refine and improve the code.

I hope this becomes a reality because I think this area of development has been relegated to either frameworks that attempt to water down programming to get kids engaged or sterile hello world-style programming that is intended for rote instruction or reference. I think we can show the full potential of D and, at the same time, engage eager learners with interesting and fun projects to trick them into learning code with our favorite language!

Interfacing D with C: Strings Part One

Digital Mars D logo

This post is part of an ongoing series on working with both D and C in the same project. The previous two posts looked into interfacing D and C arrays. Here, we focus on a special kind of array: strings. Readers are advised to read Arrays Part One and Arrays Part Two before continuing with this one.

The same but different

D strings and C strings are both implemented as arrays of character types, but they have nothing more in common. Even that one similarity is only superficial. We’ve seen in previous blog posts that D arrays and C arrays are different under the hood: a C array is effectively a pointer to the first element of the array (or, in C parlance, C arrays decay to pointers, except when they don’t); a D dynamic array is a fat pointer, i.e., a length and pointer pair. A D array does not decay to a pointer, i.e., it cannot be implicitly assigned to a pointer or bound to a pointer parameter in an argument list. Example:

extern(C) void metamorphose(int* a, size_t len);

void main() {
    int[] a = [8, 4, 30];
    metamorphose(a, a.length);      // Error - a is not int*
    metamorphose(a.ptr, a.length);  // Okay
}

Beyond that, we’ve got further incompatibilities:

  • each of D’s three string types, string, wstring, and dstring, are encoded as Unicode: UTF-8, UTF-16, and UTF-32 respectively. The C char* can be encoded as UTF-8, but it isn’t required to be. Then there’s the C wchar_t*, which differs in bit size between implementations, never mind encoding.
  • all of D’s string types are dynamic arrays with immutable contents, i.e., string is an alias to immutable(char)[]. C strings are mutable by default.
  • the last character of every C string is required to be the NUL character (the escape character \0, which is encoded as 0 in most character sets); D strings are not required to be NUL-terminated.

It may appear at first blush as if passing D and C strings back and forth can be a major headache. In practice, that isn’t the case at all. In this and subsequent posts, we’ll see how easy it can be. In this post, we start by looking at how we can deal with NUL termination and wrap up by digging deeper into the related topic of how string literals are stored in memory.

NUL termination

Let’s get this out of the way first: when passing a D string to C, the programmer must ensure it is terminated with \0. std.string.toStringz, a simple utility function in the D standard library (Phobos), can be employed for this:

import core.stdc.stdio : puts;
import std.string : toStringz;

void main() {
    string s0 = "Hello C ";
    string s1 = s0 ~ "from D!";
    puts(s1.toStringz());
}

toStringz takes a single argument of type const(char)[] and returns immutable(char)* (there’s more about const vs. immutable in Part Two). The form s1.toStringz, known as UFCS (Uniform Function Call Syntax), is lowered by the compiler into toStringz(s1).

toStringz is the idiomatic approach, but it’s also possible to append "\0" manually. In that case, puts can be supplied with the string’s pointer directly:

import core.stdc.stdio : puts;

void main() {
    string s0 = "Hello C ";
    string s1 = s0 ~ "from D!" ~ "\0";
    puts(s1.ptr);
}

Forgetting to use .ptr will result in a compilation error, but forget to append the "\0" and who knows when someone will catch it (possibly after a crash in production and one of those marathon debugging sessions which can make some programmers wish they had never heard of programming). So prefer toStringz to avoid such headaches.

However, because strings in D are immutable, toStringz does allocate memory from the GC heap. The same is true when manually appending "\0" with the append operator. If there’s a requirement to avoid garbage collection at the point where the C function is called, e.g., in a @nogc function or when -betterC is enabled, it will have to be done in the same manner as in C, e.g., by allocating/reallocating space with malloc/realloc (or some other allocator) and copying the NUL terminator. (Also note that, in some situations, passing pointers to GC-managed memory from D to C can result in unintended consequences. We’ll dig into what that means, and how to avoid it, in Part Two.)

None of this applies when we’re dealing directly with string literals, as they get a bit of special treatment from the compiler that makes puts("Hello D from C!".toStringz) redundant. Let’s see why.

String literals in D are special

D programmers very often find themselves passing string literals to C functions. Walter Bright recognized early on how common this would be and decided that it needed to be just as seamless in D as it is in C. So he implemented string literals in a way that mitigates the two major incompatibilities that arise from NUL terminators and differences in array internals:

  1. D string literals are implicitly NUL-terminated.
  2. D string literals are implicitly convertible to const(char)*.

These two features may seem minor, but they are quite major in terms of convenience. That’s why I didn’t pass a literal to puts in the toStringz example. With a literal, it would look like this:

import core.stdc.stdio : puts;

void main() {
    puts("Hello C from D!");
}

No need for toStringz. No need for manual NUL termination or .ptr. It just works.

I want to emphasize that this only applies to string literals (of type string, wstring, and dstring) and not to string variables; once a string literal is included in an expression, the NUL-termination guarantee goes out the window. Also, no other array literal type is implicitly convertible to a pointer, so the .ptr property must be used to bind them to a pointer function parameter, e.g., `giveMeIntPointer([1, 2, 3].ptr).

But there is a little more to this story.

String literals in memory

Normal array literals will usually trigger a GC allocation (unless the compiler can elide the allocation, such as when assigning the literal to a static array). Let’s do a bit of digging to see what happens with a D string literal:

import std.stdio;

void main() {
    writeln("Where am I?");
}

To make use of a command-line tool particularly convenient for this example, I compiled the above on 64-bit Linux with all three major compilers using the following command lines:

dmd -ofdmd-memloc memloc.d
gdc -o gdc-memloc memloc.d
ldc2 -ofldc-memloc memloc.d

If we were compiling C or C++, we could expect to find string literals in the read-only data segment, .rodata, of the binary. So let’s look there via the readelf command, which allows us to extract specific data from binaries in the elf object file format, to see if the same thing happens with D. The following is abbreviated output for each binary:

readelf -x .rodata ./dmd-memloc | less
Hex dump of section '.rodata':
  0x0008e000 01000200 00000000 00000000 00000000 ................
  0x0008e010 04100000 00000000 6d656d6c 6f630000 ........memloc..
  0x0008e020 57686572 6520616d 20493f00 2f757372 Where am I?./usr
  0x0008e030 2f696e63 6c756465 2f646d64 2f70686f /include/dmd/pho
...

readelf -x .rodata ./gdc-memloc | less
Hex dump of section '.rodata':
  0x00003000 01000200 00000000 57686572 6520616d ........Where am
  0x00003010 20493f00 00000000 2f757372 2f6c6962  I?...../usr/lib
...

readelf -x .rodata ./ldc-memloc | less
Hex dump of section '.rodata':
  0x00001e40 57686572 6520616d 20493f00 00000000 Where am I?.....
  0x00001e50 2f757372 2f6c6962 2f6c6463 2f783836 /usr/lib/ldc/x86

In all three cases, the string is right there in the read-only data segment. The D spec explicitly avoids specifying where a string literal will be stored, but in practice, we can bank on the following: it might be in the binary’s read-only segment, or it might be in the normal data segment, but it won’t trigger a GC allocation, and it won’t be allocated on the stack.

Wherever it is, there’s a positive consequence that we can sometimes take advantage of. Notice in the readelf output that there is a dot (.) immediately following the question mark at the end of each string. That represents the NUL terminator. It is not counted in the string’s .length (so "Where am I?".length is 11 and not 12), but it’s still there. So when we initialize a string variable with a string literal or assign a string literal to a variable, the lack of an allocation also means there’s no copying, which in turn means the variable is pointing to the literal’s location in memory. And that means we can safely do this:

import core.stdc.stdio: puts;

void main() {
    string s = "I'm NUL-terminated.";
    puts(s.ptr);
    s = "And so am I.";
    puts(s.ptr);
}

If you’ve read the GC series on this blog, you are aware that the GC can only have a chance to run a collection if an attempt is made to allocate from the GC heap. More allocations mean a higher chance to trigger a collection and more memory that needs to be scanned when a collection runs. Many applications may never notice, but it’s a good policy to avoid GC allocations when it’s easy to do so. The above is a good example of just that: toStringz allocates, we don’t need it in either call to puts because we can trust that s is NUL-terminated, so we don’t use it.

To be very clear: this is only true for string variables that have been directly initialized with a string literal or assigned one. If the value of the variable was the result of any other operation, then it cannot be considered NUL-terminated. Examples:

string s1 = s ~ "...I'm Unreliable!!";
string s2 = s ~ s1;
string s3 = format("I'm %s!!", "Unreliable");

None of these strings can be considered NUL-terminated. Each case will trigger a GC allocation. The runtime pays no mind to the NUL terminator of any of the literals during the append operations or in the format function, so the programmer can’t trust it will be part of the result. Pass any one of these strings to C without first terminating it and trouble will eventually come knocking.

But hold on…

Given that you’re reading a D blog, you’re probably adventurous or like experimenting. That may lead you to discover another case that looks reliable:

import core.stdc.stdio: puts;

void main() {
    string s = "Am I " ~ "reliable?";
    puts(s.ptr);
}

The above very much looks like appending multiple string literals in an initialization or assignment is just as reliable as using a single string literal. We can strengthen that assumption with the following:

import std.stdio : writeln;

void main() {
    writeln("Am I reliable?".ptr);

    string s = "Am I " ~ "reliable?";
    writeln(s.ptr);
}

writeln is a templated function that recognizes when it’s being given a pointer; rather than treating it as a string and printing what it points to, it prints the pointer’s value. So we can print memory addresses in D without a format string.

Compiling the above, again on 64-bit Linux:

dmd -ofdmd-rely rely.d
gdc -o gdc-rely rely.d
ldc2 -ofldc-rely rely.d

Now let’s execute them all:

./dmd-rely
562363F63010
562363F63030

./gdc-rely
5566145E0008
5566145E0008

./ldc-rely
55C63CFB461C
55C63CFB461C

We see that dmd-rely prints two different addresses, but they’re very close together. Both gdc-rely and ldc-rely print a single address in both cases. And if we make use of readelf as we did with the memloc example above, we’ll find that, in every case, the literals are in the read-only data segment. Case closed!

Well, not so fast.

What’s happening is that all three compilers are performing an optimization known as constant folding. In short, they can recognize when all operands of an append expression are compile-time constants, so they can perform the append at compile-time to produce a single string literal. In this case, the net effect is the same as s = "Am I reliable?". LDC and GDC go further and recognize that the resulting literal is identical to the one used earlier, so they reuse the existing literal’s address (a.k.a. string interning). (Note that DMD also performs string interning, but currently it only kicks in when a string literal appears more than twice.)

To be clear: this only works because all of the operands are string literals. No matter how many string literals are involved in an operation, if only one operand is a variable, then the operation triggers a GC allocation.

Although we see that the result of an append operation involving string literals can be passed directly to C just fine, and we’ve proven that it’s stored in read-only memory alongside its NUL terminator, this is not something we should consider reliable. It’s an optimization that no compiler is required to perform. Though it’s unlikely that any of the three major D compilers will suddenly stop constant folding string literals, a future D compiler could possibly be released without this particular optimization and instead trigger a GC allocation.

In short: rely on this at your own risk.

Addendum: Compile rely.d on Windows with dmd and the binary will yield some very different output:

dmd -m64 -ofwin-rely.exe rely.d
./win-rely
7FF76701D440
7FF76702BB30

There is a much bigger difference in the memory addresses here than in the dmd binary on Linux. We’re dealing with the PE/COFF format in this case, and I’m not familiar with anything similar to readelf for that format on Windows. But I do know a little something about Abner Fog’s objconv utility. Not only does it convert between object file formats, but it can also disassemble them:

objconv -fasm win-rely.obj

This produces a file, win-rely.asm. Open it in a text editor and search for a portion of the string, e.g., "I rel". You’ll find the two entries aren’t too far apart, but one is located in a block of text under this heading:

rdata SEGMENT PARA ‘CONST’ ; section number 4

And the other under this heading:

.data$B SEGMENT PARA ‘DATA’ ; section number 6

In other words, one of them is in the read-only data segment (rdata SEGMENT PARA 'CONST'), and the other is in the regular data segment. This goes back to what I mentioned earlier about the D spec being explicitly silent on where string literals are stored. Regardless, the behavior of the program on Windows is the same as it is on Linux; the second call to puts doesn’t blow anything up because the NUL terminator is still there, one slot past the last character. But it doesn’t change the fact that constant folding of appended string literals is an optimization and only to be relied upon at your own risk.

Conclusion

This post provides all that’s needed for many of the use cases encountered with strings when interacting with C from D, but it’s not the complete picture. In Part Two, we’ll look at how mutability, immutability, and constness come into the picture, how to avoid a potential problem spot that can arise when passing GC-allocated D strings to C, and how to get D strings from C strings. We’ll save encoding for Part Three.

Thanks to Walter Bright, Ali Çehreli, and Iain Buclaw for their valuable feedback on this article.

Symphony of Destruction: Structs, Classes and the GC (Part One)

Digital Mars D logo

This post is part of a broader series on garbage collection in D. The motivation is to explore how destructors and the GC interact. To do that, we first need a bit of background. We do not go into a broader discussion on the ins and outs of object destruction, only what is most relevant to the interaction of destructors and the GC.

I’ve split the discussion into two blog posts. Here in Part One, we look at how deterministic and non-deterministic destruction differ, consider the consequences of having a single destructor for both scenarios, and finally establish two simple guidelines that will help us avoid those consequences. In Part Two, we’ll go further and explore how we can still write solid destructors when circumstances dictate that the guidelines don’t apply.

Deterministic destruction

Destruction is deterministic when it is predictable, meaning the programmer can, simply by following the flow of the code, point out where and when an object’s destructor is invoked. This is possible with struct instances allocated on the stack, as the compiler will insert calls to their destructors at well-defined points for automatic and deterministic destruction.

There are two basic rules for automatic destruction:

  1. The destructors of all stack-allocated structs in a given scope are invoked when the scope exits.
  2. Destructors are invoked in reverse lexical order (i.e., the opposite of the order in which the declarations appear in the source).

With these two rules in mind, we can examine the following example and accurately predict its output.

import std.stdio;
struct Predictable
{
    int number;
    this(int n)
    {
        writeln("Constructor #", n);
        number = n;
    }
    ~this()
    {
        writeln("Destructor #", number);
    }
}

void main()
{
    Predictable s0 = Predictable(0);
    {
        Predictable s1 = Predictable(1);
    }
    Predictable s2 = Predictable(2);
}

We see that both s0 and s2 are directly within the scope of the main function, so their destructors will run when main exits. Given that the declaration of s2 comes after that of s0, the destructor of s2 will run before that of s0.

We also see that s1 is declared in an anonymous inner scope between the declarations of s0 and s2. This scope exits before s2 is constructed, so the destructor of s1 will execute before the constructor of s2.

With that, we can expect the following output:

Constructor #0      // declaration of s0
Constructor #1      // declaration of s1
Destructor #1       // anonymous scope exits, s1 destroyed
Constructor #2      // declaration of s2
Destructor #2       // main exits, s2 then s0 destroyed
Destructor #0

Compiling and executing the example proves us accurate seers.

The programmer can implement deterministic destruction manually, as is necessary when destroying instances allocated on the non-GC heap, e.g., with malloc or std.experimental.allocator. In an earlier post, Go Your Own Way (Part Two: The Heap), I covered how to use std.conv.emplace to allocate instances on the non-GC heap and briefly mentioned that destructors can be invoked manually via destroy. That’s a function template declared in the automatically imported object module so that it’s always available. We won’t retread the allocation discussion, but an example of manual destruction isn’t out of bounds for this post.

In the following example, we’ll reuse the definition of the Predictable struct and a destroyPredictable function to manually invoke the destructors. For completeness, I’ve included functions for allocating and deallocating Predictable instances from the non-GC heap: allocatePredictable and deallocatePredictable. If it isn’t clear to you what these two functions are doing, please read the blog post I mentioned above.

void main()
{
    Predictable* s0 = allocatePredictable(0);
    scope(exit) { destroyPredictable(s0); }
    {
        Predictable* s1 = allocatePredictable(1);
        scope(exit) { destroyPredictable(s1); }
    }
    Predictable* s2 = allocatePredictable(2);
    scope(exit) { destroyPredictable(s2); }
}

void destroyPredictable(Predictable* p)
{
    if(p) {
        destroy(*p);
        deallocatePredictable(p);
    }
}

Predictable* allocatePredictable(int n)
{
    import core.stdc.stdlib : malloc;
    import std.conv : emplace;
    auto p = cast(Predictable*)malloc(Predictable.sizeof);
    return emplace!Predictable(p, n);
}

void deallocatePredictable(Predictable* p)
{
    import core.stdc.stdlib : free;
    free(p);
}

Running this program will result in precisely the same output as the previous example. In the destroyPredictable function, we dereference the struct pointer when calling destroy because there is no overload that takes a pointer. There are specializations for classes, interfaces, and structs passed by reference and a general catch-all that takes all other types by reference. Destructors are invoked on types that have them. Before exiting, the function sets the argument to its default .init value through the reference.

Note that if we were to give destroy a pointer without first dereferencing it, the code would still compile. The pointer would be accepted by reference and simply set to null, the default .init value for pointers, but the struct’s destructor would not be invoked (i.e., the pointer is “destroyed”, not the struct instance).

Inserting writeln(*p) immediately after destroy(*p) should print

Predictable(0)

for each destroyed instance. (The default .init state for a struct in D is the aggregate of the .init property of each of its members; in this case, the sole member, being of type int, has an .init property of 0, so the struct’s default .init state is Predictable(0). This can be changed in the struct definition, e.g., struct Predictable { int id = 1; }.)

destroy is not restricted to instances allocated on the non-GC heap. Any aggregate type instance (struct, class, or interface) is a valid argument no matter where it was allocated.

Non-deterministic destruction

In languages with support for objects and a garbage collector, the responsibility for destroying object instances allocated on the GC heap falls to the GC. This is known as finalization. Before reclaiming an object’s memory, the GC finalizes the object by invoking its finalizer.

Finalization, though convenient, comes with a price. In Java’s particular circumstances, the price was determined to be so high that its maintainers deprecated the Object.finalize method and left a scary warning about its use in the documentation. It’s worth quoting here:

The finalization mechanism is inherently problematic. Finalization can lead to performance issues, deadlocks, and hangs. Errors in finalizers can lead to resource leaks; there is no way to cancel finalization if it is no longer necessary; and no ordering is specified among calls to finalize methods of different objects. Furthermore, there are no guarantees regarding the timing of finalization. The finalize method might be called on a finalizable object only after an indefinite delay, if at all.

Finalization in D isn’t quite the bugbear it is in Java, but we do see a less dramatic warning about it in the D documentation:

The garbage collector is not guaranteed to run the destructor for all unreferenced objects. Furthermore, the order in which the garbage collector calls destructors for unreferenced objects is not specified.

Although there’s no mention of “finalization” or “finalizers” here, that’s precisely what the text is referring to. The core message is the same in both warnings: finalization is non-deterministic and cannot be relied on.

Unlike structs, classes in D are reference types by default. Some consequences: the programmer never has direct access to the underlying class instance; instances declared uninitialized are null by default; the normal use case is to allocate instances via new. When a class is instantiated in D, it is usually going to be managed by the GC and its destructor will serve as a finalizer.

As an experiment, let’s change the definition of struct Predictable in our first example to class Unpredictable and use new to allocate the instances like so:

import std.stdio;
class Unpredictable
{
    int number;
    this(int n)
    {
        writeln("Constructor #", n);
        number = n;
    }
    ~this()
    {
        writeln("Destructor #", number);
    }
}

void main()
{
    Unpredictable s0 = new Unpredictable(0);
    {
        Unpredictable s1 = new Unpredictable(1);
    }
    Unpredictable s2 = new Unpredictable(2);
}

We’ll see that the output is drastically different:

Constructor #0
Constructor #1
Constructor #2
Destructor #0
Destructor #1
Destructor #2

Anyone familiar with the characteristics of the default DRuntime constructor can predict for this very simple program that all the destructors will be run when the GC’s cleanup function is executed as the D runtime shuts down, and that they will be executed in the order in which they were declared (an implementation detail; and note that destruction at shut down can be disabled via a command line argument). But in a more complex program, this ability to predict breaks down. Destructors can be invoked by the GC at almost any time and in any order.

To be clear, the GC will only perform its cleanup duties if and when it finds more memory is needed to fulfill a specific allocation request. In other words, it isn’t constantly running in the background, marking objects unreachable and calling destructors willy nilly. To that extent, we can predict when the GC has the possibility to perform its duties. Beyond that, all bets are off. We cannot predict with accuracy if any destructors will be invoked during any given allocation request or the order in which they will be invoked. This uncertainty has ramifications for how one implements destructors for any GC-managed type.

For starters, destructors of GC-managed objects should never perform any operation that can potentially result in a GC allocation request. Attempting to do so can result in an InvalidMemoryOperationError at run time. I use the word “potentially” because some operations can indirectly cause the error in certain circumstances, but not in others. Some examples: attempting to index an associative array can trigger an attempt to allocate a RangeError if the key is not present; a failed assert will result in allocation of an AssertError; calling any function not annotated with @nogc means GC operations are always possible in the call stack. These and any such operations should be avoided in the destructors of GC-managed objects. (The first seven items on the list of operations disallowed in @nogc functions are collectively a good guide.)

A larger issue is that one cannot rely on any resource still being valid when a destructor is called by the GC. Consider a class that attempts to close a socket handle in its destructor; it’s quite possible that the destructor won’t be called until after the program has already shutdown the network interface. There is no scenario in which the runtime can catch this. In the best case, such circumstances will result in a silent failure, but they could also result in crashes during program shutdown or even sooner.

What it comes down to is that GC-allocated objects should never be used to manage any resource that depends upon deterministic destruction for cleanup.

Designing for destruction

For the D neophyte, it can appear as if destructors in D are useless. Given that both struct and class instances can be allocated from memory that may or may not be managed by the GC, that destructors of GC-managed objects are not guaranteed to run, and that destructors are forbidden to perform GC operations during finalization, how can we ever rely on them?

In practice, it’s not as bad as it may seem. Issues do arise for the unwary, but armed with a basic awareness of the nature of D destructors, it turns out that it’s pretty easy to avoid having problems. This is especially true if the programmer adopts two fairly simple rules.

1. Pretend class destructors don’t exist

Class instances will nearly always be allocated with new. That means their destructors will nearly always be non-deterministic. Much of what one would want to do in a destructor is somehow dependent on program state: either the destructor itself expects a certain state (like writing to a log file that is expected to be open), or the program expects the destructor to have modified the program state in a specific manner (like releasing a resource handle).

Non-deterministic destruction means that all expectations about program state are thrown out the window. That log file might have already been closed, so the message will never be written (I hope it wasn’t important). That system resource handle may never be released until the program ends (I hope that particular resource isn’t scarce). Even if it seems through testing that a class destructor is working the way it’s intended, it’s quite likely down to the fact that the testing has not uncovered the case where it breaks. In a long-running program, that case will inevitably pop up at some point. Have fun debugging when your production game server starts randomly crashing.

So when using classes in D, pretend they have no destructors. Pretend that they are Java classes with a deprecated finalize method.

2. Don’t allocate structs on the GC heap when they have destructors

Since we’re pretending classes have no destructors, then we’re going to turn to structs for all of our destructive needs. Allocating structs as value objects on the stack will cover many use cases, but sometimes we may need to allocate them on the heap. When that situation arises, do not allocate any destructor-bearing struct with new. If we allocate a struct that has a destructor on the GC heap, it completely defeats our purpose of avoiding class destructors in the first place. That destructor we intended to be deterministic is now non-deterministic, so we may as well have just used a class.

As we have seen, struct instances can be allocated on the non-GC heap (e.g., with malloc) and their destructors manually invoked with the destroy function. If we need deterministic destruction and we absolutely must have a heap-allocated struct, then we cannot allocate that struct on the GC heap.

Guidelines schmuidelines

I’m sure someone is reading the above guidelines and thinking, “If I have to pretend that classes have no destructors, then why do classes have destructors?” Well, you don’t have to.

There is no One True Path to follow when deciding if an object should be implemented as a class or struct. Personally, I will always prefer structs over classes, and I will only reach for a class if I need something structs can’t give me easily (like a hierarchy) or efficiently. Other people will consider if the object they need to represent has an identity, e.g., an Actor in a simulation versus the Vertex that defines its 3D coordinates. POD (Plain Old Data) types should always be structs, but beyond that it’s largely a matter of preference.

My two guidelines are based on my experience and that of others with whom I’ve spoken. They are intended to help you keep the full implications of D’s distinction between classes and structs at the forefront of your thoughts when architecting your program. They are not commandments that every D programmer must follow.

Realistically, most D programmers will encounter circumstances at one time or another in which the guidelines do not apply. For example, when mixing GC-managed memory and manually-managed memory in the same program, it’s quite possible for a struct intended for stack use to wind up on the GC heap if the programmer is unaware of the circumstances. And some D programmers will always prefer classes over structs because that’s just the way they want it, and so will simply choose to ignore the guidelines. That’s no problem as long as they fully understand the consequences.

So what does that mean? How do you get over the non-deterministic nature of class destructors if your Actor class absolutely must have a destructor, or if you prefer to always follow The Way of the Class? How do you prevent structs intended for the stack from being GC-allocated? These are things we’ll be looking at in Part Two. See you there.

Thanks to Ali Çehreli and Max Haughton for their feedback. And to Adam D. Ruppe for his conversation on the topic in Discord and the title suggestion (it fits in more nicely with the series than the ‘Appetite For Destruction’ I had intended to go with)

Function Generation in D: The Good, the Bad, the Ugly, and the Bolt

Introduction

Digital Mars D logo

A while ago, Andrei Alexandrescu started a thread in the D Programming Language forums, titled “Perfect forwarding”, about a challenge which came up during the July 2020 beerconf:

Write an idiomatic template forward that takes an alias fun and defines (generates) one overload for each overload of fun.

Several people proposed solutions. In the course of the discussion, it arose that there is sometimes a need to alter a function’s properties, like adding, removing, or hanging user-defined attributes or parameter storage classes. This was exactly the problem I struggled with while trying to support all the bells and whistles of D functions in my openmethods library.

Eventually, I created a module that helps with this problem and contributed it to the bolts meta-programming library. But before we get to that, let’s first take a closer look at function generation in D.

The Good

I speculate that many a programmer who is a moderately advanced beginner in D would quickly come up with a mostly correct solution to the “Perfect forwarding” challenge, especially those with a C++ background who have an interest in performing all sorts of magic tricks by means of template meta-programming. The solution will probably look like this:

template forward(alias fun)
{
  import std.traits: Parameters;
  static foreach (
    ovl; __traits(getOverloads, __traits(parent, fun), __traits(identifier, fun))) {
    auto forward(Parameters!ovl args)
    {
      return ovl(args);
    }
  }
}

...

int plus(int a, int b) { return a + b; }
string plus(string a, string b) { return a ~ b; }

assert(forward!plus(1, 2) == 3);        // pass
assert(forward!plus("a", "b") == "ab"); // pass

This solution is not perfect, as we shall see, but it is not far off either. It covers many cases, including some that a beginner may not even be aware of. For example, forward handles the following function without dropping function attributes or parameter storage classes:

class Matrix { ... }

Matrix times(scope const Matrix a, scope const Matrix b) pure @safe
{
  return ...;
}

pragma(msg, typeof(times));
// pure @safe Matrix(scope const(Matrix) a, scope const(Matrix) b)

pragma(msg, typeof(forward!times));
// pure @safe Matrix(scope const(Matrix) _param_0, scope const(Matrix) _param_1)

It even handles user-defined attributes (UDAs) on parameters:

struct testParameter;

void testPlus(@testParameter int a, @testParameter int b);

pragma(msg, typeof(testPlus));
// void(@(testParameter) int a, @(testParameter) int b)

pragma(msg, typeof(forward!testPlus));
// void(@(testParameter) int a, @(testParameter) int b)

Speaking of UDAs, that’s one of the issues with the solution above: it doesn’t carry function UDAs. It also doesn’t work with functions that return a reference. Both issues are easy to fix:

template forward(alias fun)
{
  import std.traits: Parameters;
  static foreach (ovl; __traits(getOverloads, __traits(parent, fun), __traits(identifier, fun)))
  {
    @(__traits(getAttributes, fun)) // copy function UDAs
    auto ref forward(Parameters!ovl args)
    {
      return ovl(args);
    }
  }
}

This solution is still not 100% correct though. If the forwardee is @trusted, the forwarder will be @safe:

@trusted void useSysCall() { ... }

pragma(msg, typeof(&useSysCall));         // void function() @trusted
pragma(msg, typeof(&forward!useSysCall)); // void function() @safe

This happens because the body of the forwarder consists of a single statement calling the useSysCall function. Since calling a trusted function is safe, the forwarder is automatically deemed safe by the compiler.

The Bad

However, Andrei’s challenge was not exactly what we discussed in the previous section. It came with a bit of pseudocode that suggested the template should not be eponymous. In other words, I believe that the exact task was to write a template that would be used like this: forward!fun.fun(...). Here is the pseudocode:

// the instantiation of forward!myfun would be (stylized):

template forward!myfun
{
    void myfun(int a, ref double b, out string c)
    {
        return myfun(a, b, c);
    }
    int myfun(in string a, inout double b)
    {
        return myfun(a, b);
    }
}

Though this looks like a small difference, if we want to implement exactly this, a complication arises. In the eponymous forward, we did not need to create a new identifier; we simply used the template name as the function name. Thus, the function name was fixed. Now we need to create a function with a name that depends on the forwardee’s name. And the only way to do this is with a string mixin.

The first time I had to do this, I tried the following:

template forward(alias fun)
{
  import std.format : format;
  import std.traits: Parameters;
  enum name = __traits(identifier, fun);
  static foreach (ovl; __traits(getOverloads, __traits(parent, fun), name)) {
    @(__traits(getAttributes, fun))
    auto ref mixin(name)(Parameters!ovl args)
    {
      return ovl(args);
    }
  }
}

This doesn’t work because a string mixin can only be used to create expressions or statements. Therefore, the solution is to simply expand the mixin to encompass the entire function definition. The token-quote operator q{} is very handy for this:

template forward(alias fun)
{
  import std.format : format;
  import std.traits: Parameters;
  enum name = __traits(identifier, fun);
  static foreach (ovl; __traits(getOverloads, __traits(parent, fun), name)) {
    mixin(q{
        @(__traits(getAttributes, fun))
          auto ref %s(Parameters!ovl args)
        {
          return ovl(args);
        }
      }.format(name));
  }
}

Though string mixins are powerful, they are essentially C macros. For many D programmers, resorting to a string mixin can feel like a defeat.

Let us now move on to a similar, yet significantly more difficult, challenge:

Write a class template that mocks an interface.

For example:

interface JsonSerializable
{
  string asJson() const;
}

void main()
{
  auto mock = new Mock!JsonSerializable();
}

Extrapolating the techniques acquired during the previous challenge, a beginner would probably try this first:

class Mock(alias Interface) : Interface
{
  import std.format : format;
  import std.traits: Parameters;
  static foreach (member; __traits(allMembers, Interface)) {
    static foreach (fun; __traits(getOverloads, Interface, member)) {
      mixin(q{
          @(__traits(getAttributes, fun))
          auto ref %s(Parameters!fun args)
          {
            // record call
            static if (!is(ReturnType!fun == void)) {
              return ReturnType!fun.init;
            }
          }
        }.format(member));
    }
  }
}

Alas, this fails to compile, throwing errors like:

Error: function `challenge.Mock!(JsonSerializable).Mock.asJson` return type
inference is not supported if may override base class function

In other words, auto cannot be used here. We have to fall back to explicitly specifying the return type:

class Mock(alias Interface) : Interface
{
  import std.format : format;
  import std.traits: Parameters, ReturnType;
  static foreach (member; __traits(allMembers, Interface)) {
    static foreach (fun; __traits(getOverloads, Interface, member)) {
      mixin(q{
          @(__traits(getAttributes, fun))
          ReturnType!fun %s(Parameters!fun args)
          {
            // record call
            static if (!is(ReturnType!fun == void)) {
              return ReturnType!fun.init;
            }
          }
        }.format(member));
    }
  }
}

This will not handle ref functions though. What about adding a ref in front of the return type, like we did in the first challenge?

// as before
          ref ReturnType!fun %s(Parameters!fun args) ...

This will fail with all the functions in the interface that do not return a reference.

The reason why everything worked almost magically in the first challenge is that we called the wrapped function inside the template. It enabled the compiler to deduce almost all of the characteristics of the original function and copy them to the forwarder function. But we have no model to copy from here. The compiler will copy some of the aspects of the function (pure, @safe, etc.) to match those of the overriden function, but not some others (ref, const, and the other modifiers).

Then, there is the issue of the function modifiers: const, immutable, shared, and static. These are yet another category of function “aspects”.

At this point, there is no other option than to analyze some of the function attributes by means of traits, and convert them to a string to be injected in the string mixin:

      mixin(q{
          @(__traits(getAttributes, fun))
          %sReturnType!fun %s(Parameters!fun args)
          {
            // record call
            static if (!is(ReturnType!fun == void)) {
              return ReturnType!fun.init;
            }
          }
        }.format(
            (functionAttributes!fun & FunctionAttribute.const_ ? "const " : "")
          ~ (functionAttributes!fun & FunctionAttribute.ref_ ? "ref " : "")
          ~ ...,
          member));
    }

If you look at the implementation of std.typecons.wrap, you will see that part of the code deals with synthesizing bits of a string mixin for the storage classes and modifiers.

The Ugly

So far, we have looked at the function storage classes, modifiers, and UDAs, but we have merely passed the parameter list as a single, monolithic block. However, sometimes we need to perform adjustments to the parameter list of the generated function. This may seem far-fetched, but it does happen. I encountered this problem in my openmethods library. During the “Perfect forwarding” discussion, it appeared that I was not the only one who wanted to do this.

I won’t delve into the details of openmethods here (see an older blog post for an overview of the module); for the purpose of this article, it suffices to say that, given a function declaration like this one:

Matrix times(virtual!Matrix a, double b);

openmethods generates this function:

Matrix dispatcher(Matrix a, double b)
{
  return resolve(a)(a, b);
}

The virtual template is a marker; it indicates which parameters should be taken into account (i.e., passed to resolve) when picking the appropriate specialization of times. Note that only a is passed to the resolve function—that is because the first parameter uses the virtual! marker and the second does not.

Bear in mind, though, that dispatcher is not allowed to use the type of the parameters directly. Inside the openmethods module, there is no Matrix type. Thus, when openmethods is handed a function declaration, it needs to synthesize a dispatcher function that refers to the declaration’s parameter types exclusively via the declaration. In other words, it needs to use the ReturnType and Parameters templates from std.traits to extract the types involved in the declaration – just like we did in the examples above.

Let’s put aside function attributes and UDAs – we already discussed those in the previous section. The obvious solution then seems to be:

ReturnType!times dispatcher(
  RemoveVirtual!(Parameters!times[0]) a, Parameters!times[1] b)
{
  return resolve(a)(a, b);
}

pragma(msg, typeof(&dispatcher)); // Matrix function(Matrix, double)

where RemoveVirtual is a simple template that peels off the virtual! marker from the type.

Does this preserve parameter storage classes and UDAs? Unfortunately, it does not:

@nogc void scale(ref virtual!Matrix m, lazy double by);

@nogc ReturnType!scale dispatcher(RemoveVirtual!(Parameters!scale[0]) a, Parameters!scale[1] b)
{
  return resolve(a)(a, b);
}

pragma(msg, typeof(&dispatcher)); // void function(Matrix a, double b)

We lost the ref on the first parameter and the lazy on the second. What happened to them?

The culprit is Parameters. This template is a wrapper around an obscure feature of the is operator used in conjunction with the __parameters type specialization. And it is quite a strange beast. We used it above to copy the parameter list of a function, as a whole, to another function, and it worked perfectly. The problem is what happens when you try to process the parameters separately. Let’s look at a few examples:

pragma(msg, Parameters!scale.stringof); // (ref virtual!(Matrix), lazy double)
pragma(msg, Parameters!scale[0].stringof); // virtual!(Matrix)
pragma(msg, Parameters!scale[1].stringof); // double

We see that accessing a parameter individually returns the type… and discards everything else!

There is actually a way to extract everything about a single parameter: use a slice instead of an element of the paramneter pack (yes, this is getting strange):

pragma(msg, Parameters!scale[0..1].stringof); // (ref virtual!(Matrix))
pragma(msg, Parameters!scale[1..2].stringof); // (lazy double)

So this gives us a solution for handling the second parameter of scale:

ReturnType!scale dispatcher(???, Parameters!scale[1..2]) { ... }

But what can we put in place of ???. RemoveVirtual!(Parameters!scale[0..1]) would not work. RemoveVirtual expects a type, and Parameters!scale[1..2] is not a type—it is a sort of conglomerate that contains a type, and perhaps storage classes, type constructors, and UDAs.

At this point, we have no other choice but to construct a string mixin once again. Something like this:

mixin(q{
    %s ReturnType!(scale) dispatcher(
      %s RemoveVirtual!(Parameters!(scale)[1]) a,
      Parameters!(scale)[1..2] b)
    {
        resolve(a)(a, b);
    }
  }.format(
    functionAttributes!scale & FunctionAttribute.nogc ? "@nogc " : ""
    /* also handle other function attributes */,
    __traits(getParameterStorageClasses, scale, 0)));

pragma(msg, typeof(dispatcher)); // @nogc void(ref double a, lazy double)

This is not quite sufficient though, because it still doesn’t take care of parameter UDAs.

To Boltly Refract…

openmethods once contained kilometers of mixin code like the above. Such heavy use of string mixins was too ugly and messy, so much so that the project began to feel less like fun and more like work. So I decided to sweep all the ugliness under a neat interface, once and for all. The result was a “refraction” module, which I later carved out of openmethods and donated to Ali Akhtarzada’s excellent bolts library. bolts attempts to fill in the gaps and bring some regularity to D’s motley set of meta-programming features.

refraction’s entry point is the refract function template. It takes a function and an “anchor” string, and returns an immutable Function object that captures all the aspects of a function. Function objects can be used at compile-time. It is, actually, their raison d’être.

Function has a mixture property that returns a declaration for the original function. For example:

Matrix times(virtual!Matrix a, double b);
pragma(msg, refract!(times, "times").mixture);
// @system ReturnType!(times) times(Parameters!(times) _0);

Why does refract need the anchor string? Can’t the string "times" be inferred from the function by means of __traits(identifier...)? Yes, it can, but in real applications we don’t want to use this. The whole point of the library is to be used in templates, where the function is typically passed to refract via an alias. In general, the function’s name has no meaning in the template’s scope—or if, by chance, the name exists, it does not name the function. All the meta-expressions used to dissect the function must work in terms of the local symbol that identifies the alias.

Consider:

module matrix;

Matrix times(virtual!Matrix a, double b);

Method!times timesMethod; // openmethods creates a `Method` object for each
                          // declared method

module openmethods;

struct Method(alias fun)
{
    enum returnTypeMixture = refract!(fun, "fun").returnType;
    pragma(msg, returnTypeMixture);              // ReturnType!(fun)
    mixin("alias R = ", returnTypeMixture, ";"); // ok
    pragma(msg, R.stringof);                     // Matrix
}

There is no times and no Matrix in module openmethods. Even if they existed, they could not be the times function and the Matrix class from module matrix, as this would require a circular dependency between the two modules, something that D forbids by default. However, there is a fun symbol, and it aliases to the function; thus, the return type can be expressed as ReturnType!(fun).

All aspects of the function are available piecemeal. For example:

@nogc void scale(ref virtual!Matrix m, lazy double by);
pragma(msg, refract!(scale, "scale").parameters[0].storageClasses); // ["ref"]

Function also has methods that return a new Function object, with an alteration to one of the aspects. They can be used to create a variation of a function. For example:

pragma(msg,
  refract!(scale, "scale")
  .withName("dispatcher")
  .withBody(q{{ resolve(_0[0])(_0); }})
  .mixture
);

@nogc @system ReturnType!(scale) dispatcher(ref Parameters!(scale)[0] _0, lazy Parameters!(scale)[1] _1)
{
  resolve(_0[0])(_0);
}

This is the reason behind the name “refraction”: the module creates a blueprint of a function, performs some alterations on it, and returns a string—called a mixture—which, when passed to mixin, will create a new function.

openmethods needs to change the type of the first parameter while preserving storage classes. With bolts.experimental.refraction, this becomes easy:

original = refract!(scale, "scale");

pragma(msg,
  original
  .withName("dispatcher")
  .withParameters(
    [original.parameters[0].withType(
        "RemoveVirtual!(%s)".format(original.parameters[0].type)),
     original.parameters[1],
    ])
  .withBody(q{{
      return resolve(_0)(%s);
   }}.format(original.argumentMixture))
);

This time, the generated code splits the parameter pack into individual components:

@nogc @system ReturnType!(scale) dispatcher(
  ref RemoveVirtual!(Parameters!(scale)[0]) _0, Parameters!(scale)[1..2] _1)
{
  return resolve(_0)(_0);
}

Note how the first and second parameters are handled differently. The first parameter is cracked open because we need to replace the type. That forces us to access the first Parameters value via indexing, and that loses the storage classes, UDAs, etc. So they need to be re-applied explicitly.

On the other hand, the second parameter does not have this problem. It is not edited; thus, the Parameters slice trick can be used. The lazy is indeed there, but it is inside the parameter conglomerate.

Conclusion

Initially, D looked almost as good as Lisp for generating functions. As we tried to gain finer control of the generated function, our code started to look a lot more like C macros; in fact, in some respects, it was even worse: we had to put an entire function definition in a string mixin just to set its name.

This is due to the fact that D is not as “regular” a language as Lisp. Some of the people helming the evolution of D are working on changing this, and it is my hope that an improved D will emerge in the not-too-distant future.

In the meantime, the experimental refraction module from the bolts meta-programming library offers a saner, easier way of generating functions without compromising on the idiosyncrasies that come with them. It allows you to pretend that functions can be disassembled and reassembled at will, while hiding all the gory details of the string mixins that are necessarily involved in that task.


Jean-Louis Leroy is not French, but Belgian. He got his first taste of programming from a HP-25 calculator. His first real programming language was Forth, where CTFE is pervasive. Later he programmed (a little) in Lisp and Smalltalk, and (a lot) in C, C++, and Perl. He now works for Bloomberg LP in New York. His interests include object-relational mapping, open multi-methods, DSLs, and language extensions in general.

The ABC’s of Templates in D

D was not supposed to have templates.

Several months before Walter Bright released the first alpha version of the DMD compiler in December 2001, he completed a draft language specification in which he wrote:

Templates. A great idea in theory, but in practice leads to numbingly complex code to implement trivial things like a “next” pointer in a singly linked list.

The (freely available) HOPL IV paper, Origins of the D Programming Language, expands on this:

[Walter] initially objected to templates on the grounds that they added disproportionate complexity to the front end, they could lead to overly complex user code, and, as he had heard many C++ users complain, they had a syntax that was difficult to understand. He would later be convinced to change his mind.

It did take some convincing. As activity grew in the (then singular) D newsgroup, requests that templates be added to the language became more frequent. Eventually, Walter started mulling over how to approach templates in a way that was less confusing for both the programmer and the compiler. Then he had an epiphany: the difference between template parameters and function parameters is one of compile time vs. run time.

From this perspective, there’s no need to introduce a special template syntax (like the C++ style <T>) when there’s already a syntax for parameter lists in the form of (T). So a template declaration in D could look like this:

template foo(T, U) {
    // template members here
}

From there, the basic features fell into place and were expanded and enhanced over time.

In this article, we’re going to lay the foundation for future articles by introducing the most basic concepts and terminology of D’s template implementation. If you’ve never used templates before in any language, this may be confusing. That’s not unexpected. Even though many find D’s templates easier to understand than other implementations, the concept itself can still be confusing. You’ll find links at the end to some tutorial resources to help build a better understanding.

Template declarations

Inside a template declaration, one can nest any other valid D declaration:

template foo(T, U) {
    int x;
    T y;

    struct Bar {
        U thing;
    }

    void doSomething(T t, U u) {
        ...
    }
}

In the above example, the parameters T and U are template type parameters, meaning they are generic substitutes for concrete types, which might be built-in types like int, float, or any type the programmer might implement with a class or struct declaration. By declaring a template, it’s possible to, for example, write a single implementation of a function like doSomething that can accept multiple types for the same parameters. The compiler will generate as many copies of the function as it needs to accomodate the concrete types used in each unique template instantiation.

Other kinds of parameters are supported: value parameters, alias parameters, sequence (or variadic) parameters, and this parameters, all of which we’ll explore in future blog posts.

One name to rule them all

In practice, it’s not very common to implement templates with multiple members. By far, the most common form of template declaration is the single-member eponymous template. Consider the following:

template max(T) {
    T max(T a, T b) { ... }
}

An eponymous template can have multiple members that share the template name, but when there is only one, D provides us with an alternate template declaration syntax. In this example, we can opt for a normal function declaration that has the template parameter list in front of the function parameter list:

T max(T)(T a, T b) { ... }

The same holds for eponymous templates that declare an aggregate type:

// Instead of the longhand template declaration...
/* 
template MyStruct(T, U) {
    struct MyStruct { 
        T t;
        U u;
    }
}
*/

// ...just declare a struct with a type parameter list
struct MyStruct(T, U) {
    T t;
    U u;
}

Eponymous templates also offer a shortcut for instantiation, as we’ll see in the next section.

Instantiating templates

In relation to templates, the term instantiate means essentially the same as it does in relation to classes and structs: create an instance of something. A template instance is, essentially, a concrete implementation of the template in which the template parameters have been replaced by the arguments provided in the instantiation. For a template function, that means a new copy of the function is generated, just as if the programmer had written it. For a type declaration, a new copy of the declaration is generated, just as if the programmer had written it.

We’ll see an example, but first we need to see the syntax.

Explicit instantiations

An explicit instantiation is a template instance created by the programmer using the template instantiation syntax. To easily disambiguate template instantiations from function calls, D requires the template instantiation operator, !, to be present in explicit instantiations. If the template has multiple members, they can be accessed in the same manner that members of aggregates are accessed: using dot notation.

import std;

template Temp(T, U) {
    T x;
    struct Pair {
        T t;
        U u;
    }
}

void main()
{
    Temp!(int, float).x = 10;
    Temp!(int, float).Pair p;
    p.t = 4;
    p.u = 3.2;
    writeln(Temp!(int, float).x);
    writeln(p);            
}

Run it online at run.dlang.io

There is one template instantiation in this example: Temp!(int, float). Although it appears three times, it refers to the same instance of Temp, one in which T == int and U == float. The compiler will generate declarations of x and the Pair type as if the programmer had written the following:

int x;
struct Pair {
    int t;
    float u;
}

However, we can’t just refer to x and Pair by themselves. We might have other instantiations of the same template, like Temp!(double, long), or Temp(MyStruct, short). To avoid conflict, the template members must be accessed through a namespace unique to each instantiation. In that regard, Temp!(int, float) is like a class or struct with static members; just as you would access a static x member variable in a struct Foo using the struct name, Foo.x, you access a template member using the template name, Temp!(int, float).x.

There is only ever one instance of the variable x for the instantiation Temp!(int float), so no matter where you use it in a code base, you will always be reading and writing the same x. Hence, the first line of main isn’t declaring and initializing the variable x, but is assigning to the already declared variable. Temp!(int, float).Pair is a struct type, so that after the declaration Temp!(int, float).Pair p, we can refer to p by itself. Unlike x, p is not a member of the template. The type Pair is a member, so we can’t refer to it without the prefix.

Aliased instantiations

It’s possible to simplify the syntax by using an alias declaration for the instantiation:

import std;

template Temp(T, U) {
    T x;
    struct Pair {
        T t;
        U u;
    }
}
alias TempIF = Temp!(int, float);

void main()
{
    TempIF.x = 10;
    TempIF.Pair p = TempIF.Pair(4, 3.2);
    writeln(TempIF.x);
    writeln(p);            
}

Run it online at run.dlang.io

Since we no longer need to type the template argument list, using a struct literal to initialize p, in this case TempIF.Pair(3, 3.2), looks cleaner than it would with the template arguments. So I opted to use that here rather than first declare p and then initialize its members. We can trim it down still more using D’s auto attribute, but whether this is cleaner is a matter of preference:

auto p = TempIF.Pair(4, 3.2);

Run it online at run.dlang.io

Instantiating eponymous templates

Not only do eponymous templates have a shorthand declaration syntax, they also allow for a shorthand instantiation syntax. Let’s take the x out of Temp and rename the template to Pair. We’re left with a Pair template that provides a declaration struct Pair. Then we can take advantage of both the shorthand declaration and instantiation syntaxes:

import std;

struct Pair(T, U) {
    T t;
    U u;
}

// We can instantiate Pair without the dot operator, but still use
// the alias to avoid writing the argument list every time
alias PairIF = Pair!(int, float);

void main()
{
    PairIF p = PairIF(4, 3.2);
    writeln(p);            
}

Run it online at run.dlang.io

The shorthand instantiation syntax means we don’t have to use the dot operator to access the Pair type.

Even shorter syntax

When a template instantiation is passed only one argument, and the argument’s symbol is a single token (e.g., int as opposed to int[] or int*), the parentheses can be dropped from the template argument list. Take the standard library template function std.conv.to as an example:

void main() {
    import std.stdio : writeln;
    import std.conv : to;
    writeln(to!(int)("42"));
}

Run it online at run.dlang.io

std.conv.to is an eponymous template, so we can use the shortened instantiation syntax. But the fact that we’ve instantiated it as an argument to the writeln function means we’ve got three pairs of parentheses in close proximity. That sort of thing can impair readability if it pops up too often. We could move it out and store it in a variable if we really care about it, but since we’ve only got one template argument, this is a good place to drop the parentheses from the template argument list.

writeln(to!int("42"));

Whether that looks better is another case where it’s up to preference, but it’s fairly idiomatic these days to drop the parentheses for a single template argument no matter where the instantiation appears.

Not done with the short stuff yet

std.conv.to is an interesting example because it’s an eponymous template with multiple members that share the template name. That means that it must be declared using the longform syntax (as you can see in the source code), but we can still instantiate it without the dot notation. It’s also interesting because, even though it accepts two template arguments, it is generally only instantiated with one. This is because the second template argument can be deduced by the compiler based on the function argument.

For a somewhat simpler example, take a look at std.utf.toUTF8:

void main()
{    
    import std.stdio : writeln;
    import std.utf : toUTF8;
    string s1 = toUTF8("This was a UTF-16 string."w);
    string s2 = toUTF8("This was a UTF-32 string."d);
    writeln(s1);
    writeln(s2);
}

Run it online at run.dlang.io

Unlike std.conv.to, toUTF8 takes exactly one parameter. The signature of the declaration looks like this:

string toUTF8(S)(S s)

But in the example, we aren’t passing a type as a template argument. Just as the compiler was able to deduce to’s second argument, it’s able to deduce toUTF8’s sole argument.

toUTF8 is an eponymous function template with a template parameter S and a function parameter s of type S. There are two things we can say about this: 1) the return type is independent of the template parameter and 2) the template parameter is the type of the function parameter. Because of 2), the compiler has all the information it needs from the function call itself and has no need for the template argument in the instantiation.

Take the first call to the toUTF8 function in the declaration of s1. In long form, it would be toUTF8!(wstring)("blah"w). The w at the end of the string literal indicates it is of type wstring, with UTF-16 encoding, as opposed to string, with UTF-8 encoding (the default for string literals). In this situation, having to specify !(wstring) in the template instantiation is completely redundant. The compiler already knows that the argument to the function is a wstring. It doesn’t need the programmer to tell it that. So we can drop the template instantiation operator and the template argument list, leaving what looks like a simple function call. The compiler knows that toUTF8 is a template, knows that the template is declared with one type parameter, and knows that the type should be wstring.

Similarly, the d suffix on the literal in the initialization of s2 indicates a UTF-32 encoded dstring, and the compiler knows all it needs to know for that instantiation. So in this case also, we drop the template argument and make the instantiation appear as a function call.

It does seem silly to convert a wstring or dstring literal to a string when we could just drop the w and d prefixes and have string literals that we can directly assign to s1 and s2. Contrived examples and all that. But the syntax the examples are demonstrating really shines when we work with variables.

wstring ws = "This is a UTF-16 string"w;
string s = ws.toUTF8;
writeln(s);

Run it online at run.dlang.io

Take a close look at the initialization of s. This combines the shorthand template instantiation syntax with Uniform Function Call Syntax (UFCS) and D’s shorthand function call syntax. We’ve already seen the template syntax in this post. As for the other two:

  • UFCS allows using dot notation on the first argument of any function call so that it appears as if a member function is being called. By itself, it doesn’t offer much benefit aside from, some believe, readability. Generally, it’s a matter of preference. But this feature can seriously simplify the implementation of generic templates that operate on aggregate types and built-in types.
  • The parentheses in a function call can be dropped when the function takes no arguments, so that foo() becomes foo. In this case, the function takes one argument, but we’ve taken it out of the argument list using UFCS, so the argument list is now empty and the parentheses can be dropped. (The compiler will lower this to a normal function call, toUTF8(ws)—it’s purely programmer convenience.) When and whether to do this in the general case is a matter of preference. The big win, again, is in the simplification of template implementations: a template can be implemented to accept a type T with a member variable foo or a member function foo, or a free function foo for which the first argument is of type T.

All of this shorthand syntax is employed to great effect in D’s range API, which allows chained function calls on types that are completely hidden from the public-facing API (aka Voldemort types).

More to come

In future articles, we’ll explore the different kinds of template parameters, introduce template constraints, see inside a template instantiation, and take a look at some of the ways people combine templates with D’s powerful compile-time features in the real world. In the meantime, here are some template tutorial resources to keep you busy:

  • Ali Çehreli’s Programming in D is an excellent introduction to D in general, suitable even for those with little programming experience. The two chapters on templates (the first called ‘Templates’ and the second ‘More Templates’) provide a great introduction. (The online version of the book is free, but if you find it useful, please consider throwing Ali some love by buying the ebook or print version linked at the top of the TOC.)
  • More experienced programmers may find Phillipe Sigaud’s ‘D Template Tutorial’ a good read. It’s almost a decade old, but still relevant (and still free!). This tutorial goes beyond the basics into some of the more advanced template features. It includes a look at D’s compile-time features, provides a number of examples, and sports an appendix detailing the is expression (a key component of template constraints). It can also serve as a great reference when reading open source D code until you learn your way around D templates.

There are other resources, though other than my book ‘Learning D’ (this is a referral link that will benefit the D Language Foundation), I’m not aware of any that provide as much detail as the above. (And my book is currently not freely available). Eventually, we’ll be able to add this blog series to the list.

Thanks to Stefan Koch for reviewing this article.