D as a Better C

Posted on

D was designed from the ground up to interface directly and easily to C, and to a lesser extent C++. This provides access to endless C libraries, the Standard C runtime library, and of course the operating system APIs, which are usually C APIs.

But there’s much more to C than that. There are large and immensely useful programs written in C, such as the Linux operating system and a very large chunk of the programs written for it. While D programs can interface with C libraries, the reverse isn’t true. C programs cannot interface with D ones. It’s not possible (at least not without considerable effort) to compile a couple of D files and link them in to a C program. The trouble is that compiled D files refer to things that only exist in the D runtime library, and linking that in (it’s a bit large) tends to be impractical.

D code also can’t exist in a program unless D controls the main() function, which is how the startup code in the D runtime library is managed. Hence D libraries remain inaccessible to C programs, and chimera programs (a mix of C and D) are not practical. One cannot pragmatically “try out” D by add D modules to an existing C program.

That is, until Better C came along.

It’s been done before, it’s an old idea. Bjarne Stroustrup wrote a paper in 1988 entitled “A Better C“. His early C++ compiler was able to compile C code pretty much unchanged, and then one could start using C++ features here and there as they made sense, all without disturbing the existing investment in C. This was a brilliant strategy, and drove the early success of C++.

A more modern example is Kotlin, which uses a different method. Kotlin syntax is not compatible with Java, but it is fully interoperable with Java, relies on the existing Java libraries, and allows a gradual migration of Java code to Kotlin. Kotlin is indeed a “Better Java”, and this shows in its success.

D as Better C

D takes a radically different approach to making a better C. It is not an extension of C, it is not a superset of C, and does not bring along C’s longstanding issues (such as the preprocessor, array overflows, etc.). D’s solution is to subset the D language, removing or altering features that require the D startup code and runtime library. This is, simply, the charter of the -betterC compiler switch.

Doesn’t removing things from D make it no longer D? That’s a hard question to answer, and it’s really a matter of individual preference. The vast bulk of the core language remains. Certainly the D characteristics that are analogous to C remain. The result is a language somewhere in between C and D, but that is fully upward compatible with D.

Removed Things

Most obviously, the garbage collector is removed, along with the features that depend on the garbage collector. Memory can still be allocated the same way as in C – using malloc() or some custom allocator.

Although C++ classes and COM classes will still work, D polymorphic classes will not, as they rely on the garbage collector.

Exceptions, typeid, static construction/destruction, RAII, and unittests are removed. But it is possible we can find ways to add them back in.

Asserts are altered to call the C runtime library assert fail functions rather than the D runtime library ones.

(This isn’t a complete list, for that see http://dlang.org/dmd-windows.html#switch-betterC.)

Retained Things

More importantly, what remains?

What may be initially most important to C programmers is memory safety in the form of array overflow checking, no more stray pointers into expired stack frames, and guaranteed initialization of locals. This is followed by what is expected in a modern language — modules, function overloading, constructors, member functions, Unicode, nested functions, dynamic closures, Compile Time Function Execution, automated documentation generation, highly advanced metaprogramming, and Design by Introspection.

Footprint

Consider a C program:

#include <stdio.h>

int main(int argc, char** argv) {
    printf("hello world\n");
    return 0;
}

It compiles to:

_main:
push EAX
mov [ESP],offset FLAT:_DATA
call near ptr _printf
xor EAX,EAX
pop ECX
ret

The executable size is 23,068 bytes.

Translate it to D:

import core.stdc.stdio;

extern (C) int main(int argc, char** argv) {
    printf("hello world\n");
    return 0;
}

The executable size is the same, 23,068 bytes. This is unsurprising because the C compiler and D compiler generate the same code, as they share the same code generator. (The equivalent full D program would clock in at 194Kb.) In other words, nothing extra is paid for using D rather than C for the same code.

The Hello World program is a little too trivial. Let’s step up in complexity to the infamous sieve benchmark program:

#include <stdio.h>

/* Eratosthenes Sieve prime number calculation. */

#define true    1
#define false   0
#define size    8190
#define sizepl  8191

char flags[sizepl];

int main() {
    int i, prime, k, count, iter;

    printf ("10 iterations\n");
    for (iter = 1; iter <= 10; iter++) {
        count = 0;
        for (i = 0; i <= size; i++)
            flags[i] = true;
        for (i = 0; i <= size; i++) {
            if (flags[i]) {
                prime = i + i + 3;
                k = i + prime;
                while (k <= size) {
                    flags[k] = false;
                    k += prime;
                }
                count += 1;
            }
        }
    }
    printf ("\n%d primes", count);
    return 0;
}

Rewriting it in Better C:

import core.stdc.stdio;

extern (C):

__gshared bool[8191] flags;

int main() {
    int count;

    printf("10 iterations\n");
    foreach (iter; 1 .. 11) {
        count = 0;
        flags[] = true;
        foreach (i; 0 .. flags.length) {
            if (flags[i]) {
                const prime = i + i + 3;
                auto k = i + prime;
                while (k < flags.length) {
                    flags[k] = false;
                    k += prime;
                }
                count += 1;
            }
        }
    }
    printf("%d primes\n", count);
    return 0;
}

It looks much the same, but some things are worthy of note:

  • extern (C): means use the C calling convention.
  • D normally puts static data into thread local storage. C sticks them in global storage. __gshared accomplishes that.
  • foreach is a simpler way of doing for loops over known endpoints.
  • flags[] = true; sets all the elements in flags to true in one go.
  • Using const tells the reader that prime never changes once it is initialized.
  • The types of iter, i, prime and k are inferred, preventing inadvertent type coercion errors.
  • The number of elements in flags is given by flags.length, not some independent variable.

And the last item leads to a very important hidden advantage: accesses to the flags array are bounds checked. No more overflow errors! We didn’t have to do anything
in particular to get that, either.

This is only the beginning of how D as Better C can improve the expressivity, readability, and safety of your existing C programs. For example, D has nested functions, which in my experience work very well at prying goto’s from my cold, dead fingers.

On a more personal note, ever since -betterC started working, I’ve been converting many of my old C programs still in use into D, one function at a time. Doing it one function at a time, and running the test suite after each change, keeps the program in a correctly working state at all times. If the program doesn’t work, I only have one function to look at to see where it went wrong. I don’t particularly care to maintain C programs anymore, and with -betterC there’s no longer any reason to.

The Better C ability of D is available in the 2.076.0 beta: download it and read the changelog.

Don’t Fear the Reaper

Posted on

D, like many other programming languages in active use today, comes with a garbage collector out of the box. There are many types of software that can be written without worrying at all about the GC, taking full advantage of its benefits. But the GC does have drawbacks, and there are certainly scenarios in which garbage collection is undesirable. For those situations, the language provides the means to temporarily disable it, or even avoid it completely.

In order to maximize the positive impacts of garbage collection and minimize any negative, it’s necessary to have a good grounding in how the GC works in D. A good place to start is the Garbage Collection page at dlang.org, which outlines the rationale for D’s GC and provides some tips on working with it. This post is the first in a series intended to expand on the information provided on that page.

This time, we’ll look at the very basics, focusing on the language features that can trigger GC allocations. Future posts will introduce ways to disable the GC when necessary, as well as idioms useful in dealing with its nondeterministic nature (e.g. managing resources in the destructors of GC-managed objects).

The first thing to understand about D’s garbage collector is that it only runs during allocation, and only if there is no memory available to allocate. It isn’t sitting in the background, periodically scanning the heap and collecting garbage. This knowledge is fundamental to writing code that makes efficient use of GC-managed memory. Take the following example:

void main() {
    int[] ints;
    foreach(i; 0..100) {
        ints ~= i;
    }
}

This declares a dynamic array of int, then uses D’s append operator to append the numbers 0 to 99 in a foreach range loop. What isn’t obvious to the untrained eye is that the append operator makes use of the GC heap to allocate space for the values it adds to the array.

DRuntime’s array implementation isn’t a dumb one. In the example, there aren’t going to be one hundred allocations, one for each value. When more memory is needed, the implementation will allocate more space than is requested. In this particular case, it’s possible to determine how many allocations are actually made by making use of the capacity property of D’s dynamic arrays and slices. This returns the total number of elements the array can hold before an allocation is necessary.

void main() {
    import std.stdio : writefln;
    int[] ints;
    size_t before, after;
    foreach(i; 0..100) {
        before = ints.capacity;
        ints ~= i;
        after = ints.capacity;
        if(before != after) {
            writefln("Before: %s After: %s",
                before, after);
        }
    }
}

Executing this when compiled with DMD 2.073.2 shows the message is printed six times, meaning there were six total GC allocations in the loop. That means there were six opportunities for the GC to collect garbage. In this small example, it almost certainly didn’t. If this loop were part of a larger program, with GC allocations throughout, it very well might.

On a side note, it’s informative to examine the values of of before and after. Doing so shows a sequence of 0, 3, 7, 15, 31, 63, and 127. So by the end, ints contains 100 values and has space for appending 27 more before the next allocation, which, extrapolating from the values in the sequence, should be 255. That’s an implementation detail of DRuntime, though, and could be tweaked or changed in any release. For more details on how arrays and slices are managed by the GC, take a look at Steven Schveighoffer’s excellent article on the topic.

So, six allocations, six opportunities for the GC to initiate one of its pauses of unpredictable length even in that simple little loop. In the general case, that could become an issue depending on if the loop is in a hot part of code and how much total memory is allocated from the GC heap. But even then, it’s not necessarily a reason to disable the GC in that part of the code.

Even with languages that don’t come with a stock GC out of the box, like C and C++, most programmers learn at some point that it’s better for overall performance to allocate as much as possible up front and minimize allocations in the inner loops. It’s one of the many types of premature optimization that are not actually the root of all evil, something we tend to call best practice. Given that D’s GC only runs when memory is allocated, the same strategy can be applied as a simple way to mitigate any potential impact it could have on performance. Here’s one way to rewrite the example:

void main() {
    int[] ints = new int[](100);
    foreach(i; 0..100) {
        ints[i] = i;
    }
}

Now we’ve gone from six allocations down to one. The only opportunity the GC has to run is before the inner loop. This actually allocates space for at least 100 values and initializes them all to 0 before entering the loop. The array will have a length of 100 after new, but will almost certainly have additional capacity.

There’s an alternative to new for arrays: the reserve function:

void main() {
    int[] ints;
    ints.reserve(100);
    foreach(i; 0..100) {
        ints ~= i;
    }
}

This allocates memory for at least 100 values, but the array is still empty (its length property will return 0) when it returns, so nothing is default initialized. Given that the loop only appends 100 values, it’s guaranteed not to allocate.

In addition to new and reserve, it’s possible to call GC.malloc directly for explicit allocation.

import core.memory;
void* intsPtr = GC.malloc(int.sizeof * 100);
auto ints = (cast(int*)intsPtr)[0 .. 100];

Array literals will usually allocate.

auto ints = [0, 1, 2];

This is also true when an array literal enum is used.

enum intsLiteral = [0, 1, 2];
auto ints1 = intsLiteral;
auto ints2 = intsLiteral;

An enum exists only at compile time and has no memory address. Its name is a synonym for its value. Everywhere you use one, it’s like copying and pasting its value in place of its name. Both ints1 and ints2 trigger allocations exactly as if they were declared like so:

auto ints1 = [0, 1, 2];
auto ints2 = [0, 1, 2];

Array literals do not allocate when the target is a static array. Also, string literals (strings in D are arrays under the hood) are an exception to the rule.

int[3] noAlloc1 = [0, 1, 2];
auto noAlloc2 = "No Allocation!";

The concatenate operator will always allocate:

auto a1 = [0, 1, 2];
auto a2 = [3, 4, 5];
auto a3 = a1 ~ a2;

D’s associative arrays have their own allocation strategy, but you can expect them to allocate when items are added and potentially when removed. They also expose two properties, keys and values, which both allocate arrays and fill them with copies of the respective items. When its desirable to modify the underlying associative array during iteration, or when the items need to be sorted or otherwise manipulated independently of the associative array, these properties are just what the doctor ordered. Otherwise, they’re needless allocations that put undue pressure on the GC.

When the GC does run, the total amount of memory it needs to scan is going to determine how long it takes. The smaller, the better. Avoiding unnecessary allocations isn’t going to hurt anyone and is another good mitigation strategy. D’s associative arrays provide three properties that help do just that: byKey, byValue, and byKeyValue. These each return forward ranges that can be iterated lazily. They do not allocate because they actually refer to the items in the associative array, so it should not be modified while iterating them. For more on ranges, see the chapters titled Ranges and More Ranges in Ali Çehreli’s Programming in D.

Closures, which are delegates or function literals that need to carry around a pointer to the local stack frame, may also allocate. The last allocating language feature listed on the Garbage Collection page is the assertion. An assertion will allocate when it fails because it needs to throw an AssertError, which is part of D’s class-based exception hierarchy (we’ll look at how classes interact with the GC in a future post).

Then there’s Phobos, D’s standard library. Once upon a time, much of Phobos was implemented with little concern for GC allocations, making it difficult to use in situations where they were undesirable. However, a massive effort was initiated
to make it more conservative in its GC usage. Some functions were made to work with lazy ranges, others were rewritten to take buffers supplied by the caller, and some were reimplemented to avoid unnecessary allocations internally. The result is a standard library more amenable to GC-free code (though there are still probably some corners of the library that have not yet been renovated — PRs welcome).

Now that we’ve seen the basics of using the GC, the next post in this series looks at the tools the language and the compiler provide for turning the GC off and making sure specific sections of code are GC-free.

Thanks to Guillaume Piolat and Steven Schveighoffer for their help with this article.

Editable and Runnable Doc Examples on dlang.org

Posted on

Sebastian Wilzbach was a GSoC student for the D Language Foundation in 2016 and has since become a regular contributor to Phobos, D’s standard library, and dlang.org.


This article explains the steps that were needed to have editable and runnable examples in the documentation on dlang.org. First, let’s begin with the building blocks.

Unit testing in D

One of D’s coolest features is its unittest block, which allows the insertion of testable code anywhere in a program. It has become idiomatic for a function to be followed directly by its tests. For example, let’s consider a simple function add which is accompanied by two tests:

auto add(int a, int b)
{
    return a + b;
}

unittest
{
    assert(2.add(2) == 4);
    assert(3.add(4) == 7);
}

By default, all unittest blocks will be ignored by the compiler. Specifying -unittest on the compiler’s command line will cause the unit tests to be included in the compiled binary. Combined with -main, tests in D can be directly executed with:

rdmd -main -unittest add.d

If a unittest block is annotated with embedded documentation, a D documentation generator can also display the tests as examples in the generated documentation. The DMD compiler ships with a built-in documentation generator (DDoc), which can be run with the -D flag, so executing:

dmd -D -main add.d

would yield the documentation of the add function above with its tests displayed as examples, as demonstrated here:

Please note that the documentation on dlang.org is generated with DDoc. However, in case you don’t like DDoc, there are several other options.

Executing code on the web

Frequent readers of the D Blog might remember Damian Ziemba’s DPaste – an online compiler for the D Programming language. In 2012, he made the examples on the front page of D’s website runnable via his service. Back in those old days, the website of the D Programming language looked like this:

As a matter of fact, until 2015, communication with DPaste was done in XML.

Putting things together

So D has a unit test system that allows placing executable unit tests next to the functions they test, the tests can also be rendered as examples in the generated documentation, and there exists a service, in the form of DPaste, that allows D code to be executed on the web. The only thing missing was to link them together to produce interactive documentation for a compiled language.

There was one big caveat that needed to be addressed before that could happen. While D’s test suite, which is run on ten different build machines, ensures that all unit tests compile & run without errors, an extracted test might contain symbols that were imported at module scope and thus wouldn’t be runnable on dlang.org. A unittest block can only be completely independent of the module in which it is declared if all of its symbols are imported locally in the test’s scope. The solution was rather simple: extract all tests from Phobos, then compile and execute them separately to ensure that a user won’t hit a “missing import” error on dlang.org. Thanks to D’s ultra-fast frontend, this step takes less than a minute on a typical machine in single-core build mode.

Moreover, to prevent any regressions, this has been integrated into Phobos’s test suite and is run for every PR via CircleCi. As Phobos has extensive coverage with unit tests, we started this transition with a blacklist and, step-by-step, removed modules for which all extracted tests compile. With continuous checking in place, we were certain that none of the exposed unit tests would throw any errors when executed in the documentation, so we could do the flip and replace the syntax-highlighted unit test examples with an interactive code editor.

Going one step further

With this setup in place, hitting the “Run” button would merely show the users “All tests passed”. While that’s always good feedback, it conveys less information than is usually desirable.

Documentation that supports runnable examples tends to send any output to stdout. This allows the reader to take the example and modify it as needed while still seeing useful output about the modifications. So, for example, instead of using assertions to validate the output of a function, which is idiomatic in D unit tests and examples:

assert(myFun() == 4);

Other documentation usually prints to stdout and shows the expected output in a comment. In D, that would look like this:

writeln(myFun()); // 4

I initially tried to do such a transformation with regular expressions, but I was quickly bitten by the complexity of a context-free language. So I made another attempt using Brian Schott’s libdparse, a library to parse and lex D source code. libdparse allows one to traverse the abstract syntax tree (AST) of a D source file. During the traversal of the AST, the transformation tool can rewrite all AssertExpressions into writeln calls, similar to the way other documentation displays examples. To speak in the vocabulary of compiler devs: we are lowering AssertExpressions into the more humanly digestible writeln calls!

Once the AST has been traversed and modified, it needs to be transformed into source code again. This led to improvements in libdparse’s formatting capabilities (1, 2).

The future

As of now, there are still a small number of functions in Phobos that don’t have a nice public example that is runnable on dlang.org. Tooling to check for this has recently been activated in Phobos. So now you can use this tool (make -f posix.mak has_public_example) to find functions lacking public tests and remove those modules from the blacklist.

Another target for improvement is DPaste. For example, it currently doesn’t cache incoming requests, which could improve the performance of executed examples on dlang.org. However, due to the fast compilation speed of the DMD compiler, this “slow-down” isn’t noticeable and is more of a perfectionist wish.

I hope you enjoy the new “Run” button on the documentation and have as much fun playing with it as I do. Click here to get started.

A New Import Idiom

Posted on

Daniel Nielsen is an Embedded Software Engineer. He is currently using D in his spare time for an unpublished Roguelike and warns that he “may produce bursts of D Evangelism”.


I remember one day in my youth, before the dawn of Internet, telling my teachers about “my” new algorithm, only to learn it had been discovered by the ancient Greeks in ~300 BC. This is the story of my life and probably of many who are reading this. It is easy to “invent” something; being the first, not so much!

Anyway, this is what all the fuss is about this time:

template from(string moduleName)
{
  mixin("import from = " ~ moduleName ~ ";");
}

The TL;DR version: A new idiom to achieve even lazier imports.

Before the C programmers start running for the hills, please forget you ever got burned by C++ templates. The above snippet doesn’t look that complicated, now does it? If you enjoy inventing new abstractions, take my advice and give D a try. Powerful, yet an ideal beginner’s language. No need to be a template archwizard.

Before we proceed further, I’d like to call out Andrei Alexandrescu for identifying that there is a problem which needs solving. Please see his in depth motivation in DIP 1005. Many thanks also to Dominikus Dittes Scherkl, who helped trigger the magic spark by making his own counter proposal and questioning if there really is a need to change the language specification in order to obtain Dependency-Carrying Declarations (DIP 1005).

D, like many modern languages, has a fully fledged module system where symbols are directly imported (unlike the infamous C #include). This has ultimately resulted in the widespread use of local imports, limiting the scope as much as possible, in preference to the somewhat slower and less maintainable module-level imports:

// A module-level import
import std.datetime;
  
void fun(SysTime time)
{
  import std.stdio; // A local import
  ...
}

Similar lazy import idioms are possible in other languages, for instance Python.

The observant among you might notice that because SysTime is used as the type of a function parameter, std.datetime must be imported at module level. Which brings us to the point of this blog post (and DIP 1005). How can we get around that?

void fun(from!"std.datetime".SysTime time)
{
  import std.stdio;
  ...
}

There you have it, the Scherkl-Nielsen self-important lookup.

In order to fully understand what’s going on, you may need to learn some D-isms. Let’s break it down.

  1. When instantiating a template (via the ! operator), if the TemplateArgument is one token long, the parentheses can be omitted from the template parameters. So from!"std.datetime" is the same as from!("std.datetime"). It may seem trivial, but you’d be surprised how much readability is improved by avoiding ubiquitous punctuation noise.
  2. Eponymous templates. The declaration of a template looks like this:
    template y() {
        int x;
    }

    With that, you have to type y!().x in order to reach the int. Oh, ze horror! Is that a smiley? Give me x already! That’s exactly what eponymous templates accomplish:

    template x() {
        int x;
    }

    Now that the template and its only member have the same name, x!().x can be shortened to simply x.

  3. Renamed imports allow accessing an imported module via a user-specified namespace. Here, std.stdio is imported normally:
    void printSomething(string s) {
        import std.stdio;
        writeln(s);           // The normal way
        std.stdio.writeln(s)  // An alternative using the fully qualified 
                              // symbol name, for disambiguation
    }

    Now it’s imported and renamed as io:

    void printSomething(string s) {
        import io = std.stdio;
        io.writeln(s);         // Must be accessed like this.
        writeln(s);            // Error
        std.stdio.writeln(s);  // Error
    }

    Combining what we have so far:

    template dt() {
        import dt = std.datetime; 
    }
    void fun(dt!().SysTime time) {}

    It works perfectly fine. The only thing which remains is to make it generic.

  4. String concatenation is achieved with the ~ operator.
    string hey = "Hello," ~ " World!";
    assert(hey == "Hello, World!");
  5. String mixins put the power of a compiler writer at your fingertips. Let’s generate code at compile time, then compile it. This is typically used for domain-specific languages (see Pegged for one prominent use of a DSL in D), but in our simple case we only need to generate one single statement based on the name of the module we want to import. Putting it all together, we get the final form, allowing us to import any symbol from any module inline:
    template from(string moduleName)
    {
      mixin("import from = " ~ moduleName ~ ";");
    }

In the end, is it all really worth the effort? Using one comparison made by Jack Stouffer:

import std.datetime;
import std.traits;

void func(T)(SysTime a, T value) if (isIntegral!T)
{
    import std.stdio : writeln;
    writeln(a, value);
}

Versus:

void func(T)(from!"std.datetime".SysTime a, T value)
    if (from!"std.traits".isIntegral!T)
{
    import std.stdio : writeln;
    writeln(a, value);
}

In this particular case, the total compilation time dropped to ~30% of the original, while the binary size dropped to ~41% of the original.

What about the linker, I hear you cry? Sure, it can remove unused code. But it’s not always as easy as it sounds, in particular due to module constructors (think __attribute__((constructor))). In either case, it’s always more efficient to avoid generating unused code in the first place rather than removing it afterwards.

So this combination of D features was waiting there to be used, but somehow no one had stumbled on it before. I agreed with the need Andrei identified for Dependency-Carrying Declarations, yet I wanted even more. I wanted Dependency-Carrying Expressions. My primary motivation comes from being exposed to way too much legacy C89 code.

void foo(void)
{
#ifdef XXX /* needed to silence unused variable warnings */
  int x;
#endif
... lots of code ...
#ifdef XXX
  x = bar();
#endif
}

Variables or modules, in the end they’re all just symbols. For the same reason C99 allowed declaring variables in the middle of functions, one should be allowed to import modules where they are first used. D already allows importing anywhere in a scope, but not in declarations or expressions. It was with this mindset that I saw Dominikus Dittes Scherkl’s snippet:

fun.ST fun()
{
   import someModule.SomeType;
   alias ST = SomeType;
   ...
}

Clever, yet for one thing it doesn’t adhere to the DRY principle. Still, it was that tiny dot in fun.ST which caused the spark. There it was again, the Dependency-Carrying Expression of my dreams.

Criteria:

  • It must not require repeating fun, since that causes problems when refactoring
  • It must be lazy
  • It must be possible today with no compiler updates

Templates are the poster children of lazy constructs; they don’t generate any code until instantiated. So that seemed a good place to start.

Typically when using eponymous templates, you would have the template turn into a function, type, variable or alias. But why make the distinction? Once again, they’re all just symbols in the end. We could have used an alias to the desired module (see Scherkl’s snippet above); using the renamed imports feature is just a short-cut for import and alias. Maybe it was this simplified view of modules that made me see more clearly.

Now then, is this the only solution? No. As a challenge to the reader, try to figure out what this does and, more importantly, its flaw. Can you fix it?

static struct STD
{
  template opDispatch(string moduleName)
  {
    mixin("import opDispatch = std." ~ moduleName ~ ";");
  }
}

 

Testing In The D Standard Library

Posted on

Jack Stouffer is a member of the Phobos team and contributor to dlang.org. You can check out more of his writing on his blog.


In the D standard library, colloquially named Phobos, we take a multi-pronged approach to testing and code review. Currently, there are five different services any addition has to go through:

  1. The whole complier chain of tests: DMD’s and DRuntime’s test suite, and Phobos’s unit tests
  2. A documentation builder
  3. Coverage analysis
  4. A style checker
  5. And a community project builder/test runner

Using these, we’re able to automatically catch the vast majority of common problems that we see popping up in PRs. And we make regressions much less likely using the full test suite and examining coverage reports.

Hopefully this will provide some insight into how a project like a standard library can use testing in order to increase stability. Also, it can spark some ideas on how to improve your own testing and review process.

Unit Tests

In D, unit tests are an integrated part of the language rather than a library
feature:

size_t sum(int[] a)
{
    size_t result;

    foreach (e; a)
    {
        result += e;
    }

    return result;
}

unittest
{
    assert(sum([1, 2, 3]) == 6);
    assert(sum([0, 50, 100]) == 150);
}

void main() {}

Save this as test.d and run dmd -unittest -run test.d. Before your main function is run, all of the unittest blocks will be executed. If any of the asserts fail, execution is terminated and an error is printed to stderr.

The effect of putting unit tests in the language has been enormous. One of the main ones we’ve seen is tests no longer have the “out of sight, out of mind” problem. Comprehensive tests in D projects are the rule and not the exception. Phobos dogfoods inline unittest blocks and uses them as its complete test suite. There are no other tests for Phobos than the inline tests, meaning for a reviewer to check their changes, they can just run dmd -main -unittest -run std/algorithm/searching.d (this is just for quick and dirty tests; full tests are done via make).

Every PR onto Phobos runs the inline unit tests, DMD’s tests, and the DRuntime tests on the following platforms:

  • Windows 32 and 64 bit
  • MacOS 32 and 64 bit
  • Linux 32 and 64 bit
  • FreeBSD 32 and 64 bit

This is done by Brad Roberts‘s auto-tester. As a quick aside, work is currently progressing to make bring D to iOS and Android.

Idiot Proof

In order to avoid pulling untested PRs, we have three mechanisms in place. First, only PRs which have at least one Github review by someone with pull rights can be merged.

Second, we don’t use the normal button for merging PRs. Instead, once a reviewer is satisfied with the code, we tell the auto-tester to merge the PR if and only if all tests have passed on all platforms.

Third, every single change to any of the official repositories has to go through the PR review process. This includes changes made by the BDFL Walter Bright and the Language Architect Andrei Alexandrescu. We have even turned off pushing directly to the master branch in Github to make sure that nothing gets around this.

Unit Tests and Examples

Unit tests in D solve the perennial problem of out of date docs by using the unit test code itself as the example code in the documentation. This way, documentation examples are part of the test suite rather than just some comment which will go out of date.

With this format, if the unit test goes out of date, then the test suite fails. When the tests are updated, the docs change automatically.

Here’s an example:

/**
 * Sums an array of `int`s.
 * 
 * Params:
 *      a = the array to sum
 * Returns:
 *      The sum of the array.
 */
size_t sum(int[] a)
{
    size_t result;

    foreach (e; a)
    {
        result += e;
    }

    return result;
}

///
unittest
{
    assert(sum([1, 2, 3]) == 6);
    assert(sum([0, 50, 100]) == 150);
}

// only tests with a doc string above them get included in the
// docs
unittest
{
    assert(sum([100, 100, 100]) == 300);
}

void main() {}

Run dmd -D test.d and it generates the following un-styled HTML:

Phobos uses this to great effect. The vast majority of examples in the Phobos documentation are from unittest blocks. For example, here is the documentation for std.algorithm.find and here is the unit test that generates that example.

This is not a catch all approach. Wholesale example programs, which are very useful when introducing a complex module or function, still have to be in comments.

Protecting Against Old Bugs

Despite our best efforts, bugs do find their way into released code. When they do, we require the person who’s patching the code to add in an extra unit test underneath the buggy function in order to protect against future regressions.

Docs

For Phobos, the documentation pages which were changed are generated on a test server for every PR. Developed by Vladimir Panteleev, the DAutoTest allows reviewers to compare the old page and the new page from one location.

For example, this PR changed the docs for two structs and their member functions. This page on DAutoTest shows a summary of the changed pages with links to view the final result.

Coverage

Perfectly measuring the effectiveness of a test suite is impossible, but we can get a good rough approximation with test coverage. For those unaware, coverage is a ratio which represents the number of lines of code that were executed during a test suite vs. lines that weren’t executed.

DMD has built-in coverage analysis to work in tandem with the built-in unit tests. Instead of dmd -unittest -run main.d, do dmd -unittest -cov -run main.d and a file will be generated showing a report of how many times each line of code was executed with a final coverage ratio at the end.

We generate this report for each PR. Also, we use codecov in order to get details on how well the new code is covered, as well as how coverage for the whole project has changed. If coverage for the patch is lower than 80%, then codecov marks the PR as failed.

At the time of writing, of the 77,601 lines of code (not counting docs or whitespace) in Phobos, 68,549 were covered during testing and 9,052 lines were not. This gives Phobos a test coverage of 88.3%, which is increasing all of the time. This is all achieved with the built in unittest blocks.

Project Tester

Because test coverage doesn’t necessarily “cover” all real world use cases and combinations of features, D uses a Jenkins server to download, build, and run the tests for a select number of popular D projects with the master branches of Phobos, DRuntime, and DMD. If any of the tests fail, the reviewers are notified.

Style And Anti-Pattern Checker

Having a code style set from on high stops a lot of pointless Internet flame wars (tabs vs spaces anyone?) dead in their tracks. D has had such a style guide for a couple of years now, but its enforcement in official code repos was spotty at best, and was mostly limited to brace style.

Now, we use CircleCI in order to run a series of bash scripts and the fantastically helpful dscanner which automatically checks for all sorts of things you shouldn’t be doing in your code. For example, CircleCI will give an error if it finds:

  • bad brace style
  • trailing whitespace
  • using whole module imports inside of functions
  • redundant parenthesis

And so on.

The automation of the style checker and coverage reports was done by Sebastian Wilzbach. dscanner was written by Brian Schott.

Closing Thoughts

We’re still working to improve somethings. Currently, Sebastian is writing a script to automatically check the documentation of every function for at least one example. Plus, the D Style Guide can be expanded to end arguments over the formatting of template constraints and other contested topics.

Practically speaking, other than getting the coverage of Phobos up to >= 95%, there’s not too much more to do. Phobos is one of the most throughly tested projects I’ve ever worked on, and it shows. Just recently, Phobos hit under 1000 open bugs, and that’s including enhancement requests.