Author Archives: Michael Parker

New D Compiler Release: DMD 2.075.0

DMD 2.075.0 was released a few days back. As with every release, the changelog is available so you can browse the list of fixed bugs and new features. 2.075.0 can be fetched from the dlang.org download page, which always makes available the latest DMD release alongside a nightly build.

Notable Changes

Every DMD release brings with it a number of bug fixes, changes, and enhancements. Here are some of the more noteworthy changes in this release.

Two array properties removed

Anyone who does a lot of work with D’s ranges will likely have encountered this little annoyance that arises from the built-in .sort property of arrays.

void main()
{
    import std.algorithm : remove, sort;
    import std.array : array;
    int[] nums = [5, 3, 1, 2, 4];
    nums = nums.sort.remove(2).array;
}

The .sort property has been deprecated for ages, so the above would result in the following error:

sorted.d(6): Deprecation: use std.algorithm.sort instead of .sort property

The workaround would be to add an empty set of parentheses to the sort call. With DMD 2.075.0, this is no longer necessary and the above will compile. Both the .sort and .reverse array properties have finally been removed from the language.

For the uninitiated, D has two features that have proven convenient in the functional pipeline programming style typically used with ranges. One is that parentheses on a function call are optional when there are no parameters. The other is Universal Function Call Syntax (UFCS), which allows a function call to be made using the dot notation on the first argument, so that a function int add(int a, int b) can be called as: 10.add(5).

Each of D’s built-in types comes with a set of built-in properties. Given that the built-in properties are not functions, no parentheses are used to access them. The .sort array property has been around since the early days of D1. At the time, it was rather useful and convenient for anyone who was happy with the default implementation. When D2 came along with the range paradigm, the standard library was given a set of functions that can treat arrays as ranges, opening them up to use with the many range based functions in the std.algorithm package and elsewhere.

With optional parentheses, UFCS, and a range-based function in std.algorithm called sort, conflict was inevitable. Now range-based programmers can put that behind them and take one more pair of parentheses out of their pipelines.

The breaking up of std.datetime

The std.datetime module has had a reputation as the largest module in D’s standard library. Some developers have been known to use it a stress test for their tooling. It was added to the library long before D got the special package module feature, which allows multiple modules in a package to be imported as a single module.

Once package modules were added, Jonathan M. Davis, the original std.datetime developer, found it challenging to split the monolith into multiple modules. Then, at DConf 2017, he could be seen toiling away on his laptop in the conference hall and the hotel lobby. On the final day of the conference, the day of the DConf Hackathon, he announced that std.datetime was now a package. DMD 2.075.0 is the first release where the new module structure is available.

Any existing code using the old module should still compile. However, any static libraries or object files lying around with the old symbols stuffed inside may need to be recompiled.

Colorized compiler messages

This one is missing from the changelog. DMD now has the ability to output colorized messages. The implementation required going through the existing error messages and properly annotating them where appropriate, so there may well be some messages for which the colors are missing. Also, given that this is a brand new feature and people can be picky about their terminal colors, more work will likely be done on this in the future. Perhaps that might include support for customization.

 

Compiler Ddoc documentation online

DMD, though originally written in C++, was converted to D some time ago. Now that more D programmers are able to contribute to the compiler, work has gone into documenting its source using D’s built-in Ddoc syntax. The result is now online, accessible from the sidebar of the existing library reference. A good starting point is the ddmd.mars module.

And more…

The above is a small part of the bigger picture. The bugfix list shows 89 bugs, regressions, and enhancements across the compiler, runtime, standard library, and web site. See the full changelog for the details.

Thanks to everyone who contributed to this release, whether it was by reporting issues, submitting or reviewing pull requests, testing out the beta, or carrying out any of the numerous small tasks that help a new release see the light of day.

Go Your Own Way (Part One: The Stack)

This is my third post in the GC series. In the first post, I introduced D’s garbage collector and the language features that require it, and touched on simple strategies to use it effectively. In the second post, I showed off the tools provided by the language and library to disable or prohibit the GC in specific parts of a code base, how to use the compiler to assist in that endeavor, and recommended that D programs be written initially to embrace the GC, taking advantage of simple strategies to mitigate its impact, and later tuned to avoid it or further optimize its usage only when profiling shows it’s warranted.

When garbage collection is turned off via GC.disable or prevented by the @nogc function annotation, memory will still need to be allocated from somewhere. And even when the GC is fully embraced, it’s still desirable to minimize the size and quantity of GC heap allocations. That means allocating either via the stack, or via the non-GC heap. The focus of this post is the former. Non-GC heap allocations will be covered in my next post in this series.

Stack allocation

The simplest allocation strategy in D is the same as it is in C: avoid the heap and use the stack whenever possible. When a local array is needed and the size can be known at compile time, use a static rather than a dynamic array. Structs, which are value types and stack-allocated by default, should be preferred where possible over classes, which are reference types and are usually allocated from one heap or another. D’s compile-time features can present opportunities here that might not otherwise be available.

Static arrays

Static array declarations in D require the length to be known at compile-time.

// OK
int[10] nums;

// Error: variable x cannot be read at compile time
int x = 10;
int[x] err;

Unlike dynamic arrays, static arrays can be initialized with array literals with no allocation taking place on the GC heap. The lengths must match, otherwise the compiler will emit an error.

@nogc void main() {
    int[3] nums = [1, 2, 3];
}

Static arrays are automatically sliced when passed to any function taking a slice as a parameter, making them interchangeable with dynamic arrays.

void printNums(int[] nums) {
    import std.stdio : writeln;
    writeln(nums);
}

void main() {
    int[]  dnums = [0, 1, 2];
    int[3] snums = [0, 1, 2];
    printNums(dnums);
    printNums(snums);
}

When compiling with -vgc to find the potential GC allocations in a program and eliminate them where possible, this is an easy win. Just be wary of situations like the following:

int[] foo() {
    auto nums = [0, 1, 2];

    // Do work with nums...

    return nums;
}

Converting nums in this example to a static array would be a mistake. The return statement in that case would be returning a slice to stack-allocated memory, which is a programming error. Luckily, doing so will generate a compiler error.

On the other hand, if the return is conditional, it may be desirable to heap-allocate the array only when absolutely necessary rather than every time the function is called. In that scenario, a static array can be declared locally and a dynamic copy made on return. Enter the .dup property:

int[] foo() {
    int[3] nums = [0, 1, 2];
    
    // Let x = the result of some work with nums
    bool condtion = x;

    if(condition) return nums.dup;
    else return [];
}

This function still uses the GC via .dup, but only allocates if it needs to and avoids allocation when it doesn’t. Note that [] is equivalent to null in this case, a slice (or dynamic array) with a .length of 0 and a .ptr of null.

Structs vs. classes

Struct instances in D are allocated on the stack by default, but can be allocated on the heap when desired. Stack-allocated structs have deterministic destruction, with their destructors called as soon as the enclosing scope exits.

struct Foo {
    int x;
    ~this() {
        import std.stdio;
        writefln("#%s says bye!", x);
    }
}
void main() {
    Foo f1 = Foo(1);
    Foo f2 = Foo(2);
    Foo f3 = Foo(3);
}

As expected, this prints:

#3 says bye!
#2 says bye!
#1 says bye!

Classes, being reference types, are almost always allocated on the heap. Usually, that’s the GC heap via new, though it could also be the non-GC heap through a custom allocator. But there’s no rule saying they can’t be allocated on the stack. The standard library template std.typecons.scoped allows us to easily do so.

class Foo {
    int x;

    this(int x) { 
        this.x = x; 
    }
    
    ~this() {
        import std.stdio;
        writefln("#%s says bye!", x);
    }
}
void main() {
    import std.typecons : scoped;
    auto f1 = scoped!Foo(1);
    auto f2 = scoped!Foo(2);
    auto f3 = scoped!Foo(3);
}

Functionally, this is identical to the struct example above; it prints the same results. Deterministic destruction is achieved via the core.object.destroy function, which allows destructors to be called outside of GC collections.

Note that neither scoped nor destroy are currently usable in @nogc functions. This isn’t necessarily a problem, as a function doesn’t have to be annotated such to avoid the GC, but it can be a headache if you are trying to fit everything into a @nogc call tree. In future posts, we’ll look at some of the design issues that crop up when using @nogc and how to avoid them.

Generally, when implementing custom types in D, the choice between struct and class should be dependent on the type’s intended usage. POD types are obvious candidates for struct, whereas for types in something like a GUI system, where inheritance heirarchies and runtime interfaces are extremely useful, class is a more appropriate choice. Beyond those obvious cases, there are a number of other considerations which could be the focus of a separate blog post on the topic. For our purposes, just keep in mind that whether or not a type is implemented as a struct or a class need not always dictate whether or not instances can be allocated on the stack.

alloca

Given that D makes alloca available, it is also an option for stack allocation. This is a candidate especially for arrays when you want to avoid or eliminate a local GC allocation, but the array size is only known at run time. The following example allocates a dynamic array with a runtime size on the stack.

import core.stdc.stdlib : alloca;

void main() {
    size_t size = 10;
    void* mem = alloca(size);

    // Slice the memory block
    int[] arr = cast(int[])mem[0 .. size];
}

The same caution about using alloca in C applies here: be careful not to blow up the stack. And as with local static arrays, don’t return a slice of arr. Return arr.dup instead.

A simple example

Consider an implementation of a Queue data type. An idiomatic implementation in D is going to be a struct that’s templated on the type it’s intended to contain. In Java, collection usage is interface heavy and it’s recommended to declare an instance using an interface type rather than the implementation type. Structs in D can’t implement interfaces, but in many cases they can still be used to program to interfaces thanks to Design by Introspection (DbI). This allows programming to a common interface that is verified via compile-time introspection without the need for an interface type, so it can work with structs, classes and, thanks to Universal Function Call Syntax (UFCS), even free functions (when the functions are in scope).

D’s arrays are an obvious choice as the backing store for a Queue implementation. Moreover, there’s an opportunity to make the backing store a static array when a queue is intended to be bounded with a fixed size. Since it’s already a templated type, an additional parameter, a template value parameter with a default value can easily be added to decide at compile time if the array should be static or not and, if so, how much space it should require.

// A default Size of 0 means to use a dynamic array for the
// backing store; non-zero indicates a static array.
struct Queue(T, size_t Size = 0) 
{
    // This constant will be inferred as a boolean. By making it
    // public, a DbI template outside of this module can determine
    // whether or not the Queue might grow. 
    enum isFixedSize = Size > 0;

    void enqueue(T item) 
    {
        static if(isFixedSize) {
            assert(_itemCount < _items.length);
        }
        else {
            ensureCapacity();
        }
        push(item);
    }

    T dequeue() {
        assert(_itemCount != 0);
        static if(isFixedSize) {
            return pop();
        }
        else {
            auto ret = pop();
            ensurePacked();
            return ret;
        }
    }

    // Only available on a growable array
    static if(!isFixedSize) {
        void reserve(size_t capacity) { /* Allocate space for new items */ }
    }

private:   
    static if(isFixedSize) {
        T[Size] _items;     
    }
    else T[] _items;
    size_t _head, _tail;
    size_t _itemCount;

    void push(T item) { 
        /* Add item, update _head and _tail */
        static if(isFixedSize) { ... }
        else { ... }
    }

    T pop() { 
        /* Remove item, update _head and _tail */ 
        static if(isFixedSize) { ... }
        else { ... }
    }

    // These are only available on a growable array
    static if(!isFixedSize) {
        void ensureCapacity() { /* Alloc memory if needed */ }
        void ensurePacked() { /* Shrink the array if needed */}
    }
}

With this, the client can declare instances like so:

Queue!Foo qUnbounded;
Queue!(Foo, 128) qBounded;

qBounded requires no heap allocations. What happens with qUnbounded depends on the implementation. Moreover, compile-time introspection can be used to test if an instance is a fixed size or not. The isFixedSize constant is a convenience for that. Clients could alternatively use the built-in __traits(hasMember, T, "reserve") or the standard library function std.traits.hasMember!T("reserve") in one compile-time construct or another (__traits and std.traits are great for DbI, and the latter should be preferred when it provides similar functionality), but including the constant in the type is more convenient.

void doSomethingWithQueueInterface(T)(T queue)
{
    static if(T.isFixedSize) { ... }
    else { ... }
}

Conclusion

This has been a brief overview of a few options for stack allocation in D to avoid allocations from the GC heap. Making use of them when possible is an easy way to minimize the size and quantity of GC allocations, a proactive strategy for mitigating potential negative performance impacts from garbage collection.

The next post in this series will cover some of the options available for non-GC heap allocations.

Project Highlight: Derelict

Previous project highlights on this blog were written up both in my own words and in quotes from the project maintainers. This time around is different — it would be a little odd to quote myself while writing about my own project.

Derelict is a collection of D bindings to C libraries. In its present incarnation, it resides in a GitHub organization called DerelictOrg. There you’ll find bindings to a number of libraries such as SDL, GLFW, SFML, OpenGL, OpenAL, FreeType, FreeImage and more. There are currently 26 repositories in the organization: one for the documentation, one for a utility package, 21 active packages and three that are currently unsupported, maintained primarily by me and Guillaume Piolat.

The Derelict packages primarily provide dynamic bindings, though some can be configured as static bindings at compile time. Dynamic bindings require the C libraries to which they bind to be dynamically loaded at run time. The mechanism for this is provided by the DerelictUtil package. Loading a library is as simple as a function call, e.g. DerelictSDL2.load();.

Static bindings have a link-time dependency, requiring either statically or dynamically linking with the bound C library when building an executable. In that case, the dependency on DerelictUtil goes away (some packages may still use a few DerelictUtil declarations, requiring one of its modules to be imported, but it need not be linked into the program). On Windows, this introduces potential issues with object file formats, but these days they are quite easy to solve.

It’s hard to believe now, but Derelict has been around continuously, in one form or another, since March of 2004. I was first drawn to D because it fit almost perfectly between the two languages I had the most experience with back in 2003, C and Java. It addressed some of the frustrations I had with each while combining things I loved from both. But early on I ran into my first frustration with D.

At the time, interfacing with C libraries on Windows was a bit annoying. D has excellent support for and compatibility with C libraries in the language, but the object file format issues I alluded to above were frustrating. DMD on Windows could only output object files in the OMF format and could only use the OPTLINK linker, an ancient 32-bit linker that only understands OMF. This is because DMD was implemented on top of backend and tool chain used by the DMC, the Digital Mars C and C++ compiler. Meanwhile, the rest of the Microsoft ecosystem was (and is) using the COFF format. In practice, a static binding to a C library required either using tools to convert the object files or libraries from OMF to COFF, or compiling the C library with DMC. You don’t tend to find support for DMC in most build scripts.

When I found that the bindings for the libraries I wanted to use (SDL and OpenGL) were static, I resolved to create my own. I’m a big fan of dynamic loading, so it was a no-brainer to create dynamic bindings. When you load a DLL via the LoadLibrary/GetProcAddress API, the object file format is irrelevant. From that point on, I never had to worry about the COFF/OMF problem again. Even though the problem didn’t exist outside of Windows, I made them cross platform anyway to keep a consistent interface.

In those days, DMD 1.0.0 wasn’t yet a thing, but a new web site, dsource.org, had just been launched to host open source D projects (much of the site is still online in archive mode thanks to Vladimir Panteleev). All of the projects there used Subversion, so I decided to learn my way around it and take a stab at maintaining an open source project by making my new bindings available. Part of my motivation for picking “Derelict” as the title is because I fully expected no one else would be interested and, whether they were or not, I would eventually abandon it anyway.

As it turned out, people really were interested. Contributions started coming in almost immediately, along with questions and suggestions on the DSource forums. I added new bindings, set up criteria on new binding submissions (they must be gamedev-related and cross-platform), and added documentation on how to create “Derelictified” bindings. Tomasz Stachowiak (now a Frostbite game engine developer) came along and started making contributions, including a templated loader in DerelictUtil that replaced the one I had hacked together and became the basis for Derelict2.

Eventually, with the dawning of the D2 era, D development moved away from DSource to GitHub. So did Derelict, in the guise of Derelict 3. I had made use of every D build tool that came along before then, but finally settled on a custom build script (written in D) for this iteration, since the build tools all died. When DUB came along and looked like it was here to stay, I fully committed to it. I took the opportunity to finally split up the monolithic repository, gave every package its own repo, and created DerelictOrg as their new home. Aside from the appalling lack of documentation (a state which is rapidly changing, but is still a WIP), things have been fairly stable.

If I were still counting from Derelict 3, I think we might be at Derelict 6 by now, but that’s not how it rolls anymore. Since the move to DerelictOrg, I’ve twice made iterative improvements to DerelictUtil that broke binary compatibility with previous releases. The first time around, I didn’t properly manage the roll out. Anyone building the Derelict libraries manually, i.e. when they weren’t using DUB to manage their application project, could run into issues where one package used the new version and another used the older. It seems like it ought to have been a mess, but I heard very little about it. Still, in the most recent overhaul, I took steps to keep the new stuff distinctly separate from the old stuff and rolled it all out at once (which is why most of the latest releases as I write have a -beta suffix).

As D has evolved over the years, so has Derelict. Now that DMD supports COFF on Windows, some of the packages can now be configured at compile time as a static binding. Long ago, DerelictUtil gained the ability to selectively ignore missing symbols (curiously missing from the latest docs, but described in the old DerelictUtil Wiki) and more recently gained a feature that enables a package implementation to support multiple versions of a C library via a configurable call to the loader (SharedLibVersion), which only a handful of packages support and few more likely will (due to API compatibility issues with type sizes or enum values, not every package can). The latest version of DerelictGL3 is massively configurable at compile time. For a little while, I even relaxed my criteria for allowing new packages under the DerelictOrg umbrella, but now after I ended up being wholly responsible for them that’s something I’m not inclined to continue. With the DUB registry, it’s not really necessary.

There are a number of C library bindings using DerelictUtil out in the wild. You can find them, as well as all of the DerelictOrg packages, at the DUB package registry. Some of the third-party bindings have a name like “derelict-extras-foo”, but others use “derelict-foo” like those in DerelictOrg. You can also find a collection of static bindings in the Deimos organization, and third-party static bindings in the DUB registry that have adopted the Deimos approach of providing C headers along side the D modules.

Anyone who needs help with any of the Derelict packages can often find it in the #D irc channel at freenode.net. I drop by infrequently, but I monitor the D forums every day. Asking for Derelict help in the Learn forum is not off-topic.

The Derelict bindings have gone well beyond targeting game developers. Though they’re often used in games (like Voxelman and the first D game on Steam, Mayhem Intergalactic), they are also used elsewhere (like DLangUI). If you’re using any of the Derelict bindings in your own projects, I’d love to hear about it. Particularly since I’m always looking for projects I’ve never heard about to highlight here on the blog.

Life in the Fast Lane

The first post I wrote in the GC series introduced the D garbage collector and the language features that use it. Two key points that I tried to get across in the article were:

  1. The GC can only run when memory allocations are requested. Contrary to popular misconception, the D GC isn’t generally going to decide to pause your Minecraft clone in the middle of the hot path. It will only run when memory from the GC heap is requested, and then only if it needs to.
  2. Simple C and C++ allocation strategies can mitigate GC pressure. Don’t allocate memory in inner loops – preallocate as much as possible, or fetch it from the stack instead. Minimize the total number of heap allocations. These strategies work because of point #1 above. The programmer can dictate when it is possible for a collection to occur simply by being smart about when GC heap allocations are made.

The strategies in point #2 are fine for code that a programmer writes herself, but they aren’t going to help at all with third-party libraries. For those situations, D provides built-in mechanisms to guarantee that no GC allocations can occur, both in the language and the runtime. There are also command-line options that can help make sure the GC stays out of the way.

Let’s imagine a hypothetical programmer named J.P. who, for reasons he considers valid, has decided he would like to avoid garbage collection completely in his D program. He has two immediate options.

The GC chill pill

One option is to make a call to GC.disable when the program is starting up. This doesn’t stop allocations, but puts a hold on collections. That means all collections, including any that may result from allocations in other threads.

void main() {
    import core.memory;
    import std.stdio;
    GC.disable;
    writeln("Goodbye, GC!");
}

Output:

Goodbye, GC!

This has the benefit that all language features making use of the GC heap will still work as expected. But, considering that allocations are still going without any cleanup, when you do the math you’ll realize this might be problematic. If allocations start to get out of hand, something’s gotta give. From the documentation:

Collections may continue to occur in instances where the implementation deems necessary for correct program behavior, such as during an out of memory condition.

Depending on J.P.’s perspective, this might not be a good thing. But if this constraint is acceptable, there are some additional steps that can help keep things under control. J.P. can make calls to GC.enable or GC.collect as necessary. This provides greater control over collection cycles than the simple C and C++ allocation strategies.

The GC wall

When the GC is simply intolerable, J.P. can turn to the @nogc attribute. Slap it at the front of the main function and thou shalt suffer no collections.

@nogc
void main() { ... }

This is the ultimate GC mitigation strategy. @nogc applied to main will guarantee that the garbage collector will never run anywhere further along the callstack. No more caveats about collecting “where the implementation deems necessary”.

At first blush, this may appear to be a much better option than GC.disable. Let’s try it out.

@nogc
void main() {
    import std.stdio;
    writeln("GC be gone!");
}

This time, we aren’t going to get past compilation:

Error: @nogc function 'D main' cannot call non-@nogc function 'std.stdio.writeln!string.writeln'

What makes @nogc tick is the compiler’s ability to enforce it. It’s a very blunt approach. If a function is annotated with @nogc, then any function called from inside it must also be annotated with @nogc. As may be obvious, writeln is not.

That’s not all:

@nogc 
void main() {
    auto ints = new int[](100);
}

The compiler isn’t going to let you get away with that one either.

Error: cannot use 'new' in @nogc function 'D main'

Any language feature that allocates from the GC heap is out of reach inside a function marked @nogc (refer to the first post in this series for an overview of those features). It’s turtles all the way down. The big benefit here is that it guarantees that third-party code can’t use those features either, so can’t be allocating GC memory behind your back. Another downside is that any third-party library that is not @nogc aware is not going to be available in your program.

Using this approach requires a number of workarounds to make up for non-@nogc language features and library functions, including several in the standard library. Some are trivial, some are not, and others can’t be worked around at all (we’ll dive into the details in a future post). One example that might not be obvious is throwing an exception. The idiomatic way is:

throw new Exception("Blah");

Because of the new in that line, this isn’t possible in @nogc functions. Getting around this requires preallocating any exceptions that will be thrown, which in turn runs into the issue that any exception memory allocated from the regular heap still needs to be deallocated, which leads to ideas of reference counting or stack allocation… In other words, it’s a big can of worms. There’s currently a D Improvement Proposal from Walter Bright intended to stuff all the worms back into the can by making throw new Exception work without the GC when it needs to.

It’s not an insurmountable task to get around the limitations of @nogc main, it just requires a good bit of motivation and dedication.

One more thing to note about @nogc main is that it doesn’t banish the GC from the program completely. D has support for static constructors and destructors. The former are executed by the runtime before entering main and the latter upon exiting. If any of these exist in the program and are not annotated with @nogc, then GC allocations and collections can technically be present in the program. Still, @nogc applied to main means there won’t be any collections running once main is entered, so it’s effectively the same as having no GC at all.

Working it out

Here’s where I’m going to offer an opinion. There’s a wide range of programs that can be written in D without disabling or cutting the GC off completely. The simple strategies of minimizing GC allocations and keeping them out of the hot path will get a lot of mileage and should be preferred. It can’t be repeated enough given how often it’s misunderstood: D’s GC will only have a chance to run when the programmer allocates GC memory and it will only run if it needs to. Use that knowledge to your advantage by keeping the allocations small, infrequent, and isolated outside your inner loops.

For those programs where more control is actually needed, it probably isn’t going to be necessary to avoid the GC entirely. Judicious use of @nogc and/or the core.memory.GC API can often serve to avoid any performance issues that may arise. Don’t put @nogc on main, put it on the functions where you really want to disallow GC allocations. Don’t call GC.disable at the beginning of the program. Call it instead before entering a critical path, then call GC.enable when leaving that path. Force collections at strategic points, such as between game levels, with GC.collect.

As with any performance tuning strategy in software development, it pays to understand as fully as possible what’s actually happening under the hood. Adding calls to the core.memory.GC API in places where you think they make sense could potentially make the GC do needless work, or have no impact at all. Better understanding can be achieved with a little help from the toolchain.

The DRuntime GC option --DRT-gcopt=profile:1 can be passed to a compiled program (not to the compiler!) for some tune-up assistance. This will report some useful GC profiling data, such as the total number of collections and the total collection time.

To demonstrate, gcstat.d appends twenty values to a dynamic array of integers.

void main() {
    import std.stdio;
    int[] ints;
    foreach(i; 0 .. 20) {
        ints ~= i;
    }
    writeln(ints);
}

Compiling and running with the GC profile switch:

dmd gcstat.d
gcstat --DRT-gcopt=profile:1
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
        Number of collections:  1
        Total GC prep time:  0 milliseconds
        Total mark time:  0 milliseconds
        Total sweep time:  0 milliseconds
        Total page recovery time:  0 milliseconds
        Max Pause Time:  0 milliseconds
        Grand total GC time:  0 milliseconds
GC summary:    1 MB,    1 GC    0 ms, Pauses    0 ms <    0 ms

This reports one collection, which almost certainly happened as the program was shutting down. The runtime terminates the GC as it exits which, in the current implementation, will generally trigger a collection. This is done primarily to run destructors on collected objects, even though D does not require destructors of GC-allocated objects to ever be run (a topic for a future post).

DMD supports a command-line option, -vgc, that will display every GC allocation in a program, including those that are hidden behind language features like the array append operator.

To demonstrate, take a look at inner.d:

void printInts(int[] delegate() dg)
{
    import std.stdio;
    foreach(i; dg()) writeln(i);
} 

void main() {
    int[] ints;
    auto makeInts() {
        foreach(i; 0 .. 20) {
            ints ~= i;
        }
        return ints;
    }

    printInts(&makeInts);
}

Here, makeInts is an inner function. A pointer to a non-static inner function is not a function pointer, but a delegate (a context pointer/function pointer pair; if an inner function is static, a pointer of type function is produced instead). In this particular case, the delegate makes use of a variable in its parent scope. Here’s the output of compiling with -vgc:

dmd -vgc inner.d
inner.d(11): vgc: operator ~= may cause GC allocation
inner.d(7): vgc: using closure causes GC allocation

What we’re seeing here is that memory needs to be allocated so that the delegate can carry the state of ints, making it a closure (which is not itself a type – the type is still delegate). Move the declaration of ints inside the scope of makeInts and recompile. You’ll find that the closure allocation goes away. A better option is to change the declaration of printInts to look like this:

void printInts(scope int[] delegate() dg)

Adding scope to any function parameter ensures that any references in the parameter cannot be escaped. In other words, it now becomes impossible to do something like assign dg to a global variable, or return it from the function. The effect is that there is no longer a need to create a closure, so there will be no allocation. See the documentation for more on function pointers, delegates and closures, and function parameter storage classes.

The gist

Given that the D GC is very different from those in languages like Java and C#, it’s certain to have different performance characteristics. Moreover, D programs tend to produce far less garbage than those written in a language like Java, where almost everything is a reference type. It helps to understand this when embarking on a D project for the first time. The strategies an experienced Java programmer uses to mitigate the impact of collections aren’t likley to apply here.

While there is certainly a class of software in which no GC pauses are ever acceptable, that is an arguably small set. Most D projects can, and should, start out with the simple mitigation strategies from point #2 at the top of this article, then adapt the code to use @nogc or core.memory.GC as and when performance dictates. The command-line options demonstrated here can help ferret out the areas where that may be necessary.

As time goes by, it’s going to become easier to micromanage garbage collection in D programs. There’s a concerted effort underway to make Phobos, D’s standard library, as @nogc-friendly as possible. Language improvements such as Walter’s proposal to modify how exceptions are allocated should speed that work considerably.

Future posts in this series will look at how to allocate memory outside of the GC heap and use it alongside GC allocations in the same program, how to compensate for disabled language features in @nogc code, strategies for handling the interaction of the GC with object destructors, and more.

Thanks to Vladimir Panteleev, Guillaume Piolat, and Steven Schveighoffer for their valuable feedback on drafts of this article.

The article has been amended to remove a misleading line about Java and C#, and to add some information about multiple threads.

Compile-Time Sort in D

Björn Fahller recently wrote a blog post showing how to implement a compile-time quicksort in C++17. It’s a skillful demonstration that employs the evolving C++ feature set to write code that, while not quite concise, is more streamlined than previous iterations. He concludes with, “…the usefulness of this is very limited, but it is kind of cool, isn’t it?”

There’s quite a bit of usefulness to be found in evaluating code during compilation. The coolness (of which there is much) arises from the possibilities that come along with it. Starting from Björn’s example, this post sets out to teach a few interesting aspects of compile-time evaluation in the D programming language.

The article came to my attention from Russel Winder’s provocative query in the D forums, “Surely D can do better”, which was quickly resolved with a “No Story”-style answer by Andrei Alexandrescu. “There is nothing to do really. Just use standard library sort,” he quipped, and followed with code:

Example 1

void main() {
    import std.algorithm, std.stdio;
    enum a = [ 3, 1, 2, 4, 0 ];
    static b = sort(a);
    writeln(b); // [0, 1, 2, 3, 4]
}

Though it probably won’t be obvious to those unfamiliar with D, the call to sort really is happening at compile time. Let’s see why.

Compile-time code is runtime code

It’s true. There are no hurdles to jump over to get things running at compile time in D. Any compile-time function is also a runtime function and can be executed in either context. However, not all runtime functions qualify for CTFE (Compile-Time Function Evaluation).

The fundamental requirements for CTFE eligibility are that a function must be portable, free of side effects, contain no inline assembly, and the source code must be available. Beyond that, the only thing deciding whether a function is evaluated during compilation vs. at run time is the context in which it’s called.

The CTFE Documentation includes the following statement:

In order to be executed at compile time, the function must appear in a context where it must be so executed…

It then lists a few examples of where that is true. What it boils down to is this: if a function can be executed in a compile-time context where it must be, then it will be. When it can’t be excecuted (it doesn’t meet the CTFE requirements, for example), the compiler will emit an error.

Breaking down the compile-time sort

Take a look once more at Example 1.

void main() {
    import std.algorithm, std.stdio;
    enum a = [ 3, 1, 2, 4, 0 ];
    static b = sort(a);
    writeln(b);
}

The points of interest that enable the CTFE magic here are lines 3 and 4.

The enum in line 3 is a manifest constant. It differs from other constants in D (those marked immutable or const) in that it exists only at compile time. Any attempt to take its address is an error. If it’s never used, then its value will never appear in the code.

When an enum is used, the compiler essentially pastes its value in place of the symbol name.

enum xinit = 10;
int x = xinit;

immutable yinit = 11;
int y = yinit;

Here, x is initialized to the literal 10. It’s identical to writing int x = 10. The constant yinit is initialized with an int literal, but y is initialized with the value of yinit, which, though known at compile time, is not a literal itself. yinit will exist at run time, but xinit will not.

In Example 1, the static variable b is initialized with the manifest constant a. In the CTFE documentation, this is listed as an example scenario in which a function must be evaluated during compilation. A static variable declared in a function can only be initialized with a compile-time value. Trying to compile this:

Example 2

void main() {
    int x = 10;
    static y = x;
}

Will result in this:

Error: variable x cannot be read at compile time

Using a function call to initialize a static variable means the function must be executed at compile time and, therefore, it will be if it qualifies.

Those two pieces of the puzzle, the manifest constant and the static initializer, explain why the call to sort in Example 1 happens at compile time without any metaprogramming contortions. In fact, the example could be made one line shorter:

Example 3

void main() {
    import std.algorithm, std.stdio;
    static b = sort([ 3, 1, 2, 4, 0 ]);
    writeln(b);
}

And if there’s no need for b to stick around at run time, it could be made an enum instead of a static variable:

Example 4

void main() {
    import std.algorithm, std.stdio;
    enum b = sort([ 3, 1, 2, 4, 0 ]);
    writeln(b);
}

In both cases, the call to sort will happen at compile time, but they handle the result differently. Consider that, due to the nature of enums, the change will produce an equivalent of this: writeln([ 0, 1, 2, 3, 4 ]). Because the call to writeln happens at run time, the array literal might trigger a GC allocation (though it could be, and sometimes will be, optimized away). With the static initializer, there is no runtime allocation, as the result of the function call is used at compile time to initialize the variable.

It’s worth noting that sort isn’t directly returning a value of type int[]. Take a peek at the documentation and you’ll discover that what it’s giving back is a SortedRange. Specifically in our usage, it’s a SortedRange!(int[], "a < b"). This type, like arrays in D, exposes all of the primitives of a random-access range, but additionally provides functions that only work on sorted ranges and can take advantage of their ordering (e.g. trisect). The array is still in there, but wrapped in an enhanced API.

To CTFE or not to CTFE

I mentioned above that all compile-time functions are also runtime functions. Sometimes, it's useful to distinguish between the two inside the function itself. D allows you to do that with the __ctfe variable. Here's an example from my book, 'Learning D'.

Example 5

string genDebugMsg(string msg) {
    if(__ctfe)
        return "CTFE_" ~ msg;
    else
        return "DBG_" ~ msg;
}

pragma(msg, genDebugMsg("Running at compile-time."));
void main() {
    writeln(genDebugMsg("Running at runtime."));
}

The msg pragma prints a message to stderr at compile time. When genDebugMsg is called as its second argument here, then inside that function the variable __ctfe will be true. When the function is then called as an argument to writeln, which happens in a runtime context, __ctfe is false.

It's important to note that __ctfe is not a compile-time value. No function knows if it's being executed at compile-time or at run time. In the former case, it's being evaluated by an interpreter that runs inside the compiler. Even then, we can make a distinction between compile-time and runtime values inside the function itself. The result of the function, however, will be a compile-time value when it's executed at compile time.

Complex compile-time validation

Now let's look at something that doesn't use an out-of-the-box function from the standard library.

A few years back, Andrei published 'The D Programming Language'. In the section describing CTFE, he implemented three functions that could be used to validate the parameters passed to a hypothetical linear congruential generator. The idea is that the parameters must meet a set of criteria, which he lays out in the book (buy it for the commentary -- it's well worth it), for generating the largest period possible. Here they are, minus the unit tests:

Example 6

// Implementation of Euclid’s algorithm
ulong gcd(ulong a, ulong b) { 
    while (b) {
        auto t = b; b = a % b; a = t;
    }
    return a; 
}

ulong primeFactorsOnly(ulong n) {
    ulong accum = 1;
    ulong iter = 2;
    for (; n >= iter * iter; iter += 2 - (iter == 2)) {
        if (n % iter) continue;
        accum *= iter;
        do n /= iter; while (n % iter == 0);
    }
    return accum * n;
}

bool properLinearCongruentialParameters(ulong m, ulong a, ulong c) { 
    // Bounds checking
    if (m == 0 || a == 0 || a >= m || c == 0 || c >= m) return false; 
    // c and m are relatively prime
    if (gcd(c, m) != 1) return false;
    // a - 1 is divisible by all prime factors of m
    if ((a - 1) % primeFactorsOnly(m)) return false;
    // If a - 1 is multiple of 4, then m is a multiple of 4, too. 
    if ((a - 1) % 4 == 0 && m % 4) return false;
    // Passed all tests
    return true;
}

The key point this code was intended to make is the same one I made earlier in this post: properLinearCongruentialParameters is a function that can be used in both a compile-time context and a runtime context. There's no special syntax required to make it work, no need to create two distinct versions.

Want to implement a linear congruential generator as a templated struct with the RNG parameters passed as template arguments? Use properLinearCongruentialParameters to validate the parameters. Want to implement a version that accepts the arguments at run time? properLinearCongruentialParameters has got you covered. Want to implement an RNG that can be used at both compile time and run time? You get the picture.

For completeness, here's an example of validating parameters in both contexts.

Example 7

void main() {
    enum ulong m = 1UL << 32, a = 1664525, c = 1013904223;
    static ctVal = properLinearCongruentialParameters(m, a, c);
    writeln(properLinearCongruentialParameters(m, a, c));
}

If you've been paying attention, you'll know that ctVal must be initialized at compile time, so it forces CTFE on the call to the function. And the call to the same function as an argument to writeln happens at run time. You can have your cake and eat it, too.

Conclusion

Compile-Time Function Evaluation in D is both convenient and painless. It can be combined with other features such as templates (it's particularly useful with template parameters and constraints), string mixins and import expressions to simplify what might otherwise be extremely complex code, some of which wouldn't even be possible in many languages without a preprocessor. As a bonus, Stefan Koch is currently working on a more performant CTFE engine for the D frontend to make it even more convenient. Stay tuned here for more news on that front.

Thanks to the multiple members of the D community who reviewed this article.

Project Highlight: excel-d

Ever had the need to write an Excel plugin? Check this out.

Atila Neves opened his lightning talk at DConf 2017 like this:

I’m going to talk about how you can write Excel add-ins in D. Don’t ask me why. It’s just because people need it.

From there he goes into a quick intro on how to write plugins for Excel and gives a taste of what it looks like to register a single function in a C++ implementation:

Excel12f(
    xlfRegister, NULL, 11,          // 11: Number of args
    &xDLL,                          // name of the DLL
    TempStr12(L"Fibonacci"),        // procedure name
    TempStr12(L"UU"),               // type signature
    TempStr12(L"Compute to..."),    // argument text
    TempStr12(L"1"),                // macro type
    TempStr12(L"Generic Add-In"),   // category
    TempStr12(L""),                 // shortcut text
    TempStr12(L""),                 // help topic
    TempStr12(L"Number to compute to"), // function help
    TempStr12(L"Computes the nth Fibonacci number") // arg help 
);

Two things to note about this. First, Excel12f is a C++ function (wrapping a C API) that must be called in an add-in DLL’s entry point (xlAutoOpen) for each function that needs to be registered when the add-in is loaded by Excel. For a small plugin, the implementations of any registered functions might be in the same source file, but in a larger one they might be a bit of a maintenance headache by being located somewhere else. Also take note of all the comments used to document the function arguments, a common sight in C and C++ code bases.

The example D code Atila showed using excel-d is a world of difference:

@Register(
    ArgumentText("Array"),
    HelpTopic("Length of Array"),
    FunctionHelp("Length of an Array"),
    ArgumentHelp(["array"])
)
double DoublesLength(double[] arg) {
    return arg.length;
}

Here, the boilerplate for the registration is being generated at compile-time via a User Defined Attribute, which is used to annotate the function. Implementation and registration are happening in the same place. Another key difference is that the UDA has fields with descriptive names, eliminating the need to comment each argument. Finally, the UDA only requires four arguments, nine less than the C++ function. This is because it makes use of D’s compile-time introspection features to figure out as much as it possibly can and, at the same time, optional arguments (like the shortcut text) can just be omitted.

Since this is a Project Highlight on the D Blog, we’re going to ignore Atila’s opening request and ask, “Why?” There are actually two parts to that. First, why Excel?

Our customers are traders, and they work with Excel as one of their main tools. They need/want to, amongst other things, receive live stock updates in a cell and have their formulae automatically update. There’s other functionality they’d like to have and that means adding this to Excel somehow.

Of all the possible languages that could be used for this purpose, the business chose D. That brings us to the second part of the question: why D?

This is possible in Visual Basic, Python or C#, and possibly other languages. But none of them match D’s performance. C++ does, but it’s tedious and requires a lot of boilerplate to get going. D combines the speed and power of C++ with the reflection capabilities of those other languages. No boilerplate, just code, runs fast.

There’s more to the story, of course. The company is heavily invested in D.

We use D for nearly everything, even some “scripts”. The bulk of it is calculations for market indicators. Lots of data in -> munge -> new data out that needs to look pretty for traders. So integrating with existing code was an important factor.

Even though excel-d is targeting Windows, much of it was actually developed on Linux.

We use a Linux container as our reference development machine, but people use what they want. I do nearly all of my work on Linux and only boot into Windows when I have to. For the Excel work, that’s a necessity. But, as usual for me, I wrote all the tests to be platform agnostic, so I do the Excel development on Linux and test there. Every now and again that means a particular quirk of Excel wasn’t captured well in my mocking code, but it’s usually a quick fix after that.

He says they use both DMD and LDC for development, and both are running in continuous integration.

Although DMD doesn’t technically require Visual Studio to be installed (out of the box, it generates 32-bit OMF objects, and uses the OPTLINK linker, rather than the VS-compatible COFF), anyone doing serious work on Windows is going to need VS (or the Visual Studio Build Tools and the Windows SDK) for 64-bit and 32-bit COFF support. The latest LDC binary releases currently require the MS tools (support for MinGW was dropped, but according to the D Wiki, could be picked up again if there’s a champion for it).

Atila already had VS on his Windows partition. For this project, he got a bit of help from the VS plugin for D, Visual D.

I had to install VisualD because our reference project for Excel was in a Visual Studio solution, but afterwards I reverse engineered the build and didn’t open Visual Studio ever again.

Currently, excel-d has no support for custom dialogs or menus. Both items are on his TODO list.

If you’re working with D and need to write an Excel add-in, or want to try something cleaner than C++ to do so, excel-d is available in the DUB package registry. If not, the sponsors of the project, Kaleidic Associates and Symmetry Investments, have made several other open source projects available. They are interested in hiring talented hackers with a moral compass who aspire towards excellence and would like to work in D.

excel-d was developed by Stefan Koch and Atila Neves.

DConf 2017 Ex Post Facto

Another May, another DConf in the rear view. This year’s edition, organized and hosted once again by the talented folks from Sociomantic, and emceed for the second consecutive year by their own Dylan Cromwell, brought more talks, more fun, and an extra day. The livestream this year, barring a glitch or two with the audio, went almost perfectly. And for the first time in DConf history, videos of the talks started appearing online almost as soon as the conference was over. The entire playlist is available now.

As usual, there were three days of talks. The first opened with a keynote by Walter and the last with one by Andrei (he gave a longer version of the same talk at Google’s Tel Aviv campus a few days later). The keynote from Scott Meyers on Day Two, in his second DConf appearance, was actually the second talk of the day thanks to some technical issues. He told everyone about the things he finds most important in software development, a talk recommended for any developer no matter their language of preference.

The keynotes were followed by presentations from a mix of DConf veterans and first-time speakers. This year, livestream viewers were treated to some special segments during the lunch breaks. On Day One, Luís Marques showed off a live demo using D as a hardware description language, which had been the subject of his presentation just before the lunch break (he used a Papillo Pro from Gadget Factory in his demo, and the company was kind enough to provide an FPGA for one lucky attendee to take home). On Day Two, Vladimir Panteleev, after being shuffled from the second spot to the first in the lineup, gave a livestream demo of concepts he had discussed in his talk on Practical Metaprogramming. And on the last day of presentations, Bastiaan Veelo presented the livestream audience with an addendum to his talk, Extending Pegged to Parse Another Programming Language.

Day Two closed out with a panel discussion featuring Scott, Walter and Andrei.

Scott Meyers, Walter Bright, and Andrei Alexandrescu in a panel discussion moderated by Dylan Cromwell.

It was during this discussion that Walter made the claim that memory safety will be the bane of C, eliciting skepticism from Scott Meyers. That exchange prompted more discussion on /r/programming almost two weeks later.

The newest segment of the event this year came in the form of an extra day given over entirely to the DConf Hackathon. As originally envisioned, it was never intended to be a hackathon in the traditional sense. The sole purpose was for members of the D community to get together face-to-face and hash out the pain points, issues, and new ideas they feel strongly about. DConf 2016 had featured a “Birds of a Feather” session, with the goal of hashing out a specific category of issues, as part of the regular talk lineup. It didn’t work out as intended. The hackathon, suggested by Sebastian Wilzbach, was conceived as an expansion of and improvement upon that concept.

The initial plan was to present attendees with a list of issues that need resolving in the D ecosystem, allow them to suggest some more, and break off into teams to solve them. Sebastian put a lot of effort into a shared document everyone could add their ideas and their names to. As it turned out, participants flowed naturally through the venue, working, talking, and just getting things done. Some worked in groups, others worked alone. Some, rather than actively coding, hashed out thorny issues through debate, others provided informal tutoring or advice. In the end, it was a highly productive day. Perhaps the most tangible result was, as Walter put it, “a tsunami of pull requests.” It’s already a safe bet that the Hackathon will become a DConf tradition.

In the evenings between it all, there was much food, drink, and discussion to be had. It was in this “downtime” where more ideas were thrown around. Some brought their laptops to hack away in the hotel lobby, working on pet projects or implementing and testing ideas as they were discussed. It was here where old relationships were strengthened and new ones formed. This aspect of DConf can never be overstated.

A small selection of more highlights that came out of the four days of DConf 2017:

  • Walter gave the green light to change the D logo and a strategy was devised for moving forward.
  • Jonathan Davis finally managed to get std.datetime split from a monolithic module into a package of smaller modules.
  • Some contentious issues regarding workflow in the core repositories were settled.
  • Vladimir Panteleev gave DustMite the ability to reduce diffs.
  • Timon Gehr implemented static foreach (yay!).
  • Ali Çehreli finished updating his book Programming in D to 2.074.0.
  • Nemanja Boric fixed the broken exception handling on FreeBSD-CURRENT.
  • Steven Schveighoffer and Atila Neves earned their wizard hats for submitting their first pull requests to DMD.
  • Adrian Matoga and Sönke Ludwig (and probably others) worked on fixing issues with DUB.
  • Progress was made on the D compiler-as-a-library front.

This is far from an exhaustive list. The venue was a hive of activity during that last day, and who knows what else was accomplished in the halls, restaurants, and hotel lobbies. This short list only scratches the surface.

A very big Thank You to everyone at Sociomantic who treated us to another spectacular DConf. We’re already looking forward to DConf 2018!

Thanks Sociomantic!

The DConf Experience

April 23rd, the deadline for DConf 2017 registrations, is just a few days away. Personally, I wasn’t sure if I’d be able to attend this year or not, but fortunately things worked out. While I’m looking forward to more great presentations this year, that’s not what keeps drawing me back (or makes me regret being unable to attend in 2014 and 2015).

The main draw for me is putting faces to all the GitHub and forum handles, then getting the opportunity to reconnect in person with those I’ve met at past conferences to talk about more than just the D programming language or the projects to which we all contribute.

I’m not the only one who feels that way. Walter Bright had this to say of DConf 2016:

Besides the technical content, I enjoyed the camaraderie of the conference. The pleasant walking and talking going between the Ibis hotel and the conference. The great food and conversations at the event. Chatting at the Ibis after hours.

If you’re staying at the Hotel Ibis Neukölln (which, unfortunately, is booked solid the week of the conference last I heard) and you’re an early riser, you’ll likely find Walter down in the lobby having First Breakfast and up for a good conversation.

Of course, the mingling isn’t necessarily confined to the conference hall or the hotel lobby. Conference goers sometimes share a room in order to save costs. That’s a great opportunity to learn something new about someone you might not otherwise have discovered, as GDC meister Iain Buclaw found out at DConf 2013 when he and Manu Evans were roomies for a few days:

I recall that I had gotten some microwavable food from the nearby groceries ahead of his arrival — “I didn’t know what you would like, or if you had allergies, so I went with the safe option and only got Vegan foods”. It was a pleasant surprise to discover we have a diet in common also.

At that edition, Iain, Manu and Brad Roberts were shuttled to and from the conference by Walter in his rental car. Carpooling has been a common part of the Stateside DConfs. At the venue in Berlin, there’s no need for it. The semi-official hotel is in walking distance and there’s a subway stop nearby for those who are further off. And, as Steven Schveighoffer can attest, rental bicycles are always an option.

I remember last year, Joseph Wakeling, Andrew Edwards and his son, and myself went for a great bike tour of Berlin given by native David Eckardt of Sociomantic. This included the experience of dragging our rented bikes on the subway, and getting turned away by police in riot gear when we had inadvertently tried to go somewhere that political demonstrations were happening. Highly recommended for anyone, especially if David is gracious enough to do the tour again! We randomly met Ethan Watson riding a rented bike along the way too 🙂

Speaking of Ethan, it appears there’s a good reason that he was riding his bike alone and in the wrong direction.

My first BeerConf was in 2016, located in the rather interesting city of Berlin. Every evening, we would gather and drink large amounts of very cheap beer. This would also coincide with some rather enjoyable shenanigans, be they in-depth conversations; pub crawls; more in-depth conversations; convenience store crawls (where the beer is even cheaper); even more in-depth conversations; and a remarkable lack of late night food.

Hours during the day were passed away at a parallel conference that I kinda stumbled in to, DConf. Interestingly, this was stocked with the same characters that I socialised with during BeerConf. Maybe I was drunk then. Maybe I’m drunk now. Did I present a talk there? I think I did, although I can only assume the talk did not have anything to do with beer.

I’ve found out recently that I’m dual-booked for BeerConf 2017 as well as presenting again at DConf. Since attending these events can only be achieved through a submission process my deduction is that I had a thoroughly good time meeting all these people and socialising them, as well as sharing knowledge and technical expertise, and decided I must do it again. Either that, or someone else signed me up. Don’t tell me which option is the truth, I’ll work it out.

Yes, the beer is a big part of it, too. This blog was born at the small bar in the lobby of the Ibis, over a couple of cold ones and the background cacophony of D programmer talk. Who could say what new D initiatives will spawn from the fermented grains this year?

But let’s not forget the presentations. While they’ll be livestreamed and those who don’t attend can watch them on YouTube in perpetuity, the social aspect lends a flavor to them that the videos just can’t. Andrew Edwards is a fan of one speaker in particular.

I’m always fascinated by Don Clugston’s talks. The most memorable is DConf 2013, during which he stated that, “The language made minor steps… We got here like miners following a vein of gold, not by an original master plan,” in his explanation of how D arrived at its powerful metaprogramming capabilities.

Where else can you hear “wonks” recounting their first hand experiences of what makes D “berry, berry good” for them? It is just a great experience talking and exchanging ideas with some of the greatest minds in the D community.

And there’s no substitute for being in the room when something unexpected happens, as it did during Stefan Koch’s lightning talk at DConf 2016, when he wanted to do a bit of live coding on stage.

There was no DMD on the computer 🙂 That surprised me.

Whether your interest is social, technical, alcohological, or otherwise, DConf is just a fun place to be. If you have the time and the means to attend, you’re  warmly invited to make your own DConf memories. Do yourself a favor and register before the window closes. If you can’t make it, be sure to follow the livestream (pay attention to the Announce forum for details) or pull up #D on freenode.net to ask questions of the presenters. But we really hope to see you there!

I’ll leave you to consider Andrei Alexandrescu’s take on DConf, which concisely sums up the impact the conference has on many who attend:

To me, DConf has already become the anchoring event that has me inspired and motivated for the whole year. We are a unique programming language community in that we’re entirely decentralized geographically. Even the leadership is spread all over across the two US coasts, Western Europe, Eastern Europe, and Asia. Therefore, the event that brings us all together is powerful – an annual systole of the live, beating heart of a great community.

We don’t lack a good program with strong talks, but to me the best moments are found in the interstices – the communal meals, the long discussions with Walter and others in the hallway, the late nights in the hotel lobby. All of these are much needed accelerated versions of online exchanges.

I’m very much looking forward to this year’s edition. In addition to the usual suspects, we’ll have a strong showing of work done by our graduate scholarship recipients.

Thanks to Walter, Iain, Steven, Ethan, Andrew, Stefan and Andrei for sharing their anecdotes and thoughts.

Project Highlight: workspace-d

Not so long ago, Jan Jurzitza sat down at his keyboard intent on writing a D plugin for Atom, his text editor of choice at the time. Then came disappointment.

“I was pretty unhappy with their API,” he says.

Visual Studio Code was released a short time after. He decided to give it a go and “instantly fell in love with it”. His Atom plugin was pushed aside and he started work on a new plugin for VS Code called code-d.

However, I did not want to maintain the same functionality in two plugins for two different text editors, so I thought that making a program which contains most of the plugin logic, like starting and calling dcd, dscanner, dfmt, etc., would be beneficial and would also help with including D support in more editors and IDEs in the future.

For the uninitiated, DCD (the D Completion Daemon), DScanner, and Dfmt are D-oriented tools for plugin developers, all maintained by Brian Schott. They are, respectively, a client-server based auto-completion program, a source code analyzer, and a code formatter. A number of IDE and text editor plugins employ them directly.

So Jan started work on his new tool and named it workspace-d.

With workspace-d I want to make it simple for plugin developers to integrate D functionality into their editor of choice. workspace-d is designed to work both as a standalone program through stdio and as a D library. Once I ported most of the code from my Atom extension to workspace-d, I could simply spawn it as a subprocess in code-d, which I got working with it quite quickly.

In addition to porting his Atom plugin to use workspace-d, he also created one for Sublime Text. Currently, he’s not devoting any time to either and is looking for someone else to take over maintenance of one or both. Anyone interested might start by submitting pull requests. Aside from workspace-d itself, Jan’s focus is on code-d.

He’s recently been working on version 2.0 of workspace-d, with a focus on streamlining the way it handles requests.

Using traits, templates, and CTFE (Compile-Time Function Execution), basically all D compile time magic, I was able to make an automatic wrapper for the functions for version 2.0. Basically, when a request like {"cmd":"hello"} comes in, it runs the D function hello with its default arguments. If the arguments don’t match, it responds with an error. This system automatically binds function arguments to JSON values and generates a response from the return value.

To deserialize the JSON requests, he’s using painlessjson, a third-party library available in the DUB package registry.

It works really great and I can really recommend it for some simple and easy conversions between D types and JSON. This change really cleaned up all the code and made it possible to use workspace-d as a library.

He’s also working on a new project, serve-d, that works with Microsoft’s Language Server Protocol.

serve-d is an alternative for the workspace-d command line I/O for those who prefer JSON RPC over my custom binary/JSON mix. It’s fiber based and uses workspace-d as a library, which results in really clean code. There’s an alpha version of the implementation on github already, both the server and a new branch on code-d. With the Language Server Protocol, I’m hoping for easier integration in other editors. The concept is basically the same as workspace-d’s command line interface, but, because Microsoft is such a big company, I’m hoping that more editors by big developers are going to implement this protocol.

Building and installing workspace-d should go pretty smoothly on Linux or OS X, but it’s currently a little bumpy on Windows. Because of an issue Jan has yet to resolve, it can only be built on Windows with LDC.

The auto completion didn’t work for some people on Windows because it got stuck in the std.process.execute function when creating a pipe to write to. I couldn’t find any way to reproduce it in a standalone program so I couldn’t file a bug either. So what we did to avoid this issue in the short term was to simply disallow compilation on Windows using DMD. It works just fine when compiled with LDC.

Jan’s primarily a Linux user (he doesn’t own a Mac and only runs Windows in a VM). He credits GitHub user @Andrepuel for getting it operational on OS X, and @aka-demik for finding the issue on Windows and verifying that it compiles with LDC. He’ll be grateful to anyone who can help fully resolve the Windows/DMD issue once and for all.

If you’re looking to develop a D plugin for your favorite editor, consider taking advantage of the work Jan as already done with workspace-d to save yourself some effort. And VS Code users can put it to use via code-d to get code completion and more. Visit its VS Code marketplace page to read reviews and installation instructions.

Don’t Fear the Reaper

D, like many other programming languages in active use today, comes with a garbage collector out of the box. There are many types of software that can be written without worrying at all about the GC, taking full advantage of its benefits. But the GC does have drawbacks, and there are certainly scenarios in which garbage collection is undesirable. For those situations, the language provides the means to temporarily disable it, or even avoid it completely.

In order to maximize the positive impacts of garbage collection and minimize any negative, it’s necessary to have a good grounding in how the GC works in D. A good place to start is the Garbage Collection page at dlang.org, which outlines the rationale for D’s GC and provides some tips on working with it. This post is the first in a series intended to expand on the information provided on that page.

This time, we’ll look at the very basics, focusing on the language features that can trigger GC allocations. Future posts will introduce ways to disable the GC when necessary, as well as idioms useful in dealing with its nondeterministic nature (e.g. managing resources in the destructors of GC-managed objects).

The first thing to understand about D’s garbage collector is that it only runs during allocation, and only if there is no memory available to allocate. It isn’t sitting in the background, periodically scanning the heap and collecting garbage. This knowledge is fundamental to writing code that makes efficient use of GC-managed memory. Take the following example:

void main() {
    int[] ints;
    foreach(i; 0..100) {
        ints ~= i;
    }
}

This declares a dynamic array of int, then uses D’s append operator to append the numbers 0 to 99 in a foreach range loop. What isn’t obvious to the untrained eye is that the append operator makes use of the GC heap to allocate space for the values it adds to the array.

DRuntime’s array implementation isn’t a dumb one. In the example, there aren’t going to be one hundred allocations, one for each value. When more memory is needed, the implementation will allocate more space than is requested. In this particular case, it’s possible to determine how many allocations are actually made by making use of the capacity property of D’s dynamic arrays and slices. This returns the total number of elements the array can hold before an allocation is necessary.

void main() {
    import std.stdio : writefln;
    int[] ints;
    size_t before, after;
    foreach(i; 0..100) {
        before = ints.capacity;
        ints ~= i;
        after = ints.capacity;
        if(before != after) {
            writefln("Before: %s After: %s",
                before, after);
        }
    }
}

Executing this when compiled with DMD 2.073.2 shows the message is printed six times, meaning there were six total GC allocations in the loop. That means there were six opportunities for the GC to collect garbage. In this small example, it almost certainly didn’t. If this loop were part of a larger program, with GC allocations throughout, it very well might.

On a side note, it’s informative to examine the values of before and after. Doing so shows a sequence of 0, 3, 7, 15, 31, 63, and 127. So by the end, ints contains 100 values and has space for appending 27 more before the next allocation, which, extrapolating from the values in the sequence, should be 255. That’s an implementation detail of DRuntime, though, and could be tweaked or changed in any release. For more details on how arrays and slices are managed by the GC, take a look at Steven Schveighoffer’s excellent article on the topic.

So, six allocations, six opportunities for the GC to initiate one of its pauses of unpredictable length even in that simple little loop. In the general case, that could become an issue depending on if the loop is in a hot part of code and how much total memory is allocated from the GC heap. But even then, it’s not necessarily a reason to disable the GC in that part of the code.

Even with languages that don’t come with a stock GC out of the box, like C and C++, most programmers learn at some point that it’s better for overall performance to allocate as much as possible up front and minimize allocations in the inner loops. It’s one of the many types of premature optimization that are not actually the root of all evil, something we tend to call best practice. Given that D’s GC only runs when memory is allocated, the same strategy can be applied as a simple way to mitigate any potential impact it could have on performance. Here’s one way to rewrite the example:

void main() {
    int[] ints = new int[](100);
    foreach(i; 0..100) {
        ints[i] = i;
    }
}

Now we’ve gone from six allocations down to one. The only opportunity the GC has to run is before the inner loop. This actually allocates space for at least 100 values and initializes them all to 0 before entering the loop. The array will have a length of 100 after new, but will almost certainly have additional capacity.

There’s an alternative to new for arrays: the reserve function:

void main() {
    int[] ints;
    ints.reserve(100);
    foreach(i; 0..100) {
        ints ~= i;
    }
}

This allocates memory for at least 100 values, but the array is still empty (its length property will return 0) when it returns, so nothing is default initialized. Given that the loop only appends 100 values, it’s guaranteed not to allocate.

In addition to new and reserve, it’s possible to call GC.malloc directly for explicit allocation.

import core.memory;
void* intsPtr = GC.malloc(int.sizeof * 100);
auto ints = (cast(int*)intsPtr)[0 .. 100];

Array literals will usually allocate.

auto ints = [0, 1, 2];

This is also true when an array literal enum is used.

enum intsLiteral = [0, 1, 2];
auto ints1 = intsLiteral;
auto ints2 = intsLiteral;

An enum exists only at compile time and has no memory address. Its name is a synonym for its value. Everywhere you use one, it’s like copying and pasting its value in place of its name. Both ints1 and ints2 trigger allocations exactly as if they were declared like so:

auto ints1 = [0, 1, 2];
auto ints2 = [0, 1, 2];

Array literals do not allocate when the target is a static array. Also, string literals (strings in D are arrays under the hood) are an exception to the rule.

int[3] noAlloc1 = [0, 1, 2];
auto noAlloc2 = "No Allocation!";

The concatenate operator will always allocate:

auto a1 = [0, 1, 2];
auto a2 = [3, 4, 5];
auto a3 = a1 ~ a2;

D’s associative arrays have their own allocation strategy, but you can expect them to allocate when items are added and potentially when removed. They also expose two properties, keys and values, which both allocate arrays and fill them with copies of the respective items. When its desirable to modify the underlying associative array during iteration, or when the items need to be sorted or otherwise manipulated independently of the associative array, these properties are just what the doctor ordered. Otherwise, they’re needless allocations that put undue pressure on the GC.

When the GC does run, the total amount of memory it needs to scan is going to determine how long it takes. The smaller, the better. Avoiding unnecessary allocations isn’t going to hurt anyone and is another good mitigation strategy. D’s associative arrays provide three properties that help do just that: byKey, byValue, and byKeyValue. These each return forward ranges that can be iterated lazily. They do not allocate because they actually refer to the items in the associative array, so it should not be modified while iterating them. For more on ranges, see the chapters titled Ranges and More Ranges in Ali Çehreli’s Programming in D.

Closures, which are delegates or function literals that need to carry around a pointer to the local stack frame, may also allocate. The last allocating language feature listed on the Garbage Collection page is the assertion. An assertion will allocate when it fails because it needs to throw an AssertError, which is part of D’s class-based exception hierarchy (we’ll look at how classes interact with the GC in a future post).

Then there’s Phobos, D’s standard library. Once upon a time, much of Phobos was implemented with little concern for GC allocations, making it difficult to use in situations where they were undesirable. However, a massive effort was initiated
to make it more conservative in its GC usage. Some functions were made to work with lazy ranges, others were rewritten to take buffers supplied by the caller, and some were reimplemented to avoid unnecessary allocations internally. The result is a standard library more amenable to GC-free code (though there are still probably some corners of the library that have not yet been renovated — PRs welcome).

Now that we’ve seen the basics of using the GC, the next post in this series looks at the tools the language and the compiler provide for turning the GC off and making sure specific sections of code are GC-free.

Thanks to Guillaume Piolat and Steven Schveighoffer for their help with this article.