Category Archives: Compilers & Tools

DCompute: Running D on the GPU

Nicholas Wilson is a student at Murdoch University, studying for his BEng (Hons)/BSc in Industrial Computer Systems (Hons) and Instrumentation & Control/ Molecular Biology & Genetics and Biomedical Science. He just finished his thesis on low-cost defect detection of solar cells by electroluminescence imaging, which gives him time to work on DCompute and write about it for the D Blog.He plays the piano, ice skates, and has spent 7 years putting D to use on number bashing, automation, and anything else that he could make a computer do for him.


DCompute is a framework and compiler extension to support writing native kernels for OpenCL and CUDA in D to utilize GPUs and other accelerators for computationally intensive code. Its compute API drivers automate the interactions between user code and the tedious and error prone APIs with the goal of enabling the rapid development of high performance D libraries and applications.

Introduction

This is the second article on DCompute. In the previous article, we looked at the development of DCompute and some trivial examples. While we were able to successfully build kernels, there was no way to run them short of using them with an existing framework or doing everything yourself. This is no longer the case. As of v0.1.0, DCompute now comes with native wrappers for both OpenCL and CUDA, enabling kernel dispatch as easily as CUDA.

In order to run a kernel we need to pass it off to the appropriate compute API, either CUDA or OpenCL. While these APIs both try to achieve similar things they are different enough that to squeeze that last bit of performance out of them you need to treat each API separately. But there is sufficient overlap that we can make the interface reasonably consistent between the two. The C bindings to these APIs, however, are very low level and trying to use them is very tedious and extremely prone to error (yay void*).
In addition to the tedium and error proneness, you have to redundantly specify a lot of information, which further compounds the problem. Fortunately this is D and we can remove a lot of the redundancy through introspection and code generation.

The drivers wrap the C API, providing a clean and consistent interface that’s easy to use. While the documentation is a little sparse at the moment, the source code is for the most part straightforward (if you’re familiar with the C APIs, looking where a function is used is a good place to start). There is the occasional piece of magic to achieve a sane API.

Taming the beasts

OpenCL’s clGet*Info functions are the way to access properties of the class hidden behind the void*. A typical call looks like

enum CL_FOO_REFERENCE_COUNT = 0x1234;
cl_foo* foo = ...; 
cl_int refCount;
clGetFooInfo(foo, CL_FOO_REFERENCE_COUNT, refCount.sizeof, &refCount,null);

And that’s not even one for which you have to call, to figure out how much memory you need to allocate, then call again with the allocated buffer (and $DEITY help you if you want to get a cl_program’s binaries).

Using D, I have been able to turn that into this:

struct Foo
{
    void* raw;
    static struct Info
    {
        @(0x1234) int referenceCount;
        ...
    }
    mixin(generateGetInfo!(Info, clGetFooInfo));
}

Foo foo  = ...;
int refCount = foo.referenceCount;

All the magic is in generateGetInfo to generate a property for each member in Foo.Info, enabling much better scalability and bonus documentation.

CUDA also has properties exposed in a similar manner, however they are not essential (unlike OpenCL) for getting things done so their development has been deferred.

Launching a kernel is a large point of pain when dealing with the C API of both OpenCL and (only marginally less horrible) CUDA, due to the complete lack of type safety and having to use the & operator into a void* far too much. In DCompute this incantation simply becomes

Event e = q.enqueue!(saxpy)([N])(b_res, alpha, b_x, b_y, N);

for OpenCL (1D with N work items), and

q.enqueue!(saxpy)([N, 1, 1], [1 ,1 ,1])(b_res, alpha, b_x, b_y, N);

for CUDA (equivalent to saxpy<<<N,1,0,q>>>(b_res,alpha,b_x,b_y, N);)

Where q is a queue, N is the length of buffers (b_res, b_x & b_y) and saxpy (single-precision a x plus y) is the kernel in this example. A full example may be found here, along with the magic that drives the OpenCL and CUDA enqueue functions.

The future of DCompute

While DCompute is functional, there is still much to do. The drivers still need some polish and user testing, and I need to set up continuous integration. A driver that unifies the different compute APIs is also in the works so that we can be even more cross-platform than the industry cross-platform standard.

Being able to convert SPIR-V into SPIR would enable targeting cl_khr_spir-capable 1.x and 2.0 CL implementations, dramatically increasing the number of devices that can run D kernel code (there’s nothing stopping you using the OpenCL driver for other kernels though).

On the compiler side of things, supporting OpenCL image and CUDA texture & surface operations in LDC would increase the applicability of the kernels that could be written.
I currently maintain a forward-ported fork of Khronos’s SPIR-V LLVM to generate SPIR-V from LLVM IR. I plan to use IWOCL to coordinate efforts to merge it into the LLVM trunk, and in doing so, remove the need for some of the hacks in place to deal with the oddities of the SPIR-V backend.

Using DCompute in your projects

If you want to use DCompute, you’ll need a recent LDC built against LLVM with the NVPTX (for CUDA) and/or SPIRV (for OpenCL 2.1+) targets enabled and should add "dcompute": "~>0.1.0" to your dub.json. LDC 1.4+ releases have NVPTX enabled. If you want to target OpenCL, you’ll need to build LDC yourself against my fork of LLVM.

DMD, Windows, and C

The ability to interface with C was baked into D from the beginning. Most of the time, it’s something that requires little thought – as long as the declarations on the D side match what exists on the C side, things will usually just work. However, there are a few corner-case gotchas that arise from the simple fact that D, though compatible, is not C.

An upcoming series of posts here on the D blog will delve into some of these dark corners and shine a light on the traps lying in wait. In these posts, readers will be asked to follow along by compiling and executing the examples themselves so they may more thoroughly understand the issues discussed. This means that, in addition to a D compiler, readers will need access to a C compiler.

That raises a potential snafu. On the systems that the DMD frontend groups under the version(Posix) umbrella, it’s a reliable assumption that a C compiler is easily available (if a D compiler is installed and functioning properly, the C compiler will already be installed). The concept of a system compiler is a long established tradition on those systems. On Windows… not so much.

So before diving into a series about C and D, a bit of a primer is called for. That’s where this post comes in. The primary goal is to help ensure a C environment is installed and working on Windows. It’s also useful to understand why things are different on that platform than on the others. Before we get to the why, we’ll dig into the how.

First, assume we have the following two source files in the same directory.

cfoo.c

#include <stdio.h>

void say_hello(void) 
{
    puts("Hello!");
}

dfoo.d

extern(C) void say_hello();

void main() 
{
    say_hello();
}

Now let’s see how to get the two working together.

DMD and C

The DMD packages for Windows ship with everything the compiler needs: a linker and other tools, plus a handful of critical system libraries. So on the one hand, Windows is the only platform where DMD has no external dependencies out of the box. On the other hand, it’s the only platform where a working DMD installation does not imply a C compiler is also installed. And these days, the out-of-the-box experience often isn’t the one you want.

On all the other platforms, the C compiler option is the system compiler, which in practice means GCC or Clang. The system linker to which DMD sends its generated object files might be ld, lld, ld.gold, or any ld-compatible linker. On Windows, there are currently two compiler choices, and neither can be assumed to be installed by default: the Digital Mars C and C++ compiler, dmc, or the Microsoft compiler, cl. We’ll look at each in turn.

DMD and DMC

The linker (Optlink) and other tools that ship with DMD are also part of the DMC distribution. DMD uses these tools by default (or when the -m32 switch is passed on the command line). To link any C objects or static libraries, they should be in the OMF format. Compilers that can generate OMF are a rare breed these days, and while something like Open Watcom may work, using DMC will guarantee 100% compatibility.

The DMC package (version 8.57 as I write) can be downloaded from digitalmars.com. It’s a 3 MB zip file that can be unzipped anywhere. Personally, since I only ever use it in conjunction with DMD, I keep it in C:\D so that the dm directory is a sibling of the dmd2 directory. Once it’s unzipped, it can be added to the path if desired. Be aware that some of the tools DMD and DMC ship with may conflict with tools in other packages if they are on the global path.

For example, both come with Digital Mars make and Optlink, which is named link.exe. The former might conflict with Cygwin, or a MinGW distribution that’s independent of MSYS2 (if mingw32-make has been renamed), and the latter with Microsoft’s linker (which generally shouldn’t be on the global path anyway). Some may prefer just to keep it all off the global path. In that case, it’s simple to configure a command prompt shortcut that sets the PATH when it launches. For example, create a batch file, that looks like this:

echo Welcome to your Digital Mars environment.
@set PATH=C:\D\dmd2\windows\bin;C:\D\dm\bin;%PATH%

Save it as C:\D\dmenv.bat. Right click an empty spot on the desktop and, from the popup menu, select New->Shortcut. In the location field, enter the following:

C:\System\Win32\cmd.exe /k C:\d\dmenv.bat

Now you have a shortcut that, when double clicked, will launch a command prompt that has both dmd and dmc on the path.

Once installed, documentation on the command-line switches for the tools is available at the Digital Mars site. The most relevant are the docs for DMC, Optlink, and Librarian (lib.exe). The latter two will come in handy even when doing pure D development with vanilla DMD, as those are the tools needed to when manually manipulating its object file output.

That’s all there is to it. As long as both dmc.exe and dmd.exe are on the path in any given command prompt, both compilers will find the tools they need via the default settings in their configuration files. For knocking together quick tests with both C and D on Windows, it’s a quick thing to launch a command prompt, compile & link, and execute:

dmc -c cfoo.c
dmd dfoo.d cfoo.obj
dfoo

Easy peasy. Now let’s look at the other option.

DMD and Microsoft’s CL

Getting DMD to work with the Microsoft toolchain requires installing the Microsoft build tools and the Windows SDK. The easiest way to get everything is to use one of the Community editions of Visual Studio. The installer will download and install all the tools and the SDK. The latest is always available from https://www/visualstudio.com. With VS 2017, the installer has been overhauled such that it’s possible to minimize the size of the install more than was possible with past editions. An alternative is to install the Microsoft Build Tools and the Windows SDK separately. However, this is still a large install that isn’t much of a win in light of the new VS 2017 installer options (for those on Windows 8.1 or 10).

Once the tooling is installed, DMD’s configuration file needs to be modified to point its environment variables to the proper locations. Rather than repeat all of that here, I’ll direct you to the DMD installation page at the D Wiki. One of the reasons to prefer the DMD installer over the zip archive is that it will detect any installation of Visual Studio or the Microsoft Build Tools and automatically modify the configuration as needed. This is more convenient than needing to remember to update the configuration every time a new version of DMD is installed. It also offers to install VS 2013 Community if the tooling isn’t found, can install Visual D (the D plugin for Visual Studio), and will add DMD to the system path if you want it to.

It’s a bit of an annoyance to launch Visual Studio for simple tests between C and D. Since it’s not recommended to put the MS tools on the system path, each VS and Microsoft Build Tools installation ships with a number of batch files that will set the path for you (like the one we created for DMC above). The installer sets up shortcuts in the Windows Start menu. There are several different options to choose from. To launch a 64-bit environment with VS 2017 (or the 2017 build tools), find Visual Studio 2017 in the Start menu and select x64 Native Tools Command Prompt for VS 2017. For VS 2015 (or the 2015 build tools), go to Visual Studio 2015 and click on VS 2015 x64 Native Build Tools Command Prompt. Similar options exist for 32-bit (where x86 replaces x64) and cross compiling.

From the VS-enabled 64-bit environment, we can run the following commands to compile our two files.

cl /c cfoo.c
dmd -m64 dfoo.d cfoo.obj
dfoo

In a 32-bit VS environment, replace -m64 with -m32mscoff.

The consequences of history

When a new programming language is born these days, it’s not uncommon for its tooling to be built on top of an existing toolchain rather than completely from scratch. Whether we’re talking about languages like Kotlin built on the JRE, or those like Rust using LLVM, reusing existing tools saves time and allows the developers to focus their precious man-hours on the language itself and any language-specific tooling they require.

When Walter Bright first started putting D together in 1999, that trend had not yet come around. However, he already had an existing toolchain in the form of the Digital Mars C and C++ compiler tools. So it was a no-brainer to make use of his existing tools and compiler backend and just focus on making a new frontend for DMD. There were four major side-effects of this decision, all of which had varying consequences in D’s future development.

First, the DMC tools were Windows-only, so the early versions of DMD would be as well. Second, the linker, Optlink, only supports the OMF format. That meant that DMD’s output would be incompatible with the more common COFF output of most modern C and C++ compilers on Windows. Third, the DMC tools do not support 64-bit, so DMD would be restricted to 32-bit output. Finally, Symantec had the legal rights to the existing backend, which meant their license would apply to DMD. While the frontend was open source, the backend license required one to get permission from Walter to distribute DMD (on a side note, this prevented DMD from being included in official Linux package repositories once Linux support was added, but Symantec granted permission to relicense the backend earlier this year and it is now freely distributable under the Boost license).

DMD 0.00 was released in December of 2001. The 0.63 release brought Linux support in May of 2003. Walter could have based the Linux version on the GCC backend, but as a business owner, and through a caution born from past experience, he was concerned about any legal issues that could arise from his working with GPL code on one platform and maintaining a proprietary backend on another. Instead, he modified the DMD backend to generate ELF objects and hand them off to the GCC tools. This decision to enhance the backend became the approach for all new formats going forward. He did the same when adding support for Mac OS X: he modified the backend to work with the Mach-O object format.

Along with the new formats, the compiler gained the ability to generate 64-bit binaries everywhere except Windows. In order to interface with C on Windows, it was usually necessary to convert COFF object files and static libraries to OMF, to use a tool like coffimplib to generate DLL import libraries in the OMF format, or to create dynamic bindings and load DLLs manually via LoadLibrary and GetProcAddress. Then Remedy Games decided to use D.

Quantum Break was the first AAA game title to ship with D as part of its development process. Remedy used it for their gameplay code, creating their own open source tool to bind with their C++ game engine. Before they could get that far, however, they needed 64-bit support in DMD on Windows. That was the motivator to get it implemented. It took a while (apparently, there are some undocumented quirks in Microsoft’s variant of COFF, a.k.a PECOFF, a.k.a. MS-COFF), but Walter eventually got it done, and support for 32-bit COFF along with it. Again, as he had on other platforms, he modified the backend to generate object files in the new format.

This is why it’s necessary to have the Microsoft toolchain installed in order to produce 64-bit binaries with DMD on Windows. Microsoft’s cl is as close to a system compiler as one is going to get on Windows. There is, however, an option that has not yet been fully explored. It’s a toolchain that can be freely distributed, packaged with a reasonable download size, supports 32-bit and 64-bit output, and is mostly compatible with PECOFF. There is a possibility that it may be investigated as an option for future DMD releases to be built upon.

Going from here

Now that this primer is out of the way, the short series on C is just about ready to go. It will kick off with a brief summary of existing material, showing how easy it is to get D and C to work together in the general case. That will be followed up by two posts on arrays and strings. This is where most of the gotchas come into play, and anyone using D and C in the same program should understand what they are and how to avoid them.

DMD 2.076.0 Released

The core D team is proud to announce that version 2.076.0 of DMD, the reference compiler for the D programming language, is ready for download. The two biggest highlights in this release are the new static foreach feature for improved generative and generic programming, and significantly enhanced C language integration making incremental conversion of C projects to D easy and profitable.

static foreach

As part of its support for generic and generative programming, D allows for conditional compilation by way of constructs such as version and static if statements. These are used to choose different code paths during compilation, or to generate blocks of code in conjunction with string and template mixins. Although these features enable possibilities that continue to be discovered, the lack of a compile-time loop construct has been a steady source of inconvenience.

Consider this example, where a series of constants named val0 to valN needs to be generated based on a number N+1 specified in a configuration file. A real configuration file would require a function to parse it, but for this example, assume the file val.cfg is defined to contain a single numerical value, such as 10, and nothing else. Further assuming that val.cfg is in the same directory as the valgen.d source file, use the command line dmd -J. valgen.d to compile.

module valgen;
import std.conv : to;

enum valMax = to!uint(import("val.cfg"));

string genVals() 
{
    string ret;
    foreach(i; 0 .. valMax) 
    {
        ret ~= "enum val" ~ to!string(i) ~ "=" ~ to!string(i) ~ ";";
    }
    return ret;
}

string genWrites() 
{
    string ret;
    foreach(i; 0 .. valMax) 
    {
        ret ~= "writeln(val" ~ to!string(i) ~ ");";
    }
    return ret;
}

mixin(genVals);

void main() 
{
    import std.stdio : writeln;
    mixin(genWrites);
}

The manifest constant valMax is initialized by the import expression, which reads in a file during compilation and treats it as a string literal. Since we’re dealing only with a single number in the file, we can pass the string directly to the std.conv.to function template to convert it to a uint. Because valMax is an enum, the call to to must happen during compilation. Finally, because to meets the criteria for compile-time function evaluation (CTFE), the compiler hands it off to the interpreter to do so.

The genVals function exists solely to generate the declarations of the constants val0 to valN, where N is determined by the value of valMax. The string mixin on line 26 forces the call to genVals to happen during compilation, which means this function is also evaluated by the compile-time interpreter. The loop inside the function builds up a single string containing the declaration of each constant, then returns it so that it can be mixed in as several constant declarations.

Similarly, the genWrites function has the single-minded purpose of generating one writeln call for each constant produced by genVals. Again, each line of code is built up as a single string, and the string mixin inside the main function forces genWrites to be executed at compile-time so that its return value can be mixed in and compiled.

Even with such a trivial example, the fact that the generation of the declarations and function calls is tucked away inside two functions is a detriment to readability. Code generation can get quite complex, and any functions created only to be executed during compilation add to that complexity. The need for iteration is not uncommon for anyone working with D’s compile-time constructs, and in turn neither is the implementation of functions that exist just to provide a compile-time loop. The desire to avoid such boilerplate has put the idea of a static foreach as a companion to static if high on many wish lists.

At DConf 2017, Timon Gehr rolled up his sleeves during the hackathon and implemented a pull request to add support for static foreach to the compiler. He followed that up with a D Improvement Proposal, DIP 1010, so that he could make it official, and the DIP met with enthusiastic approval from the language authors. With DMD 2.076, it’s finally ready for prime time.

With this new feature, the above example can be rewritten as follows:

module valgen2;
import std.conv : to;

enum valMax = to!uint(import("val.cfg"));

static foreach(i; 0 .. valMax) 
{
    mixin("enum val" ~ to!string(i) ~ "=" ~ to!string(i) ~ ";");
}

void main() 
{
    import std.stdio : writeln;
    static foreach(i; 0 .. valMax) 
    {
        mixin("writeln(val" ~ to!string(i) ~ ");");
    }
}

Even such a trivial example brings a noticeable improvement in readability. Don’t be surprised to see compile-time heavy D libraries (and aren’t most of them?) get some major updates in the wake of this compiler release.

Better C integration and interoperation

DMD’s -betterC command line switch has been around for quite a while, though it didn’t really do much and it has languished from inattention while more pressing concerns were addressed. With DMD 2.076, its time has come.

The idea behind the feature is to make it even easier to combine both D and C in the same program, with an emphasis on incrementally replacing C code with D code in a working project. D has been compatible with the C ABI from the beginning and, with some work to translate C headers to D modules, can directly make C API calls without going through any sort of middleman. Going the other way and incorporating D into C programs has also been possible, but not as smooth of a process.

Perhaps the biggest issue has been DRuntime. There are certain D language features that depend on its presence, so any D code intended to be used in C needs to bring the runtime along and ensure that it’s initialized. That, or all references to the runtime need to be excised from the D binaries before linking with the C side, something that requires more than a little effort both while writing code and while compiling it.

-betterC aims to dramatically reduce the effort required to bring D libraries into the C world and modernize C projects by partially or entirely converting them to D. DMD 2.076 makes significant progress toward that end. When -betterC is specified on the command line, all asserts in D modules will now use the C assert handler rather than the D assert handler. And, importantly, neither DRuntime nor Phobos, the D standard library, will be automatically linked in as they normally are. This means it’s no longer necessary to manually configure the build process or fix up the binaries when using -betterC. Now, object files and libraries generated from D modules can be directly linked into a C program without any special effort. This is especially easy when using VisualD, the D plugin for Visual Studio. Not too long ago, it gained support for mixing C and D modules in the same project. The updated -betterC switch makes it an even more convenient feature.

While the feature is now more usable, it’s not yet complete. More work remains to be done in future releases to allow the use of more D features currently prohibited in betterC. Read more about the feature in Walter Bright’s article here on the D Blog, D as a Better C.

A new release schedule

This isn’t a compiler or language feature, but it’s a process feature worth noting. This is the first release conforming to a new release schedule. From here on out, beta releases will be announced on the 15th of every even month, such as 2017–10–15, 2017–12–15, 2018–2–15, etc. All final releases will be scheduled for the 1st of every odd month: 2017–11–01, 2018–01–01, 2018–03–01, etc. This will bring some reliability and predictability to the release schedule, and make it easier to plan milestones for enhancements, changes, and new features.

Get it now!

As always, the changes, fixes, and enhancements for this release can be found in the changelog. This specific release will always be available for download at http://downloads.dlang.org/releases/2.x/2.076.0, and the latest release plus betas and nightlies can be found at the download page on the DLang website.

On Tilix and D: An Interview with Gerald Nunn

Joakim is the resident interviewer for the D Blog. He has also interviewed members of the D community for This Week in D and is responsible for the Android port of LDC.


Gerald Nunn is the developer of Tilix (formerly called Terminix), an advanced, open-source, Gtk3-based tiling terminal emulator, which is the most-starred D project on github, recently surpassing even the D reference compiler, DMD. Earlier this year at DConf in Berlin, he talked about how he chose D. His slides and a recorded video are available. In his day job, which has nothing to do with desktop GUI applications, he is a Senior Middleware Solutions Architect at Red Hat.

Joakim: What is a tiling terminal emulator?

Gerald: A tiling terminal emulator allows you to split the terminal into multiple tiles and rearrange them in any layout that makes the most sense for the particular task you are working on. People that work in multiple terminals simultaneously typically find them the most useful, particularly with ever-increasing monitor sizes and resolutions.

While the tiling is cool, the primary reason I created Tilix was that I wanted a terminal emulator that followed the Gnome Human Interface Guidelines (HIG) and used Client-Side Decorations (CSD). Tilix follows the Gnome HIG published here, which means adhering to spacing, layout and other recommendations. Following the HIG is important in order for your application to be consistent with the desktop experience as a whole.

CSD refers to the titlebar of the window, where the client assumes responsibility for it, rather than the display manager, and can populate the titlebar with buttons and other controls. This is part of the Gnome HIG and most default Gnome applications (gedit, files, videos, etc.) use this. The one exception is gnome-terminal which does not use the CSD at all.

Joakim: Can you give some examples of how you conform better to the Gnome HIG?

Gerald: The Gnome HIG specifies a specific design language with regards to how applications running in Gnome should look and feel. Some examples of Tilix following the HIG include the use of CSDs and application menus, following various guidelines such as spacing and layouts, etc. Additionally, the Gnome designers have put together a variety of mockups of how they think various applications should look. Tilix uses the mockups developed by Gnome designers for the terminal where feasible. For example, in Tilix the preferences and profiles dialog used to be separate, however one of the Gnome designers proposed this mockup for gnome-terminal. I went ahead and implemented it in Tilix, for a much better experience than what I had previously.

As a result of using CSDs and following the Gnome HIG, hopefully the experience of using Tilix in Gnome feels more organic to its users.

The interesting thing is the tension between people who use Tilix on Gnome and those who use it on other distributions. While I make no bones about the fact that Gnome is my primary target, I do try to make the experience better in other desktop environments by allowing the user to disable the CSD in favor of a normal titlebar if they so wish.

Joakim: You come to D from primarily a Java background. Do you still write Java-style code in D? Was that easy, i.e. how much did you have to change to write D instead?

Gerald: Yes, my background is in Java. I found it quite interesting at DConf when I asked how many people came from a non C/C++ background that only one other fellow raised his hand.

If you look at my code in Tilix, it does look a lot like Java code. Some of that is due to my background and some is due to GtkD being a class-based wrapper. I find switching between D and Java to be pretty seamless for the most part; there is much less cognitive friction between the two than say, switching between Java and Python.

The biggest D idiom I had to learn was ranges, since it is a pretty foundational feature of D. However that wasn’t overly complicated. Compile-Time Function Evaluation (CTFE) is still something that doesn’t come naturally. I have to look it up again every time I need to use it and my current attempts at CTFE from a code perspective are pretty ugh. I’d like to leverage CTFE more in Tilix as I continue to get more experience with D.

Finally, my lack of C experience means that the other area I struggle with a bit is interfacing with C code. While for the most part it’s pretty straightforward, when I have to start deciphering something complex, it gets a bit hoary. Support for flatpack is currently being held up, as I haven’t summoned up the mental energy to work through some of the issues I am having interfacing with C code.

Having said that, I do have some experience with native code development, as I spent a fair amount of time writing code in Delphi and Object Pascal many, many years ago. This is where most of my GUI experience comes from as well.

Joakim: From your DConf talk, you’re obviously not worried much about the Garbage Collector (GC). Have you had to think about it at all when you’re coding Tilix? Any issues with the GC causing stuttering in the GUI?

Gerald: Coming from Java, the GC is pretty natural for me and I definitely do not consider it a negative for D. I think the GC in D gets a lot of bad press based on Java experiences, but it’s important to remember that the GC in D is quite different than the one in Java. The biggest difference to me is that you have more options to control the GC, since it’s well understood when it can kick off a GC cycle. I’ve been very happy to see more people pushing back on the various reddit threads and forum posts complaining about the use of GC.

I haven’t had any issues with the GC in terms of pauses, and no Tilix users have opened cases about it. I have had a few GC-related issues, primarily around leaking memory due to holding references, but these have all been programmer error rather than an issue with D’s implementation of GC. I have another GTK D application, Visual Grep, where I ran into abysmal performance when loading a large amount of matches in a tight loop. However, simply disabling the GC for that section of the code sped things up immensely.

Joakim: The github repository for Tilix is remarkably clean, no open Pull Requests (PRs) and a low percentage of issues still open. How much time do you spend weekly on Tilix? Is Tilix strictly a hobby or has it become something more?

Gerald: Tilix is strictly a hobby. I probably spend 5 to 10 hours a week on it. At this point it’s a mature application, hence the relatively low number of issues. I also prioritize fixing bugs over adding new features, which helps keep the list manageable.

With regards to pull requests, I’m a firm believer in being responsive, so I will typically respond to a PR within a day or two. As a contributor myself, I know nothing kills interest like seeing your PR languish for weeks, months, or even years. If you want people to contribute, which I definitely do, then I feel you owe contributors the courtesy of responding in a timely fashion.

Now, Tilix is a relatively small project so it’s easy for me to adopt that approach. I can understand why larger projects may have more difficulty in this area.

Joakim: D has a lot of features, how well do you know it? You mentioned that you want to use some of the compile-time capabilities more: which D features do you think Tilix would benefit from in the future and how?

Gerald: I don’t think I know it that well to be honest, beyond the core set of features I use in Tilix. I’m always amazed at the guys on the forum that can argue the merits/drawbacks of the low-level details of the language. That’s definitely not me. I’d like to get better at it, but the reality is that I only use D as a hobby so I can’t invest the same amount of time I do with Java. In addition, my job at Red Hat has more of an infrastructure component than my previous jobs, so a lot of my learning time is spent ramping up on that side of the house.

In terms of features that Tilix would benefit from, I think using CTFE and ranges would be very beneficial for exposing some of the capabilities in GtkD in a more idiomatic way. I have a fair amount of code where it could be a lot more concise with appropriate usage of CTFE. As a simple example, using ranges to support iterating over various artifacts using foreach rather than a classical for loop. However, I think some of the more complex use cases, like supporting D-Bus and GObject, would be very useful.

For those not familiar with GObject, it’s the base level object in GTK and is reference-counted. Being able to create GObjects easily in D, like you can in Python, would make it much easier to interface with some of the APIs; right now doing so is the equivalent of writing it in raw C and it’s somewhat laborious. Mike Wey, the GtkD maintainer, has started doing some work on this.

Joakim: You mentioned at DConf that D has a quick edit-compile cycle: how do you enable that, i.e. what IDE, compilers, toolchain do you use, both for development and then for release?

Gerald: I use MS Visual Studio Code on Linux with the excellent code-d plugin, written by Jan “WebFreak” Jurzitza. It gives me all of the features I need (code completion, tooltip hints, etc.). The only thing I find missing from Java IDEs is a refactoring capability. For development, I use DMD as it has the fastest compile time. Release builds are done using LDC, the D compiler with an LLVM backend, since it generates a smaller and faster binary. I rarely need to fire up a debugger, but when I do I just use GDB from the command-line.

Joakim: What problems have you had with D? What features do you dislike?

Gerald: No major problems from my perspective. I’m generally very happy with the language and find it strikes the right balance between ease of use and capabilities. Most of my dislikes would be centered on the standard library, Phobos, rather than the language itself, and those dislikes directly correlate to lack of manpower.

No major issues with Phobos, but rather a bunch of irritants. For example, you cannot easily use immutable with send in std.concurrency, std.experimental.logger is still experimental, the json parser has issues if localization is set to use a comma instead of a period for decimals, etc., etc. None of them in and of themselves are showstoppers, and most of these are really about having the manpower to polish things up. I’m actually somewhat reluctant to complain about them because I could probably fix some of them myself and submit PRs.

I tend to get more annoyed about the negativity in the forums with regards to GC. I do feel that sometimes people get so wrapped up in what D needs for it to be a perfect systems language (i.e. no GC, memory safety, etc.), it gets overlooked that it is a very good language for building native applications as it is now. While D is often compared to Rust, in some ways the comparison to Go is more interesting to me. Both are GC-based languages and both started as systems languages, however Go pivoted and doubled down on the GC and has seen success. One of the Red Hat products I support, OpenShift, leverages Kubernetes (a Google project) for container orchestration and it’s written in Go.

I think D as a language is far superior to Go, and I wish we would toot our horn a little more in this regard instead of the constant negative discussion around systems programming. Now in fairness, Go has a large corporate sponsor whereas D does not, however the contrast in positioning is still interesting to me.

Joakim: What are your future plans for Tilix?

Gerald: The two biggest features I’d like to add is support for tmux control mode and adding the ability for the popout sidebar to be permanently displayed.

For those not familiar with tmux, it is a terminal multiplexer; it essentially does terminal tiling, but within the terminal itself. It also supports a number of other features, but the most interesting is keeping terminal sessions alive outside of the terminal. Since it does tiling within the terminal, there is a bit of a performance hit, plus from a GUI perspective it can’t leverage native widgets like scrollbars. To mitigate this, it supports something called control mode that enables it to integrate with a tiling terminal emulator, so that spawning of new terminals, i.e. tiling, is managed outside of tmux. This greatly improves its performance, while allowing users to leverage other features that tmux supports. At the moment, only iterm2 on OSX supports this AFAIK.

The sidebar in Tilix is one of the more controversial UI elements. I opted not to implement a tabbed interface, because I found them useless in terms of figuring out which tab I wanted; there simply isn’t enough room on the tab to distinguish them and manually renaming them is a pain. The sidebar is my attempt at an alternative, it renders a thumbnail of each session (aka tab) in a sidebar that can be popped out as needed to switch between sessions.

The Tilix sidebar in action.

Some people have a strong preference for something that is permanently available, unfortunately making the sidebar always visible isn’t easy, as generating the thumbnails takes a significant amount of time due to the way GTK is structured. There are potentially ways to make it work, but there is a time investment required to try different options to see what is feasible and then what is the most effective.

In terms of a dream feature, I’d love to switch to using a terminal emulator that is written natively in D, rather than the GTK VTE (Virtual Terminal Emulator) that is written in C that I’m using now. For those not familiar with it, the VTE is the terminal emulation widget used by Gnome Terminal and is available as a reusable widget. Many terminal emulators in Linux use this widget (gnome-terminal, guake, terminator, tilix, etc.), as it provides a production-ready emulator that has been through a huge amount of testing.

The downside with using it is that any custom features you want to implement that involve the actual terminal emulation layer require modifying the VTE and getting those modifications in upstream. I have a few patches that Tilix supports (for triggers and badges), but frankly I’ve done a poor job of getting them into upstream. Part of the reason for this is VTE is written in C and getting up to speed enough with C to make a quality patch is time-consuming.

So having the terminal emulation written in D would make this much easier, however it’s a huge time investment as people underestimate the amount of work involved. There’s a lot of edge cases in terminal emulation, plus adding all of the stuff to make it user-friendly (search, clicking links, etc.), it’s a lot to take on. I simply don’t have the time to make this a reality unless I win the lottery. If someone wanted to take that part on and create a GTK terminal emulation widget that has all of the needed features and stick with it for the long haul, I’d be happy to work with them to get it integrated with Tilix. Adam Ruppe has already created one that works quite well from my testing of it; if someone wanted to work on converting it to a GTK widget, add the necessary improvements and agree to maintain it, feel free to ping me. 🙂

Joakim: Please take us from your experience first discovering and using D to writing Tilix.

Gerald: Throughout my working career, I’ve always had hobby projects going on. Some things I worked on included a popular add-in for Delphi called Gexperts, a Windows file explorer replacement, a Java IDE called Gel and a popular Android app called OnTrack Diabetes that I sold a few years ago. A couple of years ago, I was looking for something new to work on as my hobby program and settled on the idea of building a desktop application for Linux. I knew it needed to use the GTK toolkit, since Gnome is my preferred desktop environment, and particularly with my past Delphi experience in GUIs which I could leverage. I also knew I wasn’t interested in coding in C or C++, so I had a look at what the alternatives were.

I started with Python, as it has excellent support for GTK and I had done a bit of Jython programming in WebLogic, as the Weblogic Scripting Tool (WLST) used it. However, most of my previous work had been small scripts and I quickly realized that at larger scales Python wasn’t for me. I really prefer statically-typed languages in general and Python’s dynamic typing drove me batty, particularly on a hobby program where I constantly needed to rebuild my mental stack since I worked on it infrequently.

I also looked at Rust and Go, but at the time neither of them had feature-complete GTK bindings. Also, while Rust had a lot of positive press, it had a formidable learning curve and I was less than convinced that its focus on memory safety via the borrow checker was a better approach than GC.

I was aware of D, as I had looked at D previously many years ago and liked the language, but the ecosystem was so weak it just wasn’t that useful for practical work. I gave it a second look and found that it had greatly improved and amazingly, full GtkD bindings were available. It also helped that D and Java are similar enough that picking up D was incredibly easy. Once I learned how ranges worked, it was easy to start cranking out code.

I first created a small application called Visual Grep that wraps grep into a GUI. When I was consulting, I often had to grep large code bases looking for specific patterns and a GUI that made browsing the matches was an absolute necessity. This was a great first application, as I learned quite a few things about D. The application performance was initially shit with large result sets, because the GC was constantly kicking in when loading results due to the constant allocation. Disabling the GC during that tight loop improved the performance immeasurably. I also learned about integrating GTK with D’s multi-threading capabilities.

I was inspired by the UI from Gnome Builder, an IDE, and thought that it would work quite well for a terminal emulator. Thus Tilix was born. Well, actually at first it was called Terminix, but once it started getting popular I received a polite cease-and-desist order from Terminix, the American pest control company. Being Canadian, I wasn’t overly aware of them, hence why I didn’t think too much about the name. Lesson learned: spend time choosing a good name in case your app does become more popular than you expect.

I’ve enjoyed my time with D and it’s been a great language for creating desktop applications in GTK. In many ways, I feel like D is a natural successor to Vala, which was a language created specifically for building GTK applications but has been slowly dying, largely due to its single focus. If you are not building GTK apps, you aren’t using Vala, which means the pool of people using it and working on it is by definition very small.

I also feel that D is a natural successor to Delphi, at least with GtkD, as a potent tool for creating desktop applications. With its fast compile time and as an easy-to-learn language, it brings many of Delphi’s best attributes into the modern age. Plus I can type { and } instead of Begin and End and increase my efficiency by 75% or so. 🙂

A DUB Case Study: Compiling DMD as a Library

In his day job, Jacob Carlborg is a Ruby backend developer for Derivco Sweden, but he’s been using D on his own time since 2006. He is the maintainer of numerous open source projects, including DStep, a utility that generates D bindings from C and Objective-C headers, DWT, a port of the Java GUI library SWT, and DVM, the topic of another post on this blog. He implemented native Thread Local Storage support for DMD on OS X and contributed, along with Michel Fortin, to the integration of Objective-C in D.


DUB is the official build tool and package manager for the D programming language. Originally written and currently maintained by Sönke Ludwig as part of the vibe.d web framework, its acceptance as an official part of the D toolchain means it is now shipping with the most recent DMD and LDC compilers.

A Quick Introduction to DUB

If you have have the latest DMD or LDC installed, you already have DUB installed as well. If not, or if you want to check for a more recent version, you can get the very latest release, beta or release candidate from the DUB download page.

You can create a new DUB project by executing the dub init command. This will start an interactive setup that guides you through project creation.

  1. First decide the format of the package recipe. Two formats are supported: JSON and SDLang. Here we picked SDLang.
  2. Then specify the name of the project. Press enter to use the default name, which is displayed in brackets and is inferred from the directory
  3. Do the same for the description, author, license, copyright, and dependencies to select the default values
$ dub init foo
Package recipe format (sdl/json) [json]: sdl
Name [foo]:
Description [A minimal D application.]:
Author name [Jacob Carlborg]:
License [proprietary]:
Copyright string [Copyright © 2017, Jacob Carlborg]:
Add dependency (leave empty to skip) []:
Successfully created an empty project in '/Users/jacob/tmp/foo'.
Package successfully created in foo

After the setup has completed, the following files and directories will have been created:

$ tree foo
foo
├── dub.sdl
└── source
    └── app.d

1 directory, 2 files
  • dub.sdl is the package recipe file, which provides instructions telling DUB how to build the package
  • source is the default path where DUB looks for D source files
  • app.d contains the main function and is an example Hello World generated by DUB with the following content:
import std.stdio;

void main()
{
	writeln("Edit source/app.d to start your project.");
}

The content of the dub.sdl file is the following:

name "foo"
description "A minimal D application."
authors "Jacob Carlborg"
copyright "Copyright © 2017, Jacob Carlborg"
license "proprietary"

All of which was taken from what we specified during project creation. By default, DUB looks for D source files in either source or src directories and compiles all files it finds there and in any subdirectories.

To build and run the application, navigate to the project’s root directory, foo in this case, and invoke dub:

$ dub
Performing "debug" build using dmd for x86_64.
foo ~master: building configuration "application"...
Linking...
Running ./foo
Edit source/app.d to start your project.

To build without running, invoke dub build:

$ dub build
Performing "debug" build using dmd for x86_64.
foo ~master: building configuration "application"...
Linking...

Case Study: DMD as a Library

Recently there has been some progress in making the D compiler (DMD) available as a library. Razvan Nitu has been working on it as part of his D Foundation scholarship at the University Politechnica of Bucharest. He gave a presentation at DConf 2017 (a video of the talk is available, as well as examples in the DMD repository). So I had the idea that as part of the DConf 2017 hackathon I could create a simple DUB package for DMD to make only the lexer and the parser available as a library, something his work has made possible.

Currently DMD is built using make. There are three Makefiles, one for Posix, one for 32-bit Windows and one for 64-bit Windows  (which is only a wrapper of the 32-bit one). I don’t intend to try to completely replicate the Makefiles as a DUB package (they contain some additional tasks besides building the compiler), but instead will start out fresh and only include what’s necessary to build the lexer and parser.

DMD already has all the source code in the src directory, which is one of the directories DUB searches by default. If we would leave it as is, DUB would include the entirety of DMD, including the backend and other parts we don’t want to include at this point.

The first step is to create the DUB package recipe file. We start simple with only the metadata (here using the SDLang format):

name "dmd"
description "The DMD compiler"
authors "Walter Bright"
copyright "Copyright © 1999-2017, Digital Mars"
license "BSL-1.0"

When we have this we need to figure out which files to include in the package. We can do this by invoking DMD with the -deps flag to generate the imports of a module. A good start is the lexer, which is located in src/ddmd/lexer.d. We run the following command to output the imports that lexer.d is using:

$ dmd -deps=deps.txt -o- -Isrc src/ddmd/lexer.d

This will write a file named deps.txt containing all the imports used by lexer.d. The -o- flag is used to tell the compiler not to generate any code. The -I flag is used to add an import path where the compiler will look for additional modules to import (but not compile). An example of the output looks like this (the long path names have been reduced to save space):

core.attribute (druntime/import/core/attribute.d) : private : object (druntime/import/object.d)
object (druntime/import/object.d) : public : core.attribute (druntime/import/core/attribute.d):selector
ddmd.lexer (ddmd/lexer.d) : private : object (druntime/import/object.d)
core.stdc.ctype (druntime/import/core/stdc/ctype.d) : private : object (druntime/import/object.d)
ddmd.root.array (ddmd/root/array.d) : private : object (druntime/import/object.d)
ddmd.root.array (ddmd/root/array.d) : private : core.stdc.string (druntime/import/core/stdc/string.d)

The most interesting part of this output, in this case, is the first column, which consists of a long list of module names. What we are interested in here is a unique list of modules that are located in the ddmd package. All modules in the core package are part of the D runtime and are already precompiled as a library and automatically linked when compiling a D executable, so these modules don’t need to be compiled. The modules from the ddmd package can be extracted with some search-and-replace in your favorite text editor or using some standard Unix command lines tools:

$ cat deps.txt | cut -d ' ' -f 1 | grep ddmd | sort | uniq
ddmd.console
ddmd.entity
ddmd.errors
ddmd.globals
ddmd.id
ddmd.identifier
ddmd.lexer
ddmd.root.array
ddmd.root.ctfloat
ddmd.root.file
ddmd.root.filename
ddmd.root.hash
ddmd.root.outbuffer
ddmd.root.port
ddmd.root.rmem
ddmd.root.rootobject
ddmd.root.stringtable
ddmd.tokens
ddmd.utf

Here we can see that a set of modules is located in the nested package ddmd.root. This package contains common functionality used throughout the DMD source code. Since it doesn’t have any dependencies on any code outside the package it’s a good fit to place in a DUB subpackage. This can be done using the subPackage directive, as follows:

subPackage {
  name "root"
  targetType "library"
  sourcePaths "src/ddmd/root"
}

We specify the name of the subpackage, root. The targetType directive is used to tell DUB whether it should build an executable or a library (though it’s optional — DUB will build an executable if it finds an app.d in the root of the source directory and a library if it doesn’t). Finally, sourcePaths can be used to specify the paths where DUB should look for the D source files if neither of the default directories is used. Fortunately, we want to include all the files in the src/ddmd/root, so using sourcePaths works perfectly fine.

We can verify that the subpackage works and builds by invoking:

$ dub build :root
Building package dmd:root in /Users/jacob/development/d/dlang/dmd/
Performing "debug" build using dmd for x86_64.
dmd:root ~master: building configuration "library"...

:package-name is shorthand that tells DUB to build the package-name subpackage of the current package, in our case the root subpackage.

After removing all the modules from the root package from the initial list of dependencies, the following modules remain:

ddmd.console
ddmd.entity
ddmd.errors
ddmd.globals
ddmd.id
ddmd.identifier
ddmd.lexer
ddmd.tokens
ddmd.utf

The next step is to create a subpackage for the lexer containing the remaning modules.

subPackage {
  name "lexer"
  targetType "library"
  sourcePaths

Again we start by specifying the name of the subpackage and that the target type is a library. Specifying sourcePaths without any value will set it to an empty list, i.e. no source paths. This is done because there are more files than we want to include in this subpackage in the source directory.

sourceFiles \
    "src/ddmd/console.d" \
    "src/ddmd/entity.d" \
    "src/ddmd/errors.d" \
    "src/ddmd/globals.d" \
    "src/ddmd/id.d" \
    "src/ddmd/identifier.d" \
    "src/ddmd/lexer.d" \
    "src/ddmd/tokens.d" \
    "src/ddmd/utf.d"

The above specifies all source files that should be included in this subpackage. The difference between sourcePaths and sourceFiles is that sourcePaths expects a whole directory of source files that should be included, where sourceFiles lists only the individual files that should be included. A list in SDLang is written by separating the items with a space. The backslash (\) is used for line continuation, making it possible spread the list across multiple lines.

The final step of the lexer subpackage is to add a dependency on the root subpackage. This is done with the dependency directive:

dependency "dmd:root" version="*"
}

The first parameter for the dependency directive is the name of another DUB package. The colon is used to separate the package name from the subpackage name. The version attribute is used to specify which version the package should depend on. The * is used to indicate that any version of the dependency matches, i.e. the latest version should always be used. When implementing subpackages in any given package, this is generally what should be used. External projects that depend on any DUB package should specify a SemVer version number corresponding to a known release version.

If we build the lexer subpackage now it will result in an error:

$ dub build :lexer
Building package dmd:lexer in /Users/jacob/development/d/dlang/dmd/
Performing "debug" build using dmd for x86_64.
dmd:lexer ~master: building configuration "library"...
src/ddmd/globals.d(339,21): Error: need -Jpath switch to import text file VERSION
dmd failed with exit code 1.

Looking at the file and line of the error shows that it contains the following code:

_version = (import("VERSION") ~ '\0').ptr;

This code contains an import expression. Import expressions differ from import statements (e.g. import std.stdio;) in that they take a file from the file system and insert its contents into the current module. It’s just as if you copied and pasted the contents yourself. Using an import expression requires that the path where the file is imported from be passed to the compiler as a security mechanism. This can be done using the -J flag. In this case, we want to use the package root, where we are executing DUB, so we can use a single dot: “.“. Passing arbitrary flags to the compiler can be done with the dflags build setting, as follows:

dflags "-J."

Add that to the lexer subpackage configuration and it will compile correctly:

$ dub build :lexer
Building package dmd:lexer in /Users/jacob/development/d/dlang/dmd/
Performing "debug" build using dmd for x86_64.
dmd:lexer ~master: building configuration "library"...

For the final subpackage, we have the parser. The parser is located in src/ddmd/parse.d. To get its dependencies we can use the same approach we used for the lexer. But we will filter out all files that are part of the other subpackages:

$ dmd -deps=deps.txt -Isrc -J. -o- src/ddmd/parse.d
$ cat deps.txt | cut -d ' ' -f 1 | grep ddmd | grep -E -v '(root|console|entity|errors|globals|id|identifier|lexer|tokens|utf)' | sort | uniq
ddmd.parse

Here, we’re supplying the -v flag to grep to filter the results and the -E flag to enable extended regular expressions. All modules from the root package and all modules from the lexer subpackage are filtered out and the only remaining module is the ddmd.parse module.

The subpackage for the parser will look similar to the other subpackages:

subPackage {
  name "parser"
  targetType "library"
  sourcePaths

  sourceFiles "src/ddmd/parse.d"

  dependency "dmd:lexer" version="*"
}

Again, we can verify that it’s working by building the subpackage:

$ dub build :parser
Building package dmd:parser in /Users/jacob/development/d/dlang/dmd/
Performing "debug" build using dmd for x86_64.
dmd:parser ~master: building configuration "library"...

Currently we have three subpackages in the DUB recipe file, but no way to use the main package as a whole. To fix this we add the parser subpackage as a dependency of the main package. We pick the parser subpackage as a dependency because it will include the other two subpackages through its own dependencies.

license "BSL-1.0"

targetType "none"
dependency ":parser" version="*"

subPackage {
  name "root"

In addition to specifying parser as a dependency, we also specify the target type to be none. This will avoid building an empty library out of the main package, since it doesn’t contain any source files of its own.

As a final step, we’ll verify that the whole library is working by creating a separate project that uses the DMD DUB package as a dependency. We create a new DUB project in the test directory, called dub_package:

$ cd test
$ mkdir dub_package
$ cd dub_package
$ cat > dub.sdl <<EOF
> name "dmd-dub-test"
> description "Test of the DMD Dub package"
> license "BSL 1.0"
>
> dependency "dmd" path="../../"
> EOF
$ mkdir source

We create a new file, source/app.d, with the following content:

void main()
{
}

// lexer
unittest
{
    import ddmd.lexer;
    import ddmd.tokens;

    immutable expected = [
        TOKvoid,
        TOKidentifier,
        TOKlparen,
        TOKrparen,
        TOKlcurly,
        TOKrcurly
    ];

    immutable sourceCode = "void test() {} // foobar";
    scope lexer = new Lexer("test", sourceCode.ptr, 0, sourceCode.length, 0, 0);
    lexer.nextToken;

    TOK[] result;

    do
    {
        result ~= lexer.token.value;
    } while (lexer.nextToken != TOKeof);

    assert(result == expected);
}

// parser
unittest
{
    import ddmd.astbase;
    import ddmd.parse;

    scope parser = new Parser!ASTBase(null, null, false);
    assert(parser !is null);
}

The above file contains two unit tests, one for the lexer and one for the parser. We can run dub test to run the unit tests for this package:

$ dub test
No source files found in configuration 'library'. Falling back to "dub -b unittest".
Performing "unittest" build using dmd for x86_64.
dmd:root ~issue-17392-dub: building configuration "library"...
dmd:lexer ~issue-17392-dub: building configuration "library"...
../../src/ddmd/globals.d(339,21): Error: file "VERSION" cannot be found or not in a path specified with -J
dmd failed with exit code 1.

Which gives us the error that it cannot find the VERSION file in any string import paths, even though we added the correct directory to the string import paths. If we run the tests with verbose output enabled, using the --verbose flag we get a hint (the output has been reduced to save space):

dmd:lexer ~issue-17392-dub: building configuration "library"...
dmd -J. -lib

Here we see that the compiler is invoked with the -J. flag, which is what we previously specified in the lexer subpackage. The problem is that the current directory is now of the dmd-dub-test DUB package instead of the dmd DUB package. Looking at the documentation of DUB we can see there’s an environment variable, $PACKAGE_DIR, that we can use as the string import path instead of hardcoding it to use a single dot. We update the dflags setting of the lexer subpackage to use the $PACKAGE_DIR environment variable:

dflags "-J$PACKAGE_DIR"
}

Running the tests again shows that the error is fixed, but now we get a new error, a long list of undefined symbols (shortened here):

$ dub test
No source files found in configuration 'library'. Falling back to "dub -b unittest".
Performing "unittest" build using dmd for x86_64.
dmd:root ~issue-17392-dub: building configuration "library"...
dmd:lexer ~issue-17392-dub: building configuration "library"...
dmd:parser ~issue-17392-dub: building configuration "library"...
dmd-dub-test ~master: building configuration "application"...
Linking...
Undefined symbols for architecture x86_64:
  "_D4ddmd7astbase12__ModuleInfoZ", referenced from:
      _D3app12__ModuleInfoZ in dmd-dub-test.o

The reason for this is that we’re importing the ddmd.astbase module in the test of the parser, but it’s never compiled. We can solve that problem by adding it to the parser subpackage in the dmd DUB package. Running dmd again to show all its dependencies shows that it also depends on the ddmd.astbasevisitor module. We add these two modules as follows:

sourceFiles \
  "src/ddmd/astbase.d" \
  "src/ddmd/astbasevisitor.d" \
  "src/ddmd/parse.d"

Finally, running the tests again shows that everything is working correctly:

$ dub test
No source files found in configuration 'library'. Falling back to "dub -b unittest".
Performing "unittest" build using dmd for x86_64.
dmd:root ~issue-17392-dub: building configuration "library"...
dmd:lexer ~issue-17392-dub: building configuration "library"...
dmd:parser ~issue-17392-dub: building configuration "library"...
dmd-dub-test ~master: building configuration "application"...
Linking...
Running ./dmd-dub-test

After verifying that both the lexer and parser are working in a separate DUB package, this is the final result of the package recipe for the dmd DUB package:

name "dmd"
description "The DMD compiler"
authors "Walter Bright"
copyright "Copyright © 1999-2017, Digital Mars"
license "BSL-1.0"

targetType "none"
dependency ":parser" version="*"

subPackage {
  name "root"
  targetType "library"
  sourcePaths "src/ddmd/root"
}

subPackage {
  name "lexer"
  targetType "library"
  sourcePaths

  sourceFiles \
    "src/ddmd/console.d" \
    "src/ddmd/entity.d" \
    "src/ddmd/errors.d" \
    "src/ddmd/globals.d" \
    "src/ddmd/id.d" \
    "src/ddmd/identifier.d" \
    "src/ddmd/lexer.d" \
    "src/ddmd/tokens.d" \
    "src/ddmd/utf.d"

  dflags "-J$PACKAGE_DIR"

  dependency "dmd:root" version="*"
}

subPackage {
  name "parser"
  targetType "library"
  sourcePaths

  sourceFiles \
    "src/ddmd/astbase.d" \
    "src/ddmd/astbasevisitor.d" \
    "src/ddmd/parse.d"

  dependency "dmd:lexer" version="*"
}

All this has now been merged into master and the DUB package is available here: http://code.dlang.org/packages/dmd. Happy hacking!

New D Compiler Release: DMD 2.075.0

DMD 2.075.0 was released a few days back. As with every release, the changelog is available so you can browse the list of fixed bugs and new features. 2.075.0 can be fetched from the dlang.org download page, which always makes available the latest DMD release alongside a nightly build.

Notable Changes

Every DMD release brings with it a number of bug fixes, changes, and enhancements. Here are some of the more noteworthy changes in this release.

Two array properties removed

Anyone who does a lot of work with D’s ranges will likely have encountered this little annoyance that arises from the built-in .sort property of arrays.

void main()
{
    import std.algorithm : remove, sort;
    import std.array : array;
    int[] nums = [5, 3, 1, 2, 4];
    nums = nums.sort.remove(2).array;
}

The .sort property has been deprecated for ages, so the above would result in the following error:

sorted.d(6): Deprecation: use std.algorithm.sort instead of .sort property

The workaround would be to add an empty set of parentheses to the sort call. With DMD 2.075.0, this is no longer necessary and the above will compile. Both the .sort and .reverse array properties have finally been removed from the language.

For the uninitiated, D has two features that have proven convenient in the functional pipeline programming style typically used with ranges. One is that parentheses on a function call are optional when there are no parameters. The other is Universal Function Call Syntax (UFCS), which allows a function call to be made using the dot notation on the first argument, so that a function int add(int a, int b) can be called as: 10.add(5).

Each of D’s built-in types comes with a set of built-in properties. Given that the built-in properties are not functions, no parentheses are used to access them. The .sort array property has been around since the early days of D1. At the time, it was rather useful and convenient for anyone who was happy with the default implementation. When D2 came along with the range paradigm, the standard library was given a set of functions that can treat arrays as ranges, opening them up to use with the many range based functions in the std.algorithm package and elsewhere.

With optional parentheses, UFCS, and a range-based function in std.algorithm called sort, conflict was inevitable. Now range-based programmers can put that behind them and take one more pair of parentheses out of their pipelines.

The breaking up of std.datetime

The std.datetime module has had a reputation as the largest module in D’s standard library. Some developers have been known to use it a stress test for their tooling. It was added to the library long before D got the special package module feature, which allows multiple modules in a package to be imported as a single module.

Once package modules were added, Jonathan M. Davis, the original std.datetime developer, found it challenging to split the monolith into multiple modules. Then, at DConf 2017, he could be seen toiling away on his laptop in the conference hall and the hotel lobby. On the final day of the conference, the day of the DConf Hackathon, he announced that std.datetime was now a package. DMD 2.075.0 is the first release where the new module structure is available.

Any existing code using the old module should still compile. However, any static libraries or object files lying around with the old symbols stuffed inside may need to be recompiled.

Colorized compiler messages

This one is missing from the changelog. DMD now has the ability to output colorized messages. The implementation required going through the existing error messages and properly annotating them where appropriate, so there may well be some messages for which the colors are missing. Also, given that this is a brand new feature and people can be picky about their terminal colors, more work will likely be done on this in the future. Perhaps that might include support for customization.

 

Compiler Ddoc documentation online

DMD, though originally written in C++, was converted to D some time ago. Now that more D programmers are able to contribute to the compiler, work has gone into documenting its source using D’s built-in Ddoc syntax. The result is now online, accessible from the sidebar of the existing library reference. A good starting point is the ddmd.mars module.

And more…

The above is a small part of the bigger picture. The bugfix list shows 89 bugs, regressions, and enhancements across the compiler, runtime, standard library, and web site. See the full changelog for the details.

Thanks to everyone who contributed to this release, whether it was by reporting issues, submitting or reviewing pull requests, testing out the beta, or carrying out any of the numerous small tasks that help a new release see the light of day.

DCompute: GPGPU with Native D for OpenCL and CUDA

Nicholas Wilson is a student at Murdoch University, studying for his BEng (Hons)/BSc in Industrial Computer Systems (Hons) and Instrumentation & Control/ Molecular Biology & Genetics and Biomedical Science. He just finished his thesis on low-cost defect detection of solar cells by electroluminescence imaging, which gives him time to work on DCompute and write about it for the D Blog. He plays the piano, ice skates, and has spent 7 years putting D to use on number bashing, automation, and anything else that he could make a computer do for him.


DCompute is a framework and compiler extension to support writing native kernels for OpenCL and CUDA in D to utilise GPUs and other accelerators for computationally intensive code. In development are drivers to automate the interactions between user code and the tedious and error prone compute APIs with the goal of enabling the rapid development of high performance D libraries and applications.

Introduction

After watching John Colvin’s DConf 2016 presentation in May of last year on using D’s metaprogramming to make the OpenCL API marginally less horrible to use, I thought, “This would be so much easier to do if we were able to write kernels in D, rather than doing string manipulations in OpenCL C”. At the time, I was coming up to the end of a rather busy semester and thought that would make a good winter[1] project. After all, LDC, the LLVM D Compiler, has access to LLVM’s SPIR-V and PTX backends, and I thought, “It can’t be too hard, its only glue code”. I slightly underestimated the time it would take, finishing the first stage of DCompute (because naming things is hard), mainlining the changes I made to LDC at the end of February, eight months later — just in time for the close of submissions to DConf, where I gave a talk on the progress I had made.

Apart from familiarising myself with the LDC and DMD front-end codebases, I also had to understand the LLVM SPIR-V and PTX backends that I was trying to target, because they require the use of special metadata (for e.g. denoting a function is a kernel) and address spaces, used to represent __global & friends in OpenCL C and __global__ & friends in CUDA, and introduce these concepts into LDC.

But once I was familiar with the code and had sorted the above discrepancies, it was mostly smooth sailing translating the OpenCL and CUDA modifiers into compiler-recognised attributes and wrapping the intrinsics into an easy to use and consistent interface.

When it was all working and almost ready to merge into mainline LDC, I hit a bit of a snag with regards to CI: the SPIR-V backend that was being developed by Khronos was based on the quite old LLVM 3.6.1 and, despite my suggestions, did not have any releases. So I forward ported the backend and the conversion utility to the master branch of LLVM and made a release myself. Still in progress on this front are converting magic intrinsics to proper LLVM intrinsics and transitioning to a TableGen-driven approach for the backend in preparation for merging the backend into LLVM Trunk. This should hopefully be done soon™.

Current state of DCompute

With the current state of DCompute we are able to write kernels natively in D and have access to most of its language-defining features like templates & static introspection, UFCS, scope guards, ranges & algorithms and CTFE. Notably missing, for hardware and performance reasons, are those features commonly excluded in kernel languages, like function pointers, virtual functions, dynamic recursion, RTTI, exceptions and the use of the garbage collector. Note that unlike OpenCL C++ we allow kernel functions to be templated and have overloads and default values. Still in development is support for images and pipes.

Example code

To write kernels in D, we need to pass -mdcompute-targets=<targets> to LDC, where <targets> is a comma-separated list of the desired targets to build for, e.g. ocl-120,cuda-350 for OpenCL 1.2 and CUDA compute capability 3.5, respectively (yes, we can do them all at once!). We get one file for each target, e.g. kernels_ocl120_64.spv, when built in 64-bit mode, which contains all of the code for that device.

The vector add kernel in D is:

@compute(CompileFor.deviceOnly) module example;
import ldc.dcompute;
import dcompute.std.index;

alias gf = GlobalPointer!float;

@kernel void vadd(gf a, gf b, gf c) 
{
	auto x = GlobalIndex.x;
	a[x] = b[x]+c[x];
}

Modules marked with the @compute attribute are compiled for each of the command line targets, @kernel makes a function a kernel, and GlobalPointer is the equivalent of the __global qualifier in OpenCL.

Kernels are not restricted to just functions — lambdas & tamplates also work:

@kernel void map(alias F)(KernelArgs!F args)
{
    F(args);
}
//In host code
AutoBuffer!float x,y,z; // y & z initialised with data
q.enqueue!(map!((a,b,c) => a=b+c))(x.length)(x, y, z);

Where KernelArgs translates host types to device types (e.g. buffers to pointers or, as in this example, AutoBuffers to AutoIndexed Pointers) so that we encapsulate the differences in the host and device types.

The last line is the expected syntax for launching kernels, q.enqueue!kernel(dimensions)(args), akin to CUDA’s kernel<<<dimensions,queue>>>(args). The libraries for launching kernels are in development.

Unlike CUDA, where all the magic for transforming the above expression into code on the host lies in the compiler, q.enqueue!func(sizes)(args) will be processed by static introspection of the driver library of DCompute.
The sole reason we can do this in D is that we are able to query the mangled name the compiler will give to a symbol via the symbol’s .mangleof property. This, in combination with D’s easy to use and powerful templates, means we can significantly reduce the mental overhead associated with using the compute APIs. Also, implementing this in the library will be much simpler, and therefore faster to implement, than putting the same behaviour in the compiler. While this may not seem much for CUDA users, this will be a breath of fresh air to OpenCL users (just look at the OpenCL vector add host code example steps 7-11).

While you cant do that just yet in DCompute, development should start to progress quickly and hopefully become a reality soon.

I would like to thank John Colvin for the initial inspiration, Mike Parker for editing, and the LDC folks, David Nadlinger, Kai Nacke, Martin Kinke, with a special thanks to Johan Engelen, for their help with understanding the LDC codebase and reviewing my work.

If you would like to help develop DCompute (or be kept in the loop), feel free to drop a line at the libmir Gitter. Similarly, any efforts preparing the SPIR-V backend for inclusion into LLVM are also greatly appreciated.

[1] Southern hemisphere.

The New CTFE Engine

Stefan Koch is the maintainer of sqlite-d, a native D sqlite reader, and has contributed to projects like SDC (the Stupid D Compiler) and vibe.d. He was also responsible for a 10% performance improvement in D’s current CTFE implementation and is currently writing a new CTFE engine, the subject of this post.


For the past nine months, I’ve been working on a project called NewCTFE, a reimplementation of the Compile-Time Function Evaluation (CTFE) feature of the D compiler front-end. CTFE is considered one of the game-changing features of D.

As the name implies, CTFE allows certain functions to be evaluated by the compiler while it is compiling the source code in which the functions are implemented. As long as all arguments to a function are available at compile time and the function is pure (has no side effects), then the function qualifies for CTFE. The compiler will replace the function call with the result.

Since this is an integral part of the language, pure functions can be evaluated anywhere a compile-time constant may go. A simple example can be found in the standard library module, std.uri, where CTFE is used to compute a lookup table. It looks like this:

private immutable ubyte[128] uri_flags = // indexed by character
({

    ubyte[128] uflags;

    // Compile time initialize
    uflags['#'] |= URI_Hash;

    foreach (c; 'A' .. 'Z' + 1)
    {
        uflags[c] |= URI_Alpha;
        uflags[c + 0x20] |= URI_Alpha; // lowercase letters

    }

    foreach (c; '0' .. '9' + 1) uflags[c] |= URI_Digit;

    foreach (c; ";/?:@&=+$,") uflags[c] |= URI_Reserved;

    foreach (c; "-_.!~*'()") uflags[c] |= URI_Mark;

    return uflags;

})();

Instead of populating the table with magic values, a simple expressive function literal is used. This is much easier to understand and debug than some opaque static array. The ({ starts a function-literal, the }) closes it. The () at the end tells the compiler to immediately evoke that literal such that uri_flags becomes the result of the literal.

Functions are only evaluated at compile time if they need to be. uri_flags in the snippet above is declared in module scope. When a module-scope variable is initialized in this manner, the initializer must be available at compile time. In this case, since the initializer is a function literal, an attempt will be made to perform CTFE. This particular literal has no arguments and is pure, so the attempt succeeds.

For a more in-depth discussion of CTFE, see this article by H. S. Teoh at the D Wiki.

Of course, the same technique can be applied to more complicated problems as well; std.regex, for example, can build a specialized automaton for a regex at compile time using CTFE. However, as soon as std.regex is used with CTFE for non-trivial patterns, compile times can become extremely high (in D everything that takes longer than a second to compile is bloat-ware :)). Eventually, as patterns get more complex, the compiler will run out of memory and probably take the whole system down with it.

The blame for this can be laid at the feet of the current CTFE interpreter’s architecture. It’s an AST interpreter, which means that it interprets the AST while traversing it. To represent the result of interpreted expressions, it uses DMD’s AST node classes. This means that every expression encountered will allocate one or more AST nodes. Within a tight loop, the interpreter can easily generate over 100_000_000 nodes and eat a few gigabytes of RAM. That can exhaust memory quite quickly.

Issue 12844 complains about std.regex taking more than 16GB of RAM. For one pattern. Then there’s issue 6498, which executes a simple 0 to 10_000_000 while loop via CTFE and runs out of memory.

Simply freeing nodes doesn’t fix the problem; we don’t know which nodes to free and enabling the garbage collector makes the whole compiler brutally slow. Luckily there is another approach which doesn’t allocate for every expression encountered. It involves compiling the function to a virtual ISA (Instruction Set Architecture). This virtual ISA, also known as bytecode, is then given to a dedicated interpreter for that ISA (in the case in which a virtual ISA happens to be the same as the ISA of the host, we call it a JIT (Just in Time) interpreter).

The NewCTFE project concerns itself with implementing such a bytecode interpreter. Writing the actual interpreter (a CPU emulator for a virtual CPU/ISA) is reasonably simple. However, compiling code to a virtual ISA is exactly as much work as compiling it to a real ISA (though, a virtual ISA has the added benefit that it can be extended for customized needs, but that makes it harder to do JIT later). That’s why it took a month just to get the first simple examples running on the new CTFE engine, and why slightly more complicated ones still aren’t running even after 9 months of development. At the end of the post, you’ll find an approximate timeline of the work done so far.

I’ll be giving a presentation at DConf 2017, where I’ll discuss my experience implementing the engine and explain some of the technical details, particularly regarding the trade-offs and design choices I’ve made. The current estimation is that the 1.0 goals will not be met by then, but I’ll keep coding away until it’s done.

Those interested in keeping up with my progress can follow my status updates in the D forums. At some point in the future, I will write another article on some of the technical details of the implementation. In the meantime, I hope the following listing does shed some light on how much work it is to implement NewCTFE 🙂

  • May 9th 2016
    Announcement of the plan to improve CTFE.
  • May 27th 2016
    Announcement that work on the new engine has begun.
  • May 28th 2016
    Simple memory management change failed.
  • June 3rd 2016
    Decision to implement a bytecode interpreter.
  • June 30th 2016
    First code (taken from issue 6498) consisting of simple integer arithmetic runs.
  • July 14th 2016
    ASCII string indexing works.
  • July 15th 2016
    Initial struct support
  • Sometime between July and August
    First switches work.
  • August 17th 2016
    Support for the special cases if(__ctfe) and if(!__ctfe)
  • Sometime between August and September
    Ternary expressions are supported
  • September 08th 2016
    First Phobos unit tests pass.
  • September 25th 2016
    Support for returning strings and ternary expressions.
  • October 16th 2016
    First (almost working) version of the LLVM backend.
  • October 30th 2016
    First failed attempts to support function calls.
  • November 01st
    DRuntime unit tests pass for the first time.
  • November 10th 2016
    Failed attempt to implement string concatenation.
  • November 14th 2016
    Array expansion, e.g. assignment to the length property, is supported.
  • November 14th 2016
    Assignment of array indexes is supported.
  • November 18th 2016
    Support for arrays as function parameters.
  • November 19th 2016
    Performance fixes.
  • November 20th 2016
    Fixing the broken while(true) / for (;;) loops; they can now be broken out of 🙂
  • November 25th 2016
    Fixes to goto and switch handling.
  • November 29th 2016
    Fixes to continue and break handling.
  • November 30th 2016
    Initial support for assert
  • December 02nd 2016
    Bailout on void-initialized values (since they can lead to undefined behavior).
  • December 03rd 2016
    Initial support for returning struct literals.
  • December 05th 2016
    Performance fix to the bytecode generator.
  • December 07th 2016
    Fixes to continue and break in for statements (continue must not skip the increment step)
  • December 08th 2016
    Array literals with variables inside are now supported: [1, n, 3]
  • December 08th 2016
    Fixed a bug in switch statements.
  • December 10th 2016
    Fixed a nasty evaluation order bug.
  • December 13th 2016
    Some progress on function calls.
  • December 14th 2016
    Initial support for strings in switches.
  • December 15th 2016
    Assignment of static arrays is now supported.
  • December 17th 2016
    Fixing goto statements (we were ignoring the last goto to any label :)).
  • December 17th 2016
    De-macrofied string-equals.
  • December 20th 2016
    Implement check to guard against dereferencing null pointers (yes… that one was oh so fun).
  • December 22ed 2016
    Initial support for pointers.
  • December 25th 2016
    static immutable variables can now be accessed (yes the result is recomputed … who cares).
  • January 02nd 2017
    First Function calls are supported !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
  • January 17th 2017
    Recursive function calls work now 🙂
  • January 23rd 2017
    The interpret3.d unit-test passes.
  • January 24th 2017
    We are green on 64bit!
  • January 25th 2017
    Green on all platforms !!!!! (with blacklisting though)
  • January 26th 2017
    Fixed special case cast(void*) size_t.max (this one cannot go through the normal pointer support, which assumes that you have something valid to dereference).
  • January 26th 2017
    Member function calls are supported!
  • January 31st 2017
    Fixed a bug in switch handling.
  • February 03rd 2017
    Initial function pointer support.
  • Rest of Feburary 2017
    Wild goose chase for issue #17220
  • March 11th 2017
    Initial support for slices.
  • March 15th 2017
    String slicing works.
  • March 18th 2017
    $ in slice expressions is now supported.
  • March 19th 2017
    The concatenation operator (c = a ~ b) works.
  • March 22ed 2017
    Fixed a switch fallthrough bug.

Project Highlight: workspace-d

Not so long ago, Jan Jurzitza sat down at his keyboard intent on writing a D plugin for Atom, his text editor of choice at the time. Then came disappointment.

“I was pretty unhappy with their API,” he says.

Visual Studio Code was released a short time after. He decided to give it a go and “instantly fell in love with it”. His Atom plugin was pushed aside and he started work on a new plugin for VS Code called code-d.

However, I did not want to maintain the same functionality in two plugins for two different text editors, so I thought that making a program which contains most of the plugin logic, like starting and calling dcd, dscanner, dfmt, etc., would be beneficial and would also help with including D support in more editors and IDEs in the future.

For the uninitiated, DCD (the D Completion Daemon), DScanner, and Dfmt are D-oriented tools for plugin developers, all maintained by Brian Schott. They are, respectively, a client-server based auto-completion program, a source code analyzer, and a code formatter. A number of IDE and text editor plugins employ them directly.

So Jan started work on his new tool and named it workspace-d.

With workspace-d I want to make it simple for plugin developers to integrate D functionality into their editor of choice. workspace-d is designed to work both as a standalone program through stdio and as a D library. Once I ported most of the code from my Atom extension to workspace-d, I could simply spawn it as a subprocess in code-d, which I got working with it quite quickly.

In addition to porting his Atom plugin to use workspace-d, he also created one for Sublime Text. Currently, he’s not devoting any time to either and is looking for someone else to take over maintenance of one or both. Anyone interested might start by submitting pull requests. Aside from workspace-d itself, Jan’s focus is on code-d.

He’s recently been working on version 2.0 of workspace-d, with a focus on streamlining the way it handles requests.

Using traits, templates, and CTFE (Compile-Time Function Execution), basically all D compile time magic, I was able to make an automatic wrapper for the functions for version 2.0. Basically, when a request like {"cmd":"hello"} comes in, it runs the D function hello with its default arguments. If the arguments don’t match, it responds with an error. This system automatically binds function arguments to JSON values and generates a response from the return value.

To deserialize the JSON requests, he’s using painlessjson, a third-party library available in the DUB package registry.

It works really great and I can really recommend it for some simple and easy conversions between D types and JSON. This change really cleaned up all the code and made it possible to use workspace-d as a library.

He’s also working on a new project, serve-d, that works with Microsoft’s Language Server Protocol.

serve-d is an alternative for the workspace-d command line I/O for those who prefer JSON RPC over my custom binary/JSON mix. It’s fiber based and uses workspace-d as a library, which results in really clean code. There’s an alpha version of the implementation on github already, both the server and a new branch on code-d. With the Language Server Protocol, I’m hoping for easier integration in other editors. The concept is basically the same as workspace-d’s command line interface, but, because Microsoft is such a big company, I’m hoping that more editors by big developers are going to implement this protocol.

Building and installing workspace-d should go pretty smoothly on Linux or OS X, but it’s currently a little bumpy on Windows. Because of an issue Jan has yet to resolve, it can only be built on Windows with LDC.

The auto completion didn’t work for some people on Windows because it got stuck in the std.process.execute function when creating a pipe to write to. I couldn’t find any way to reproduce it in a standalone program so I couldn’t file a bug either. So what we did to avoid this issue in the short term was to simply disallow compilation on Windows using DMD. It works just fine when compiled with LDC.

Jan’s primarily a Linux user (he doesn’t own a Mac and only runs Windows in a VM). He credits GitHub user @Andrepuel for getting it operational on OS X, and @aka-demik for finding the issue on Windows and verifying that it compiles with LDC. He’ll be grateful to anyone who can help fully resolve the Windows/DMD issue once and for all.

If you’re looking to develop a D plugin for your favorite editor, consider taking advantage of the work Jan as already done with workspace-d to save yourself some effort. And VS Code users can put it to use via code-d to get code completion and more. Visit its VS Code marketplace page to read reviews and installation instructions.

Editable and Runnable Doc Examples on dlang.org

Sebastian Wilzbach was a GSoC student for the D Language Foundation in 2016 and has since become a regular contributor to Phobos, D’s standard library, and dlang.org.


This article explains the steps that were needed to have editable and runnable examples in the documentation on dlang.org. First, let’s begin with the building blocks.

Unit testing in D

One of D’s coolest features is its unittest block, which allows the insertion of testable code anywhere in a program. It has become idiomatic for a function to be followed directly by its tests. For example, let’s consider a simple function add which is accompanied by two tests:

auto add(int a, int b)
{
    return a + b;
}

unittest
{
    assert(2.add(2) == 4);
    assert(3.add(4) == 7);
}

By default, all unittest blocks will be ignored by the compiler. Specifying -unittest on the compiler’s command line will cause the unit tests to be included in the compiled binary. Combined with -main, tests in D can be directly executed with:

rdmd -main -unittest add.d

If a unittest block is annotated with embedded documentation, a D documentation generator can also display the tests as examples in the generated documentation. The DMD compiler ships with a built-in documentation generator (DDoc), which can be run with the -D flag, so executing:

dmd -D -main add.d

would yield the documentation of the add function above with its tests displayed as examples, as demonstrated here:

Please note that the documentation on dlang.org is generated with DDoc. However, in case you don’t like DDoc, there are several other options.

Executing code on the web

Frequent readers of the D Blog might remember Damian Ziemba’s DPaste – an online compiler for the D Programming language. In 2012, he made the examples on the front page of D’s website runnable via his service. Back in those old days, the website of the D Programming language looked like this:

As a matter of fact, until 2015, communication with DPaste was done in XML.

Putting things together

So D has a unit test system that allows placing executable unit tests next to the functions they test, the tests can also be rendered as examples in the generated documentation, and there exists a service, in the form of DPaste, that allows D code to be executed on the web. The only thing missing was to link them together to produce interactive documentation for a compiled language.

There was one big caveat that needed to be addressed before that could happen. While D’s test suite, which is run on ten different build machines, ensures that all unit tests compile & run without errors, an extracted test might contain symbols that were imported at module scope and thus wouldn’t be runnable on dlang.org. A unittest block can only be completely independent of the module in which it is declared if all of its symbols are imported locally in the test’s scope. The solution was rather simple: extract all tests from Phobos, then compile and execute them separately to ensure that a user won’t hit a “missing import” error on dlang.org. Thanks to D’s ultra-fast frontend, this step takes less than a minute on a typical machine in single-core build mode.

Moreover, to prevent any regressions, this has been integrated into Phobos’s test suite and is run for every PR via CircleCi. As Phobos has extensive coverage with unit tests, we started this transition with a blacklist and, step-by-step, removed modules for which all extracted tests compile. With continuous checking in place, we were certain that none of the exposed unit tests would throw any errors when executed in the documentation, so we could do the flip and replace the syntax-highlighted unit test examples with an interactive code editor.

Going one step further

With this setup in place, hitting the “Run” button would merely show the users “All tests passed”. While that’s always good feedback, it conveys less information than is usually desirable.

Documentation that supports runnable examples tends to send any output to stdout. This allows the reader to take the example and modify it as needed while still seeing useful output about the modifications. So, for example, instead of using assertions to validate the output of a function, which is idiomatic in D unit tests and examples:

assert(myFun() == 4);

Other documentation usually prints to stdout and shows the expected output in a comment. In D, that would look like this:

writeln(myFun()); // 4

I initially tried to do such a transformation with regular expressions, but I was quickly bitten by the complexity of a context-free language. So I made another attempt using Brian Schott’s libdparse, a library to parse and lex D source code. libdparse allows one to traverse the abstract syntax tree (AST) of a D source file. During the traversal of the AST, the transformation tool can rewrite all AssertExpressions into writeln calls, similar to the way other documentation displays examples. To speak in the vocabulary of compiler devs: we are lowering AssertExpressions into the more humanly digestible writeln calls!

Once the AST has been traversed and modified, it needs to be transformed into source code again. This led to improvements in libdparse’s formatting capabilities (1, 2).

The future

As of now, there are still a small number of functions in Phobos that don’t have a nice public example that is runnable on dlang.org. Tooling to check for this has recently been activated in Phobos. So now you can use this tool (make -f posix.mak has_public_example) to find functions lacking public tests and remove those modules from the blacklist.

Another target for improvement is DPaste. For example, it currently doesn’t cache incoming requests, which could improve the performance of executed examples on dlang.org. However, due to the fast compilation speed of the DMD compiler, this “slow-down” isn’t noticeable and is more of a perfectionist wish.

I hope you enjoy the new “Run” button on the documentation and have as much fun playing with it as I do. Click here to get started.