Category Archives: Compilers & Tools

Snowflake Strings

Walter Bright is the BDFL of the D Programming Language and founder of Digital Mars. He has decades of experience implementing compilers and interpreters for multiple languages, including Zortech C++, the first native C++ compiler. He also created Empire, the Wargame of the Century.


Consider the following D code in file.d:

int foo(int i) {
    assert(i < 3);
    return i;
}

This is equivalent to the C code:

#include <assert.h>

int foo(int i) {
    assert(i < 3);
    return i;
}

The assert() in the D code is “lowered” (i.e. rewritten by the compiler) to the following:

(i < 3 || _d_assertp("file.d", 2))

We’re interested in how the compiler writes that string literal, "file.d" to the generated object file. The most obvious implementation is to write the characters into the data section and push the address of that to call _d_assertp().

Indeed, that does work, and it’s tempting to stop there. But since we’re professional compiler nerds obsessed with performance, details, etc., there’s a lot more oil in that olive (to borrow one of Andrei’s favorite sayings). Let’s put it in a press and start turning the screw, because assert()s are used a lot.

First off, string literals are immutable (they were originally mutable in C and C++, but are no longer, and D tries to learn from such mistakes). This suggests the string can
be put in read-only memory. What advantages does that deliver?

  • Read-only memory is, well, read only. Attempts to write to it are met with a seg fault exception courtesy of the CPUs memory management logic. Seg faults are a bit graceless, like a cop pulling you over for driving on the wrong side of the road, but at least there wasn’t a head-on collision or corruption of the program’s memory.
  • Read-only pages are never swapped out by the virtual memory system; they never get marked as “dirty” because they are never written to. They may get discarded and reloaded, but that’s much less costly.
  • Read-only memory is safe from corruption by malware (unless the malware infects the MMU, sigh).
  • Read-only memory in a shared library is shared – copies do not have to be made for each user of the shared library.
  • Read-only memory does not need to be scanned for pointers to the heap by the garbage collection system (if the application does GC).

Essentially, shoving as much as possible into read-only memory is good for performance, size and safety. There’s the first drop of oil.

The next issue crops up as soon as there’s more than one assert:

int foo(int i, int j) {
    assert(i < 3);
    assert(j & 1);
    return i + j;
}

The compiler emits two copies of "file.d" into the object file. Since they’re identical, and read-only, it makes sense to only emit one copy:

string TMP = "file.d";
int foo(int i, int j) {
    (i < 3 || _d_assertp(TMP, 2))
    (j & 1 || _d_assertp(TMP, 3))
    return i + j;
}

This is called string pooling and is fairly simple to implement. The compiler maintains a hash table of string literals and their associated symbol names (TMP in this case).

So far, this is working reasonably well. But when asserts migrate into header files, macros, and templates, the same string can appear in lots of object files, since the compiler doesn’t know what is happening in other object files (the separate compilation model). Other string literals can exhibit this behavior, too, when generic coding practices are used. There needs to be some way to present these in the object file so the linker can pool identical strings.

The dmd D compiler currently supports four different object file formats on different platforms:

  • Elf, for Linux and FreeBSD
  • Mach-O, for OSX
  • MS-COFF, for Win64
  • OMF, for Win32

Each does it in a different way, with different tradeoffs. The methods tend to be woefully under documented, and figuring this stuff out is why I get paid the big bucks.

Elf

Elf turns out to have a magic section just for this purpose. It’s named .rodata.strM.N where N is replace by the number of bytes a character has, and M is the alignment. For good old char strings, that would be .rodata.str1.1. The compiler just dumps the strings into that section, and the Elf linker looks through it, removing the redundant strings and adjusting the relocations accordingly. It’ll handle the usual string types – char, wchar, and dchar – with aplomb.

There’s just a couple flaws. The end of a string is determined by a nul character. This means that strings cannot have embedded nuls, or the linker will regard them as multiple strings and shuffle them about in unexpected ways. One cannot have relocations in those sections, either. This means it’s only good for C string literals, not other kinds of data.

This poses a problem for D, where the strings are length-delineated strings, not nul-terminated ones. Does this mean D is doomed to being unable to take advantage of the C-centric file formats and linker design? Not at all. The D compiler simply appends a nul when emitting string literals. If the string does have an embedded nul (allowed in D), it is not put it in these special sections (and the benefit is lost, but such strings are thankfully rare).

Mach-O

Mach-O uses a variant of the Elf approach, a special section named __cstring. It’s more limited in that it only works with single byte chars. No wchar_ts for you! If there ever was confirmation that UTF-16 and UTF-32 are dead end string types, this should be it.

MS-COFF

Microsoft invented MS-COFF by extending the old Unix COFF format. It has many magic sections, but none specifically for strings. Instead, it uses what are called COMDAT sections, one for each string. COMDATs are sections tagged with a unique name, and when the linker is faced with multiple COMDATs with the same name, one is picked and all references to the other COMDATs are rewritten to refer to the Anointed One. COMDATs first appeared in object formats with the advent of C++ templates, since template code generation tends to generate the same code over and over in separate files.

(Isn’t it interesting how object file formats are driven by the needs of C and C++?)

The COMDAT for "hello" would look something like this:

??_C@_05CJBACGMB@hello?$AA@:
db 'h', 'e', 'l', 'l', 'o', 0

The tty noise there is the mangled name of the COMDAT which is generated from the string literal’s contents. The algorithm must match across compilation units, as that is how the linker decides which ones are the same (experimenting with it will show that the substring CJBACGMB is some sort of hash). Microsoft’s algorithm for the mangling and hash is undocumented as far as I can determine, but it doesn’t matter anyway, it only has to have a 1:1 mapping between name and string literal. That screams “MD5 hash” to me, so that’s what dmd does. The name is an MD5 hash of the string literal contents, which also has the nice property that no matter how large the string gets, the identifier doesn’t grow.

COMDATs can have anything stuffed in them, so this is a feature that is usable for a lot more than just strings.

The downside of the COMDAT scheme is the space taken up by all those names, so shipping a program with the debug symbols in it could get much larger.

OMF

The caboose is OMF, an ancient format going back to the early 80’s. It was extended with a kludgy COMDAT system to support C++ just before most everyone abandoned it. DMD still emits it for Win32 programs. We’re stuck with it because it’s the only format the default linker (OPTLINK) understands, and so we find a way to press it into service.

Since it has COMDATs, that’s the mechanism used. The wrinkle is that COMDATs are code sections or data sections only; there are no other options. We want it to be read-only, so the string COMDATs are emitted as code sections (!). Hey, it works.

Conclusion

I don’t think we’ve pressed all the oil out of that olive yet. It may be like memcpy, where every new crop of programmers thinks of a way to speed it up.

I hope you’ve enjoyed our little tour of literals, and may all your string literals be unique snowflakes.

Thanks to Mike Parker for his help with this article.

Project Highlight: DPaste

DPaste is an online compiler and collaboration tool for the D Programming Language. Type in some D code, click run, and see the results. Code can also be saved so that it can be shared with others. Since it was first announced in the forums back in 2012, it has become a frequently used tool in facilitating online discussions in the D community. But Damian Ziemba, the creator and maintainer of DPaste, didn’t set out with that goal in mind.

Actually it was quite spontaneous and random. I was hanging out on the #D IRC channel at freenode. I was quite amazed at how active this channel was. People were dropping by asking questions, lots of code snippets were floating around. One of the members created an IRC bot that was able to compile code snippets, but it was for his own language that he created with D. Someone else followed and created the same kind of bot, but with the ability to compile code in D, though it didn’t last long as it was run on his own PC. So I wrote my own, purely in D, that was compiling D snippets. It took me maybe 4-5 hours to write both an IRC support lib and the logic itself. Then some server hardening where the bot was running and voila, we had nazbot @ #D, which was able to evaluate statements like ^stmt import std.stdio; writeln("hello world"); and would respond with, "hello world".

Nazbot became popular and people started floating new ideas. That ultimately led Damian to take a CMS he had already written in PHP and repurpose it to use as a frontend for what then became DPaste.

The frontend is written in PHP and uses MySQL for storage. It acts as a web interface (using a Bootstrap HTML template and jQuery) and API provider for 3rd Parties. The backend is responsible for actual compilation and execution. It’s possible to use multiple backends. The frontend is a kind of load-balancer when it comes to choosing a backend. The frontend and the backend may live on different machines.

DPaste is primarily used through the web interface, but it’s also used by dlang.org.

Once dpaste.dzfl.pl was well received, the idea popped up that maybe we could provide runnable examples on the main site. So it was implemented. The next idea, proposed by Andrei Alexandrescu, was to enable runnable examples on all of the Phobos documentation. I got swallowed by real life and couldn’t contribute at the time, but eventually Sebastian Wilzbach took it up and finished the implementation. So today we have interactive examples in the Phobos documentation.

When Damian first started work on DPaste in 2011, the D ecosystem looked a bit different than it does today.

There weren’t as many 3rd party libraries as we have now; there was no DUB, there was no vibe.d, etc. I wish I’d had vibe.d back then. I would have implemented the frontend in D instead of PHP.

What I enjoy the most about D is just how “nice” to the eye the language is (compared to C and C++, which I work with on a daily basis) and how easy it is to express what’s in your mind. I’ve never had to stop and think, “how the hell can I implement this”, which is quite common with C++ in my case. In the current state, what is also amazing is how D is becoming a “batteries-included” package. Whatever you need, you just dub fetch it.

He’s implemented DPaste such that it requires very little in terms of maintenance costs. It automatically updates itself to the latest compiler release and also knows how to restart itself if the backend hangs for some reason. He says the only real issue he’s had to deal with over the past five years is spam, which has forced him to reimplement the captcha mechanism several times.

As for the future? He has a few things in mind.

I plan to rewrite the backend from scratch, open source it and use a docker image so anybody can easily pick up development or host his own backend (which is almost done). Functionally, I want to maintain different compiler versions like DMD 2.061.0, DMD 2.062.1, DMD 2.063.0, LDC 0.xx, GDC x.xx.xx, etc., and connect more architectures as backends (currently x86, arm and aarch64 are planned).

I also want to rewrite the frontend in D using vibe.d, websockets, and angular.js. In general, I would like to make the created applications more interactive. So, for example, you could use the output from your code snippet in realtime as it is produced. I would like also to split a middle end off from the frontend. The middle end would provide communication with backends and offer both a REST API and websockets. Then the frontend would be responsible purely for user interaction and nothing else.

He would also like to see DPaste become more official, perhaps through making it a part of dlang.org. And for a point further down the road, Damian has an even grander plan.

I hope to make a full blown online IDE for dlang.org with workspaces, compilers to chose, and so on.

That would be cool to see!

Project Highlight: The New CTFE Engine

CTFE (Compile-Time Function Execution) is today a core feature of the D Programming Language. D creator Walter Bright first implemented it in DMD as an extension of the constant folding logic that was already there. Don Clugston (of FastDelegate fame) made a pass at improving it and, according to Walter, “took it much further“. Since that time, usage of CTFE has shown up in one D project after another, including in D’s standard library. For example, Dmitry Olshansky employed it in his overhaul of std.regex to great effect.

On the last day of DConf 2016, Stefan Koch gave a lightning talk on his thoughts about CTFE in D. At the end of the talk, in response to a question from Andrei Alexandrescu on how D’s implementation could be improved, he said the following:

CTFE is really a hack. You can see that it’s a hack. It’s implemented as a hack. It is the most useful hack that I’ve ever seen, and it is definitely a hacker’s tool to do stuff that are like magic. But to be fast, it would need to be heavily redesigned, reimplemented, possibly executed in multiple threads, because it is used for stuff that we could never have envisioned when it was invented.

Not long after that, Stefan opened a discussion on the fourms and took up the torch to improve the CTFE engine. As to why he got started on this journey in the first place, Stefan says, “I started work on the CTFE engine because I said so at DConf.” But, of course, there’s more to it than that.

I have pretty heavy-weight CTFE needs (I worked on a compile-time trans-compiler). Also my CTFE SQLite reader is failing if you want to read a database bigger then 2MB at ctfe.

His investigations into the performance of the CTFE interpreter shed light on its problems.

The current interpreter interprets every AST-Node it sees directly. This leaves very little space to collect information about the code that is being interpreted. It doesn’t know when something will be used as a reference, so it needs to copy every variable on every mutation. It has to do a deep-copy for this. That means it copies the whole chain of mutations every time.

To clarify, he offers the following example.

Imagine foreach(i;0 .. 10) { a = i; }. On the first iteration we save a` = 0 and set a`` to 1. On the second iteration we save a``` = 1 and a````= 0 and we set a````` to 2 , then a`````` = 1 and a``````` = 0 and so on. As you can see, the memory requirements just shoot up. It’s basically a factorial function with a very small coefficient. That is why for very small workloads this extreme overhead is not noticeable.

That flaw looked unfixable. Indeed the whole architecture in dinterpret.d is very convoluted and hard to understand. I did a few experiments on improving memory-management of the interpreter but it proved fruitless.

Once he realized there was going to be no quick fix, Stefan sat down and drew up a plan to avoid digging himself into the same hole the current interpreter was in. The result of his planning led him down a road he hadn’t expected to travel.

Direct Interpretation was out of the question since it would give the new engine too little time to analyze data-flow and decided whether a copy was really needed or not. I had to implement an Intermediate Representation. It had to be portable to different evaluation back-ends. I ended up with a solution, inspired by OpenGL, of defining my interface in the form of function calls an evaluation back end had to implement. That meant I would not be able to simply modify the current interpreter. This made the start very steep, but it is a decision I do not regret.

His implementation consists of a front end and a back end.

The front end walks the AST and issues calls to the back end. And the back end transforms those calls into actual bytecode. This bytecode is interperted by the back end as soon as the front end requires it.

In terms of functionality, he likens the current implementation to an immediate mode graphics API, and his revamp to retained mode. In this case, though, it’s the immediate mode that’s the memory hog.

You can read about his progress in the CTFE Status thread, where he has been posting frequent updates. His updates include problems he encounters, features he implements, and performance statistics. Eventually, every compiler that uses the DMD front end will benefit from his improvements.

GSoC Report: DStep

Wojciech Szęszoł is a Computer Science major at the University of Wrocław. As part of Google Summer of Code 2016, he chose to make improvements to Jacob Carlborg’s DStep, a tool to generate D bindings from C and Objective-C header files.


GSoC-icon-192It was December of last year and I was writing an image processing project for a course at my university. I would normally use Python, but the project required some custom processing, so I wasn’t able to use numpy. And writing the inner loops of image processing algorithms in plain Python isn’t the best idea. So I decided to use D.

I’ve been conscious about the existence of the D language for as long as I can remember, but I’d never convinced myself before to try it out. The first thing I needed to do was to load an image. At the time, I didn’t know that there is a DUB repository containing bindings to image loading libraries, so I started writing bindings to libjpeg by myself. It didn’t end very well, so I thought there should be a tool that will do the job for me. That’s when I found DStep and htod.

Unluckily, the capabilities of DStep weren’t satisfying (mostly the lack of any kind of support for the preprocessor) and htod didn’t run on Linux. I ended up coding my project in C++, but as GSoC (Google Summer of Code) was lurking on the horizon, I decided that I should give it a try with DStep. I began by contacting Craig Dillabaugh (Ed. Note: Craig volunteers to organize GSoC projects using D) to learn if there was any need for developing such a project. It sparked some discussion on the forum, the idea was accepted, and, more importantly, Russel Winder agreed to be the mentor of the project. After some time I needed to prepare and submit an official proposal. There was an interview and fortunately I was accepted.

The first commit I made for DStep is dated to February, 1. It was a proof of concept that C preprocessor definitions can be translated to D using libclang. Then I improved the testing framework by replacing the old Cucumber-based tests with some written in D. I made a few more improvements before the actual GSoC coding period began.

During GSoC, I added support for translation of preprocessor macros (constants and functions). It required implementing a parser for a small part of the C language as the information from libclang was insufficient. I implemented translation of comments, improved formatting of the output code (e.g. made DStep keep the spacing from C sources), fixed most of the issues from the GitHub issue list and ported DStep to Windows. While I was coding I was getting support from Jacob Carlborg. He did a great job by reviewing all of the commits I made. When I didn’t know how to accomplish something with D, I could always count on help on forum.dlang.org.

DStep was the first project of such a size that I coded in D. I enjoyed its modern features, notably the module system, garbage collector, built-in arrays, and powerful templates. I used unittest blocks a lot. It would be nice to have named unit tests, so that they can be run selectively. From the perspective of a newcomer, the lack of consistency and symmetry in some features is troubling, at least before getting used to it. For example there is a built-in hash map and no hash set, some identifiers that should be keywords starts with @ (Ed. Note: see the Attributes documentation), etc. I was very sad when I read that the in keyword is not yet fully implemented. Despite those little issues, the language served me very well overall. I suppose I will use D for my personal toy projects in the future. And for DStep of course. I have some unfinished business with it :).

I would like to encourage all students to take part in future editions of GSoC. And I must say the D Language Foundation is a very good place to do this. I learned a lot during this past summer and it was a very exciting experience.

Inside D Version Manager

In his day job, Jacob Carlborg is a Ruby backend developer for Derivco Sweden, but he’s been using D on his own time since 2006. He is the maintainer of numerous open source projects, including DStep, a utility that generates D bindings from C and Objective-C headers, DWT, a port of the Java GUI library SWT, and the topic of this post, DVM. He implemented native Thread Local Storage support for DMD on OS X and contributed, along with Michel Fortin, to the integration of Objective-C in D.


D Version Manager (DVM), is a cross-platform tool that allows you to easily download, install and manage multiple D compiler versions. With DVM, you can select a specific version of the compiler to use without having to manually modify the PATH environment variable. A selected compiler is unique in each shell session, and it’s possible to configure a default compiler.

The main advantage of DVM is the easy downloading and installation of different compiler versions. Specify the version of the compiler you would like to install, e.g. dvm install 2.071.1, and it will automatically download and install that version. Then you can tell DVM to use that version by executing dvm use 2.071.1. After that, you can invoke the compiler as usual with dmd. The selected compiler version will persist until the end of the shell session.

DVM makes it possible for the user to select a specific compiler version without having to modify any makefiles or build scripts. It’s enough for any build script to refer to the compiler by name, i.e. dmd, as long as the user selects the compiler version with DVM before invoking the script.

History

DVM was created in the beginning of 2011. That was a different time for D. No proper installers existed, D1 was still a viable option, and each new release of DMD brought with it a number of regressions. Because of all the regressions, it was basically impossible to always use the latest compiler, and often even older compilers, for all of your projects. Taking into consideration projects from other developers, some were written in D1 and some in D2, making it inconvenient to have only one compiler version installed.

It was for these reasons I created DVM. Being able to have different versions of the compiler active in different shell sessions makes it easy to work on different projects requiring different versions of the compiler. For example, it was possible to open one tab for a D1 compiler and another for a D2 compiler.

The concept of DVM comes directly from the Ruby tool RVM. Where DVM installs D compilers, RVM installs Ruby interpreters. RVM can do everything DVM can do and a lot more. One of the major things I did not want to copy from RVM is that it’s completely written in shell script (bash). I wanted DVM to be written in D. Because it’s written in shell script, RVM enables some really useful features that DVM does not support, but some of them are questionable (some might call them hacks). For example, when navigating to an RVM-enabled project, RVM will automatically select the correct Ruby interpreter. However, it accomplishes this by overriding the built-in cd command. When the command is invoked, RVM will look in the target directory for one of the files .rvmrc or .ruby-version. If either is present, it will read that file to determine which Ruby interpreter to select.

Implementation and Usage

One of the goals of DVM was that it should be implemented in D. In the end, it was mostly written in D with a few bits of shell script. Note that the following implementation details are specific to the platforms that fall under D’s Posix umbrella, i.e. version(Posix), but DVM is certainly available for Windows with the same functionality.

Structure of the DVM Installation

Before DVM can be used, it needs to install itself. This is accomplished with the command, dvm install dvm. This will create the ~/.dvm directory. It contains the following subdirectories: archives, bin, compilers, env and scripts.

archives contains a cache of downloaded zip archives of D compilers.

bin contains shell scripts, acting as symbolic links, to all installed D compilers. The name of each contains the version of the compiler, e.g. dmd-2.071.1, making it possible to invoke a specific compiler without first having to invoke the use command. This directory also contains one shell script, dvm-current-dc, pointing to the currently active D compiler. This allows the currently active D compiler to be invoked without knowing which version has been set. This can be useful for executing the compiler from within an editor or IDE, for example. A shell script for the default compiler exists as well. Finally, this directory also contains the binary dvm itself.

The compilers directory contains all installed compilers. All of the downloaded compilers are unpacked here. Due to the varying quality of the D compiler archives throughout the years, the install command will also make a few adjustments if necessary. In the old days, there was only one archive for all platforms. This command will only include binaries and libraries for the current platform. Another adjustment is to make sure all executables have the executable permission set.

The env directory contains helper shell scripts for the use command. There’s one script for each installed compiler and one for the default selected compiler.

The scripts directory currently only contains one file, dvm. It’s a shell script which wraps the dvm binary in the bin directory. The purpose of this wrapper is to aid the use command.

The use Command

The most interesting part of the implementation is the use command, which selects a specific compiler, e.g. dvm use 2.071.1. The selection of a compiler will persist for the duration of the shell session (window, tab, script file).

The command works by prepending the path of the specified compiler to the PATH environment variable. This can be ~/.dvm/compilers/dmd-2.071.1/{platform}/bin for example, where {platform} is the currently running platform. By prepending the path to the environment variable, it guarantees the selected compiler takes precedence over any other possible compilers in the PATH. The reason the {platform} section of the path exists is related to the structure of the downloaded archive. Keeping this structure avoids having to modify the compiler’s configuration file, dmd.conf.

The interesting part here is that it’s not possible to modify the environment variables of the parent process, which in this case is the shell. The magic behind the use command is that the dvm command that you’re actually invoking is not the D binary; it’s the shell script in the ~/.dvm/scripts path. This shell script contains a function called dvm. This can be verified by invoking type dvm | head -n 1, which should print dvm is a function if everything is installed correctly.

The installation of DVM adds a line to the shell initialization file, .bashrc, .bash_profile or similar. This line will load/source the DVM shell script in the ~/.dvm/scripts path which will make the dvm command available. When the dvm function is invoked, it will forward the call to the dvm binary located in ~/.dvm/bin/dvm. The dvm binary contains all of the command logic. When the use command is invoked, the dvm binary will write a new file to ~/.dvm/tmp/result and exit. This file contains a command for loading/sourcing the environment file available in ~/.dvm/env that corresponds to the version that was specified when the use command was invoked. After the dvm binary has exited, the shell script function takes over again and loads/sources the result file if it exists. Since the shell script is loaded/sourced instead of executed, the code will be evaluated in the current shell instead of a sub-shell. This is what makes it possible to modify the PATH environment variable. After the result file is loaded/sourced, it’s removed.

If you find yourself with the need to build your D project(s) with multiple compiler versions, such as the current release of DMD, one or more previous releases, and/or the latest beta, then DVM will allow you to do so in a hassle-free manner. Pull up a shell, execute use on the version you want, and away you go.

Core Team Update: Martin Nowak

In the early days of DMD, new releases were put out when Walter decided they were ready. There was no formal process, no commonly accepted set of criteria that defined the point at which a new compiler version was due or the steps involved in building the release packages. In the early days, that was just fine. As time passed, not so much.

According to Martin Nowak:

The old release process was completely opaque, inconsistent, irregular, and also very time-consuming for Walter. So at some point it more or less failed and Walter could no longer manage to build the next release.

Martin was eager to do something about it, but he wasn’t the first person to take action on the issue. He decided to start with the work that had come before.

At this point, I took a fairly complex script from Nick Sabalausky, which tried to emulate Walter’s process, and started to improve upon it. It got trimmed down to a much smaller size over time.

But that was just the beginning.

Next, I prepared OS images for Vagrant for all supported platforms. That alone took more than a week. After that, I wired up the release script with Vagrant. From there on we had kind of reproducible builds.

That was a major step forward and eliminated some of the common mistakes that had crept in now and again with the previous, no-so-reproducible, process. With the pieces in place, Martin got some help from another community member.

Andrew Edwards took over the actual release building, but I was still doing the main work of keeping the scripts running and managing bugfixes. At some point, Andrew no longer had time. As the back and forth between us was almost as much work as the release build process itself, I completely took over. Nowadays, we have the script running fully automated to build nightlies.

Thanks to Martin, the process for building DMD release packages is described step-by-step in the D Wiki so that anyone can do it if necessary. Moreover, he coauthored and implemented a D Improvement Proposal to clearly define a release schedule. This process was adopted beginning with DMD 2.068 and continues today.

With the improved release schedule, DMD users can plan ahead to anticipate new releases, or take advantage of nightly builds and point releases to test out bug fixes or, whenever they come around, new features in the language or the standard library. However, the schedule is not etched in stone, as any particular release may be delayed for one reason or another. For example, the 2.071 release introduced major changes to D’s import and symbol lookup to fix some long-standing annoyances, with the subsequent 2.071.1 fixing some regressions they introduced. The release of 2.072 has been delayed until all of the known issues related to the changes have been fixed.

Those who were around before Martin stepped up to take charge of the release process surely notice the difference. He and the others who have contributed along the way (like Nick and Andrew) have done the D community a major service. As a result, Martin has become an essential member of the core D team.

Making Of: LDC 1.0

This is a guest post from Kai Nacke. A long-time contributor to the D community, Kai is the author of D Web Development and the maintainer of LDC, the LLVM D Compiler.


LDC has been under development for more than 10 years. From release to release, the software has gotten better and better, but the version number has always implied that LDC was still the new kid on block. Who would use a version 0.xx compiler for production code?

These were my thoughts when I raised the question, “Version number: Are we ready for 1.0?” in the forum about a year ago. At that time, the current LDC compiler was 0.15.1. In several discussions, the idea was born that the first version of LDC based on the frontend written in D should be version 1.0, because this would really be a major milestone. Version 0.18.0 should become 1.0!

Was LDC really as mature as I thought? Looking back, this was an optimistic view. At DConf 2015, Liran Zvibel from Weka.IO mentioned in his talk about large scale primary storage systems that he couldn’t use LDC because of bugs! Additionally, the beta version of 0.15.2 had some serious issues and was finally abandoned in favor of 0.16.0. And did I mention that I was busy writing a book about vibe.d?

Fortunately, over the past two years, more and more people began contributing to LDC. The number of active committers grew. Suddenly, the progress of LDC was very impressive: Johan added DMD-style code coverage and worked on merging the new frontend. Dan worked on an iOS version and Joakim on an Android version. Together, they made ARM a first class target of LDC. Martin and Rainer worked on the Windows version. David went ahead and fixed a lot of the errors which had occurred with the Weka code base. I spent some time working on the ports to PowerPC and AArch64. Uncounted small fixes from other contributors improved the overall quality.

Now it was obvious that a 1.x version was overdue. Shortly after DMD made the transition to the D-based frontend, LDC was able to use it. After the usual alpha and beta versions, I built the final release version on Sunday, June 5, and officially announced it the next day. Version 1.0 is shipping now!

Creating a release is always a major effort. I would like to say “Thank you!” to everybody who made this special release happen. And a big thanks to our users; your feedback is always a motivation to make the next LDC release even better.

Onward to 1.1!