Category Archives: Compilers & Tools

D News Roundup

Version 2.097.0 of DMD, the D programming language reference compiler, was released on June 5th in the middle of new GDC and LDC release announcements, while preparations for two major D community events were underway: the Symmetry Autumn of Code 2021 and DConf Online 2021. We’ll cover it all in this post, with a focus first on the events.

Symmetry Autumn of Code 2021

Symmetry Investments logo

As I write, Symmetry Investments employs in the neighborhood of 180 full-time workers and manages over US$8 billion of capital, and they’re always on the lookout for more employees, including programmers to work with D and other languages. They sponsored DConf 2019 in London and have sponsored the annual Symmetry Autumn of Code since 2018, in which a handful of programmers are paid to work for four months on projects of benefit to the D ecosystem.

This year marks the fourth annual SAoC, and we are now accepting applications. Participants will plan four milestones for projects that benefit the D ecosystem and will be expected to work at least 20 hours per week on each milestone. Each participant will be rewarded US$1000 for the successful completion of each of the first three milestones. At the end of the final milestone, the SAoC committee will review the overall progress of each of the remaining participants. One will be rewarded with a final $US1000 payment and a free pass to the next real-world DConf, with reimbursement for travel and lodging. In last year’s event, a second participant was also awarded a fourth US$1000 payment.

Participation in SAoC has led to jobs for some lucky coders and has generally been a valuable learning experience for those who have completed it. Students currently enrolled in graduate or postgraduate university programs will be given priority, but applications are open to all. The application deadline is August 18th. Project ideas can be found in the D community’s projects repository at GitHub. See the Symmetry Autumn of Code page here at the D Blog for all the details on how to apply as a participant or as a mentor.

DConf Online 2021

For the second consecutive year, we were unable to hold a real-world DConf. Last year we launched the first annual DConf Online. And when I say annual, I mean annual! We’re doing it again this year and will continue to do it going forward even after the real-world DConfs are back on.

DConf Online 2021 will take place November 20 and 21 on the D Language Foundation’s YouTube channel. Once again, we’re looking for pre-recorded talks, livestream panels, and livecoding sessions. If you’d like to propose something in one of those categories, the application deadline is September 5. Please visit the DConf Online 2021 homepage for all the details.

And if you haven’t seen them yet, the DConf Online 2020 and DConf Online 2020 Q & A playlists are available on the same channel. You can also find a full list of talks and all the links (talk videos, slides, and Q & A videos) on the DConf Online 2020 homepage.

New compiler releases

D 2.097.0 is live in the latest release of DMD and the beta release of LDC, the LLVM-based D compiler. The new version of GDC also came into the world as part of GCC 11.1 at the end of April.

DMD 2.097.0

Digital Mars D logo

This version of DMD comes with 29 major changes and 144(!) fixed Bugzilla issues courtesy of 54 contributors. Changes include a few deprecations and several improvements to the standard library. Two things stand out:

  • while(auto n = expression) has been on a few wishlists for a while. Now it’s a reality. The same syntax that was already possible with if statements is considered idiomatic in certain circumstances (such as when checking if an item exists in an associative array). Expect the while condition assignment to start popping up in open-source D projects soon.
  • std.sumtype is another wishlist item that is a wish no more. The new SumType is a replacement for std.variant.Algebraic. It’s a discriminated union that makes good use of Design by Introspection with a nice match syntax for those looking for that sort of thing. It’s been quite a while since the last time a new module was added to the D standard library. Many thanks to Paul Backus for putting in the effort to see it through, and a very big Congratulations!

LDC 1.27.0-beta1

LDC logo

On the same day the new DMD was released, the first beta of LDC 1.27.0, which also supports D 2.097.0, was announced in the D forums.

On top of 2.097.0 support, this version of LDC provides greatly improved DLL support on Windows. The prebuilt Windows packages ship with DRuntime and Phobos DLLs. This is big news for D developers on Windows. We’ve long had issues with D DLLs that have prevented heavy use outside of simple interfaces (with APIs exported as extern(C) being the most reliable).

There are some limitations to be aware of, such as the inability to directly access TLS variables across DLL boundaries (though it’s fine with accessor functions). Please see the release page for the details.

Thanks to Martin Kinkelin and all the LDC maintainers and contributors for their continued work on LDC. They aren’t getting paid for this. If you are a happy LDC user or just like the idea of the project, you can support their work by sponsoring Martin Kinkelin on GitHub.

GDC 11.1

In the GCC world, Iain Buclaw continues to make strides on the GDC compiler.

GDC 11.1 still uses the old C++ version of the D frontend, which feature-wise is mostly (see below) at D 2.076.1. There were significant issues in upstream DMD that prevented Iain from making the switch to the D version of the frontend in time to make the release window. He is currently aiming to make the switch in time for GDC 12. As a consolation, this release has support for three BSDs, Mac OS X, and MinGW!

Despite the older frontend, Iain has backported several fixes and optimizations, and even a few features, so it isn’t your grandfather’s D 2.076.1 that GDC supports. For example, the new bottom type that recently made its way through the D Improvement Proposal review process has found its way into this GDC release. See the forum announcement for details of all the new D goodness in GDC 11.1 and Please consider sponsoring his work on GitHub.

One-off donations

If you aren’t up for sponsoring Martin or Iain but would still like to support them financially, you can make one-time donations through the D Language Foundation. You can send money to the D General Fund, the D Open Collective, or to our PayPal account. Whichever method you choose, please be sure to leave a note that the donation is intended for LDC, GDC, or any D project you would like to support. We’ll make sure the appropriate person receives the money.

Other options for supporting the D programming language: visit the D Language Foundation donation page and donate to one of our funds, head to the DLang Swag Emporium and purchase any items that catch your eye (the D Rocket stuff rocks, and DConf Online 2021 swag will be available shortly), or consider using smile.amazon.com and selecting the D Language Foundation as your charity the next time you shop at Amazon.com (we are only available through the .com domain; browser extensions like SmartAmazonSmile for Firefox and AmazonSmileRedirect for Chrome make it easy to do).

Thanks to everyone who has, will, or continues to support the D programming language, either through donations of time or money. We’ve gotten where we are through community effort, and community effort will keep pushing us forward. D rocks!

A New Year, A New Release of D

Here in DLang Land we’re beginning the new year with a new release of the D reference compiler (DMD) and a beta release of the popular LLVM-based D compiler (LDC). D 2.095.0 is crammed full of 27 major changes and 78 fixes from 61 contributors. Following are some highlights that I expect some D programmers might find interesting, but please see the changelog for the full rundown. Those more interested in Bugzilla issue numbers can jump straight to the bugfix list

D 2.095.0

Digital Mars D logo

D’s support for other programming languages is important for interacting with existing codebases. C ABI compatibility has been strong from the beginning. Support for Objective-C and C++ came later. Though C++-compatibility is a bear to get right, it keeps improving with every compiler release. This release continues that trend and also enhances Objective-C support. We also see a number of QOL (quality-of-life) improvements throughout the compiler, libraries, and tools. DUB, the D build tool and package manager that ships with the compiler (and is also available separately), especially gets a good bit of love in this release.

C++ header generation

For a little while now, DMD has included experimental support for the generation of C++ header files from D source code, via the -CH command-line option, in order to facilitate calling D libraries from C++. For example, given the following D source file:

cpp-ex.d

extern(C++):
struct A {
    int x;
}

void printA(ref A a) {
    import std.stdio : writeln;
    writeln(a);
}

And the following command line:

dmd -HC cpp-ex.d

The compiler outputs the following to stdout (-HCf to specify a file name, and -HCd a directory):

// Automatically generated by Digital Mars D Compiler

#pragma once

#include <assert.h>
#include <stddef.h>
#include <stdint.h>
#include <math.h>

#ifdef CUSTOM_D_ARRAY_TYPE
#define _d_dynamicArray CUSTOM_D_ARRAY_TYPE
#else
/// Represents a D [] array
template<typename T>
struct _d_dynamicArray
{
    size_t length;
    T *ptr;

    _d_dynamicArray() : length(0), ptr(NULL) { }

    _d_dynamicArray(size_t length_in, T *ptr_in)
        : length(length_in), ptr(ptr_in) { }

    T& operator[](const size_t idx) {
        assert(idx < length);
        return ptr[idx];
    }

    const T& operator[](const size_t idx) const {
        assert(idx < length);
        return ptr[idx];
    }
};
#endif

struct A;

struct A
{
    int32_t x;
    A() :
        x()
    {
    }
};

extern void printA(A& a);

This release brings a number of fixes and improvements to this feature, as can be seen in the changelog. Note that generation of C headers is also supported via -H, -Hf, and -Hd.

Default C++ standard change

Prior to this release, extern(C++) code was guaranteed to link with C++98 binaries out of the box. This is no longer true, and you will need to pass -extern-std=c++98 on the command line to maintain that behavior. The C++11 standard is now the default.

Additionally, the compiler will now accept -extern-std=c++20. In practice, the only effect this has at the moment is to change the compile-time value, __traits(getTargetInfo, "cppStd"), but new types may be added in the future.

Improved Objective-C support

Objective-C compatibility is enhanced in this release with support for Objective-C protocols. This is achieved by repurposing interface in an extern(Objective-C) context. Additionally, the attributes @optional and @selector help get the job done. Read the details and see an example in the changelog.

Improved compile-time feedback

Here’s a QOL issue that really became an annoyance after a deprecation in Phobos, the standard library: when instantiating templates, deprecation messages reported the source location deep inside the library where the deprecated feature was used (e.g., template constraints) and not the user-code instantiation that triggered it. No longer. You’ll now get a template instantiation trace just as you do on errors.

Another QOL feedback issue involved the absence of errors. The compiler would silently allow multiple definitions of identical functions in the same module. The compiler will now raise an error when it encounters this situation. However, multiple declarations are allowed as long as there is at most one definition. For mangling schemes where overloading is not supported (extern(C), extern(Windows), and extern(System)), the compiler will emit a deprecation message.

The mainSourceFile in DUB recipes

The mainSourceFile entry in DUB package recipes was a way to specify a source file containing a main function that should be excluded from unit tests when invoking dub test. However, when setting up other configurations where the file should also not be compiled, or where a different main source file was required, it was necessary to add the file to an excludedSourceFiles entry. This is no longer the case. If a mainSourceFile is specified in any configuration, it will automatically be excluded from other configurations.

Propagating compiler flags to dependencies

Not every existing compiler flag has a corresponding build setting for DUB recipes. The dflags entry allows for such flags to be configured for any project. For example, -fPic, or -preview=in. The catch is, it does not propagate to dependencies. Now, you can explicitly specify compiler flags for dependencies by adding a dflags parameter to any dependency entry in a dub.json recipe. For example:

{
    "name": "example",
    "dependencies": {
        "vibe-d": { "version" : "~>0.9.2", "dflags" : ["-preview=in"] }
    }
}

Unfortunately, it appears the implementation does not work for recipes in SDLang format (dub.sdl), so those of us who prefer that format over JSON will have to wait a bit.

LDC 1.025-beta1

LDC logo

This release of LDC brings the compiler up to date with the D 2.095.0 frontend, with the prebuilt packages based on LLVM v11.0.1. The biggest news in this release looks to be the new -linkonce-templates flag. This experimental feature causes the compiler to emit template symbols into each compilation unit that references them, “with optimizer-discardable linkonce-odr linkage”. The implementation has big wins both in terms of compile times when compiling with optimizations turned on and in cutting down on a class of template-related bugs. See the beta1 release notes for the details.

Happy New Year

On behalf of the D Language Foundation, I wish you all the very best for 2021. As a community, we weren’t affected much by the global pandemic. Sure, we were forced to cancel DConf 2020, but the silver lining is that it also motivated us to finally launch DConf Online in November. We fully intend to make this an annual event alongside of, not in place of, the real-world conference (when physically possible). Other than that, it was business as usual in D Land.

At a personal level, the lives of some in our community were disrupted last year in ways large and small. Please remember that, though the primary object that brings us together is our enthusiasm for the D programming language, we are all still human beings behind our keyboards. The majority of work that gets done in our community is carried out on a volunteer basis. All of us, as the beneficiaries, must never forget that the health and well-being of everyone in our community take top priority over any work we may want or expect to see completed. We encourage everyone to keep an ear open for those who may need to borrow it, and never be afraid to communicate that need when it feels necessary. Sometimes, an open ear can make a very big difference.

Thanks to all of you for your participation in the D community, whether as a user, a contributor, or both. Stay safe, and have a very happy 2021.

DustMite: The General-Purpose Data Reduction Tool

If you’ve been around for a while, or are a particularly adventurous developer who enjoys mixing language features in interesting ways, you may have run into one compiler bug or two:

Implementation bugs are inevitably a part of using cutting-edge programming languages. Should you run into one, the steps to proceed are generally as follows:

  1. Reduce the failing program to a minimal, self-contained example.
  2. Add a description of what happens and what you expect to happen.
  3. Post it on the bug tracker.

Nine years ago, an observation was made that when filing and fixing compiler bugs, a disproportionate amount of time was spent on the first step. When your program stops compiling “out of the blue”, or when the bug stops reproducing after the code is taken out of its context, manually paring down a large codebase by repeatedly cutting out code and checking if the bug still occurs becomes a tedious and repetitive task.

Fortunately, tedious and repetitive tasks are what computers are good for; they just have to be tricked into doing them, usually by writing a program. Enter DustMite.


The first version.

The basic operation is simple. The tool takes as inputs:

  • a data set to reduce (such as, a directory containing D source code which exhibits a particular compiler bug)
  • an oracle (or, more mundanely, a test script), which itself:
  • takes as input a variation of the data set, and
  • produces a yes-or-no answer on whether the input still satisfies the sought property (such as reproducing the particular compiler bug).

DustMite’s output is some local minimum variation of the data set, which it reaches by consecutively trying to remove parts of the data set and saving the results which the oracle approves. In the compiler bug example, this means removing bits of code which are not required to reproduce the problem at hand.

DustMite wouldn’t be very efficient if it attempted to remove things line-by-line or character-by-character. In order to maximize the chance of finding good reductions, the input is parsed into a tree according to the syntax of the input files.

Each tree node consists of a “head” (string), children (list of node pointers), and “tail” (string). Technically, it is redundant to have both “head” and “tail”, but they make representing some constructs and performing reductions much simpler, such as paren/bracket pairs.


Nodes are arranged into a binary tree as an optimization.

Additionally, nodes may have a list of dependencies. The dependency relationship essentially means “if this node is removed, these nodes should be removed too”. These constraints are not representable using just the tree structure described above, and are used to allow reducing things such as lists where trailing item delimiters are not allowed, or removing a function parameter and corresponding arguments from the entire code base at once.

In the case of D source code, declarations, statements, and subexpressions get their own tree nodes, so that they can be removed in one go if unneeded. The parser DustMite uses for D source code is intentionally very simple because it needs to handle potentially invalid D code, and you don’t want your bug reduction tool to also crash on top of the compiler.


How DustMite sees a simple D program.

An algorithm decides the order in which nodes are queued for potential deletion; DustMite implements several (it calls them “strategies”). Fundamentally, a strategy’s interface is (statei, resulti) ⇒ (statei+1, reductioni+1), i.e., which reduction is chosen next depends on the previous reduction and its result. The default “inbreadth” strategy visits nodes in ascending depth order (row by row) and starts over from the top as long as it finds new reductions.

DustMite today supports quite a few more options:


The current version.

Probably, the most interesting of these is the -j switch—one reason being that DustMite’s task is inherently not parallelizable. Which reduction is chosen next, and the tree version to which that reduction is applied, depends on the previous reduction’s result.

DustMite works around this by putting unused CPU cores to work on lookahead: using a highly sophisticated predictor, it guesses what the result of the current reduction will be, and based on that assumption, calculates the next reduction. If the guess was right, great! We get to use that result. Otherwise, the work is wasted. Implementing this meant that strategies now needed to have copyable state, and so had to be refactored from an imperative style to a state machine.

Unfortunately, although the highly expected feature was implemented four years ago, the initial implementation was rather underwhelming. DustMite still did too much work in the main thread and wasted too much CPU time on rescanning the data set tree on every reduction. The problem was so bad that, at high core counts, lookahead mode was even slower than single-threaded mode.

I have recently set out to resolve these inadequacies. The following obstacles stood in the way:

Problem 1: Hashing was too slow. Because the oracle’s services (i.e., running the test script) are usually expensive, DustMite keeps track of a cache of previously attempted reductions and their outcome. This helps because not all tree transformations result in a change of output, and some strategies will retry reductions in successive iterations. A hash of the tree is used as the cache key; however, calculating it requires walking the entire tree every time, which is slow for large inputs.

Would it be possible to make the hash calculation incremental? One approach would be Merkle trees (each node’s hash is the hash of its children’s hashes), however that is suboptimal in the case of e.g., empty leaf nodes. CS erudite Ivan Kazmenko blesses us with an answer: polynomial hashes! By representing strings as polynomials, it is possible to use modulo arithmetic to calculate an incremental fixed-size hash and cache subtree hashes per node.



Each node holds its cumulative hash and length.

The number theory staggered me at first, so I recruited the assistance of feep from #d. After we went through a few draft implementations, I could begin working on the final version. The first improvement was replacing the naive exponentiation algorithm with exponentiation by squaring (D CTFE allowed precomputing a table at compile-time and a faster calculation than the classical method). Next, there was the matter of the modulo.

Initially, we used integer overflow for modulo arithmetic (i.e. q=264), however Ivan cautioned against using powers of two as the modulo, as this makes the algorithm susceptible to Thue-Morse strings. Not long ago I was experimenting with using long multiplication/division CPU instructions (where multiplying one machine word by another yields the result in two machine words with a high and low part, and vice-versa for division). D allows generating assembler code specific to the types that the function template is instantiated with, though in DustMite we only use the unsigned 64-bit variant (on x86 we fall back to using integer overflow).

With the hashing algorithm implemented, all that remained was to mark dirty nodes (they or their children had their content edited) and incrementally recalculate their hashes as needed. Dependencies posed a small obstacle: at the time, they were implemented as simply an array of pointers to the dependency node within the tree. As such, we didn’t know how to get to their parents (to mark them dirty as well), however this was easily overcome by adding a “parent” pointer to each node.

Well, or so I thought, until I got to work on the next problem.

Problem 2: Copying the tree. At the time, the current version of the tree representing the data set was part of the global state. Because of this, applying a reduction was implemented twice:

This was clumsy, but faster and less complicated than making a copy of the entire tree just to change one part of it to test a reduction. However, doing so was a requirement for proper lookahead, otherwise we would be unable to test reductions based on results where past tests predicted a positive outcome, or do nearly anything in a separate thread.

One issue was the tree “internal pointers”—making a copy would require updating all pointers within the tree to point to the new copies in the new tree. This was easy for children/parent pointers (since we can reliably visit every such pointer exactly once), but not quite for dependencies: because they were also implemented as simple pointers to nodes, we would have to keep track of a map of which node was copied where in order to update the dependency pointers.

One way to solve this would be to change the representation of node references from pointers to indices into a node array; this way, copying the tree would be as simple as a .dup. However, large inputs meant many nodes, and I wanted to see if it was possible to avoid iterating over every node in the tree (i.e. O(n)) for every reduction.

Was it possible? It would mean that we would copy only the modified nodes and their parents, leaving the rest of the tree in-place, and only reusing it as the copies’ children. This goal conflicted with the existence of “parent” pointers, because a parent would have to point towards either the old or new root, so to resolve this ambiguity every node would have to be copied. As a result, the way we handled dependencies needed to be rethought.


Editing trees with “copy on write” involves copying just the edited nodes (🔴), and their parents.

With internal pointers out, the next best thing to array indices for referencing a node was a series of instructions for how to reach the node from the tree root: an address. The representation of these addresses that I chose was a bit string represented as a linked list, where each list node holds the child index at that depth, starting from the deep end. Such a representation can be arranged in a tree where the root-side ends are shared, mimicking the structure of the tree containing the nodes for the reduced data, and thus allowing us to reuse memory and minimize allocations.


Nodes cannot hold their own address (as that would make them unmovable),
which is why they need to be stored outside of the main tree.

For addresses to work, the object they point at needs to remain the same, which means that we can no longer simply remove children from tree nodes—an address going through the second child would become invalid if the first child was removed. Rewriting all affected addresses for every tree edit is, of course, impractical, which leads us to the introduction of tombstones—dead nodes that only serve to preserve the index of the children that follow it. Because one of the possible reduction types involves moving subtrees around the tree, we now also have “redirects” (which are just tombstones with a “see here” address attached).

With the above changes in place, we can finally move forward with fixing and optimizing lookahead, as well as implementing incremental rehashing in a way that’s compatible with the above! The mutable global “current” tree variable is gone, save now simply takes a tree root as an argument, and applyReduction is now:

/// Apply a reduction to this tree, and return the resulting tree.
/// The original tree remains unchanged.
/// Copies only modified parts of the tree, and whatever references them.
Entity applyReduction(Entity origRoot, ref Reduction r)

With the biggest hurdle behind us, and a few more rounds of applying Walter Bright’s secret weapon, the performance metrics started to look more like what they should:


Going deeper would likely involve using OS-specific I/O APIs or rewriting D’s GC.

A mere 3.5x speed-up from a 32-fold increase in computational power may seem underwhelming. Here are some reasons for this:

  • With a 50/50 predictor, predictions form a complete binary tree, so doubling the number of parallel jobs gives you +1x more speed. That’s roughly log₂(jobs)-1, or 4 for 32 jobs – not far off!

  • The results will depend on the reduction being performed, so YMMV. For a certain artificial test case, one optimization (not pictured above) yielded a 500x speed-up!

  • DustMite does not try to keep all CPU cores busy all the time. If a prediction turns out false, all lookahead jobs based on it become wasted work, so DustMite only starts new lookahead tasks when a reduction’s outcome is resolved. Perhaps ideally DustMite would keep starting new jobs but kill them as soon as it discovers they’re based on a misprediction. As there is no cross-platform process group management in Phobos, the D standard library, this is something I left for future versions.

  • Some work is still done in the main thread, because moving it to a worker thread actually makes things slower due to the global GC lock.

There still remains one last place where DustMite iterates over every tree node per reduction: saving the tree to disk (so that it could be read by the test script). This seems unavoidable at first, but could actually be avoided by caching each node’s full text contents within the node itself.

I opted to leave this one out. With the other related improvements, such as using lockingBinaryWriter and aggregating writes of contiguous strings as one I/O operation, the increase in memory usage was much more dramatic than the decrease in execution time, even when optimized to just one allocation per reduction (polynomial hashing gives us every node’s total length for free). But, for a brief instant, DustMite processed reductions in sub-O(n) time.

One more addition is worth mentioning: Andrej Mitrovic suggested a switch which would replace removed text with whitespace, which would allow searching for exact line numbers in the test script. At the time, its addition posed significant challenges, as there needed to be some way to keep removed nodes in the tree but exclude them from future removal attempts. With the new tree representation, this became much easier, and also allowed creating the following animation:

In conclusion, I’d like to bring up that DustMite is good at more than just reducing compiler test cases. The wiki lists some ideas:

  • Finding the source of ambiguous or misleading compiler error messages (e.g., errors with the file/line information pointing only inside the standard library).

  • Alternative (much slower, but also much more thorough) method of verifying unit test code coverage. Just because a line of code is executed, that doesn’t mean it’s necessary; DustMite can be made to remove all code that does not affect the execution of your unit tests.

  • Similarly, if you have complete test coverage, it can be used for reducing the source tree to a minimal tree which includes support for only enabled unittests. This can be used to create a version of a program or library with a test-defined subset of features.

  • The --obfuscate mode can obfuscate your code’s identifiers. It can be used for preparing submission of proprietary code to bug trackers.

  • The --fuzz mode (a new feature) can help find bugs in compilers and tools by creating random programs (using fragments of other programs as input).

But DustMite is not limited to D programs (or any kind of programs) as input. With the --split option, we can tell DustMite how to parse and reduce other kinds of files. DustMite successfully handled the following scenarios:

  • reducing C++ programs (the D parser supports some C++-only syntax too);

  • reducing Python programs (using the indent split mode);

  • reducing a large commit to a minimal diff (using the diff split mode);

  • reducing a commit list, when git bisect is insufficient due to the problem being introduced across more than any single commit;

  • reducing a large data set to a minimal one, resulting in the same code coverage, with the purpose of creating a test suite;

  • and many more which I do not remember.

Today, some version of DustMite is readily available in major distributions (usually as part of some D-related package), so I’m happy having a favorite tool one apt-get / pacman -S away when I’m not at my PC.

Discovering a problem which can be elegantly reduced away by DustMite is always exciting for me, and I’m hoping you will find it useful too.

D 2.091.0 Released

Digital Mars D logoThe latest release of DMD, the D reference compiler, ships with 18 major changes and 66 bugfixes from 55 contributors. This release contains, among other goodies, improvements to the Windows experience and enhancements to C and C++ interoperability. As fate would have it, the initial release announcement came in the aftermath of some unfortunate news regarding DConf 2020.

DMD on Windows

Over the years, some D users have remarked that the development of D is Linux-centric, that Windows is the black sheep or red-headed stepchild of D platforms. For anyone familiar with D’s early history, that seems an odd thing to say, given that DMD started out as a Windows-only compiler that could only output 32-bit objects in the OMF format. But it’s also understandable, as anyone not familiar with that history could only see that DMD on Windows lagged behind the Linux releases.

64-bit

One place where the official DMD releases on Windows have continued to differ from the releases on other platforms is the lack of 64-bit binaries in the release packages. Again, there’s a historical reason for this. The default output of the compiler is determined by how it is compiled, e.g., 32-bit versions output 32-bit binaries by default. When Walter first added support to DMD for 64-bit output on Windows, it required giving the back end the ability to generate object files in Microsoft’s version of the COFF format and also requiring users to install the Microsoft Build Tools and Platform SDK for access to the MS linker and system link libraries. This is quite a different experience from other platforms, where you can generally expect a common set of build tools to have been installed via the system package manager on any system set up for C and C++ development.

For a Windows developer who chooses GCC for their C and C++ development (or who does no C or C++ development at all), it’s a big ask to require them to download and install several GBs they might not already have installed and probably will never use for anything else. So D releases on Windows continued to ship with 32-bit binaries and the OPTLINK linker in order to provide a minimum out-of-the-box experience. That was a perfectly fine solution, unless you happened to be someone who really wanted 64-bit output (posts from disgruntled Windows users who didn’t want to install the MS tools can be found sprinkled throughout the forum archives).

Eventually, the LLVM linker (LLD) was added to the DMD Windows release packages, along with system link libraries generated from the MinGW definitions. This allowed users to compile 64-bit output out of the box and, once the kinks were worked out, eliminated the dependency on the MS linker. Yet, the official release packages still did not include a 64-bit version of DMD and still did not support 64-bit output by default.

With DMD 2.091.0, the black sheep has come back into the fold. The official DMD releases on Windows now ship with 64-bit binaries, so those of you masochists out there who cling to Makefiles and custom build scripts can expect the default output be what you expect it to be (for the record, DUB, the build tool and package manager that ships with DMD, has been instructing the compiler to compile 64-bit output by default on 64-bit systems for the past few releases).

Windows gets even more love

There are lots of goodies for Windows in this release. Another biggie is that DMD is now 30-40% faster on Windows. It’s no secret that LDC, the LLVM-based D compiler, generates faster binaries than DMD (for some D users, the general rule of thumb is to develop with DMD for its fast compile times and release with LDC for its faster binaries, though others argue that LDC is plenty fast for development and DMD is fine for production). There have been requests for some time to stop compiling DMD with DMD and start doing it with LDC instead. This release is the first to put that into practice.

There are a number of smaller enhancements to the Windows experience: the install.sh script available on the DMD downloads page that some people prefer now supports POSIX environments on Windows; the system link libraries that ship with the compiler have been upgraded from MinGW  5.0.2 to 7.0.0; LLD has been upgraded to 9.0.0; and there’s plenty more in the changelog.

C++ Header Generation

With just about every major release of DMD, D’s interoperability with C and C++ sees some kind of improvement. This release brings a huge one.

Over the years, some have speculated that it would be excellent if the D compiler could generate headers for C and C++ for D libraries intended to be usable in C or C++ programs. Now that wishful thinking has become a(n experimental) reality. Given a set of extern(C) or extern(C++) functions, DMD can generate header files that contain the appropriate C or C++ declarations. Three compiler switches get the job done:

  • -HC will cause the header to be generated and printed to standard output
  • -HCf=fileName will cause the header to be generated and printed to the specified file
  • -HCd=directoryname will (once it’s implemented) cause the header to be printed to a file in the specified directory

See the changelog for example output.

Other News

While the Corona virus was initially ramping up out of sight from most of the world, plans for DConf 2020 were ramping up online from different locations around the world. Planning began in November, the venue was secured in late December, and the website launched with the announcement in early January.

As news of the virus outbreak spread, the conference organizers grew concerned. Would we be okay in June? In late February, that concern manifested as a discussion of possible contingency plans. Two weeks later, it resulted in the decision to cancel DConf 2020. Thankfully, the D community has been supportive of the decision.

As part of the discussion of contingency plans, the possibility was raised of hosting an online conference. The idea of course came up in the discussion of the cancellation in the forums, and a few people reached out shortly after the initial announcement offering to provide help in setting something up. Walter created a forum thread to discuss the topic for anyone interested.

No one involved with organizing DConf has any experience with hosting an online conference. We’re currently exploring options and looking at what the organizers of other Conferences in the Time of COVID-19 are doing. We want to do it, and we want to do it well. Experience with organizing DConf in the real world has taught us not to jump on any old technology without first having a fallback (ahem, DConf 2018 livestream) and making sure the tech does what we expect it to (ahem, DConf 2019 livestream). So don’t expect a quick announcement. We want to find the right tech that fits our requirements and explore how it works before we move forward with setting dates. But do expect that DConf 2020 Online is looking more and more likely to become a thing.

DMD 2.089.0 Released

Digital Mars logoThe latest release of DMD, the D reference compiler, is ready for download. It’s a relatively light release in terms of changes and features, with 11 major changes and 66 closed Bugzilla issues. Most of the changes cover narrow use cases. To highlight a few: proper non-D mangling in template mixins, a renamed default linker, and expanded support in DUB for LDC.

Proper non-D mangling in template mixins

Maintainers of C bindings, or anyone writing D programs that need to interface with a custom C codebase, may find this change particularly useful. Previously, symbols declared as extern(C), extern(Windows)(which is stdcall), and extern(Objective-C) in template mixins were improperly mangled as D symbols when the templates were mixed in at global scope. It was easy to work around if you didn’t need parameterization (just use string mixins), but if you did need it, then you were either stuck or had to jump through hoops.

The root of the issue was that a template mixin introduces a new scope at the point it’s mixed in and creates aliases to the declarations inside that scope so that they may be accessed externally without the need for dot notation. This is a major convenience when only one instance of the template is mixed into a scope (as is the usual case) and allows for code to be written as if the template mixin doesn’t exist. In other words:

mixin template Foo(T)
{
    T x;
}

mixin Foo;

// Not required
// mixin Foo f;

// Allows this:
int main()
{
    x = 10;

    // Not required
    // f.x = 10;
}

The additional scope meant that, e.g., an extern(C) symbol in a template mixin was never in global scope, so it was mangled as a D symbol rather than a C symbol. With the change, such symbols are promoted from the mixin scope to the global scope and are properly mangled. Now you can choose either string mixins or template mixins for your C/stdcall /Objective-C boilerplate.

No more link.exe conflict

From the first alpha release of DMD, OPTLINK has been the default linker that ships with the compiler on Windows. When Walter Bright first started working on DMD, he used the existing C compiler backend he had been maintaining for 20 years. Since the backend already output object files in the Intel OMF format, he also decided to make use of OPTLINK; it had become a part of the C and C++ compiler package while he was at Borland. It was several years later that he added support Microsoft’s MSCOFF (PE) and the Microsoft linker, cl, to DMD. The fact that both the OPTLINK and cl executables were named link.exe wasn’t an issue early on, but over time it has begun to pop up more often, particularly when performing the compile and link steps separately.

With DMD 2.089, it will never be an issue again. The version of OPTLINK that ships with DMD has been renamed to optlink.exe.

DUB and LDC

DUB, the D build tool and package manager, has shipped with DMD for several releases. When executing dub on a properly configured source package, DMD is the default compiler. It also supports LDC (the LLVM-based D compiler) and GDC (the GCC-based D compiler, which is part of GCC 9) via command-line options, but the full range of DUB features haven’t been available for those compilers.

The latest version improves support for LDC. DUB command-line options to enable code coverage, profiling, keeping the stack frame, and separate linking now work with LDC. Target triples can be given to the --architecture (-a) switch to enable cross-compilation with LDC.

DMD 2.088.0 Released

Digital Mars logoThe newest DMD has rolled off the assembly line and is ready for download. A total of 58 contributors fixed 58 bugs and introduced 27 major changes to version 2.088.0 of the compiler.

I’m always looking for the big ticket items in a new DMD release to highlight on the blog, but this is a workaday release that isn’t showing off anything too shiny in the changleog. Much of it is run-of-the mill maintenance: deprecations, removals, and behavior adjustments. All of that is important, and we all welcome it, but it doesn’t make for great reading on the blog. That said, there are a handful of useful additions that I can point to, one of which actually is a big deal when it comes to C++ interop.

std::string and std::vector

Thanks to the work Manu Evans has been performing and advocating, C++ interoperability gets a big boost in this release with bindings to std::string and std::vector in the DRuntime modules core.stdcpp.string and core.stdcpp.vector, respectively.  There’s one caveat with the std::string binding that anyone intending to use it must be aware of.

When compiling on Linux, where DMD makes use of the GCC libraries and linker, there’s a compatibility issue when using the modern version of std::string which is compliant with C++11. It contains an interior pointer, which in D is both illegal and incompatible with move semantics. The work around is to pass -D_GLIBCXX_USE_CXX11_ABI=0 to g++ and compile your D application with -version=_GLIBCXX_USE_CXX98_ABI. This will be resolved in the future when work on move constructors in D is complete.

New Utilities

The language gets an interesting new compile-time trait in the form of getLocation. Given a symbol, this trait will return a tuple containing the file name, line number, and column number at which the symbol appears in the source code. This opens the door to more informative debug logging and error reporting beyond the functionality already available via __FILE__ and __LINE__. And I’m sure folks will find other uses for it.

The standard library utility module std.file, which provides a lot of convenience functions for working with files as a unit, now has the new function getAvailableDiskSpace. Give it a directory path on Windows, or the path to a directory or file on Posix, and it will give you the number of bytes available on that path.

Other News

The Symmetry Autumn of Code 2019 participants all have mentors now and they are hard at work laying out their milestones. Milestone 1 officially kicks off on September 15, after which we can expect to see weekly updates from the participants in the General forum.

Google Summer of Code 2019 has come to an end. Five of our students submitted their work at the end of August. You can find information about their projects and view their code submissions from our GSOC projects page. Congratulations to all who participated!

The D Language Foundation is currently in discussions to put some of the Human Resource Fund to use in finalizing LDC support for iOS and Android. Hopefully, I’ll have details to report on that front in the very near future. In the meantime, please help us raise the HR Fund even higher than it is now. There’s some important work waiting to be done that will require as much money as we can throw at it. You can donate any amount directly to the HR Fund Campaign or use the special campaign we set up to send $60 to the HR Fund and get a DConf 2019 t-shirt in return.

Speaking of t-shirts, thanks to everyone who has made a purchase in our DLang Swag Emporium. You’ve helped us raise over $77 so far, all of which will go to the General Fund. If you haven’t yet dropped in, what are you waiting for? We’ve got t-shirts, stickers, and coffee mugs, with updates coming soon. It’s an easy way to support our favorite programming language!

 

DMD 2.087.0 Released

Digital Mars logoThe latest release of the Digital Mars D compiler (DMD) is now available. Version 2.087.0 marks 44 closed Bugzilla issues and 22 major changes courtesy of 63 contributors. See the changelog for the details and related links. Visit the Digital Mars Downloads page to get the release package for your platform(s).

One of the changes in this release is the end of a transitional period regarding imports, another involves a certain compiler switch and the compilation of Phobos. There’s also something developers on Windows will find useful, and more options for documenting code with Ddoc.

Endings and beginnings

Once upon a time, two related compiler issues were reported in the D bug tracker, where they remained for years beyond measure (it was actually just shy of a decade). These bugs allowed symbols to sometimes be accessed inside a scope in which they weren’t supposed to be visible. Eventually, once the bugs were fixed, two switches were introduced to help users maintain their existing code: -transtion=import caused the code to compile under the old, incorrect behavior, and -transition=checkimport would report on all occurences of the erroneous behavior in a code base. Steven Schveighoffer did a write up about it all on his blog at the time, which is a good read for anyone interested in the details.

In DMD 2.087.0, the transitional period is over. The -transition=import and -transition=checkimports switches no longer have any effect. Henceforth, if you have any existing code you’ve been compiling with -transition=import, your code will break with the new release if you are still relying on the old buggy behavior.

As one period ends, another begins. A new deprecation will give you warnings if you are initializing immutable global data via a static constructor like so:

immutable int bar;
static this()
{
    bar = 42;
}

This behavior is deprecated. Static constructors and destructors are called once per thread. Given that immutable global data is implicitly shared across threads rather than being thread-local like normal D variables, data like bar would be overwritten every time a new thread is spawned. The fix is to ensure that the static constructor is also shared across threads:

immutable int bar;
shared static this()
{
    bar = 42;
}

Static constructors and destructors marked shared are invoked once per process rather than once per thread.

DIP 1000 and Phobos

DIP 1000 was the first D Improvement Proposal submitted after the DIP process was transformed from an informal, wiki-based approach, to a formal, managed approach with structured review periods. It proposed a feature called “Scoped Pointers” intended to “provide a mechanism to guarantee that a reference cannot escape lexical scope”. Unfortunately, the document itself remained in a sort of limbo as the proposed feature was implemented and evolved. Eventually, the implementation diverged from the proposal to such a degree that the DIP was marked as Superseded and retired. But not the feature!

The -preview=dip1000 flag has been available for some time now, but it has been a bit tricky to use given that the standard library could not be compiled with it. With DMD 2.087.0, that is true no more. Phobos now compiles with -preview=dip1000 and D programmers can now more easily make use of the feature.

Given that this is still in preview mode and hasn’t yet seen enough wide-spread use, please don’t be surprised if you uncover any bugs. Please do report any bugs you find to the issue tracker so that they can be squashed as soon as possible.

Explicitly choose the LLD linker on Windows

From the beginning, DMD shipped with self-contained support for 32-bit development on Windows. This was possible because Walter Bright made use of the existing platform libraries and linker (OPTLINK) that he was already shipping with his Digital Mars C and C++ compiler. Unfortunately, OPTLINK only supports the OMF object format, so the output of DMD on Windows was incompatible with the greater Windows ecosystem which is primarily built around the PE COFF output (see this PDF of the specification) of the Microsoft build tools. PE COFF, known as MSCOFF in the D universe, is a Microsoft-specific version of the COFF format.

Eventually, Walter added MSCOFF support to DMD for 64-bit (and later, 32-bit) development, but that required that developers have the Microsoft linker and platform libraries installed. In the past, that meant installing either Visual Studio or the Microsoft Build Tools package along with the Windows SDK. In recent years, installing Visual Studio Community edition would provide everything necessary. When compiling with -m64 or -m32mscoff, OPTLINK would be ignored in favor of the Microsoft linker.

With the release of DMD 2.079.0, the compiler began shipping with the LLVM linker (LLD), a set of platform libraries derived from those that ship with the MinGW compiler, and a wrapper library for the Visual C++ 2010 runtime. From that point forward, when given -m64 or -m32mscoff on Windows, the compiler would search for the Microsoft installation and, if not found, fallback on the bundled linker and libraries if they were installed. For the first time, DMD had self-contained 64-bit output on Windows. (Interestingly, it’s never been self-contained on other platforms, where DMD relies on the system linker and libraries, the presence of which is a given and not a source of complaint as the dependence on the MS linker has been.)

Now that the new set up has been put through its paces for a while and some of the kinks have been ironed out, DMD 2.087.0 makes it possible to explicitly select the bundled MSCOFF import libraries and LLD linker via the command line switch -mscrtlib=msvcrt100.

Markdown support in Ddoc

Ddoc was originally designed as a macro-based system for documenting source code, but its use has expanded beyond that scenario in the years since. Most sections of the dlang.org website are created with Ddoc, as is the dconf.org site. Ali Çehreli used it to write his book, Programming in D.

Support for Markdown-like syntax has been requested now and again in the forums. Now that’s available in DMD 2.087. It’s currently in preview mode, so it requires the -preview=markdown flag. There are some differences from the other flavors of Markdown you may be familiar with, so be sure to read the list of supported features before putting it to use.

Supporting the development of D

Much of the blood, sweat, and tears that went into this and every release of the compiler was provided by volunteers. They do the work they can within the constraints of their knowledge, skills, experience and, the mother of all limiters, time. There are some tasks that have yet to fit within the bounds of any volunteer’s constraints. These will require dedicated effort to strike off the list.

To that end, I’d like to remind everyone that we are raising money for a Human Resource Fund that we will use to bring in some folks specifically to tackle these difficult tasks. WekaIO seeded the fund with a generous contribution and a few of our community members have thrown some of their own resources into the pot with the gratitude of the D Language Foundation.

But we need more! We still have DConf 2019 t-shirts for anyone who wants to throw $60 at the HR fund, as well as DMan shirts and DConf 2020 registration discounts for those able and willing to donate more. See my recent blog post on the topic for details to make sure you donate to the correct campaign in order to get the shirt you want.

Another reminder: if you want to become a Gold Donor or Personal Sponsor on our Open Collective page (which is separate from the HR Fund – again, read my recent blog post to avoid confusion), please either ensure your email address is included in your profile or contact me directly to let me know who you are. Otherwise, I can’t send you a DMan shirt!

DStep 1.0.0

DStep is a tool for automatically generating D bindings for C and Objective-C libraries. This is implemented by processing C or Objective-C header files and outputting D modules. DStep uses the Clang compiler as a library (libclang) to process the header files.

Background

The first version of DStep was released on the 7th of July, 2012. There have been four subsequent releases, the last of which was on the 16th of January, 2016. Quite a lot has happened in the D world and with DStep since then.

After the release of DStep 0.2.1 in January 2016, there wasn’t much progress on DStep. I had a limited amount of time and chose to spend it on other projects. Fortunately, in 2016, DStep got picked as one of four D-related projects for Google Summer of Code (GSoC). The student who chose to work on DStep was Wojciech Szęszoł. He did a tremendous amount of work and pushed DStep forward by years compared to the time it would have taken me. In fact, I was often a blocker because I couldn’t keep up with reviewing all the changes he made.

New Release

The latest release of DStep contains a huge number of new features and bug fixes. A lot of the new features add support for translating C preprocessor macros in various forms. DStep also gained support for one more platform: Windows. Here follow some of the new features available in DStep 1.0.0:

Support for Simple Defines

This feature adds support for translating a simple form of #define to a manifest constant in D. Example:

#define FOO 1

The above C code is translated to the following D code:

enum FOO = 1;

DStep will try to translate the C code so that the D code looks as much as possible like the original C code. If a #define contains an expression instead of a single literal, DStep will try to preserve the original expression:

#define FOO 1 + 3

Instead of translating this to a manifest constant with the value of 4 (which would be semantically correct), DStep will preserve the original expression and translate it to:

enum FOO = 1 + 3;

This also goes for other types of literals, like hexadecimal literals:

#define FOO 0x1

Here DStep will preserve the hexadecimal literal and translate it to:

enum FOO = 0x1;

Function-Like Macros

DStep is now able to translate function-like macros. This is a pretty advanced feature that requires a small parser for the macros. DStep uses libclang to tokenize the macros and the parses them to be able to do the proper translations. The most basic example looks like:

#define FOO() 0 + 1

The above macro will translate to the following D code:

extern (D) int FOO()
{
    return 0 + 1;
}

Although not shown here (to minimize the examples), DStep will output extern (C): at the top of each file. Therefore, for macros translated to functions, DStep will add extern (D) to give the functions D linkage and mangling.

Here’s an example of a C macro containing parameters:

#define FOO(a, b) a + b

Unfortunately, in C, a and b can be basically anything. D doesn’t have an exact corresponding feature. DStep will translate this as accurately as
possible by outputting a templated function:

extern (D) auto FOO(T0, T1)(auto ref T0 a, auto ref T1 b)
{
    return a + b;
}

The assumption in this translation is that a and b will be a value of some kind of type. They can either be of the same type or of different types. To
avoid copying any of the values, ref parameters are used. Since an rvalue cannot be passed to a ref parameter, auto ref is used instead to properly handle both rvalues and lvalues.

More advanced expressions are supported as well:

#define BAR 4
#define FOO(a, b) a + 3 + (b + BAR) - sizeof(b)

In the above example there’s a combination of parameters, literals, parenthesized expression, usages of other macros, and built-in operators. DStep handles all those and translates it to:

enum BAR = 4;

extern (D) auto FOO(T0, T1)(auto ref T0 a, auto ref T1 b)
{
    return a + 3 + (b + BAR) - b.sizeof;
}

Again, the expression is preserved as closely as possible to the original source code. The parentheses, the reference to the BAR macro, all are preserved.

Token Concatenation

This feature adds support for translating the token concatenation, or token pasting, operator to a D string concatenation:

#define CONCAT(prefix, name) prefix ## name

The above function-like macro concatenates the two given tokens. DStep translates that to a function that converts the arguments to strings and concatenates the two resulting strings. This can then be used together with the string mixin statement to give the same behavior as in C.

extern (D) string CONCAT(T0, T1)(auto ref T0 prefix, auto ref T1 name)
{
    import std.conv : to;

    return to!string(prefix) ~ to!string(name);
}

Another example is parameters combined with tokens:

#define CONCAT(prefix) prefix ## name

This translates similarly to the previous example, but since name is not a parameter this will be translated to a string literal:

extern (D) string CONCAT(T)(auto ref T prefix)
{
    import std.conv : to;

    return to!string(prefix) ~ "name";
}

Preprocessor Constants in Array Sizes

DStep will now preserve preprocessor constants for the size of arrays:

#define Foo 3
int a[Foo];

In previous versions of DStep the translation would just output the size of the array a as 3. This would be semantically accurate but the generated source
code would look less like the original C source code. Now DStep is able to translate preprocessor constants and can, therefore, use the preprocessor constant as the size of the array:

enum Foo = 3;
extern __gshared int[Foo] a;

In the above example, the manifest constant Foo is used as the size of a instead of a plain 3. This more closely matches the original C source code.

Preserving Comments

In previous versions of DStep comments were completely stripped out. With this release DStep is able to preserve comments in the D code from the original C code:

// This comment describes this whole file

// Documentation for the symbol `foo`
void foo();

/* Loose comment */ /* Loose comment */

/*
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
*/ /* Loose comment */

int a; // this is `a`

In the above example there are three types of comments:

  • A header comment for the whole file
  • A preceding comment for the symbol foo
  • Loose comments not belonging to any symbol
  • A trailing comment for the symbol a

All of these comments are now properly preserved:

// This comment describes this whole file

extern (C):

// Documentation for the symbol `foo`
void foo ();

/* Loose comment */ /* Loose comment */

/*
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
*/ /* Loose comment */

extern __gshared int a; // this is `a`

In the above example, notice how the header comment is placed above the extern (C): line. If a module declaration is output in the D file (when the --package flag is used), the header comment will be placed above that as well:

// This comment describes this whole file

module bar.foo;

extern (C):

// Documentation for the symbol `foo`
void foo ();

/* Loose comment */ /* Loose comment */

/*
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
*/ /* Loose comment */

extern __gshared int a; // this is `a`

It’s also possible to disable the preservation of comments using the --comments=false flag.

Package Prefix

To better help organize bindings, DStep supports the
--package <name.of.package> flag. When this flag is enabled DStep will put all the translated modules in the <name> package. Note, this will only add the package prefix to the module declaration of the translated file. It will not place the output file in a directory corresponding to the package.

Removing Excessive Newlines

DStep will now remove excessive newlines but still preserve spacing for the original C code. This is best illustrated with an example:

int a;


int b;

In the above example there are two newlines between the declarations of a and b. DStep will remove the excessive newline and only output one to still preserve the spacing:

extern __gshared int a;

extern __gshared int b;

But if there is no spacing between the declarations, that is respected as well:

int a;
int b;

In the above example there is no newline between the declarations and DStep will preserve that:

extern __gshared int a;
extern __gshared int b;

Preserving Order of Declarations

In previous versions of DStep the declarations in the translated D code would follow a certain order defined by DStep, aliases first, then constants, then types and last functions. With this release, DStep will now preserve the order of the declarations of the original C code:

void bar();

struct Foo
{
    int a;
};

Previous versions would translate the above to:

struct Foo
{
    int a;
}

void bar ();

with the struct first and then the function declaration. With this release, the order is preserved:

void bar ();

struct Foo
{
    int a;
}

Multiple Input Files

Previous versions of DStep only allowed a single header file as input. With this release, multiple files can be passed to DStep at once. Each input file will produce one D source file as input. To pass multiple input files to DStep, just pass the filenames when invoking DStep.

$ dstep foo.h bar.h

Running the above command will produce two D source files: foo.d and bar.d.

If multiple input files and the -o flag are given, the -o flag specifies the output directory where the D source files will be placed. When multiple input
files are given it’s not possible to specify the names of the D source files.

$ dstep foo.h bar.h -o foobar
$ ls foobar
bar.d foo.d

The above command will place the two D source files in the directory foobar.

--reduce-aliases Flag

Normally when DStep translates a header file to a D module it will reduce aliases if possible. DStep contains a set of common typedefs that can be reduced to native D types. That means that code like this:

#include ;
int32_t a = 3;

Will be translated to the following D code:

extern __gshared int a;

In this release of DStep, there’s a new flag, --reduce-aliases. This flag allows the reduce aliases feature to be enabled or disabled. By default it’s enabled, but can be disabled by invoking DStep with the following command: dstep --reduce-aliases=false. When this feature is disabled, it will translate the above example to the following D code:

import core.stdc.stdint;
extern __gshared int32_t a;

It will keep the name int32_t as the type of the variable declaration and add the import for the module that contains the declaration of int32_t.

--alias-enum-members Flag

In C, enum members are accessible directly in the global scope. Example:

enum Foo
{
    foo,
    bar
};

enum Foo a = foo;

In D however, enum members need to be qualified with the enum name. The correct translation of the above would be:

enum Foo
{
    foo = 0,
    bar = 1
}


Foo a = Foo.foo;

In this release of DStep a new flag has been added, --alias-enum-members, that enables the generation of aliases to enum members at module scope. This will allow keeping the translation more closely to the original C code:

enum Foo
{
    foo = 0,
    bar = 1
}

alias foo = Foo.foo;
alias bar = Foo.bar;

By default this feature is not enabled for the translated D code to more closely follow the D conventions.

--translate-macros Flag

In this release, DStep can now translate several kinds of C macros to their equivalent in D. This might not always be desirable because the translations are not bulletproof. Therefore, there’s a new flag, --translate-macros, which will enable or disable the translation of macros. By default translation of macros is enabled.

libclang Bindings

DStep uses Clang as a library to process the C code. There two main ways of using Clang as a library. One is to use the C++ APIs directly. This will give full access to what Clang can do. The problem with this API is that is not a stable API. It’s also a C++ API and when DStep was first implemented, the C++ integration in D was quite lacking. One of my requirements when implementing DStep was to implement it in D. Therefore, the natural choice was to use the C API, called “libclang”, which is also provided. This library is the main interface intended to be used by editors and IDEs that want to leverage Clang as a library. It’s also recommended to use libclang when accessing Clang from a language other than C++—D in my case.

Since libclang is a C library only exposing C header files, I needed bindings to be able to use from it within D. Up until now these bindings were hand made. Fortunately, these headers are the most forgiving I have ever seen when it comes to translating into D. Only a single header file was needed containing hardly any macros at all. It was quite a quick job of translating these headers by using search-and-replace.

With this release of DStep, since DStep has improved so much, the bindings are now self-hosted. That is, DStep has been used to generate the bindings. In addition to that, generation of the bindings has now been added to the test suite to make sure it doesn’t break.

Support for Windows

DStep was originally developed on macOS since that is my main development platform. Thanks to the Posix standard it was easy to port to Linux as well. The first release of DStep was available for both macOS and Linux. Back in 2011 when the development of DStep first started, Clang did not support Windows. There seems to have been some support for MinGW, but that was not a target supported by DMD. The LLVM and Clang team has made huge progress since then and in 2016 when DStep got picked as a GSoC project, support for Windows was available, targeting compatibility with the Visual Studio compiler.

In this release, thanks to Wojciech Szęszoł, DStep is now available on Windows. Due to Clang being compatible with Visual Studio, DStep needs to be built to use the same object format. When compiling with DMD, that means compiling for 64-bit (via the -m64 flag) or using the -m32mscoff flag when compiling for 32 bit. The Dub package automatically takes care of this.

Continuous Integration/Deployment

In the area of CI/CD quite a few things have happened. Originally the test suite of DStep was implemented using Cucumber and Ruby. These tests were a form of end-to-end test and failed to take advantage of the reasons to use Cucumber in the first place. These tests have now been replaced with a combination of unit tests and end-to-end tests, all implemented in D.

LDC

LDC has been added as a supported compiler. That means that DStep is compiled with LDC in addition to DMD as part of the CI pipelines. Every commit and every pull request is now tested with LDC as well.

Upgrade of Compilers

Both DMD and LDC have been upgraded to their latest versions. In addition, beta and nightly releases are being tested in the CI pipelines. A scheduled job has been added to the CI pipelines as well, which will run once every day to make sure new releases of the compilers won’t break DStep even if no changes have been made to DStep. This also means that only the latest versions of LDC and DMD are supported for building DStep.

Testing Windows using AppVeyor

Since DStep now supports Windows as an additional platform a new CI pipeline has been added in the form of AppVeyor. This is a CI service that provides Windows as a platform to run builds on. This build run compiles DStep both using DMD and LDC and it also builds both 32-bit and 64-bit versions.

Complex Floating-Point Types

Another feature that is new in this version of DStep is that complex floating-point types are now supported. There are three complex types that are supported:
float _Complex, double _Complex and long double _Complex.

float _Complex a;
double _Complex b;
long double _Complex c;

The above code snippet in C is translated to the following D code:

extern __gshared cfloat a;
extern __gshared cdouble b;
extern __gshared creal c;

New Alias Syntax

Typedefs in C header files are translated to alias declarations in the D code. Up until this release they used the old alias syntax: alias oldName newName. Since the previous release of DStep the D language has improved and gained new features. One of them is a new (now considered the standard) alias syntax: alias newName = oldName. It’s easier to follow which name is the alias and which is the original when it’s using a more familiar syntax similar to variable declarations. Here’s an example of how the C code is translated into D code with the new alias syntax:

typedef int foo;

The above C code is translated to the following D code:

alias foo = int;

Custom Global Attributes

By default DStep doesn’t add any attributes like @nogc or nothrow to the translated code. In this release of DStep, support for attributes has been added. Custom global attributes can be enabled with the --global-attribute flag. For a C header file with the following content:

int a;

And invoking DStep with the following command:

$ dstep foo.h --global-attribute @nogc --global-attribute nothrow

Will output the following D code:

@nogc:
nothrow:

extern __gshared int a;

Rename Enums

Unlike in D, enums in C don’t create a new scope for their members, even if a name is given to the enum. Example:

enum
{
    a,
    b
};

enum Foo
{
    c,
    d
};

int e = a;
int f = c;

In the above example it’s possible to access the enum members from both of the enums without qualifying the type. In D, this is not the case for named enums. They require the qualifying the enum member with the type name:

enum
{
    a,
    b
}

enum Foo
{
    c,
    d
}

int e = a; // ok since the first enum is anonymous
int f = Foo.c; // need to qualifying the enum member with the type name

To reduce the risk of symbol conflict it’s quite common for C libraries to prefix enum members with the name of the type:

enum Foo
{
    FooC,
    FooD
};

While the following is perfectly fine in D as well, it gets a bit redundant and verbose to have to specify Foo twice:

enum Foo
{
    FooC,
    FooD
}

int c = Foo.FooC;
int d = Foo.FooD;

For this reason DStep now supports a new flag, --rename-enum-members, which when enabled will try to remove any prefix of the enum member names. Given the
following C header file:

enum Foo
{
    FooA,
    FooB
};

And running DStep as follows:

$ dstep foo.h --rename-enum-members

It will produce the following D code:

enum Foo
{
    a = 0,
    b = 1
}

DStep identified the Foo prefix and removed it from the enum member names. It also converted the names to lowercase to better match the standard D naming
conventions.

By default this feature is not enabled to more closely match the original C code.

Normalize Modules

When the --package flag is specified DStep will add a module declaration to all D modules. By default, it will use the name of the input file as the name of the module. In the C world there’s no direct file-naming convention. Some libraries will use all lowercase letters, some will use snake case, some will use camel case, some will use Pascal case, and so on.

The standard D naming convention for modules (and therefore files) is to use only lowercase letters and underscores, i.e snake case. To help with following this convention DStep now supports the new flag --normalize-modules. When this flag is enabled (and the --package flag is used) DStep will try to convert the name of the input file to a name matching the D conventions.

Given a C header file named Foo.h and only invoking DStep with the --package flag:

$ dstep Foo.h --package bar

DStep will produce the following D code:

module bar.Foo;

When the --normalize-modules flag is used as well:

$ dstep Foo.h --package bar --normalize-modules

DStep will output the following D code:

module bar.foo;

Note that Foo has been converted to foo.

Another example with a file using the Pascal naming convention:

$ dstep NSString.h --package bar --normalize-modules

And the result:

module bar.ns_string;

By default this feature is not enabled to more closely match the original C code.

Bit Fields

Another new feature that has been added in this release of DStep is support for bit fields. The bit field is a built-in language construct in C, but there’s no language support for it in D. Fortunately, with the help of D’s metaprogramming capabilities, the bit field has been implemented as a library construct and is available in the standard library [3]. The library construct will generate getters and setters that perform the same bit manipulation that
the C compiler would have generated.

The following snippet in C:

struct Foo
{
    unsigned int a : 1;
    unsigned int b : 2;
    unsigned int c : 5;
};

Is translated to D:

struct Foo
{
    import std.bitmanip : bitfields;

    mixin(bitfields!(
        uint, "a", 1,
        uint, "b", 2,
        uint, "c", 5));
}

The D translation makes use of the bitfields template from the standard library. It’s automatically imported, directly inside the struct, to minimize the scope of where the symbol is available.


In his day job, Jacob Carlborg is a DevOps engineer for Derivco Sweden, but he’s been using D on his own time since 2006. He is the maintainer of numerous open source projects, including DStep, a utility that generates D bindings from C and Objective-C headers, DWT, a port of the Java GUI library SWT, and DVM, the topic of another post on this blog. He implemented native Thread Local Storage support for DMD on OS X and contributed, along with Michel Fortin, to the integration of Objective-C in D.

Project Highlight: DPP

D was designed from the beginning to be ABI compatible with C. Translate the declarations from a C header file into a D module and you can link directly with the corresponding C library or object files. The same is true in the other direction as long as the functions in the D code are annotated with the appropriate linkage attribute. These days, it’s possible to bind with C++ and even Objective-C.

Binding with C is easy, but can sometimes be a bit tedious, particularly when done by hand. I can speak to this personally as I originally implemented the Derelict collection of bindings by hand and, though I slapped together some automation when I ported it all over to its successor project, BindBC, everything there is maintained by hand. Tools like dstep exist and can work well enough, though they come with limitations which require careful attention to and massaging of the output.

Tediousness is an enemy of productivity. That’s why several pages of discussion were generated from Átila Neves’s casual announcement a few weeks before DConf 2018 that it was now possible to #include C headers in D code.

dpp is a compiler wrapper that will parse a D source file with the .dpp extension and expand in place any #include directives it encounters, translating all of the C or C++ symbols to D, and then pass the result to a D compiler (DMD by default). Says Átila:

What motivated the project was a day at Cisco when I wanted to use D but ended up choosing C++ for the task at hand. Why? Because with C++ I could include the relevant header and be on my way, whereas with D (or any other language really) I’d have to somehow translate the header and all its transitive dependencies somehow. I tried dstep and it failed miserably. Then there’s the fact that the preprocessor is nearly always needed to properly use a C API. I wanted to remove one advantage C++ has over D, so I wrote dpp.

Here’s the example he presented in the blog post accompanying the initial announcement:

// stdlib.dpp
#include <stdio.h>
#include <stdlib.h>

void main() {
    printf("Hello world\n".ptr);

    enum numInts = 4;
    auto ints = cast(int*) malloc(int.sizeof * numInts);
    scope(exit) free(ints);

    foreach(int i; 0 .. numInts) {
        ints[i] = i;
        printf("ints[%d]: %d ".ptr, i, ints[i]);
    }

    printf("\n".ptr);
}

Three months later, dpp was successfully compiling the julia.h header allowing the Julia language to be embedded in a D program. The following month, it was enabled by default on run.dlang.io.

C support is fairly solid, though not perfect.

Although preprocessor macro support is one of dpp’s key features, some macros just can’t be translated because they expand to C/C++ code fragments. I can’t parse them because they’re not actual code yet (they only make sense in context), and I can’t guess what the macro parameters are. Strings? Ints? Quoted strings? How does one programmatically determine that #define FOO(S) (S) is meant to be a C cast? Did you know that in C macros can have the same name as functions and it all works? Me neither until I got a bug report!

Push the stdlib.dpp code block from above through run.dlang.io and read the output to see an example of translation difficulties.

The C++ story is more challenging. Átila recently wrote about one of the problems he faced. That one he managed to solve, but others remain.

dpp can’t translate C++ template specialisations on reference types because reference types don’t exist in D. I don’t know how to translate anything that depends on SFINAE because it also doesn’t exist in D.

For those not in the know, classes in D are reference types in the same way that Java classes are reference types, and function parameters annotated with ref accept arguments by reference, but when it comes to variable declarations, D has no equivalent for the C++ lvalue reference declarator, e.g. int& someRef = i;.

Despite the difficulties, Átila persists.

The holy grail is to be able to #include a C++ standard library header, but that’s so difficult that I’m currently concentrating on a much easier problem first: being able to successfully translate C++ headers from a much simpler library that happens to use standard library types (std::string, std::vector, std::map, the usual suspects). The idea there is to treat all types that dpp can’t currently handle as opaque binary blobs, focusing instead on the production library types and member functions. This sounds simple, but in practice I’ve run into issues with LLVM IR with ldc, ABI issues with dmd, mangling issues, and the most fun of all: how to create instances of C++ stdlib types in D to pass back into C++? If a function takes a reference to std::string, how do I give it one? I did find a hacky way to pass D slices to C++ functions though, so that was cool!

On the plus side, he’s found some of D’s features particularly helpful in implementing dpp, though he did say that “this is harder for me to recall since at this point I mostly take D’s advantages for granted.” The first thing that came to mind was a combination of built-in unit tests and token strings:

unittest {
    shouldCompile(
        C(q{ struct Foo { int i; } }),
        D(q{ static assert(is(typeof(Foo.i) == int)); })
    );
}

It’s almost self-explanatory: the first parameter to shouldCompile is C code (a header), and the second D code to be compiled after translating the C header. D’s token strings allow the editor to highlight the code inside, and the fact that C syntax is so similar to D lets me use them on C code as well!

He also found help from D’s contracts and the garbage collector.

libclang is a C library and as such has hardly any abstractions or invariant enforcement. All nodes in the AST are represented by a libclang “cursor”, which can have several “kinds”. D’s contracts allowed me to document and enforce at runtime which kind(s) of cursors a function expects, preventing bugs. Also, libclang in certain places requires the client code to manually manage memory. D’s GC makes for a wrapper API in which that is never a concern.

During development, he exposed some bugs in the DMD frontend.

I tried using sumtype in a separate branch of dpp to first convert libclang AST entities into types that are actually enforced at compile time instead of run time. Unfortunately that caused me to have to switch to compiling all code at once since sumtype behaves differently in separate compilation, triggering previously unseen frontend bugs.

For unit testing, he uses unit-threaded, a library he created to augment D’s built-in unit testing feature with advanced functionality. To achieve this, the library makes use of D’s compile-time reflection features. But dpp has a lot of unit tests.

Given the number of tests I wrote for dpp, compiling takes a very long time. This is exacerbated by -unittest, which is a known issue. Not using unit-threaded’s runner would speed up compilation, but then I’d lose all the features. It’d be better if the compile-time reflection required were made faster.

Perhaps he’ll see some joy there when Stefan Koch’s ongoing work on NewCTFE is completed.

Átila will be speaking about dpp on May 9 as part of his presentation at DConf 2019 in London. The conference runs from May 8–11, so as I write there’s still plenty of time to register. For those who can’t make it, you can watch the livestream (a link to which will be provided in the D forums each day of the conference) or see the videos of all the talks on the D Language Foundation’s YouTube channel after DConf is complete.

DMD 2.085.0 and DConf 2019 News

Coinciding with news of a new release of DMD is news about DConf 2019 in London. From new GC options in DRuntime to free beers and free tours at DConf, we may as well kill two birds with one blog post!

Compiler news

The 2.085.0 release of DMD, the D reference compiler, is now ready for download. Among other things, this release sees support for 32-bit OS X removed, more support for interfacing with Objective-C, and big news on the garbage collection front. There’s also been some work on compatibility with the standard C++ library. In total, 2.085.0 represents 58 closed Bugzilla issues and the efforts of 49 contributors. See the changelog for the full details.

Interfacing with Objective-C

DMD has had limited support for binding with Objective-C for some time. This release expands that support to include classes, instance variables, and super calls.

Previously, binding with Objective-C classes required using D interfaces. No longer. Now, Objective-C classes can be declared directly as D classes. Decorate an Objective-C class with extern(Objective-C), make use of the @selector attribute on the methods, and away you go.

To better facilitate interaction between the two languages, such classes have slightly modified behavior. Any static and final methods in an extern(Objective-C) class are virtual. However, final methods still are forbidden from being overridden by subclasses. static methods are overridable.

extern (Objective-C)
class NSObject
{
    static NSObject alloc() @selector("alloc");
    NSObject init() @selector("init");
    void release() @selector("release");
}

extern (Objective-C)
class Foo : NSObject
{
    override static Foo alloc() @selector("alloc");
    override Foo init() @selector("init");

    int bar(int a) @selector("bar:")
    {
        return a;
    }
}

void main()
{
    auto foo = Foo.alloc.init;
    scope (exit) foo.release();

    assert(foo.bar(3) == 3);
}

It’s also now possible to declare instance variables in Objective-C classes and for a method in an Objective-C class to make a super call.

extern (Objective-C)
class NSObject
{
    void release() @selector("release");
}

extern (Objective-C)
class Foo : NSObject
{
    // instance variable
    int bar;

    int foo() @selector("foo")
    {
        return 3;
    }

    int getBar() @selector("getBar")
    {
        return bar;
    }
}

extern (Objective-C)
class Bar : Foo
{
    static Bar alloc() @selector("alloc");
    Bar init() @selector("init");

    override int foo() @selector("foo")
    {
        // super call
        return super.foo() + 1;
    }
}

New GC stuff

Perhaps the biggest of the GC news items is that DRuntime now ships with a precise garbage collector. This can be enabled on any executable compiled and linked against the latest DRuntime by passing the runtime option --DRT-gcopt=gc:precise. To be clear, this is not a DMD compiler option. Read the documentation on the precise GC for more info.

Another new GC configuration option controls the behavior of the GC at program termination. Currently, the GC runs a collection at termination. This is to present the opportunity to finalize any live objects still holding on to resources that might affect the system state. However, this doesn’t guarantee all objects will be finalized as roots may still exist, nor is the need for it very common, so it’s often just a waste of time. As such, a new cleanup option allows the user of a D program to specify three possible approaches to GC clean up at program termination:

  • collect: the default, current, behavior for backward compatibility
  • none: do no cleanup at all
  • finalize: unconditionally finalize all live objects

This can be passed on the command line of a program compiled and linked against DRuntime as, e.g. --DRT-gcopt=cleanup:finalize.

All DRuntime options, including the two new ones, can be set up in code rather than being passed on the command line by declaring and initializing an array of strings called rt_options. It must be both extern(C) and __gshared:

extern(C) __gshared string[] rt_options = ["gcopt=gc:precise cleanup:none"];

See the documentation on configuring the garbage collector for more GC options.

Additional GC-related enhancements include programmatic access to GC profiling statistics and a new GC registry that allows user-supplied GCs to be linked with the runtime (see the documentation for details).

Standard C++

There are two enhancements to D’s C++ interface in this release. The first is found in the new DRuntime module, core.stdcpp.new_. This provides access to the global C++ new and delete operators so that D programs can allocate from the C++ heap. The second is the new core.stdcpp.allocator module, which exposes the std::allocator<T> class of C++ as a foundation for binding to the STL container types that allocate.

DConf 2019 news

There are two interesting perks for conference goers this year.

The nightly gathering spot

We now have an “official” gathering spot. Usually at DConf, we pick an “official” hotel where the D Language Foundation folks and many attendees stay, but where a number of conference goers gather in the evenings after dinner. This year, a number of factors made it difficult to pick a reasonable spot, so we opted for something different.

There’s a cozy little pub around the corner from the venue, the Prince Arthur, that has a nice room on the second floor available for reservation. There’s a limit on how many bodies we can pack in there at once, but folks generally come and go throughout the evening anyway. Any overflow can head downstairs to the public area. We’ve got the room May 8, 9, and 10.

Additionally, we’ll be offering a couple of free rounds each night courtesy of Mercedes-Benz Research & Development North America. Free drinks in a cozy backstreet London pub sounds like a great way to pass the time!

Check out the DConf venue page for details about the Prince Arthur and how to get there.

A free tour by Joolz Guides

Julian McDonnell of Joolz Guides will be taking some DConf registrants on a guided walk May 6 and 7. If you’ve never seen his YouTube channel, we recommend it. His video guides are quirky, informative, and fun.

This is available for free to all registrants, but space is limited! When you register for DConf, you’ll receive information on how to reserve your spot. We’ve arranged to have the tours in the mid-afternoon on both days so that folks arriving in the morning will have a chance to participate. This is a walking tour that will last somewhere between 3 – 4 hours, so be sure to wear comfortable shoes.

The current plan is to start at Temple Station and end at London Bridge. We hope you join us!

Deadlines

The DConf submission deadline of March 10 is just around the corner. Now’s the time to send in your proposal for a talk, demo, panel or research report. See the DConf 2019 homepage for submission guidelines and selection criteria. And remember, speakers are eligible for reimbursement of travel and accommodation expenses.

The early-bird registration deadline of March 17 is fast approaching. Register now to take advantage of the 15% discount!