Category Archives: Compilers & Tools

DMD Compiler as a Library: A Call to Arms

Digital Mars D logo

Having a flexible and powerful compiler library has been one of the stated goals of the D Language Foundation for some time now. This makes sense, as a proper compiler library will channel the efforts of contributors into building developer tools, which in turn, will increase the adoption rate of the language. However, progress on this topic has been slow, mainly due to two aspects: (1) the lack of a clear direction, and (2) the intimidating complexity of the DMD frontend, which requires significant work on the compiler codebase.

The good news is that we now have a plan, which I will outline in this blog post. The bad news is that implementing this plan requires significant effort, and we need more contributors. However, the silver lining is that the work, while extensive, mostly involves refactoring the code. This provides an excellent opportunity for contributors to familiarize themselves with the compiler codebase while delivering real value. Before delving into the specifics, let me give you some background.

Current Status And How We Got Here

To fully understand the work done so far on the compiler-as-a-library project, I highly recommend watching my talk on this subject.

In summary:

  • Several years ago, we began packaging the compiler as a library.
  • Our goal was to clearly separate compilation phases: lexing, parsing, semantic analysis, optimizations, and code generation.
  • The parsing and semantic analysis modules were interdependent, necessitating a method for separation.
  • We opted to template the parser with an ASTFamily template parameter, defining the AST nodes required for parsing.
  • We created ASTBase (containing AST nodes essential for parsing) and ASTCodegen (containing AST nodes needed for code generation).
  • ASTBase, as it stands, is code duplicated from ASTCodegen.
  • We started extracting semantic routines and fields from AST nodes to eliminate ASTBase’s code duplication by importing a subset of modules used by ASTCodegen.
  • Additionally, we began replacing third-party libraries (like libdparse) with the DMD-as-a-library package.

For more detailed information on each of these points, I recommend watching the talk I referenced.

Recently, I proposed to Walter a modification to the codebase that would significantly enhance the flexibility of the compiler library, allowing any AST node to be overwritten. Walter was hesitant to accept my proposal, concerned about the potential “ugliness” it would introduce to the codebase. He cited the addition of ASTBase and the resulting code duplication as a precedent. He then suggested that if we eliminate ASTBase, he would reconsider my proposal.

What You Can Do To Help

We are now focused on eliminating the duplication in ASTBase. To achieve this, we need to extract all information related to semantic analysis from the existing AST nodes. The challenge is the sheer number of AST nodes and the multitude of functions associated with each. I have been working on this sporadically over the past few months, and progress is slow due to the nature of the work: it mostly involves moving code, creating visitors, breaking dependencies, etc. While not overly complex, it isn’t particularly creative work either. However, for someone interested in understanding a real-life compiler codebase, it’s an ideal starting point.

If you’re willing to support this initiative, I’ve put together a guide on where to start and what you can do. Feel free to contact me on Slack (razvan.nitu), Discord, or email (razvan.nitu1305@gmail.com) for more details or to request a review of your PR.

I see this as an excellent opportunity to onboard new people into compiler development in a way that benefits both the language and the contributor. So, if you have some spare time, please join us in getting this work done!

D News May ’22: D 2.100.0; GDC & LDC Releases; DConf ’22 Schedule Published & Early-Bird Registration Ends

May was a busy month in D land. Early on, a major milestone release of GDC, the GCC-based D compiler, hit the virtual shelves. It was followed in middle of the month by the release of D 2.100.0 along with a DMD release, the reference D compiler, of the same version. That was immediately follwed by a beta release of the LLVM-based D compiler, LDC, version 1.30.0. Finally, the latter half of the month saw the publication of the DConf ’22 schedule, we found a sponsor for the DConf tradition of BeerConf, and May 31st marks the final day of DConf ’22 early-bird registration.

A video version of this blog post is available on the D Language Foundation YouTube channel.

D 2.100.0

This latest release of DMD comes to us courtesy of 41 contributors who brought us 22 major changes and 179 fixed Bugzilla issues. Although the community attached a bit of significance to the 2.100.0 version number, there isn’t anything overly exciting in the changelog. This is largely a house-cleaning release—a number of deprecation periods that should have already ended have been terminated— but there are a couple of interesting additions to the language.

D1-style operator overloading

One of these is the deprecation of D1-style operator overloads. Originally, these were designed to make their purpose clear. Want to overload the addition operator? Then implement opAdd. What to overload the multiplication operator? Then implement opMul. Walter took this approach with operator overloading because of one of the major complaints about the feature in C++: people often overload an operator to do something different from what it is expected to do. An example: overloading the + operator to append rather than perform addition. Walter’s reasoning was that if the intent of the operator is included in the name of the function, then anyone overloading it to do something different is essentially violating its contract. Perhaps it would encourage people to stick to the intent.

No one can say for sure if Walter’s approach worked like he hoped, but a more generic design was implemented in D2, and this is the approach all D code must use today. The D1 operators were kept around largely to ease porting D1 code to D2, with the intention that they would one day be deprecated. It finally happened in D 2.088.0, which was released in the fall of 2019. Following the deprecation process, the deprecation period should have ended with 2.098.0 (the first release after 10 non-patch releases including the deprecation).

delete

The delete keyword was another D1 feature that was ultimately axed in D2. It was deprecated in D 2.079.0, which was released in the spring of 2018. This was something that had long been planned (see the deprecation page for the rationale), and its use had been discouraged for some time.

Ndelete would both destroy an object instance (call its destructors) and release the memory allocated for it by the GC. Now, we use the destroy function from the object module which is imported by default in all D programs. This will call the destructor on an instance and optionally reset the instance to its default .init state. The GC will then free the memory allocated for the instance when necessary, or the programmer can do it manually via GC.free static member function in core.memory.

@mustuse

Paul Backus took DIP 1038 through the review process from beginning to end. Initially, it introduced an @nodiscard attribute for functions and types. During the Formal Assessment after the review rounds were completed, Walter and Átila were willing to approve it with changes. The final version renamed the attribute to @mustUse and restricted its application to structs and unions.

The feature was implemented in D 2.100.0 as @mustuse, and is now available to use in your D code. When a type marked with the attribute is the result of an expression, the result cannot be ignored.

.tupleof for static arrays

Many D programmers are familiar with the .tupleof property of structs, which is particularly useful when interfacing with C libraries:

struct Circle {
    float x, y;
    float radius;
    ubyte r, g, b, a;
}

@nogc nothrow
extern(C) void draw_circle (
    float cx, float cy, float radius,
    ubyte r, ubyte g, ubyte b, ubyte a
);

void foo() {
    Circle c = makeCircle();
    draw_circle(c.tupleof);
}

Now we can do the same thing with static arrays:

void foo(int, int, int) { /* ... */ }

int[3] ia = [1, 2, 3];
foo(ia.tupleof); // same as `foo(1, 2, 3);

float[3] fa;
//fa = ia; // error
fa.tupleof = ia.tupleof;
assert(fa == [1F, 2F, 3F]);

DConf ’22

DConf ’22 is happening in London, August 1 4. If you haven’t registered yet and you’re reading this on or before May 31st, then register now to take advantage of the 15% early-bird discount. The schedule is online and BeerConf is a go!

DConf ’22 schedule

We love to see and hear first-time speakers at DConf, whether it’s their first conference talk ever or their first DConf talk. This year, we have 11 first-time DConf speakers, 12 if you include our invited keynote speaker Roberto Ierusalimschy (the head designer of the Lua programming language). This is awesome!

The DConf ’22 schedule is set up as follows:

  • three keynotes: two from the language maintainers, one from our guest speaker
  • two panels: the traditional DConf Ask Us Anything involving the language maintainers, and a panel on Programming Language Design
  • a Lightning Talks session
  • 15 presentations (11 of which are from first-time DConf speakers)

We’re limiting the talks to 45 minutes this year so that we’ll have more time to mingle between sessions. One of the talks on Day 3 is slated for 25–30 minutes, so we’ve slotted it such that we have a longer lunch that day.

The schedule (excluding the keynotes, as the details of those haven’t yet been provided) has a loose theme. It’s not perfect, but it’ll do:

  • Day One is mostly status reports and tutorials
  • Day Two is largely intermediate to advanced and heavy on the tech
  • Day Three is about the D ecosystem

All of the talks will be livestreamed and recorded, so they’ll be available on our YouTube channel at some point after the conference has ended. Still, DConf is about more than just the talks, as Razvan Nitu and Dennis Korpel noted in an interview. It’s about getting to know in person the people we encounter online in our regular D community interactions. As Razvan said and I can attest, your perspective will surely change after you can match the internet handles with living, breathing, human beings with whom you’ve interacted in person.

So register!

Early-bird registration ends

May 31st is the last day of early-bird registration. With the 15% discount and 20% VAT, the total is $423.30 USD. We also show the GBP equivalent on the site, based on the HMRC exchange rate for the current month, and accept payments in GBP through PayPal. On June 1st, the general registration rate of $498.00 USD (including 20% VAT) kicks in.

If you are a student, there’s a flat rate of $120.00 USD (including 20% VAT). Email social@dlang.org to take advantge of it.

We also offer a flat rate of $240.00 USD (including 20% VAT) for major open source contributors. The keyword here is major. It’s not something for which we can set specific criteria, and we don’t really want to provide examples that may discourage inquiries. If you would like to see if you qualify for this discount, please email social@dlang.org, and we’ll let you know.

Finally, we also offer a hardship rate. If you would like to attend DConf but can’t afford the registration, just email social@dlang.org and we’ll see about helping you out. We can’t help you with transportation, just the registration.

BeerConf

BeerConf is a DConf tradition going back to the very beginning, though we didn’t call it that back then. Every year, we would designate an “official” hotel somewhere in the vicinity of the venue. This would be our gathering spot in the evenings, usually in the hotel lobby or bar. Typically, would people break off into groups for dinner, then several of them would wander over to the gathering spot to hang out and chat, usually over beers. At DConf 2017, Ethan Watson branded this gathering BeerConf and the name has stuck.

At DConf 2019 in London, we couldn’t find a suitable hotel to select as the site of BeerConf. Instead, we hired out the upper floor of a pub close to the venue, thanks to the sponsorship of Mercedes Benz Research and Development North America. For DConf ’22, we’re back in the same general area, and so we again have to hire out a pub.

The 2019 pub was a bit crowded for us, and is a bit too far of a walk from our ’22 venue, so we’ve got our eyes on another pub within walking distance of the venue and near some of the budget hotels listed at dconf.org. What we’ve been missing is funding.

That has changed, thanks to Funkwerk! With their sponsorship, we’re able to cover the minimum spend the pub asks for the each of the evenings of August 1 3. This means that DConf attendees dropping by this pub on those nights can order food and drinks (alcoholic and or otherwise) for free until the DConf tab runs out. We’ll have a separate tab for each night so that we don’t blow it all in one go.

Unfortunately, I can’t announce the specifics about the pub just yet. Our DConf host, Symmetry Investments, is handling the arrangements for us since they’re in London and we aren’t. Once I receive confirmation that the deal is set, I’ll announce all of the details in the forums, here on the blog, and at dconf.org. So keep your ears open!

Thanks again to Funkwerk for helping us out.

Next time

The next big news roundup will come in late August or early September, but I’ll keep the blog updated with announcements before DConf as they come. If you are planning to attend DConf, then I’m looking forward to seeing you in London. And if you aren’t, then change your plans!

D News Jan-Mar 2022: SAOC 2021, D 2.099.0, DConf ’22

Digital Mars D logo

The first three months of 2022 brought some major milestones:

  • Symmetry Autumn of Code 2021 came to an end on January 15, but the judges didn’t render a decision until the middle of February. And what a surprise it was!
  • The D Language Foundation announced in January that we were hiring for a vacant position sponsored by Symmetry Investments, and in February we found the person to fill it.
  • Also in February, we made a long-awaited announcement regarding DConf.
  • In early March, D 2.099.0 was released.

That’s a pretty solid start to 2022, and most of it was made possible thanks to the generous contributions of Symmetry Investments. If you’re looking for a job, Symmetry is always hiring, including D programmers!

And now on with the news.

Symmetry Autumn of Code 2021

We started SAOC 2021 with five participants, each working on projects that would be of value to the D community. Three of them were unable to make it to the end. So it came down to two: Teodor Dutu and Luís Ferreira. Teodor was working on converting DRuntime hooks to templates, and Luís on getting support for D into LLDB, the LLVM debugger.

SAOC is sponsored by Symmetry Investments. Each year, participants promise to work on their projects at least 20 hours per week across four month-long milestones. At the end of each of the first three milestones, a panel of judges evaluates their progress to decide if they pass or fail. A passing participant is awarded a $1000 payment and allowed to continue in the next milestone. A failing participant might be given a reduced payment or none at all, and removed from the event or given a warning, depending on the circumstances leading to the failure. At the end of the fourth milestone, the judges evaluate the overall progress of each participant across the entire event and select one to receive a final $1000 payment and a free trip to DConf.

For the first time in four editions of the event, the SOAC 2021 judges were unable to agree on who should receive the final rewards. It was a three-judge panel, each of whom is a veteran of every edition of SAOC: Jon Colvin, Átila Neves, and Robert Schadek. Two of them split, and the third felt there wasn’t enough to make either of the two participants stand out above the other. Teodor and Luís both did their work, wrote detailed milestone reports, and kept up with their forum updates to the same degree. So the conflicted judge took a proposal to Laeeth Isharc of Symmetry: why not award both candidates the final payment and the DConf trip?

Congratulations to Teodor and Luís on being the first dual recipients of the final SAOC reward. They have continued working on their projects, and we look forward to seeing the work they do in the future. Thanks to all of the SAOC participants, mentors, and judges, and to Symmetry Investments for sponsoring the event every year.

The New Pull Request and Issue Manager

For over a year, Razvan Nitu has been working hard at closing Bugzilla issues and merging pull requests in his role as our Pull Request and Issue Manager. His position is sponsored by Symmetry Investments, which provided funding for two such positions. Unfortunately, real-world circumstances conspired to prevent the person selected for the second position from filling it, so it remained vacant through most of 2021.

At the beginning of this year, Symmetry committed to continuing funding for both positions (as well as a different position, that of my assistant, filled by Max Haughton). In January, we put out a call for applications. In February, we announced that Dennis Korpel was selected for the job. His proven track record as a volunteer contributor to the core D repositories made him the top contender.

Dennis officially started his new job on March 1, and he hit the ground running. We’re happy to have him on board.

Tell them about it–#dbugfix

Razvan and Dennis are here to make sure the bugs are fixed and pull requests are merged. If you have an issue that’s bugging you because it’s been open for ages, or if you feel like a pull request should be getting more attention, let them know! That’s what they’re here for.

One way you can do that is by tweeting the issue number along with #dbugfix. We initiated this hashtag a while back so that D users could bring attention to specific issues, but then the hard part was finding someone with the time and inclination to fix it. Now, with both Razvan and Dennis paid to make sure issues get fixed, the hard part is a lot easier. You can also post about issues in the forums or email social@dlang.org, and I will make sure that they see it.

Razvan and Dennis have their criteria for deciding their priorities in the absence of input, but if you bring an issue or PR to their attention, they will work to resolve it as quickly as they can.

D 2.099.0

Version 2.099.0 of DMD, the reference D compiler, was released on March 6. This is a massive release, containing 20 major changes and 221 closed Bugzilla issues from 100 contributors. Some highlights from this release: D modules can be imported into C code via ImportC; D now has throw expressions; and PE/COFF output is now the default in DMD on Windows. See the changelog for the complete list.

Import modules in C source code with ImportC

ImportC is proving to be a valuable addition to D. Once all the kinks are ironed out and a solution for handling C preprocessor directives is implemented, the need for bindings to C libraries will largely disappear—you’ll be able to bring C headers, and compile C source files, directly into your D programs without any external tools.

As of D 2.099.0, you can also bring D modules directly into C files via the __import keyword.

// dsayhello.d
import core.stdc.stdio : puts;

extern(C) void helloImport() {
    puts("Hello __import!");
}
// dhelloimport.c
__import dsayhello;
__import core.stdc.stdio : puts;

int main(int argc, char** argv) {
    helloImport();
    puts("Cool, eh?");
    return 0;
}

Compile with:

dmd dhelloimport.c dsayhello.d

You can also use it to import C modules that have been compiled via ImportC:

// csayhello.c
__import core.stdc.stdio : puts;

void helloImport() {
    puts("Hello _import!");
}
// chelloimport.c
__import csayhello;
__import core.stdc.stdio : puts;

int main(int argc, char** argv) {
    helloImport();
    puts("Cool, eh?");
    return 0;
}

Compile with:

dmd chelloimport.c csayhello.c

The throw expression has been implemented

For all of D’s lifetime, throw has been a statement and only a statement. It couldn’t be used in expressions because expressions must have a type, and since throw doesn’t return a value, there was no suitable type. This prevented it from being used with the following syntax:

(string err) => throw new Exception(err);

And required this form instead:

(string err) { throw new Exception(err); }

DIP 1034, which introduced a bottom type to the language, provided the means to enable throw expressions: when “a throw statement is seen as an expression returning a bottom type”. As of D 2.099.0, the following code snippet compiles:

void foo(int function() f) {}

void main() {
    foo(() => throw new Exception());
}

PE/COFF is the default DMD output on Windows

For many years, DMD outputs object files in the OMF format on Windows. There’s a story behind this, a large part of it related to the culture of software development on Windows, but it can be summarized in two bullet points:

  • Walter Bright already had a C compiler backend that generated OMF output, a license to distribute OMF link libraries for the Win32 API, and a linker that understands OMF (OPTLINK).
  • There was no de facto system linker on Windows when he started working on D in 1999, so he could not rely on a specific linker being installed.

Reusing the compiler backend and the linker allowed Walter to distribute DMD as a compiler that worked out of the box, without the need to install any further development tools. He felt this was important for D’s early adoption. The downside was that it also restricted DMD on Windows to 32-bit. Eventually, he had to support PE/COFF and require the Microsoft linker in order to support 64-bit output, and he implemented PE/COFF 32-bit at the same time, but he was adamant that DMD continue to work out of the box for those who didn’t want to install the Microsoft Build Tools (for the linker) and Windows SDK (for the Win32 link libraries).

Eventually, OPTLINK started showing its age. Linker errors became more common as D codebases grew. There were calls to enable PE/COFF by default. Finally, someone raised the idea of shipping the LLVM linker, LLD, along with link libraries generated from the MinGW project. This would allow DMD to eventually default to PE/COFF while maintaining the out-of-the-box experience.

DMD has been shipping with LLD for several releases, and it seems enough of the kinks have been worked out that it has been ready to become the default for a while now. Nicholas Wilson finally took the step to make that happen, Walter eventually gave it his blessing, and now PE/COFF is the default DMD output on Windows.

Practically, this means that the -m32mscoff switch has been deprecated, -m32 now specifies PE/COFF, and the new switch -m32omf can be used to produce OMF output if needed (but its OMF support will eventually be dropped). The -m64 switch has always produced PE/COFF output, so has not changed.

LDC

The beta release of LDC 1.29.0 was announced on March 10. This version of the LLVM-based D compiler is based on D 2.099.0+. It includes support for LLVM 13, no longer defaults to the ld.gold linker on Linux (LLD is recommended), and includes a breaking change for the extern(D) ABI. See the full release log for details.

DConf ’22 in London

After an unexpected and unwanted hiatus, DConf is returning to the real world! Hosted once again by Symmetry Investments, we’ll be in London, Aug 1–4, 2022. We’re currently accepting submissions and early-bird registration is open.

Guest keynote speaker

Our guest speaker this year is Roberto Ierusalimschy, Associate Professor at the PUC-Rio Department of Informatics and head designer of the Lua programming language. We’re excited that he’s able to join us. Several D community members have used or are using Lua in their D projects, including the gas dynamics toolkit at the University of Queensland that its maintainers wrote about on this blog. (You can also count me in that group. I’ve used Lua in different capacities over the years, and I maintain a set of D bindings for Lua’s C API).

Roberto was the mentor who shepherded the Origins of the D Programming Language paper through the HOPL IV conference, so he already has a connection to the D community.

I don’t know yet if his talk will be related to Lua, but I’m looking forward to hearing what he has to say.

Registration

Early-bird registration is open until May 31. The base early-bird rate is $352.75 ($423.30 after applying 20% VAT), which is 15% off the general registration of $415 ($498 with 20% VAT). We offer a student discount, a discount for major open source contributors, and a hardship rate. You can register now or learn about the discounted rates at dconf.org.

Talks

At past editions of DConf, we’ve allotted talks in 50-minute blocks with 10-minute breaks in between. This year, we’re cutting that down: we’d like to keep the talks no longer than 40–45 minutes. Part of the magic of DConf is the time spent interacting face-to-face with other D enthusiasts, so it only makes sense to make as much room for that as we can while still allowing for educational and informative presentations.

If you have something related to the D programming language that you’d like to share with the world, please send in a submission. Don’t know what to talk about? Then heed Ali Çehreli, from one of his DConf Online 2020 Q & A sessions:

Coming up with an idea for a talk is as simple as the way you use D. Just look at your code, and it makes a presentation…

If you have used the D programming language, then you have material for a talk: describe your project; talk about specific problems you solved or interesting ways in which you’ve employed language features; expound on the ups and downs of your experience learning D so that others can benefit; and so on. Take a look at the DConf and DConf Online talks available on our YouTube channel for inspiration. Even if you’ve never presented at a conference, we encourage you to send us a submission! Several D community members have given their first presentation at DConf, and we are always happy to see more.

The worst that can happen when you submit a talk is that it isn’t accepted. But if it is accepted, then you’ll be entitled to reimbursement for your transportation to and from London, and your lodging for the five nights of the conference. You get to hang out with people who share your interest in D and most of your expenses are covered, with nothing to lose if your talk isn’t accepted.

Don’t let doubt or hesitation hold you back. You can find submission details at dconf.org.

Venue

DConf ’22 is taking place a nifty venue between Moorgate and Liverpool Street Stations called CodeNode. All of our talks will be in their CTRL room on the first floor, and we’ll have the basement ESC room to ourselves for mingling between talks and during lunch. They have table tennis and foosball tables, and plenty of space in which to chill.

CodeNode isn’t far from our DConf 2019 venue, so the same budget hotels we stayed at then are also within walking distance this year. You can find a list of those and several other budget hotels in the area at dconf.org.

BeerConf!

For every edition of DConf before 2019, we designated one area hotel as the official gathering spot. Many attendees would take rooms there, and a number of us would gather in the evenings in the hotel lobby or bar to chat over drinks and snacks. In one of our Berlin editions, Ethan Watson coined the term “BeerConf” to refer to these evening meetups. In 2019, we couldn’t find a suitable hotel in which to gather, so we hired space in a pub near the venue. When DConf was canceled in 2020, a couple of community members hosted an online BeerConf to make up for the loss of the real-world version, and they’ve been hosting it every month since.

This year, since we’re back in the same part of London, we’re again looking for a space we can rent for BeerConf. We’ve got our eyes on a couple of spaces, and we’re working to secure funding. I hope to have an update on that before the end of April.

In the meantime, keep an eye on the D Announce forum for news of our monthly online version of BeerConf, and consider picking up a BeerConf shirt from our DLang Swag Emporium!

Looking ahead

We’re looking forward to the rest of 2022. One of our big goals for this year is to lay the groundwork for bringing more structure and organization to the D ecosystem. The PR/Issue managers have made a big difference and brought order to a chaotic contribution process, but we still have a long way to get to where we’d like to be.

Soon, I’ll start publishing tutorials on the foundation’s YouTube channel. These tutorials are going to cover more than just the language syntax and semantics. They’ll also dive into the tools we use as D programmers: compilers, linkers, loaders, object files, etc. These days, it’s not unsual for a programmer new to D to have gone years without ever touching a programming language that uses the same compile-link model. Questions about static linking errors, or confusion about compiler vs. linker errors, are not uncommon. These tutorials will be short and focused on specific topics, and will hopefully serve as a means for new D programmers to up their game with the tools they use.

Once I’ve uploaded the tutorials, I’ll apply for our channel to join the YouTube Partner Program so that we can start raising money from the channel. We’re eligible now, but I don’t want to apply until I’ve established a more frequent pattern of updates.

On that note, I’d like to remind you that the D Language Foundation is available to select as a charity for the Amazon Smile program. When you shop via smile.amazon.com, selecting the D Language Foundation as your preferred charity allows us to receive a small percentage of your payment. If you shop at Amazon, it’s an easy way to support the D Language Foundation. You can find browser extensions that will redirect you to smile.amazon.com every time you visit amazon.com, such as Amazon Smile Redirect, which is available for Chrome/Edge and for Firefox. (Amazon Smile charities are domain-specific, so the D Language Foundation is only available through Amazon’s .com domain).

You can also support us by shopping at the DLang Swag Emporium or donating directly via one of the options listed at dlang.org.

We can’t wait to see you in London!

The Binary Language of Moisture Vaporators

Digital Mars D logo

I know why you’re reading this. Like other Alpha programmers, you’re not content with just compiling Vaporator code and testing to see if it works. You need to know the binary code that’s generated. But getting at it is clumsy. I want to make it easy for myself, and why not share it?

One of my earliest memories is being curious about how light bulbs worked and sticking my finger in a hot light bulb socket. I was three or four years old at the time. Later, I was always taking things apart to see how they worked. It was years before I could successfully put them back together again. I remember being baffled at the grey dust inside a resistor I cracked open and when I unwrapped the paper in a capacitor. I took my first car to pieces to see how it worked.

When I first learned Fortran, it was a great mystery how the text of the language turned into machine code. Machine code was the language of the gods. This evolved into wanting to make my own compiler. But to build a compiler, you need to be able to see the output. A disassembler had to be built along with the compiler. That became obj2asm.exe. I’ve spent a great deal of time running the disassembler and poring over what the compiler generated. I look at what other compilers generate, too, using obj2asm.

But running obj2asm is a separate process, and the output is filled with all the boilerplate needed to create a proper object file. The boilerplate is rarely of interest, and I’m only interested in the generated code for a function. Why not just give the compiler a switch, call it -vasm (short for Show Me The Vaporator Assembly), and have it emit the binary and assembler code to the screen, function by function? So I ripped the disassembler logic out of obj2asm and put it into the dmd D compiler.

One would think that the way to do this would be to have the compiler generate the assembler source code, which would then be run through an assembler like MASM or gas to create the object file. I figured this would be slow and too much work. Instead, the disassembler logic actually intercepts the binary data being written to the object file and disassembles it to a string, then prints the string to the console.

For example, for the file vaporator.d:

int demo(int x)
{
     return x * x;
}

Compiling with:

dmd vaporator.d -c -vasm

prints:

_D9vaporator4demoFiZi:
0000:   89 F8                   mov     EAX,EDI
0002:   0F AF C0                imul    EAX,EAX
0005:   C3                      ret

and we see the mangled name of the function, the code offsets, the code binary, and the mnemonic representation for those learning binary.

I am not aware of any other compiler that does this in the same way. This is probably because most programmers are not particularly interested in how the sausages are made. But I find it fascinating and fun. I’ve opined before that programmers who don’t know the assembler their code is transformed into are not likely to be Alpha programmers. With the -vasm switch, it’s so easy to look at the output, why not do it? It works as a great way to learn assembler code, too!

I’ve been using it myself, and the convenience is a game changer. What are you waiting for?

P.S. I made the disassembler as a Boost Licensed standalone module that anyone can use who needs a tool to understand the binary language of moisture vaporators.

Using the GCC Static Analyzer on the D Programming Language

Largely thanks to the tireless work of Iain Buclaw, the D programming language is part of GCC. As well as having access to an extremely potent set of compiler optimizations and a large group of target platforms, D also benefits from upstream features added to GCC as a whole or even for specific languages. For some projects, this can be very important, as some of these features require large quantities of careful work, for example, mitigations for transient execution vulnerabilities.

A few years ago, thanks to David Malcolm at Red Hat, GCC gained a static analyzer. This uses a set of algorithms at compile time to find patterns in a program that would lead to memory safety bugs when the program is executed.

How do I turn it on?

Run GDC like you normally would and add the -fanalyzer flag. If you’re already bored of reading and want to have a go, please use Matt Godbolt’s excellent compiler explorer. Start with this simple example.

Which patterns does it look for?

Some memory bugs

From the GCC documentation, we can get a list of every warning the analyzer can emit:

-Wanalyzer-double-fclose 
-Wanalyzer-double-free 
-Wanalyzer-exposure-through-output-file 
-Wanalyzer-file-leak 
-Wanalyzer-free-of-non-heap 
-Wanalyzer-malloc-leak 
-Wanalyzer-mismatching-deallocation 
-Wanalyzer-possible-null-argument 
-Wanalyzer-possible-null-dereference 
-Wanalyzer-null-argument 
-Wanalyzer-null-dereference 
-Wanalyzer-shift-count-negative 
-Wanalyzer-shift-count-overflow 
-Wanalyzer-stale-setjmp-buffer 
-Wanalyzer-tainted-array-index 
-Wanalyzer-unsafe-call-within-signal-handler 
-Wanalyzer-use-after-free 
-Wanalyzer-use-of-pointer-in-stale-stack-frame 
-Wanalyzer-write-to-const 
-Wanalyzer-write-to-string-literal 

These names are fairly descriptive. However, let’s take a look at some examples before going into detail.

Let’s say we have some code that allocates a buffer for itself via malloc, like the following.

int usesTheHeap(size_t x)
{
    import core.stdc.stdlib : malloc, free;
    int[] slice = (cast(int*) malloc(int.sizeof * x))[0..x];
    slice[] = 0;
    // Algorithm goes here
    return 0;
}

For this code, the static analyzer gives us two warnings, the first of which is the following:

warning: leak of 'slice.ptr' [CWE-401]
   11 | }
      | ^
  'usesTheHeap': events 1-3
    |
    |    8 |     int[] slice = (cast(int*) malloc(int.sizeof * x))[0..x];
    |      |                                     ^
    |      |                                     |
    |      |                                     (1) allocated here
    |    9 |     slice[] = 0;
    |      |     ~                                
    |      |     |
    |      |     (2) assuming 'slice.ptr' is non-NULL
    |   10 |     // Algorithm goes here
    |   11 | }
    |      | ~                                    
    |      | |
    |      | (3) 'slice.ptr' leaks here; was allocated at (1)

As you might expect, since we didn’t free the memory we allocated, the analyzer warns us that the memory leaks at the end of the scope.

The second warning complains that we used the memory from malloc without checking if it was null. Program failure due to dereferencing a null-pointer is sometimes desirable in D, so you can turn this off with -Wno-analyzer-possible-null-dereference if you need to.

Thanks to assert being built into the core language and being lowered to a construct that GCC understands, we can use it to make the analyzer assume a pointer is non-null:

int usesTheHeap(size_t x)
{
    import core.stdc.stdlib : malloc, free;
    void* allocatedBuffer = malloc(int.sizeof * x);
    assert(allocatedBuffer != null);
    // The program may not proceed if the pointer is null
    int[] slice = (cast(int*) allocatedBuffer)[0..x];
    slice[] = 0; //So the analyzer knows this is safe.
    // Algorithm goes here
    return 0;
}

More than malloc and free

Let’s think about something that (obviously) uses memory, but isn’t always considered part of memory safety: although it’s not encouraged, you can use setjmp and longjmp from C in D code. As with many C features, these really can blow up in your face.

Look at the following:

import core.sys.posix.setjmp;

void main()
{
    jmp_buf local;
    void set()
    {
        setjmp(local);
    }
    set();
    longjmp(local, 0);
} 

We set the buffer inside set, but the buffer is now primed, ready, and pointing to nothing (technically it is something but that something is chaotic). Thankfully, the analyzer can warn us about this as in the following:

<source>: In function 'D main':
<source>:11:12: warning: 'longjmp' called after enclosing function of 'setjmp' has returned [-Wanalyzer-stale-setjmp-buffer]
   11 |     longjmp(local, 0);
      |            ^
  'D main': events 1-2
    |
    |    3 | void main()
    |      |      ^
    |      |      |
    |      |      (1) entry to 'D main'
    |......
    |   10 |     set();
    |      |        ~
    |      |        |
    |      |        (2) calling 'set' from 'D main'
    |
    +--> 'set': events 3-5
           |
           |    6 |     void set()
           |      |          ^
           |      |          |
           |      |          (3) entry to 'set'
           |    7 |     {
           |    8 |         setjmp(local);
           |      |               ~
           |      |               |
           |      |               (4) 'setjmp' called here
           |    9 |     }
           |      |     ~     
           |      |     |
           |      |     (5) stack frame is popped here, invalidating saved environment
           |
    <------+
    |
  'D main': events 6-7
    |
    |   10 |     set();
    |      |        ^
    |      |        |
    |      |        (6) returning to 'D main' from 'set'
    |   11 |     longjmp(local, 0);
    |      |            ~
    |      |            |
    |      |            (7) 'longjmp' called after enclosing function of 'setjmp' returned at (5)
    |

Beyond skin-deep

While important, stack corruption and (simple) memory leaks are old hat; catching them is usually relatively (touch wood) easy with modern programming practices, programming language design (i.e., sound memory safety analysis), sanitizers, and toolings like Valgrind or your favorite debugger. For less trivial issues, finding the issues when they happen in a controlled environment is still relatively easy with the above tools if the program fails, but finding why they happened could require manually instrumenting the program. Finding issues early is important and appreciated.

The analyzer is interprocedural, i.e., it can see across function boundaries (when the information is available). In some older codebases you can sometimes see code like this:

struct Handle
{
    void* x;
    void reset()
    {
        free(x);
    }
    ~this()
    {
        free(x);
    }
}
void accept(Handle x)
{
    x.reset();
    // Destructor called 
}

This yields a double-free. The analyzer is able to see “inside” the destructor and thus correctly warns about the double-free and what causes it.

The following seems to be sensitive to the optimization settings used but is very important when it works: iterator invalidation. That is to say, we hand out a pointer to somewhere, end up (say) realloc-ing, and suddenly that pristine pointer is now a pointer to absolutely nowhere.

struct Vector
{
    int* handle;
    void expand(size_t sz)
    {
        int* newPtr = cast(int*) realloc(handle, sz);
        assert(newPtr);
        handle = newPtr;
    }
    ~this()
    {
        free(handle);
    }
}
void iter(Vector x)
{
    int* copy = x.handle;
    x.expand(1000);
    *copy = 3;
}

The analyzer sees this and spits out the following:

<source>: In function 'iter':
<source>:23:11: warning: use after 'free' of 'copy_5' [CWE-416] [-Wanalyzer-use-after-free]
   23 |     *copy = 3;
      |           ^
  'iter': events 1-2
    |
    |   19 | void iter(Vector x)
    |      |      ^
    |      |      |
    |      |      (1) entry to 'iter'
    |......
    |   22 |     x.expand(1000);
    |      |             ~
    |      |             |
    |      |             (2) calling 'expand' from 'iter'
    |
    +--> 'expand': events 3-7
           |
           |    8 |     void expand(size_t sz)
           |      |          ^
           |      |          |
           |      |          (3) entry to 'expand'
           |    9 |     {
           |   10 |         int* newPtr = cast(int*) realloc(handle, sz);
           |      |                                         ~
           |      |                                         |
           |      |                                         (4) freed here
           |      |                                         (5) when '__builtin_realloc' succeeds, moving buffer
           |   11 |         assert(newPtr);
           |      |         ~ 
           |      |         |
           |      |         (6) following 'false' branch...
           |   12 |         handle = newPtr;
           |      |                ~
           |      |                |
           |      |                (7) ...to here
           |
    <------+
    |
  'iter': events 8-9
    |
    |   22 |     x.expand(1000);
    |      |             ^
    |      |             |
    |      |             (8) returning to 'iter' from 'expand'
    |   23 |     *copy = 3;
    |      |           ~  
    |      |           |
    |      |           (9) use after 'free' of 'copy_5'; freed at (4)
    |

Inline assembly

The analyzer was partly intended to help eliminate bugs in the Linux kernel. As such, it is useful to be able to analyze inline assembly (which is commonplace in the kernel). An example will not be given here, but GCC has gained the ability to analyze basic X86 inline assembly.

Some idiosyncrasies

The static analyzer is implemented as just another pass inside GCC (there are hundreds). This means that some warnings may magically disappear under certain optimization settings as the compiler eliminates dead code and propagates information.

Similarly, the quality of output does vary with the flags used. We won’t discuss it here, but options exist to increase the usefulness of diagnostics by performing more sophisticated analysis, for example, by propagating constraints through analyzed branches and thus eliminating some paths which are superficially “possible” but can, in fact, be eliminated by considering the semantics of the code.

Finding bugs when combining C and D

The static analyzer was designed for use with C (and C++, but mostly the former) and operates on GCC’s IR. If we use link-time optimization, we can combine the IR from compilation units in different languages (D and C), then use the analyzer to look for bugs across language boundaries.

Let’s say we have an unfortunate C library with two functions, doWork and terminate. They both accept void*, but they expect the memory to be allocated by the user of the library rather than by a matching init function.

#include <stdlib.h>
void doWork(void* ptr)
{
    // Do something, doesn't matter what here
}
void terminate(void* ptr)
{
    // Clean up things attached to ptr
    free(ptr);
}

Assuming we have no access to the C source and assuming the library documentation fails to mention that terminate calls free, we would likely write the following code:

extern(C) void doWork(void*);
extern(C) void terminate(void*);

void main()
{
    import core.stdc.stdlib : malloc, free;
    void* buf = malloc(100);
    scope(exit) free(buf);
    buf.doWork();
    buf.terminate();
}

If we’re lucky, we’ll see an error message like

free(): double free detected in tcache 2
Aborted (core dumped)

which is better than nothing but nonetheless not ideal if we were unfamiliar with the code.

If instead, we compile with gdc d.d c.c -fanalyzer -flto (the last flag is essential), we get this warning:

In function ‘D main’:
d.d:11:14: warning: double-‘free’ of ‘buf_6’ [CWE-415] [-Wanalyzer-double-free]
   11 |  scope(exit) free(buf);
      |              ^
  ‘D main’: event 1
    |
    |/usr/lib/gcc/x86_64-linux-gnu/10/include/d/__entrypoint.di:33:5:
    |   33 | int _Dmain(char[][] args);
    |      |     ^
    |      |     |
    |      |     (1) entry to ‘D main’
    |
  ‘D main’: events 2-3
    |
    |d.d:10:8:
    |   10 |  void* buf = malloc(100);
    |      |        ^
    |      |        |
    |      |        (2) allocated here
    |......
    |   13 |  buf.terminate();
    |      |  ~
    |      |  |
    |      |  (3) calling ‘terminate’ from ‘D main’
    |
    +--> ‘terminate’: events 4-5
           |
           |c.c:6:6:
           |    6 | void terminate(void* ptr)
           |      |      ^
           |      |      |
           |      |      (4) entry to ‘terminate’
           |    7 | {
           |    8 |     free(ptr);
           |      |     ~
           |      |     |
           |      |     (5) first ‘free’ here
           |
    <------+
    |
  ‘D main’: events 6-7
    |
    |d.d:13:2:
    |   11 |  scope(exit) free(buf);
    |      |              ~
    |      |              |
    |      |              (7) second ‘free’ here; first ‘free’ was at (5)
    |   12 |  buf.doWork();
    |   13 |  buf.terminate();
    |      |  ^
    |      |  |
    |      |  (6) returning to ‘D main’ from ‘terminate’
    |

This found our bug straight away. Thank you very much, static analysis.

Conclusion

The way this analyzer is implemented can serve as a lesson on the usefulness of IRs as a tool for analysis rather than merely optimization. A similar analysis is currently performed on the AST in the D frontend, but that’s slow and fairly ugly to write (let alone read).

I don’t think using a static analyzer is a replacement for a carefully designed language-level memory safety story, but I am very glad it exists. The fact that it is usable and useful from D is a testament to the benefits of D’s presence in GCC and diversity of implementation.

DLang News September/October 2021: D 2.098.0, OpenBSD, SAOC, DConf Online Swag

Digital Mars D logo

Version 2.098.0 of the D programming language is now available in the form of DMD 2.098.0 (the reference D compiler) and LDC 1.28.0 (the LLVM-based D compiler), D has come to OpenBSD, cool things are happening thanks to the Symmetry Autumn of Code, and DConf Online 2021 t-shirts are available for purchase.

Read on for the deets.

DMD 2.098.0

This release comes with 17 major changes and 160 fixed Bugzilla issues from 62 contributors across the core repositories. The number of fixed issues may well be a record high. The 2.097.0 release had 144, and the 2.094.0 release had 119, but a cursory look at several other major releases shows numbers ranging from the high 40s to under 100, with counts in the 50s showing up frequently. This is the sort of trend we were hoping to see when Razvan Nitu came on board as our Pull Request and Issue Manager, and we couldn’t be more pleased.

There are two items of note that I’d like to point out from the new release, and then I have a little more to say about the work Razvan is doing.

ImportC

The ImportC compiler is a major enhancement to D that allows the D compiler to directly compile C source code. Walter has been working on it for a few months now, and this is the first release in which it’s available. ImportC enables the compiler to inline C function calls and even evaluate them at compile time via CTFE. ImportC targets C11 and does not currently handle preprocessor directives, so any C source you do intend to compile must first be run through a preprocessor. It’s not yet complete, but if you have a use case for it, any help in finding and reporting ImportC bugs is welcome. Contributions to fix said bugs doubly so!

Fork-based garbage collector

This release also includes an optional concurrent garbage collector for Posix systems. This is cool in and of itself, but more so because the project came to fruition thanks to the Symmetry Autumn of Code. It was originally developed for D1 by Leandro Lucarella but was never included in an official release (using alternative GCs back then required more than just a simple command-line switch). In 2018, for the inaugural edition of SAOC, Francesco Mecca undertook to port the GC to D2. This resulted in a pull request to DRuntime that was ultimately merged in time for this release by Rainer Schuetze.

To use the new GC, provide the DRuntime option --DRT-gcopt=fork:1 on the command-line of any program compiled against DRuntime 2.098.0+ (this is not a compiler option, but an option to any program linked with DRuntime). It can also be configured programmatically via:

extern(C) __gshared string[] rt_options = [ "gcopt=fork:1" ];

See the D documentation for more GC configuration options.

Shrinking the pull-request queues

Razvan has been managing pull requests across several of our repositories, but he’s been laser-focused on reducing the number of PRs in the phobos and druntime repositories, with dmd his next target. This isn’t just about lowering the PR count. He’s been reviving old PRs with the original author where he can (he tells me he was surprised how many PR authors were responsive, even after no activity on a PR for a few years) and has tried to rebase and resolve those where he can’t. Here are some statistics he’s gathered on PR activity so far this year across the phobos, druntime, and dmd repositories:

  • phobos: 568 PRs created, 650 PRs closed
  • druntime: 283 PRs created, 311 PRs closed
  • dmd: 1140 PRs created, 1126 closed

At the time he sent me the stats on October 29th, the number of open PRs in phobos had gone down from 160 to 77 and druntime from 130 to 96. The number of open PRs in dmd has remained fairly constant at around 230.

We want to thank Razvan for all the work he is doing, Symmetry Investments for sponsoring his position, the volunteer members of the “strike teams” Razvan has assembled to squash as many bugs as possible, and every contributor who has donated and continues to donate their time and effort to improving our favorite programming language.

LDC 1.28.0

The latest release of LDC implements D 2.098.0 (D frontend, DRuntime, and Phobos) and is compatible with LLVM 6.0 – 12.0.

A major item in this release is that LDC now supports dynamic casts across binary boundaries. DLL support has long been a weak point in D, often requiring the programmer to resort to extern(C) functions that return handles (pointers, references) to D objects. Martin Kinkelin has worked to improve the situation in LDC, motivated primarily by the desire to provide the standard library and runtime as a DLL on Windows.

Thanks to Martin and all the LDC contributors for the work they do to keep LDC releases in sync with those of DMD. If you benefit from their efforts, please consider sponsoring Martin (and LDC by extension) on GitHub!

D on OpenBSD

The D ecosystem grows primarily because of the efforts of volunteers who step forward to fill in the blanks. New D projects pop up all the time, but it’s pretty rare to hear that someone has brought D to a new platform. Brian Callahan has done just that.

Brian has been on a mission to bring D to OpenBSD. In August of this year, he popped into the D forums with an announcement that GDC, the GCC-based D compiler maintained by Iain Buclaw, was now available in the OpenBSD ports tree as part of GCC 11. In early October, he let us know that DMD was coming to the platform. Then in late October, he had the same news about LDC. Instructions for installing DMD on OpenBSD are on the download page (and can be extrapolated to LDC and GDC).

We are grateful to Brian for the work he has done to make this happen. We’re looking forward to his upcoming DConf Online 2021 talk, Life Outside the Big 4: The Adventure of D on OpenBSD:

The journey of D from pie-in-the-sky to a package officially offered in the OpenBSD package repository serves as a model story for other platforms who want to offer D to their userbase. We will walk through the many interconnected parts required to get a D package on OpenBSD, what the future is like for D outside the Big 4, how you can get started with D on your platform, and how those of us who enjoy life outside the Big 4 can be a positive force for D and the D community.

SAOC News

The SAOC 2021 progress bar is past the 25% mark. The first milestone wrapped up on October 15, and the participants have been posting weekly progress reports in the General Forum. It’s always interesting to read about the challenges they encounter and their solutions. But the latest SAOC isn’t the only edition about which there is news to report.

I’ve written above about the SAOC 2018 forking GC project that has found its way into the latest release of DRuntime. I can’t begin to tell you how pleased I am that another SAOC project has come into its own.

For SAOC 2020, Adela Vais set out to implement a D backend for the venerable Bison parser generator. Not only did Adela successfully complete SAOC, she saw her project through to its ultimate goal. The D backend was officially released as part of Bison 3.81 in September.

We want to offer Adela our congratulations and a huge round of applause for a job well done! Getting a project of this scope accepted into a GNU codebase is no mean feat.

DConf Online 2021 T-Shirts

DConf Online 2021 is less than a month away. The D Language Foundation will be providing DConf Online 2021 swag to the DConf speakers and prizes to viewers asking questions in the post-talk live stream Q & A sessions. The cost of the items and their shipping are the only DConf Online expenses, and they’re covered by the D Language Foundation General Fund.

Direct donations to the General Fund and our more targeted funds are always appreciated, but you can also help support the D programming language and DConf Online by purchasing a DConf Online 2021 T-Shirt or other D swag in the DLang Swag Emporium. All proceeds go straight into the General Fund. You get some swag along with our gratitude, and we get a couple of bucks. That’s a pretty good deal!

Looking Forward

As we near the end of 2021, we are looking forward to 2022 and beyond. The D programming language, its ecosystem, and its community have come a long way from the gaggle of curious coders who first took an interest in a one-man project by the guy who had created the game Empire and the Zortech C++ compiler.

The primary means of contributing to the core D projects went from emailing patches to Walter, to posting patches on Bugzilla, to committing to a Subversion repository, to submitting pull requests on GitHub. The web site went from being a few basic HTML pages of the D spec on digitalmars.com maintained only by Walter, to a simple HTML site designed by a community member under the dlang.org domain, to the more complex collection of pages and scripts that today is maintained in Ddoc by multiple contributors. The ecosystem has gone from random libraries and tools hosted by individuals on myriad services, to centralized hosting at dsource.org, to the package repository at code.dlang.org.

These are just some examples of major changes over the years, each in response to growth: as the community grew in size, some of the processes and systems began to burst at the seams. To continue to grow, something had to change. Such improvements have nearly always been the result of community action: discussion and debate in the forums eventually would lead to a champion stepping forward to make it happen. Community action has been the driving force of D since Walter first announced the “D alpha compiler” in late 2001. That’s still true today. We have a handful of paid positions, but we are still primarily driven by volunteers.

The see-a-problem-and-fix-it philosophy that carried D to where we are today has served us well, and we hope it will continue to do so into the future. But that alone is no longer enough. We are bursting at the seams again, and have been for some time. In the monthly foundation meetings, we’ve been discussing specific issues, both low level and high, and how to solve them. But there’s one thing that’s been missing from the equation: organization.

Razvan Nitu’s position as Pull Request & Issue Manager grew out of an email discussion, prompted by Laeeth Isharc, and was a year in the making. We are grateful for every volunteer who has and continues to make themselves available to review pull requests. Razvan is here not to replace them, but to complement them. They can continue as they have done. What Razvan brings to the mix is organization. He’s there to make sure fewer issues and PRs fall through the cracks, to ensure that as many issues as possible that can be resolved are resolved.

In November, the D Language Foundation and a couple of contributors are meeting with a community member who has graciously volunteered his time and expertise to advise us on how to bring the disparate servers in the D community under Foundation management and multiple admins. The end goals are to eliminate the financial burden on the volunteers who maintain these services and, hopefully, reduce the response time when it comes to solving server-related issues or making changes. In other words, organization.

I’m in the middle of revising the Vision Document that we put together over the summer. I’m not just editing it, though. I’m expanding it. My vision of the vision document has evolved since we first discussed a “goal-oriented task list” in our June meeting. I said at the time that I didn’t “know what the initial version of the final list will look like”. I feel that what we came up with falls short of meeting the need it was intended to fill. Now, I’m pretty sure of what it needs to look like. At the moment, I’m swamped in preparations for DConf Online 2021, so I’ve put the document on the backburner. I plan to pick it up again in early December and present my revisions at the last foundation meeting of the year for approval. If all goes well, it should be published on dlang.org in January. This will be a living document, updated to reflect current priorities as time goes by.

Mathias Lang is working on a proposal to bring organization into even more of our processes. It’s a modified version of the governance proposal he brought to the September foundation meeting, the aim of which is to formalize a core team to oversee the day-to-day guidance and management of the D ecosystem. I hope that this will take what already happens in our monthly meetings to the next level. I see this as a means to establish a framework for creating workgroups that can oversee specific tasks and projects, bringing more opportunities for follow-up and follow-through. It should also help provide guidance and establish priorities (e.g., via revisions to the vision document) so that independent contributors can direct their efforts not just to the issues they care about, but those that are seen as a priority by the core team. (I want to emphasize that this is my personal view. Mathias has yet to complete the proposal. But my view is informed by what we discussed in the September meeting.)

With these and future steps aimed at better organizing our community, we intend to level up our ecosystem: motivate library development, improve the onboarding experience, increase retention, make it easier to contribute, and generally resolve the long-standing issues that tarnish the experience of using the best programming language we know. We ask our current volunteers to keep volunteering, and those who aren’t yet doing so to keep an eye out for the right opportunity to pitch in. Together, we can get to where we all want to go.

D News Roundup

Version 2.097.0 of DMD, the D programming language reference compiler, was released on June 5th in the middle of new GDC and LDC release announcements, while preparations for two major D community events were underway: the Symmetry Autumn of Code 2021 and DConf Online 2021. We’ll cover it all in this post, with a focus first on the events.

Symmetry Autumn of Code 2021

Symmetry Investments logo

As I write, Symmetry Investments employs in the neighborhood of 180 full-time workers and manages over US$8 billion of capital, and they’re always on the lookout for more employees, including programmers to work with D and other languages. They sponsored DConf 2019 in London and have sponsored the annual Symmetry Autumn of Code since 2018, in which a handful of programmers are paid to work for four months on projects of benefit to the D ecosystem.

This year marks the fourth annual SAoC, and we are now accepting applications. Participants will plan four milestones for projects that benefit the D ecosystem and will be expected to work at least 20 hours per week on each milestone. Each participant will be rewarded US$1000 for the successful completion of each of the first three milestones. At the end of the final milestone, the SAoC committee will review the overall progress of each of the remaining participants. One will be rewarded with a final $US1000 payment and a free pass to the next real-world DConf, with reimbursement for travel and lodging. In last year’s event, a second participant was also awarded a fourth US$1000 payment.

Participation in SAoC has led to jobs for some lucky coders and has generally been a valuable learning experience for those who have completed it. Students currently enrolled in graduate or postgraduate university programs will be given priority, but applications are open to all. The application deadline is August 18th. Project ideas can be found in the D community’s projects repository at GitHub. See the Symmetry Autumn of Code page here at the D Blog for all the details on how to apply as a participant or as a mentor.

DConf Online 2021

For the second consecutive year, we were unable to hold a real-world DConf. Last year we launched the first annual DConf Online. And when I say annual, I mean annual! We’re doing it again this year and will continue to do it going forward even after the real-world DConfs are back on.

DConf Online 2021 will take place November 20 and 21 on the D Language Foundation’s YouTube channel. Once again, we’re looking for pre-recorded talks, livestream panels, and livecoding sessions. If you’d like to propose something in one of those categories, the application deadline is September 5. Please visit the DConf Online 2021 homepage for all the details.

And if you haven’t seen them yet, the DConf Online 2020 and DConf Online 2020 Q & A playlists are available on the same channel. You can also find a full list of talks and all the links (talk videos, slides, and Q & A videos) on the DConf Online 2020 homepage.

New compiler releases

D 2.097.0 is live in the latest release of DMD and the beta release of LDC, the LLVM-based D compiler. The new version of GDC also came into the world as part of GCC 11.1 at the end of April.

DMD 2.097.0

Digital Mars D logo

This version of DMD comes with 29 major changes and 144(!) fixed Bugzilla issues courtesy of 54 contributors. Changes include a few deprecations and several improvements to the standard library. Two things stand out:

  • while(auto n = expression) has been on a few wishlists for a while. Now it’s a reality. The same syntax that was already possible with if statements is considered idiomatic in certain circumstances (such as when checking if an item exists in an associative array). Expect the while condition assignment to start popping up in open-source D projects soon.
  • std.sumtype is another wishlist item that is a wish no more. The new SumType is a replacement for std.variant.Algebraic. It’s a discriminated union that makes good use of Design by Introspection with a nice match syntax for those looking for that sort of thing. It’s been quite a while since the last time a new module was added to the D standard library. Many thanks to Paul Backus for putting in the effort to see it through, and a very big Congratulations!

LDC 1.27.0-beta1

LDC logo

On the same day the new DMD was released, the first beta of LDC 1.27.0, which also supports D 2.097.0, was announced in the D forums.

On top of 2.097.0 support, this version of LDC provides greatly improved DLL support on Windows. The prebuilt Windows packages ship with DRuntime and Phobos DLLs. This is big news for D developers on Windows. We’ve long had issues with D DLLs that have prevented heavy use outside of simple interfaces (with APIs exported as extern(C) being the most reliable).

There are some limitations to be aware of, such as the inability to directly access TLS variables across DLL boundaries (though it’s fine with accessor functions). Please see the release page for the details.

Thanks to Martin Kinkelin and all the LDC maintainers and contributors for their continued work on LDC. They aren’t getting paid for this. If you are a happy LDC user or just like the idea of the project, you can support their work by sponsoring Martin Kinkelin on GitHub.

GDC 11.1

In the GCC world, Iain Buclaw continues to make strides on the GDC compiler.

GDC 11.1 still uses the old C++ version of the D frontend, which feature-wise is mostly (see below) at D 2.076.1. There were significant issues in upstream DMD that prevented Iain from making the switch to the D version of the frontend in time to make the release window. He is currently aiming to make the switch in time for GDC 12. As a consolation, this release has support for three BSDs, Mac OS X, and MinGW!

Despite the older frontend, Iain has backported several fixes and optimizations, and even a few features, so it isn’t your grandfather’s D 2.076.1 that GDC supports. For example, the new bottom type that recently made its way through the D Improvement Proposal review process has found its way into this GDC release. See the forum announcement for details of all the new D goodness in GDC 11.1 and Please consider sponsoring his work on GitHub.

One-off donations

If you aren’t up for sponsoring Martin or Iain but would still like to support them financially, you can make one-time donations through the D Language Foundation. You can send money to the D General Fund, the D Open Collective, or to our PayPal account. Whichever method you choose, please be sure to leave a note that the donation is intended for LDC, GDC, or any D project you would like to support. We’ll make sure the appropriate person receives the money.

Other options for supporting the D programming language: visit the D Language Foundation donation page and donate to one of our funds, head to the DLang Swag Emporium and purchase any items that catch your eye (the D Rocket stuff rocks, and DConf Online 2021 swag will be available shortly), or consider using smile.amazon.com and selecting the D Language Foundation as your charity the next time you shop at Amazon.com (we are only available through the .com domain; browser extensions like SmartAmazonSmile for Firefox and AmazonSmileRedirect for Chrome make it easy to do).

Thanks to everyone who has, will, or continues to support the D programming language, either through donations of time or money. We’ve gotten where we are through community effort, and community effort will keep pushing us forward. D rocks!

A New Year, A New Release of D

Here in DLang Land we’re beginning the new year with a new release of the D reference compiler (DMD) and a beta release of the popular LLVM-based D compiler (LDC). D 2.095.0 is crammed full of 27 major changes and 78 fixes from 61 contributors. Following are some highlights that I expect some D programmers might find interesting, but please see the changelog for the full rundown. Those more interested in Bugzilla issue numbers can jump straight to the bugfix list

D 2.095.0

Digital Mars D logo

D’s support for other programming languages is important for interacting with existing codebases. C ABI compatibility has been strong from the beginning. Support for Objective-C and C++ came later. Though C++-compatibility is a bear to get right, it keeps improving with every compiler release. This release continues that trend and also enhances Objective-C support. We also see a number of QOL (quality-of-life) improvements throughout the compiler, libraries, and tools. DUB, the D build tool and package manager that ships with the compiler (and is also available separately), especially gets a good bit of love in this release.

C++ header generation

For a little while now, DMD has included experimental support for the generation of C++ header files from D source code, via the -CH command-line option, in order to facilitate calling D libraries from C++. For example, given the following D source file:

cpp-ex.d

extern(C++):
struct A {
    int x;
}

void printA(ref A a) {
    import std.stdio : writeln;
    writeln(a);
}

And the following command line:

dmd -HC cpp-ex.d

The compiler outputs the following to stdout (-HCf to specify a file name, and -HCd a directory):

// Automatically generated by Digital Mars D Compiler

#pragma once

#include <assert.h>
#include <stddef.h>
#include <stdint.h>
#include <math.h>

#ifdef CUSTOM_D_ARRAY_TYPE
#define _d_dynamicArray CUSTOM_D_ARRAY_TYPE
#else
/// Represents a D [] array
template<typename T>
struct _d_dynamicArray
{
    size_t length;
    T *ptr;

    _d_dynamicArray() : length(0), ptr(NULL) { }

    _d_dynamicArray(size_t length_in, T *ptr_in)
        : length(length_in), ptr(ptr_in) { }

    T& operator[](const size_t idx) {
        assert(idx < length);
        return ptr[idx];
    }

    const T& operator[](const size_t idx) const {
        assert(idx < length);
        return ptr[idx];
    }
};
#endif

struct A;

struct A
{
    int32_t x;
    A() :
        x()
    {
    }
};

extern void printA(A& a);

This release brings a number of fixes and improvements to this feature, as can be seen in the changelog. Note that generation of C headers is also supported via -H, -Hf, and -Hd.

Default C++ standard change

Prior to this release, extern(C++) code was guaranteed to link with C++98 binaries out of the box. This is no longer true, and you will need to pass -extern-std=c++98 on the command line to maintain that behavior. The C++11 standard is now the default.

Additionally, the compiler will now accept -extern-std=c++20. In practice, the only effect this has at the moment is to change the compile-time value, __traits(getTargetInfo, "cppStd"), but new types may be added in the future.

Improved Objective-C support

Objective-C compatibility is enhanced in this release with support for Objective-C protocols. This is achieved by repurposing interface in an extern(Objective-C) context. Additionally, the attributes @optional and @selector help get the job done. Read the details and see an example in the changelog.

Improved compile-time feedback

Here’s a QOL issue that really became an annoyance after a deprecation in Phobos, the standard library: when instantiating templates, deprecation messages reported the source location deep inside the library where the deprecated feature was used (e.g., template constraints) and not the user-code instantiation that triggered it. No longer. You’ll now get a template instantiation trace just as you do on errors.

Another QOL feedback issue involved the absence of errors. The compiler would silently allow multiple definitions of identical functions in the same module. The compiler will now raise an error when it encounters this situation. However, multiple declarations are allowed as long as there is at most one definition. For mangling schemes where overloading is not supported (extern(C), extern(Windows), and extern(System)), the compiler will emit a deprecation message.

The mainSourceFile in DUB recipes

The mainSourceFile entry in DUB package recipes was a way to specify a source file containing a main function that should be excluded from unit tests when invoking dub test. However, when setting up other configurations where the file should also not be compiled, or where a different main source file was required, it was necessary to add the file to an excludedSourceFiles entry. This is no longer the case. If a mainSourceFile is specified in any configuration, it will automatically be excluded from other configurations.

Propagating compiler flags to dependencies

Not every existing compiler flag has a corresponding build setting for DUB recipes. The dflags entry allows for such flags to be configured for any project. For example, -fPic, or -preview=in. The catch is, it does not propagate to dependencies. Now, you can explicitly specify compiler flags for dependencies by adding a dflags parameter to any dependency entry in a dub.json recipe. For example:

{
    "name": "example",
    "dependencies": {
        "vibe-d": { "version" : "~>0.9.2", "dflags" : ["-preview=in"] }
    }
}

Unfortunately, it appears the implementation does not work for recipes in SDLang format (dub.sdl), so those of us who prefer that format over JSON will have to wait a bit.

LDC 1.025-beta1

LDC logo

This release of LDC brings the compiler up to date with the D 2.095.0 frontend, with the prebuilt packages based on LLVM v11.0.1. The biggest news in this release looks to be the new -linkonce-templates flag. This experimental feature causes the compiler to emit template symbols into each compilation unit that references them, “with optimizer-discardable linkonce-odr linkage”. The implementation has big wins both in terms of compile times when compiling with optimizations turned on and in cutting down on a class of template-related bugs. See the beta1 release notes for the details.

Happy New Year

On behalf of the D Language Foundation, I wish you all the very best for 2021. As a community, we weren’t affected much by the global pandemic. Sure, we were forced to cancel DConf 2020, but the silver lining is that it also motivated us to finally launch DConf Online in November. We fully intend to make this an annual event alongside of, not in place of, the real-world conference (when physically possible). Other than that, it was business as usual in D Land.

At a personal level, the lives of some in our community were disrupted last year in ways large and small. Please remember that, though the primary object that brings us together is our enthusiasm for the D programming language, we are all still human beings behind our keyboards. The majority of work that gets done in our community is carried out on a volunteer basis. All of us, as the beneficiaries, must never forget that the health and well-being of everyone in our community take top priority over any work we may want or expect to see completed. We encourage everyone to keep an ear open for those who may need to borrow it, and never be afraid to communicate that need when it feels necessary. Sometimes, an open ear can make a very big difference.

Thanks to all of you for your participation in the D community, whether as a user, a contributor, or both. Stay safe, and have a very happy 2021.

DustMite: The General-Purpose Data Reduction Tool

If you’ve been around for a while, or are a particularly adventurous developer who enjoys mixing language features in interesting ways, you may have run into one compiler bug or two:

Implementation bugs are inevitably a part of using cutting-edge programming languages. Should you run into one, the steps to proceed are generally as follows:

  1. Reduce the failing program to a minimal, self-contained example.
  2. Add a description of what happens and what you expect to happen.
  3. Post it on the bug tracker.

Nine years ago, an observation was made that when filing and fixing compiler bugs, a disproportionate amount of time was spent on the first step. When your program stops compiling “out of the blue”, or when the bug stops reproducing after the code is taken out of its context, manually paring down a large codebase by repeatedly cutting out code and checking if the bug still occurs becomes a tedious and repetitive task.

Fortunately, tedious and repetitive tasks are what computers are good for; they just have to be tricked into doing them, usually by writing a program. Enter DustMite.


The first version.

The basic operation is simple. The tool takes as inputs:

  • a data set to reduce (such as, a directory containing D source code which exhibits a particular compiler bug)
  • an oracle (or, more mundanely, a test script), which itself:
  • takes as input a variation of the data set, and
  • produces a yes-or-no answer on whether the input still satisfies the sought property (such as reproducing the particular compiler bug).

DustMite’s output is some local minimum variation of the data set, which it reaches by consecutively trying to remove parts of the data set and saving the results which the oracle approves. In the compiler bug example, this means removing bits of code which are not required to reproduce the problem at hand.

DustMite wouldn’t be very efficient if it attempted to remove things line-by-line or character-by-character. In order to maximize the chance of finding good reductions, the input is parsed into a tree according to the syntax of the input files.

Each tree node consists of a “head” (string), children (list of node pointers), and “tail” (string). Technically, it is redundant to have both “head” and “tail”, but they make representing some constructs and performing reductions much simpler, such as paren/bracket pairs.


Nodes are arranged into a binary tree as an optimization.

Additionally, nodes may have a list of dependencies. The dependency relationship essentially means “if this node is removed, these nodes should be removed too”. These constraints are not representable using just the tree structure described above, and are used to allow reducing things such as lists where trailing item delimiters are not allowed, or removing a function parameter and corresponding arguments from the entire code base at once.

In the case of D source code, declarations, statements, and subexpressions get their own tree nodes, so that they can be removed in one go if unneeded. The parser DustMite uses for D source code is intentionally very simple because it needs to handle potentially invalid D code, and you don’t want your bug reduction tool to also crash on top of the compiler.


How DustMite sees a simple D program.

An algorithm decides the order in which nodes are queued for potential deletion; DustMite implements several (it calls them “strategies”). Fundamentally, a strategy’s interface is (statei, resulti) ⇒ (statei+1, reductioni+1), i.e., which reduction is chosen next depends on the previous reduction and its result. The default “inbreadth” strategy visits nodes in ascending depth order (row by row) and starts over from the top as long as it finds new reductions.

DustMite today supports quite a few more options:


The current version.

Probably, the most interesting of these is the -j switch—one reason being that DustMite’s task is inherently not parallelizable. Which reduction is chosen next, and the tree version to which that reduction is applied, depends on the previous reduction’s result.

DustMite works around this by putting unused CPU cores to work on lookahead: using a highly sophisticated predictor, it guesses what the result of the current reduction will be, and based on that assumption, calculates the next reduction. If the guess was right, great! We get to use that result. Otherwise, the work is wasted. Implementing this meant that strategies now needed to have copyable state, and so had to be refactored from an imperative style to a state machine.

Unfortunately, although the highly expected feature was implemented four years ago, the initial implementation was rather underwhelming. DustMite still did too much work in the main thread and wasted too much CPU time on rescanning the data set tree on every reduction. The problem was so bad that, at high core counts, lookahead mode was even slower than single-threaded mode.

I have recently set out to resolve these inadequacies. The following obstacles stood in the way:

Problem 1: Hashing was too slow. Because the oracle’s services (i.e., running the test script) are usually expensive, DustMite keeps track of a cache of previously attempted reductions and their outcome. This helps because not all tree transformations result in a change of output, and some strategies will retry reductions in successive iterations. A hash of the tree is used as the cache key; however, calculating it requires walking the entire tree every time, which is slow for large inputs.

Would it be possible to make the hash calculation incremental? One approach would be Merkle trees (each node’s hash is the hash of its children’s hashes), however that is suboptimal in the case of e.g., empty leaf nodes. CS erudite Ivan Kazmenko blesses us with an answer: polynomial hashes! By representing strings as polynomials, it is possible to use modulo arithmetic to calculate an incremental fixed-size hash and cache subtree hashes per node.



Each node holds its cumulative hash and length.

The number theory staggered me at first, so I recruited the assistance of feep from #d. After we went through a few draft implementations, I could begin working on the final version. The first improvement was replacing the naive exponentiation algorithm with exponentiation by squaring (D CTFE allowed precomputing a table at compile-time and a faster calculation than the classical method). Next, there was the matter of the modulo.

Initially, we used integer overflow for modulo arithmetic (i.e. q=264), however Ivan cautioned against using powers of two as the modulo, as this makes the algorithm susceptible to Thue-Morse strings. Not long ago I was experimenting with using long multiplication/division CPU instructions (where multiplying one machine word by another yields the result in two machine words with a high and low part, and vice-versa for division). D allows generating assembler code specific to the types that the function template is instantiated with, though in DustMite we only use the unsigned 64-bit variant (on x86 we fall back to using integer overflow).

With the hashing algorithm implemented, all that remained was to mark dirty nodes (they or their children had their content edited) and incrementally recalculate their hashes as needed. Dependencies posed a small obstacle: at the time, they were implemented as simply an array of pointers to the dependency node within the tree. As such, we didn’t know how to get to their parents (to mark them dirty as well), however this was easily overcome by adding a “parent” pointer to each node.

Well, or so I thought, until I got to work on the next problem.

Problem 2: Copying the tree. At the time, the current version of the tree representing the data set was part of the global state. Because of this, applying a reduction was implemented twice:

This was clumsy, but faster and less complicated than making a copy of the entire tree just to change one part of it to test a reduction. However, doing so was a requirement for proper lookahead, otherwise we would be unable to test reductions based on results where past tests predicted a positive outcome, or do nearly anything in a separate thread.

One issue was the tree “internal pointers”—making a copy would require updating all pointers within the tree to point to the new copies in the new tree. This was easy for children/parent pointers (since we can reliably visit every such pointer exactly once), but not quite for dependencies: because they were also implemented as simple pointers to nodes, we would have to keep track of a map of which node was copied where in order to update the dependency pointers.

One way to solve this would be to change the representation of node references from pointers to indices into a node array; this way, copying the tree would be as simple as a .dup. However, large inputs meant many nodes, and I wanted to see if it was possible to avoid iterating over every node in the tree (i.e. O(n)) for every reduction.

Was it possible? It would mean that we would copy only the modified nodes and their parents, leaving the rest of the tree in-place, and only reusing it as the copies’ children. This goal conflicted with the existence of “parent” pointers, because a parent would have to point towards either the old or new root, so to resolve this ambiguity every node would have to be copied. As a result, the way we handled dependencies needed to be rethought.


Editing trees with “copy on write” involves copying just the edited nodes (🔴), and their parents.

With internal pointers out, the next best thing to array indices for referencing a node was a series of instructions for how to reach the node from the tree root: an address. The representation of these addresses that I chose was a bit string represented as a linked list, where each list node holds the child index at that depth, starting from the deep end. Such a representation can be arranged in a tree where the root-side ends are shared, mimicking the structure of the tree containing the nodes for the reduced data, and thus allowing us to reuse memory and minimize allocations.


Nodes cannot hold their own address (as that would make them unmovable),
which is why they need to be stored outside of the main tree.

For addresses to work, the object they point at needs to remain the same, which means that we can no longer simply remove children from tree nodes—an address going through the second child would become invalid if the first child was removed. Rewriting all affected addresses for every tree edit is, of course, impractical, which leads us to the introduction of tombstones—dead nodes that only serve to preserve the index of the children that follow it. Because one of the possible reduction types involves moving subtrees around the tree, we now also have “redirects” (which are just tombstones with a “see here” address attached).

With the above changes in place, we can finally move forward with fixing and optimizing lookahead, as well as implementing incremental rehashing in a way that’s compatible with the above! The mutable global “current” tree variable is gone, save now simply takes a tree root as an argument, and applyReduction is now:

/// Apply a reduction to this tree, and return the resulting tree.
/// The original tree remains unchanged.
/// Copies only modified parts of the tree, and whatever references them.
Entity applyReduction(Entity origRoot, ref Reduction r)

With the biggest hurdle behind us, and a few more rounds of applying Walter Bright’s secret weapon, the performance metrics started to look more like what they should:


Going deeper would likely involve using OS-specific I/O APIs or rewriting D’s GC.

A mere 3.5x speed-up from a 32-fold increase in computational power may seem underwhelming. Here are some reasons for this:

  • With a 50/50 predictor, predictions form a complete binary tree, so doubling the number of parallel jobs gives you +1x more speed. That’s roughly log₂(jobs)-1, or 4 for 32 jobs – not far off!

  • The results will depend on the reduction being performed, so YMMV. For a certain artificial test case, one optimization (not pictured above) yielded a 500x speed-up!

  • DustMite does not try to keep all CPU cores busy all the time. If a prediction turns out false, all lookahead jobs based on it become wasted work, so DustMite only starts new lookahead tasks when a reduction’s outcome is resolved. Perhaps ideally DustMite would keep starting new jobs but kill them as soon as it discovers they’re based on a misprediction. As there is no cross-platform process group management in Phobos, the D standard library, this is something I left for future versions.

  • Some work is still done in the main thread, because moving it to a worker thread actually makes things slower due to the global GC lock.

There still remains one last place where DustMite iterates over every tree node per reduction: saving the tree to disk (so that it could be read by the test script). This seems unavoidable at first, but could actually be avoided by caching each node’s full text contents within the node itself.

I opted to leave this one out. With the other related improvements, such as using lockingBinaryWriter and aggregating writes of contiguous strings as one I/O operation, the increase in memory usage was much more dramatic than the decrease in execution time, even when optimized to just one allocation per reduction (polynomial hashing gives us every node’s total length for free). But, for a brief instant, DustMite processed reductions in sub-O(n) time.

One more addition is worth mentioning: Andrej Mitrovic suggested a switch which would replace removed text with whitespace, which would allow searching for exact line numbers in the test script. At the time, its addition posed significant challenges, as there needed to be some way to keep removed nodes in the tree but exclude them from future removal attempts. With the new tree representation, this became much easier, and also allowed creating the following animation:

In conclusion, I’d like to bring up that DustMite is good at more than just reducing compiler test cases. The wiki lists some ideas:

  • Finding the source of ambiguous or misleading compiler error messages (e.g., errors with the file/line information pointing only inside the standard library).

  • Alternative (much slower, but also much more thorough) method of verifying unit test code coverage. Just because a line of code is executed, that doesn’t mean it’s necessary; DustMite can be made to remove all code that does not affect the execution of your unit tests.

  • Similarly, if you have complete test coverage, it can be used for reducing the source tree to a minimal tree which includes support for only enabled unittests. This can be used to create a version of a program or library with a test-defined subset of features.

  • The --obfuscate mode can obfuscate your code’s identifiers. It can be used for preparing submission of proprietary code to bug trackers.

  • The --fuzz mode (a new feature) can help find bugs in compilers and tools by creating random programs (using fragments of other programs as input).

But DustMite is not limited to D programs (or any kind of programs) as input. With the --split option, we can tell DustMite how to parse and reduce other kinds of files. DustMite successfully handled the following scenarios:

  • reducing C++ programs (the D parser supports some C++-only syntax too);

  • reducing Python programs (using the indent split mode);

  • reducing a large commit to a minimal diff (using the diff split mode);

  • reducing a commit list, when git bisect is insufficient due to the problem being introduced across more than any single commit;

  • reducing a large data set to a minimal one, resulting in the same code coverage, with the purpose of creating a test suite;

  • and many more which I do not remember.

Today, some version of DustMite is readily available in major distributions (usually as part of some D-related package), so I’m happy having a favorite tool one apt-get / pacman -S away when I’m not at my PC.

Discovering a problem which can be elegantly reduced away by DustMite is always exciting for me, and I’m hoping you will find it useful too.

D 2.091.0 Released

Digital Mars D logoThe latest release of DMD, the D reference compiler, ships with 18 major changes and 66 bugfixes from 55 contributors. This release contains, among other goodies, improvements to the Windows experience and enhancements to C and C++ interoperability. As fate would have it, the initial release announcement came in the aftermath of some unfortunate news regarding DConf 2020.

DMD on Windows

Over the years, some D users have remarked that the development of D is Linux-centric, that Windows is the black sheep or red-headed stepchild of D platforms. For anyone familiar with D’s early history, that seems an odd thing to say, given that DMD started out as a Windows-only compiler that could only output 32-bit objects in the OMF format. But it’s also understandable, as anyone not familiar with that history could only see that DMD on Windows lagged behind the Linux releases.

64-bit

One place where the official DMD releases on Windows have continued to differ from the releases on other platforms is the lack of 64-bit binaries in the release packages. Again, there’s a historical reason for this. The default output of the compiler is determined by how it is compiled, e.g., 32-bit versions output 32-bit binaries by default. When Walter first added support to DMD for 64-bit output on Windows, it required giving the back end the ability to generate object files in Microsoft’s version of the COFF format and also requiring users to install the Microsoft Build Tools and Platform SDK for access to the MS linker and system link libraries. This is quite a different experience from other platforms, where you can generally expect a common set of build tools to have been installed via the system package manager on any system set up for C and C++ development.

For a Windows developer who chooses GCC for their C and C++ development (or who does no C or C++ development at all), it’s a big ask to require them to download and install several GBs they might not already have installed and probably will never use for anything else. So D releases on Windows continued to ship with 32-bit binaries and the OPTLINK linker in order to provide a minimum out-of-the-box experience. That was a perfectly fine solution, unless you happened to be someone who really wanted 64-bit output (posts from disgruntled Windows users who didn’t want to install the MS tools can be found sprinkled throughout the forum archives).

Eventually, the LLVM linker (LLD) was added to the DMD Windows release packages, along with system link libraries generated from the MinGW definitions. This allowed users to compile 64-bit output out of the box and, once the kinks were worked out, eliminated the dependency on the MS linker. Yet, the official release packages still did not include a 64-bit version of DMD and still did not support 64-bit output by default.

With DMD 2.091.0, the black sheep has come back into the fold. The official DMD releases on Windows now ship with 64-bit binaries, so those of you masochists out there who cling to Makefiles and custom build scripts can expect the default output be what you expect it to be (for the record, DUB, the build tool and package manager that ships with DMD, has been instructing the compiler to compile 64-bit output by default on 64-bit systems for the past few releases).

Windows gets even more love

There are lots of goodies for Windows in this release. Another biggie is that DMD is now 30-40% faster on Windows. It’s no secret that LDC, the LLVM-based D compiler, generates faster binaries than DMD (for some D users, the general rule of thumb is to develop with DMD for its fast compile times and release with LDC for its faster binaries, though others argue that LDC is plenty fast for development and DMD is fine for production). There have been requests for some time to stop compiling DMD with DMD and start doing it with LDC instead. This release is the first to put that into practice.

There are a number of smaller enhancements to the Windows experience: the install.sh script available on the DMD downloads page that some people prefer now supports POSIX environments on Windows; the system link libraries that ship with the compiler have been upgraded from MinGW  5.0.2 to 7.0.0; LLD has been upgraded to 9.0.0; and there’s plenty more in the changelog.

C++ Header Generation

With just about every major release of DMD, D’s interoperability with C and C++ sees some kind of improvement. This release brings a huge one.

Over the years, some have speculated that it would be excellent if the D compiler could generate headers for C and C++ for D libraries intended to be usable in C or C++ programs. Now that wishful thinking has become a(n experimental) reality. Given a set of extern(C) or extern(C++) functions, DMD can generate header files that contain the appropriate C or C++ declarations. Three compiler switches get the job done:

  • -HC will cause the header to be generated and printed to standard output
  • -HCf=fileName will cause the header to be generated and printed to the specified file
  • -HCd=directoryname will (once it’s implemented) cause the header to be printed to a file in the specified directory

See the changelog for example output.

Other News

While the Corona virus was initially ramping up out of sight from most of the world, plans for DConf 2020 were ramping up online from different locations around the world. Planning began in November, the venue was secured in late December, and the website launched with the announcement in early January.

As news of the virus outbreak spread, the conference organizers grew concerned. Would we be okay in June? In late February, that concern manifested as a discussion of possible contingency plans. Two weeks later, it resulted in the decision to cancel DConf 2020. Thankfully, the D community has been supportive of the decision.

As part of the discussion of contingency plans, the possibility was raised of hosting an online conference. The idea of course came up in the discussion of the cancellation in the forums, and a few people reached out shortly after the initial announcement offering to provide help in setting something up. Walter created a forum thread to discuss the topic for anyone interested.

No one involved with organizing DConf has any experience with hosting an online conference. We’re currently exploring options and looking at what the organizers of other Conferences in the Time of COVID-19 are doing. We want to do it, and we want to do it well. Experience with organizing DConf in the real world has taught us not to jump on any old technology without first having a fallback (ahem, DConf 2018 livestream) and making sure the tech does what we expect it to (ahem, DConf 2019 livestream). So don’t expect a quick announcement. We want to find the right tech that fits our requirements and explore how it works before we move forward with setting dates. But do expect that DConf 2020 Online is looking more and more likely to become a thing.