DLang News September/October 2021: D 2.098.0, OpenBSD, SAOC, DConf Online Swag

Digital Mars D logo

Version 2.098.0 of the D programming language is now available in the form of DMD 2.098.0 (the reference D compiler) and LDC 1.28.0 (the LLVM-based D compiler), D has come to OpenBSD, cool things are happening thanks to the Symmetry Autumn of Code, and DConf Online 2021 t-shirts are available for purchase.

Read on for the deets.

DMD 2.098.0

This release comes with 17 major changes and 160 fixed Bugzilla issues from 62 contributors across the core repositories. The number of fixed issues may well be a record high. The 2.097.0 release had 144, and the 2.094.0 release had 119, but a cursory look at several other major releases shows numbers ranging from the high 40s to under 100, with counts in the 50s showing up frequently. This is the sort of trend we were hoping to see when Razvan Nitu came on board as our Pull Request and Issue Manager, and we couldn’t be more pleased.

There are two items of note that I’d like to point out from the new release, and then I have a little more to say about the work Razvan is doing.

ImportC

The ImportC compiler is a major enhancement to D that allows the D compiler to directly compile C source code. Walter has been working on it for a few months now, and this is the first release in which it’s available. ImportC enables the compiler to inline C function calls and even evaluate them at compile time via CTFE. ImportC targets C11 and does not currently handle preprocessor directives, so any C source you do intend to compile must first be run through a preprocessor. It’s not yet complete, but if you have a use case for it, any help in finding and reporting ImportC bugs is welcome. Contributions to fix said bugs doubly so!

Fork-based garbage collector

This release also includes an optional concurrent garbage collector for Posix systems. This is cool in and of itself, but more so because the project came to fruition thanks to the Symmetry Autumn of Code. It was originally developed for D1 by Leandro Lucarella but was never included in an official release (using alternative GCs back then required more than just a simple command-line switch). In 2018, for the inaugural edition of SAOC, Francesco Mecca undertook to port the GC to D2. This resulted in a pull request to DRuntime that was ultimately merged in time for this release by Rainer Schuetze.

To use the new GC, provide the DRuntime option --DRT-gcopt=fork:1 on the command-line of any program compiled against DRuntime 2.098.0+ (this is not a compiler option, but an option to any program linked with DRuntime). It can also be configured programmatically via:

extern(C) __gshared string[] rt_options = [ "gcopt=fork:1" ];

See the D documentation for more GC configuration options.

Shrinking the pull-request queues

Razvan has been managing pull requests across several of our repositories, but he’s been laser-focused on reducing the number of PRs in the phobos and druntime repositories, with dmd his next target. This isn’t just about lowering the PR count. He’s been reviving old PRs with the original author where he can (he tells me he was surprised how many PR authors were responsive, even after no activity on a PR for a few years) and has tried to rebase and resolve those where he can’t. Here are some statistics he’s gathered on PR activity so far this year across the phobos, druntime, and dmd repositories:

  • phobos: 568 PRs created, 650 PRs closed
  • druntime: 283 PRs created, 311 PRs closed
  • dmd: 1140 PRs created, 1126 closed

At the time he sent me the stats on October 29th, the number of open PRs in phobos had gone down from 160 to 77 and druntime from 130 to 96. The number of open PRs in dmd has remained fairly constant at around 230.

We want to thank Razvan for all the work he is doing, Symmetry Investments for sponsoring his position, the volunteer members of the “strike teams” Razvan has assembled to squash as many bugs as possible, and every contributor who has donated and continues to donate their time and effort to improving our favorite programming language.

LDC 1.28.0

The latest release of LDC implements D 2.098.0 (D frontend, DRuntime, and Phobos) and is compatible with LLVM 6.0 – 12.0.

A major item in this release is that LDC now supports dynamic casts across binary boundaries. DLL support has long been a weak point in D, often requiring the programmer to resort to extern(C) functions that return handles (pointers, references) to D objects. Martin Kinkelin has worked to improve the situation in LDC, motivated primarily by the desire to provide the standard library and runtime as a DLL on Windows.

Thanks to Martin and all the LDC contributors for the work they do to keep LDC releases in sync with those of DMD. If you benefit from their efforts, please consider sponsoring Martin (and LDC by extension) on GitHub!

D on OpenBSD

The D ecosystem grows primarily because of the efforts of volunteers who step forward to fill in the blanks. New D projects pop up all the time, but it’s pretty rare to hear that someone has brought D to a new platform. Brian Callahan has done just that.

Brian has been on a mission to bring D to OpenBSD. In August of this year, he popped into the D forums with an announcement that GDC, the GCC-based D compiler maintained by Iain Buclaw, was now available in the OpenBSD ports tree as part of GCC 11. In early October, he let us know that DMD was coming to the platform. Then in late October, he had the same news about LDC. Instructions for installing DMD on OpenBSD are on the download page (and can be extrapolated to LDC and GDC).

We are grateful to Brian for the work he has done to make this happen. We’re looking forward to his upcoming DConf Online 2021 talk, Life Outside the Big 4: The Adventure of D on OpenBSD:

The journey of D from pie-in-the-sky to a package officially offered in the OpenBSD package repository serves as a model story for other platforms who want to offer D to their userbase. We will walk through the many interconnected parts required to get a D package on OpenBSD, what the future is like for D outside the Big 4, how you can get started with D on your platform, and how those of us who enjoy life outside the Big 4 can be a positive force for D and the D community.

SAOC News

The SAOC 2021 progress bar is past the 25% mark. The first milestone wrapped up on October 15, and the participants have been posting weekly progress reports in the General Forum. It’s always interesting to read about the challenges they encounter and their solutions. But the latest SAOC isn’t the only edition about which there is news to report.

I’ve written above about the SAOC 2018 forking GC project that has found its way into the latest release of DRuntime. I can’t begin to tell you how pleased I am that another SAOC project has come into its own.

For SAOC 2020, Adela Vais set out to implement a D backend for the venerable Bison parser generator. Not only did Adela successfully complete SAOC, she saw her project through to its ultimate goal. The D backend was officially released as part of Bison 3.81 in September.

We want to offer Adela our congratulations and a huge round of applause for a job well done! Getting a project of this scope accepted into a GNU codebase is no mean feat.

DConf Online 2021 T-Shirts

DConf Online 2021 is less than a month away. The D Language Foundation will be providing DConf Online 2021 swag to the DConf speakers and prizes to viewers asking questions in the post-talk live stream Q & A sessions. The cost of the items and their shipping are the only DConf Online expenses, and they’re covered by the D Language Foundation General Fund.

Direct donations to the General Fund and our more targeted funds are always appreciated, but you can also help support the D programming language and DConf Online by purchasing a DConf Online 2021 T-Shirt or other D swag in the DLang Swag Emporium. All proceeds go straight into the General Fund. You get some swag along with our gratitude, and we get a couple of bucks. That’s a pretty good deal!

Looking Forward

As we near the end of 2021, we are looking forward to 2022 and beyond. The D programming language, its ecosystem, and its community have come a long way from the gaggle of curious coders who first took an interest in a one-man project by the guy who had created the game Empire and the Zortech C++ compiler.

The primary means of contributing to the core D projects went from emailing patches to Walter, to posting patches on Bugzilla, to committing to a Subversion repository, to submitting pull requests on GitHub. The web site went from being a few basic HTML pages of the D spec on digitalmars.com maintained only by Walter, to a simple HTML site designed by a community member under the dlang.org domain, to the more complex collection of pages and scripts that today is maintained in Ddoc by multiple contributors. The ecosystem has gone from random libraries and tools hosted by individuals on myriad services, to centralized hosting at dsource.org, to the package repository at code.dlang.org.

These are just some examples of major changes over the years, each in response to growth: as the community grew in size, some of the processes and systems began to burst at the seams. To continue to grow, something had to change. Such improvements have nearly always been the result of community action: discussion and debate in the forums eventually would lead to a champion stepping forward to make it happen. Community action has been the driving force of D since Walter first announced the “D alpha compiler” in late 2001. That’s still true today. We have a handful of paid positions, but we are still primarily driven by volunteers.

The see-a-problem-and-fix-it philosophy that carried D to where we are today has served us well, and we hope it will continue to do so into the future. But that alone is no longer enough. We are bursting at the seams again, and have been for some time. In the monthly foundation meetings, we’ve been discussing specific issues, both low level and high, and how to solve them. But there’s one thing that’s been missing from the equation: organization.

Razvan Nitu’s position as Pull Request & Issue Manager grew out of an email discussion, prompted by Laeeth Isharc, and was a year in the making. We are grateful for every volunteer who has and continues to make themselves available to review pull requests. Razvan is here not to replace them, but to complement them. They can continue as they have done. What Razvan brings to the mix is organization. He’s there to make sure fewer issues and PRs fall through the cracks, to ensure that as many issues as possible that can be resolved are resolved.

In November, the D Language Foundation and a couple of contributors are meeting with a community member who has graciously volunteered his time and expertise to advise us on how to bring the disparate servers in the D community under Foundation management and multiple admins. The end goals are to eliminate the financial burden on the volunteers who maintain these services and, hopefully, reduce the response time when it comes to solving server-related issues or making changes. In other words, organization.

I’m in the middle of revising the Vision Document that we put together over the summer. I’m not just editing it, though. I’m expanding it. My vision of the vision document has evolved since we first discussed a “goal-oriented task list” in our June meeting. I said at the time that I didn’t “know what the initial version of the final list will look like”. I feel that what we came up with falls short of meeting the need it was intended to fill. Now, I’m pretty sure of what it needs to look like. At the moment, I’m swamped in preparations for DConf Online 2021, so I’ve put the document on the backburner. I plan to pick it up again in early December and present my revisions at the last foundation meeting of the year for approval. If all goes well, it should be published on dlang.org in January. This will be a living document, updated to reflect current priorities as time goes by.

Mathias Lang is working on a proposal to bring organization into even more of our processes. It’s a modified version of the governance proposal he brought to the September foundation meeting, the aim of which is to formalize a core team to oversee the day-to-day guidance and management of the D ecosystem. I hope that this will take what already happens in our monthly meetings to the next level. I see this as a means to establish a framework for creating workgroups that can oversee specific tasks and projects, bringing more opportunities for follow-up and follow-through. It should also help provide guidance and establish priorities (e.g., via revisions to the vision document) so that independent contributors can direct their efforts not just to the issues they care about, but those that are seen as a priority by the core team. (I want to emphasize that this is my personal view. Mathias has yet to complete the proposal. But my view is informed by what we discussed in the September meeting.)

With these and future steps aimed at better organizing our community, we intend to level up our ecosystem: motivate library development, improve the onboarding experience, increase retention, make it easier to contribute, and generally resolve the long-standing issues that tarnish the experience of using the best programming language we know. We ask our current volunteers to keep volunteering, and those who aren’t yet doing so to keep an eye out for the right opportunity to pitch in. Together, we can get to where we all want to go.

Bugzilla Reward System

Digital Mars D logo

The Dlang bot has been updated to track Bugzilla issues
that have been fixed. It went live for testing on the 2nd of July. Each GitHub user who fixes a bug via a merged pull request is awarded a number of points depending on the severity of the issue. The current results can always be seen on the contributor stats page. This blog post covers all of the details regarding the implementation, rules, and prizes of the reward system.

Raison d’être

I want to start by saying that the motivation of this system is not to start a fierce competition between contributors to fix as many issues as possible. The primary reasons: we see this as a means for the D Language Foundation to reward committed contributors and to channel their efforts towards more important bugs. If, as a side effect, the system motivates people to fix more bugs, that’s great! We won’t complain.

There are some negative side effects that are possible with any sort of gamification system, and we’ll be keeping an eye out for them. We think we have one of the best online communities out there. Our members are generally friendly and helpful, and we don’t want to do anything that causes tension or proves negatively disruptive. We think this will be a fun way to reward our contributors, but we will pull the plug if it proves otherwise.

Scoring system

The scoring is designed to reward contributors based on the importance of the issues they fix, rather than the total number fixed. As such, issues are awarded points based on severity:

  • enhancement: 10
  • trivial: 10
  • minor: 15
  • normal: 20
  • major: 50
  • critical/blocker: 75
  • regression: 100

Of course, the severity of an issue does not necessarily reflect the complexity of the solution. There might be regressions that are trivial to solve, and enhancements that require an extremely complicated fix. The message that we are trying to send is that complexity is secondary to need. That is why regressions are given top priority and critical/blocker/major issues aren’t far behind.

Rules

The following rules will guide how points are awarded from the initial launch of the reward system. They are not set in stone and are open to revision over time.

Rule #1: The severity of an issue will be decided by the reviewers of a proposed patch.

Severity levels are not always accurately set when issues are first reported and may not have been updated since. The reviewer of a pull request that closes a Bugzilla issue will evaluate the issue’s severity level and may change it if he or she determines it is inaccurate. I will moderate any disagreements that may arise about severity levels.

Rule #2: A PR fixing a bug may not be merged by the same person that proposed the patch.

This is already an unwritten rule that applies to the DLang repositories, so it should not surprise anyone.

Rule #3: Anyone who adopts an orphaned PR that fixes a bug may be awarded its associated points.

To avoid any authorship conflicts, it is best if the adopter contacts the original author to ask if it is okay to adopt the PR. Rule #3 will apply if there is no response or if the response is affirmative. Otherwise, no points will be awarded.

Rule #4: Only one person may receive points per fixed issue.

This rule is specifically designed for reverted PRs. Imagine that a PR that presumably fixes an issue is merged and the author gets points for it. Later on, it is decided that the fix is incorrect and the PR is reverted. If someone else proposes the correct fix, the points will be subtracted from the original contributor and awarded to the new author. Hopefully, this will motivate the original contributor to propose the correct fix after the reversion.

Rule #5: Incomplete fixes still get points.

A Bugzilla report usually includes a snippet of code that reproduces the issue. A frequent pattern is that the bug is correctly fixed for the provided snippet, then someone comes up with a slightly modified example that does not work and reopens the issue. Since the original fix was correct, but not complete, the procedure here is that the original issue should be left closed and a new one should be opened. The original author keeps the points awarded for the original issue.

Implementation

Since most of you are die-hard geeks and are eagerly awaiting the code, here’s the database implementation hosted on the dlang-bot, and here’s the web page implementation. You will notice that the web page is extremely minimal. That is because I am a total n00b when it comes to web programming, so if anyone has the skills and the time to make a cooler web page, feel free to make a PR :D.

In short, for each of the issues that are fixed, the database stores the Bugzilla issue number, the GitHub ID of the person who fixed it, the date when the fix was merged, and the severity of the issue. Every time the leaderboard page is accessed, a query is issued to the database to compute the total points for all of the contributors and sort them in descending order. Easy peasy.

I would like to thank Vladimir Pantaleev for his continued support and assistance throughout the period that I implemented the system.

Prizes

As Mike briefly described in this forum post, we are going to have quarterly competitions. The quarterly prizes will vary. At the end of the year, the person who has acquired the most points will be awarded a bigger prize.

For the inaugural competition, which will officially start on the 20th of September 2021 and will last until the start of DConf Online (the 20th of November), the prizes will be:

  • First Place: a $300 Amazon eGift Card
  • Second Place: a $200 Amazon eGift Card
  • Third Place: a $100 Amazon eGift Card

The next set of prizes will be announced at DConf Online, so stay tuned!

That’s all folks!

If there are any questions or suggestions regarding any aspect of the bugfix reward system, please contact me at razvan.nitu1305@gmail.com. Also, feel free to directly propose changes to the existing infrastructure.

Happy coding everyone!

SAOC 2021 Projects

Digital Mars D logo

The applications have been reviewed, the results decided, and the applicants notified. Five coders will be participating in the 2021 edition of the Symmetry Autumn of Code, one of whom will be the first to take part in SAOC two times.

Following is a brief introduction to each participant and an equally brief summary of their projects. The project planning phase officially kicks off on September 1st, so any details I could provide from their applications would likely change by the time they finalize their initial milestones with their mentors. If you’re eager for more detail, please hold out a little while longer. The participants will start posting updates in the forums once their projects are underway. Their first updates should include more information.

Rethinking the default class hierarchy

If you followed SAOC 2020, you may recall that Robert Aron was a fourth-year student at University POLITEHNICA of Bucharest who worked on implementing D client libraries for the Google APIs, along with a tool to generate client libraries for said APIs (all of which can be found in his GitHub repositories). He also was a recipient of the final SAOC payment (one of two last year, where usually we have only one) and is owed a free trip to a future real-world DConf.

Robert is now working toward an MSc in Security of Complex Networks at the same university, and he’s back with us for SAOC 2021. His project this time around is a DIP for and implementation of the ProtoObject concept that Eduard Staniloiu described in his DConf 2019 talk. This will set a ProtoObject class as the root class of D’s object hierarchy and the ancestor of the existing Object class. It will allow users to opt-in to features currently provided by default through Object, such as the inclusion of a monitor to support synchronization.

Once again, Robert will be working with Eduard Staniloiu and Razvan Nitu as his mentors.

Welcome back, Robert!

Replace DRuntime hooks with templates

Teodor Dutu is also at university in Bucharest working on a master’s degree in Advanced Cybersecurity. He has experience in C and Java, and it’s the low-level experience he gained working on projects like a file system, a kernel module, and an asynchronous HTTP server that he wants to apply toward improving the D ecosystem. The D language grabbed his interest when he participated in Razvan and Edi’s D Summer School, and he is eager to help out where he can.

To that end, Teodor is entering SAOC to work on a change to DRuntime. Currently, certain operations in user code are rewritten to call functions in the runtime known as runtime hooks (if you’ve ever seen a linker error mentioning something like _d_newArrayT or a symbol with a similar name, that was a runtime hook). There are some significant downsides to this approach, such as code bloat (the entire DRuntime library is linked in when linking statically), negative performance impact (due to the use of TypeInfo to pass runtime information to the hooks), and code that’s hard to maintain (the hooks are inserted at the IR level, a component of the compiler that’s difficult to understand).

Teodor’s plan is to replace each of the runtime hooks with templates. Dan Printzell already did some work on this, and Teodor will be following in his footsteps intending to take it all the way.

Eduard Staniloiu and Razvan Nitu will be Teodor’s mentors.

Implement support for D in LLVM Debugger (LLDB)

Luís Ferreira has extensive experience with C, C++, and D. He has contributed to DMD, DRuntime, and Phobos, and has a WIP implementation of DIP 1029 (Add throw as a Function Attribute) underway.

One of the projects Luís has been working on in his free time is a rewrite of DRuntime’s demangler to avoid exceptions, taken on because of his interest in mangling and demangling. He also has an interest in LLVM. The combination sparked the idea for his SAOC project. His rough goals for the project are to add support to LLDB for demangling D symbols, recognizing D-specific data structures, and parsing D expressions.

Mathias Lang has taken on the role of mentor for this project.

Light Weight DRuntime (LWDR)

Dylan Graham made waves on this blog when he wrote about the custom gearbox controller he built, using D for its firmware. That project led him to a new one: D needs a runtime that is suitable for embedded, Internet of Things, and real-time operating systems. That’s when he started work on LWDR.

You can see from that link that LWDR is not a port of DRuntime, but “a completely new implementation for low-resource environments”, and you’ll find a list of features that are currently supported. For SAOC, Dylan will be working on expanding feature support, shoring up what’s already there and adding new features along the way.

Dylan is a university student in Australia, currently pursuing a Bachelor of Computer Science through Monash University. He’s been programming since he was 11 years old, starting with C on the Arduino Uno and BASIC on the Maximite. His courses have exposed him to several other languages, and he has shown he’s a good hand with D.

His mentor for SAOC 2021 is Adam D. Ruppe.

Improve DUB: solve dependency hell

Ahmet Sait Koçak is a Computer Engineering student from Turkey. He has a strong background in C#, but considers D his second-most comfortable language. Some might be familiar with his work maintaining bindbc-harfbuzz.

For his SAOC project, he made use of our Projects repository and settled on the idea of solving the “dependency hell” problem that can arise when using DUB. Essentially, if library A depends on libraries B and C, which in turn depend on two different versions of library D, dub will error out without any effort to resolve the version conflict.

In reviewing the application, the judges identified some issues with the project as proposed, but it was still accepted with the understanding that Ahmet may need to take a different approach. His project subsequently gained the distinction of being the first SAOC project application discussed in a D Language Foundation meeting. The goal was to determine if there might be another way.

Ahmet’s mentor is Max Haughton, who was present for the meeting. He will be working with Ahmet to investigate the solution arrived at in the meeting and, if that proves infeasible, to move forward with the initial idea. Either way, you’ll hear the details from Ahmet in his weekly forum updates.

Onward!

The SAOC judges (Átila Neves, Robert Schadek, and John Colvin) were impressed with the quality of the applications this year and are eager to see how the projects turn out. Please keep an eye out for the weekly updates that should start arriving in the forums around September 22nd, a week after Milestone 1 begins. This will help you keep abreast of the progress of each project and also provide an opportunity for suggestions that might help our SAOC 2021 coders along their paths.

Milestone 1 kicks off on September 15th, and Milestone 4 will end on January 15th. The D Language Foundation and our sponsors, Symmetry Investments, wish these five coders well in all they do over those four months. Their success is the D community’s success, so we hope everyone will join us in ensuring they have all the support and help they need to get through their four milestones and see this thing through to the end.

D Summer School v3

The third edition of the D Summer School, held at University POLITEHNICA of Bucharest, took place from the 5th to the 25th of July. It was three weeks of boot camping bachelor students into the basics of D across eight sessions of hands-on workshops and a hackathon. We will describe our experience in organizing the program, teaching the students, and trying to integrate them into the D community.

Recap from past editions

For the first two editions we had the following process:

  • we started advertising the summer school two months before the event;
  • we selected students from among the applicants;
  • the students had to complete a project during the summer school;
  • we helped students improve their projects during the hackathon;
  • we collected feedback at the end.

For more information, we recommend our previous article on the first edition.

DSSv3

Pre-event actions

In contrast to previous years, this time we started marketing the event very early in February, five months before the start date. We used the most influential vectors we could to promote it: the most popular professors. We nagged them ceaselessly to promote DSSv3 during their courses. The results were spectacular: we received 86 student applications (as opposed to 25 in the previous years). Since this was an online version of the summer school, we decided to drop the selection process and open the door to everyone. This had the added benefit that we didn’t have to conduct interviews with everyone and a wider range of students had a chance to be introduced to D.

To cope with the larger number of participants, we had to grow our team. Former Symmetry Autumn of Code participants Robert Aron and Adela Vais, and Symmetry employee Alexandru Militaru, agreed to join us. As such, we were able to diversify our teaching materials and raise the overall quality of the presentations.

The teaching materials were mostly the same as in the previous years; we simply reshuffled the order to have an incremental level of complexity and added a lab, “WebApp Tutorial using Vibe.d”. Besides this, we also changed the project competition. During the previous editions of DSS we found that students were not very engaged with the project assignment, so this time we came up with something different. We created a Dlang Bug Fix Competition: the top three students who had the largest number of merged PRs that solved a Bugzilla issue would win Raspberry Pi 400 kits (you may have noticed the “DSSv3” labeled PRs on our three main repositories). You may think that contributing to a programming language is a scary task, however, the truth is there are dozens of approachable, easy-to-fix issues that give instant gratification to the contributor.

With all the planning into place, we were now ready to start DSSv3.

The teaching sessions

A teaching session is comprised of an hour-long lecture and two hours of hands-on exercises. This year, we abandoned the slides in favor of tutorial-like examples. Since everything was online, we simply shared our screen and highlighted the different aspects of D in a practical way by directly compiling and running examples. This made the lessons more interactive as students enthusiastically asked “what happens if…?” questions, and we could easily demonstrate the results.

For the exercises, we followed a team-play system where students were grouped in teams of four, and they worked together to solve their tasks. This made it easier for us to organize everything on the Teams platform (we would enter rooms of four students instead of talking students individually) and it encouraged them to help each other.

From among the 86 applicants, we had an average of 35 students attending the sessions, with a record high of 56 (“Introduction to D”) and a low of 25 (“C\C++ Interoperability and Tooling”). It seems that from the first lecture to the last, almost half of the students abandon the course. This may seem like a grim figure, but note that we had students from all types of backgrounds, some of them in their first year at university. Since we are teaching subjects like memory management and multithreading, it’s only natural that some of them will be lost along the way. Regardless, the lowest number of attendees was higher than the highest number from previous years.

The hackathon was attended by eight people. Again, a low figure, but that was not a surprise. Keep in mind that the majority of the participants had never made an open-source contribution. We expected that only the best students would manage to contribute. One other factor that may have influenced the low number was our uninspired decision to organize the hackathon on a Sunday; several people noted in our feedback form that they would have participated had they not had other plans. The result of the hackathon:

  • 9 PRs submitted to Phobos: 5 were merged, 2 were closed but led to closing the associated Bugzilla entries, and 2 were rejected
  • 1 PR submitted to DRuntime that was merged
  • 4 PRs submitted to DMD: 2 were merged, 2 were rejected

We had hoped that students would submit PRs before the hackathon, but we were wrong. It seems that students should be assisted when making their first PR.

The winners of the hackathon were:

  1. danandrei279 with 3 PRs merged
  2. vladchicos with 2 PRs merged
  3. lucica28 with 1 PR merged

Feedback

We asked the students to fill out a feedback form, and we received 15 responses. It is highly possible that the results are biased since the feedback form was available at the end of the hackathon; by that time some students had already dropped out. Although it would have been great to understand their perspective, we still had valuable feedback on what went well and what could be improved.

From aggregating the results, we have the following conclusions.

  • The difficulty of DSS is perceived as being high. Those who are well prepared love it, but those who don’t have too much programming experience are lost along the way.
  • The introductory courses are much more popular than advanced ones such as “C\C++ interop” and “Multithreading in D”.
  • The hackathon was appreciated by hardcore programmers (a small percentage of the total number of attendees), but the rest were intimidated by it.
  • Students appreciated the relaxed interactive atmosphere of the sessions, with some commenting: “The general feel of the summer school was a chill evening hanging out with your friends.”

Overall, DSSv3 generated enthusiasm among the programming geeks, but we still have some work to do to make it attractive to a less savvy audience.

Future plans

Now that our team has grown, we plan to do a bigger restructuring of the course. Given the high drop-out rate, we would like to make the course welcoming for any type of CS student regardless of background or experience. To that end, we are considering creating two tracks for the course: one at a beginner level, and one for more advanced students. That way we can accommodate any type of audience.

Another aspect to think about is the hackathon. We still haven’t found the most appealing project that would motivate students to commit to it. Experience has shown that creating a project from scratch in a language you’ve just learned doesn’t really work (or we haven’t found the adequate project) and contributing to the D language may be intimidating. We are still searching for better solutions, so if you have any ideas, please contact us.

Also, right now, UPB is going through a major redesign of its curriculum. Proposals for new courses are being accepted, and we will forward this course as our choice. There’s a long wait for acceptance, but we’re keeping our fingers crossed.

Conclusions

Overall, we are happy with this year’s edition. We managed to expand our team, grow our reach, and motivate some students to contribute to the language. Even though we still have some work to do to engage less passionate students, we think we are on the right track.

See you next time at DSSv4!

D News Roundup

Version 2.097.0 of DMD, the D programming language reference compiler, was released on June 5th in the middle of new GDC and LDC release announcements, while preparations for two major D community events were underway: the Symmetry Autumn of Code 2021 and DConf Online 2021. We’ll cover it all in this post, with a focus first on the events.

Symmetry Autumn of Code 2021

Symmetry Investments logo

As I write, Symmetry Investments employs in the neighborhood of 180 full-time workers and manages over US$8 billion of capital, and they’re always on the lookout for more employees, including programmers to work with D and other languages. They sponsored DConf 2019 in London and have sponsored the annual Symmetry Autumn of Code since 2018, in which a handful of programmers are paid to work for four months on projects of benefit to the D ecosystem.

This year marks the fourth annual SAoC, and we are now accepting applications. Participants will plan four milestones for projects that benefit the D ecosystem and will be expected to work at least 20 hours per week on each milestone. Each participant will be rewarded US$1000 for the successful completion of each of the first three milestones. At the end of the final milestone, the SAoC committee will review the overall progress of each of the remaining participants. One will be rewarded with a final $US1000 payment and a free pass to the next real-world DConf, with reimbursement for travel and lodging. In last year’s event, a second participant was also awarded a fourth US$1000 payment.

Participation in SAoC has led to jobs for some lucky coders and has generally been a valuable learning experience for those who have completed it. Students currently enrolled in graduate or postgraduate university programs will be given priority, but applications are open to all. The application deadline is August 18th. Project ideas can be found in the D community’s projects repository at GitHub. See the Symmetry Autumn of Code page here at the D Blog for all the details on how to apply as a participant or as a mentor.

DConf Online 2021

For the second consecutive year, we were unable to hold a real-world DConf. Last year we launched the first annual DConf Online. And when I say annual, I mean annual! We’re doing it again this year and will continue to do it going forward even after the real-world DConfs are back on.

DConf Online 2021 will take place November 20 and 21 on the D Language Foundation’s YouTube channel. Once again, we’re looking for pre-recorded talks, livestream panels, and livecoding sessions. If you’d like to propose something in one of those categories, the application deadline is September 5. Please visit the DConf Online 2021 homepage for all the details.

And if you haven’t seen them yet, the DConf Online 2020 and DConf Online 2020 Q & A playlists are available on the same channel. You can also find a full list of talks and all the links (talk videos, slides, and Q & A videos) on the DConf Online 2020 homepage.

New compiler releases

D 2.097.0 is live in the latest release of DMD and the beta release of LDC, the LLVM-based D compiler. The new version of GDC also came into the world as part of GCC 11.1 at the end of April.

DMD 2.097.0

Digital Mars D logo

This version of DMD comes with 29 major changes and 144(!) fixed Bugzilla issues courtesy of 54 contributors. Changes include a few deprecations and several improvements to the standard library. Two things stand out:

  • while(auto n = expression) has been on a few wishlists for a while. Now it’s a reality. The same syntax that was already possible with if statements is considered idiomatic in certain circumstances (such as when checking if an item exists in an associative array). Expect the while condition assignment to start popping up in open-source D projects soon.
  • std.sumtype is another wishlist item that is a wish no more. The new SumType is a replacement for std.variant.Algebraic. It’s a discriminated union that makes good use of Design by Introspection with a nice match syntax for those looking for that sort of thing. It’s been quite a while since the last time a new module was added to the D standard library. Many thanks to Paul Backus for putting in the effort to see it through, and a very big Congratulations!

LDC 1.27.0-beta1

LDC logo

On the same day the new DMD was released, the first beta of LDC 1.27.0, which also supports D 2.097.0, was announced in the D forums.

On top of 2.097.0 support, this version of LDC provides greatly improved DLL support on Windows. The prebuilt Windows packages ship with DRuntime and Phobos DLLs. This is big news for D developers on Windows. We’ve long had issues with D DLLs that have prevented heavy use outside of simple interfaces (with APIs exported as extern(C) being the most reliable).

There are some limitations to be aware of, such as the inability to directly access TLS variables across DLL boundaries (though it’s fine with accessor functions). Please see the release page for the details.

Thanks to Martin Kinkelin and all the LDC maintainers and contributors for their continued work on LDC. They aren’t getting paid for this. If you are a happy LDC user or just like the idea of the project, you can support their work by sponsoring Martin Kinkelin on GitHub.

GDC 11.1

In the GCC world, Iain Buclaw continues to make strides on the GDC compiler.

GDC 11.1 still uses the old C++ version of the D frontend, which feature-wise is mostly (see below) at D 2.076.1. There were significant issues in upstream DMD that prevented Iain from making the switch to the D version of the frontend in time to make the release window. He is currently aiming to make the switch in time for GDC 12. As a consolation, this release has support for three BSDs, Mac OS X, and MinGW!

Despite the older frontend, Iain has backported several fixes and optimizations, and even a few features, so it isn’t your grandfather’s D 2.076.1 that GDC supports. For example, the new bottom type that recently made its way through the D Improvement Proposal review process has found its way into this GDC release. See the forum announcement for details of all the new D goodness in GDC 11.1 and Please consider sponsoring his work on GitHub.

One-off donations

If you aren’t up for sponsoring Martin or Iain but would still like to support them financially, you can make one-time donations through the D Language Foundation. You can send money to the D General Fund, the D Open Collective, or to our PayPal account. Whichever method you choose, please be sure to leave a note that the donation is intended for LDC, GDC, or any D project you would like to support. We’ll make sure the appropriate person receives the money.

Other options for supporting the D programming language: visit the D Language Foundation donation page and donate to one of our funds, head to the DLang Swag Emporium and purchase any items that catch your eye (the D Rocket stuff rocks, and DConf Online 2021 swag will be available shortly), or consider using smile.amazon.com and selecting the D Language Foundation as your charity the next time you shop at Amazon.com (we are only available through the .com domain; browser extensions like SmartAmazonSmile for Firefox and AmazonSmileRedirect for Chrome make it easy to do).

Thanks to everyone who has, will, or continues to support the D programming language, either through donations of time or money. We’ve gotten where we are through community effort, and community effort will keep pushing us forward. D rocks!

Driving with D

Here is what comes to mind when I think of D: fast, expressive, easy, and… driving? That’s right, I drive with D.

Enter my venerable Holden VZ Ute daily driver. From the factory, it came with a rubbish four-speed automatic gearbox. During 18 months of ownership, I destroyed four gearboxes. I could not afford a new vehicle at the time (I’m a 20-year-old Australian computer science student at Monash University), so I had to get creative. I purchased a rock-solid, bulletproof, six-speed automatic gearbox from another car. But that’s where the solutions ended. To make it work, I had to build my own circuit board, computer system, and firmware to control the solenoids, hydraulics, and clutches inside the gearbox, handle user input, perform shifting decisions, and interface to my car by pretending to be the four-speed automatic.

I’m quite proud of my solution. It can perform a shift in 250 milliseconds, which is great for racing. It has a steep first gear, giving it a swift takeoff. It has given some more powerful cars a run for their money. It’s got flappy paddles, diagnostic data on the screen, and the ability to go ahead and change the way it works whenever I want.

Here’s a very old video of the system working. It’s not representative of the current system—that ghastly blue screen is gone, the speedo works, and shifting has improved.

The computer is split into two parts: the user interface board, which drives an OLED display and uses an STM32F042, and the mainboard, which handles everything else, utilizing an STM32F407. The two cooperate over a CAN bus (Controller Area Network). All the firmware to handle this is written in D.

I picked D (as -betterC) because of its ingenious Uniform Function Call Syntax (UFCS), design by contract, metaprogramming, ease of interfacing with C, unit testing, portability, shared, @safe, and codebase maintenance features. Another bonus is the helpful, welcoming community. It has genuinely been a joy discussing D on the forums, and with the founders and community leaders.

The Advantages of D

Uniform Function Call Syntax (UFCS)
This has made my code significantly clearer. My code can accurately follow the flow of data without polluting my stack with single-use variables, nesting many function calls, or other sorts of clutter.
Here is an example of how you could potentially use it in an ECU (Engine Control Unit):

immutable injectorTime = airStoich(100.kpa, 25.degCelsius)
    .airMass
    .fuelMass((14.7f).afr)
    .fuelMol
    .calculateInjectorWidth;

This is equivalent to:

immutable injectorTime =
    calculateInjectorWidth(
        fuelMol(
            fuelMass(
                airMass(
                    airStoich(kpa(100), degCelsius(25))
                ),
                afr(14.7)
            )
        )
    ); // brackets have been expanded for reading clarity

Please note: the values in this example are hardcoded to simplify the code and demonstrate how UFCS can give a unit of measurement to a value.

Both representations are valid D code; you can use either.

With UFCS, there’s no need to read the code backward or count your brackets, no need to use a gazillion single-use variables. Function calls mirror the flow of data. It’s concise.

Design by Contract
D’s contract programming is quite similar to Ada’s. Functions can be marked with preconditions, designated by in, and postconditions, designated by out. Should a contract fail, an assertion is thrown.

// This demonstrates D’s contract programming for a function.
// It uses the short-hand expression based syntax.
int iHaveAContract(void* ptr)
in(ptr !is null) // this is a precondition, if ptr is null, an assertion is raised
in(ptr !is null, "ptr is null :(") // this is a precondition, if ptr is null, an assertion with the error message "ptr is null :(" is raised
out(result; result > 0) // this is a post condition, it captures the return value as result and tests it
out(result; result > 0, "result was too low") // this is a post condition, it captures the return value as result, tests it, and if it fails, raise an assertion with the message "result was too low"
{
    // normal function code here
}

If you want to do something a bit more complex in your contracts, an alternative syntax is available:

int iHaveAContract(void* ptr)
in {
    assert(ptr !is null); // this is equivalent to in(ptr !is null) from above.
    MyStruct* ms = cast(MyStruct*)ptr; // we can introduce variables local to this contract
    assert(ms.blah == 2, "MyStruct.blah must be 2!"); // test ms.blah, if it fails, raise an assert with an error message
}
out(result) { // captures the return value as result
    int squareIt = result * result;
    assert(squareIt == 4);
}
do { // this designates the function body
    // normal function code here
}

void main(string[] args)
{
    auto i = iHaveAContract(null); // this will violate the contract
}

Structs and classes can include contracts, called invariants, that sanity check the state of an instance for its whole lifetime. Invariants are checked after the constructor is run, before the destructor is run, and before and after public function calls.

struct MyStruct
{
    int blah;

    invariant {
        assert(blah == 2, "blah must always be 2 for some reason!");
    }
    // shorthand:
    invariant(blah == 2, "blah must be 2 for some reason!");

    // The invariant is checked at function entry and exit. If value is anything other than 2, the invariant will fail when the function exits
    void setBlah(int value)
    {
        blah = value;
    }
}

void main(string[] args)
{
    MyStruct s; // invariant is run after construction
    s.setBlah(3); // invariant contract will be violated.
}

Metaprogramming
The Don’t Repeat Yourself (DRY) principle is often touted by programmers. D’s metaprogramming is an incredible tool to achieve that goal. I use it in my CAN bus implementation. For example:

struct CANPacket(ushort ID) {
    enum id = ID;
    ubyte[8] data;
}
alias HeartbeatPacket = CANPacket!10;
alias BeepHornPacket = CANPacket!140;

I’ve got specific aliased types as HeartbeatPacket and BeepHornPacket, but I haven’t needed to repeat any code. They all follow the same underlying structure, so if I modify CANPacket, every alias is also updated.

Maybe you want a more descriptive CAN packet? Mixin templates can help with that!

struct GenericCanPacket
{
    ushort id;
    ubyte[8] data; // 8 bytes to store the CAN packet payload
}

struct HeartbeatPacket
{
    ubyte deviceID; // first byte is the device ID
    ubyte statusID; // second byte is the status
}

To translate a GenericCanPacket to the descriptive HeartbeatPacket, we can use mixin templates.

mixin template CanPacketHelperFunctions(ushort ID)
{
    enum id = ID;

    // typeof(this) means that the return type of readFromGeneric will be the struct this template is instantiated in.
    static typeof(this) readFromGeneric(const ref GenericCanPacket p)
    in(p.id == ID, "Generic packet cannot be converted!")
    {
        // do some cast
    }
}

struct HeartbeatPacket
{
    // the stuff declared in CanPacketHelperFunctions is pretty much copy-pasted (not literally) into here
    mixin CanPacketHelperFunctions!10;

    ubyte deviceID; // first byte is the device ID
    ubyte statusID; // second byte is the status
}

void main()
{
    GenericCanPacket generic;
    generic.id = 10;
    generic.data = [ 2 /* deviceID */, 3 /* statusID */ ];

    HeartbeatPacket heartbeat = HeartbeatPacket.readFromGeneric(generic);
    writeln(HeartbeatPacket.id); // the packet ID is 10
    writeln(heartbeat.deviceID); // 2
    writeln(heratbeat.statusID); // 3
}

The mixin template CanPacketHelperFunctions can be used over and over for all sorts of packet representations, and since it is only declared once, the implementation remains consistent across all types that use it.

Interfacing to C
I frequently must communicate with my microcontroller’s HAL and RTOS; D’s C interface made that a breeze. Just add an extern(C) and it’s good to go.

extern(C) c_setPwm(int solenoid, void* userData); // declaration
c_setPwm(4, null); // usage, pretty easy!

Unit Testing
D’s built-in unit testing has saved me from blowing my foot off a few times. I can run all my unit tests on Windows to guarantee logical correctness, and then build a final target for my microcontroller. Here is an example:

struct MyStruct
{
    int x;
    int squareIt() { return x * x; }
}

unittest
{
    MyStruct s;
    s.x = 9;
    assert(s.squareIt == 9 * 9); // If for some reason the implementation breaks, then the unit test fails
}

Deprecation
Codebases can be ever-changing, and sometimes certain functionality may no longer be considered good practice but must be retained for legacy reasons. D provides a way to explicitly mark this in code. Any use of such deprecated code will trigger a deprecation warning from the compiler. Example:

deprecated struct Example
{
    // ...
}

// This deprecation includes a reason
deprecated("This is the reason for deprecation..") struct ExampleWithMessage
{
    // ...
}

void main()
{
    // This will generate the following compiler warning:
    // "Deprecation: struct `Example` is deprecated"
    Example e;

    // This will generate the following compiler warning:
    // "Deprecation: struct `ExampleWithMessage` is deprecated - This is the reason for deprecation.."
    ExampleWithMessage ewm;
}

Of course, deprecated can be applied to all sorts of things, not just structs.

Portability
Following on from above, D supports a surprisingly large number of targets via GDC and LDC. If it weren’t for D’s portability, I would have had to write my project in C++ (ugh). I use LDC, and cross-compiling can be performed by simply adjusting my command line arguments.

Shared
shared is D’s way of guarding against multi-threaded access of code. It’s not perfect, but I use it as-is, and I think it works well. I do have multiple threads in my codebase, and they need to synchronize data. I mark certain variables as shared, which means I must take special care accessing that data. It works with system locks and mutexes. While locked, I can cast shared away and use it like a normal variable. This is handy with structs and classes.

shared int sensorValue;
sensorValue = 4; // using it like a single-thread variable, error
atomicStore(sensorValue, 4); // works with atomics

SafeD
@safe exists to prohibit sketchy memory activities and enforce best behavior. I haven’t had to fight @safe much yet because I don’t do anything wicked with my memory, but it is comfortable knowing that if I am going to make a mistake, the compiler can assist me in stopping it.

Mental Friction
Adam D. Ruppe puts it succinctly: D has low mental friction. The flexibility and expressiveness of the language make it easy to translate one’s thoughts into written code and maintain productivity. I don’t have to fight D much. This is my personal opinion, but I feel like D is the language in which I’m most productive.

Final Thoughts

D is the perfect fit for this sort of project— I think it’s going to have a bright future ahead in the embedded world. I’m going to continue using D for my projects. I’ve got another D-powered automotive project in the works which I hope to show off in the future. Even if D isn’t yet suitable for your project, keep an eye on it. D has been making enormous strides in the past few years, especially in regards to memory safety.

The examples shown in this article are purely meant to demonstrate how D’s features can be used in the real world. Do not take them as gospel as to how you should program.

Interfacing D with C: Strings Part One

Digital Mars D logo

This post is part of an ongoing series on working with both D and C in the same project. The previous two posts looked into interfacing D and C arrays. Here, we focus on a special kind of array: strings. Readers are advised to read Arrays Part One and Arrays Part Two before continuing with this one.

The same but different

D strings and C strings are both implemented as arrays of character types, but they have nothing more in common. Even that one similarity is only superficial. We’ve seen in previous blog posts that D arrays and C arrays are different under the hood: a C array is effectively a pointer to the first element of the array (or, in C parlance, C arrays decay to pointers, except when they don’t); a D dynamic array is a fat pointer, i.e., a length and pointer pair. A D array does not decay to a pointer, i.e., it cannot be implicitly assigned to a pointer or bound to a pointer parameter in an argument list. Example:

extern(C) void metamorphose(int* a, size_t len);

void main() {
    int[] a = [8, 4, 30];
    metamorphose(a, a.length);      // Error - a is not int*
    metamorphose(a.ptr, a.length);  // Okay
}

Beyond that, we’ve got further incompatibilities:

  • each of D’s three string types, string, wstring, and dstring, are encoded as Unicode: UTF-8, UTF-16, and UTF-32 respectively. The C char* can be encoded as UTF-8, but it isn’t required to be. Then there’s the C wchar_t*, which differs in bit size between implementations, never mind encoding.
  • all of D’s string types are dynamic arrays with immutable contents, i.e., string is an alias to immutable(char)[]. C strings are mutable by default.
  • the last character of every C string is required to be the NUL character (the escape character \0, which is encoded as 0 in most character sets); D strings are not required to be NUL-terminated.

It may appear at first blush as if passing D and C strings back and forth can be a major headache. In practice, that isn’t the case at all. In this and subsequent posts, we’ll see how easy it can be. In this post, we start by looking at how we can deal with NUL termination and wrap up by digging deeper into the related topic of how string literals are stored in memory.

NUL termination

Let’s get this out of the way first: when passing a D string to C, the programmer must ensure it is terminated with \0. std.string.toStringz, a simple utility function in the D standard library (Phobos), can be employed for this:

import core.stdc.stdio : puts;
import std.string : toStringz;

void main() {
    string s0 = "Hello C ";
    string s1 = s0 ~ "from D!";
    puts(s1.toStringz());
}

toStringz takes a single argument of type const(char)[] and returns immutable(char)* (there’s more about const vs. immutable in Part Two). The form s1.toStringz, known as UFCS (Uniform Function Call Syntax), is lowered by the compiler into toStringz(s1).

toStringz is the idiomatic approach, but it’s also possible to append "\0" manually. In that case, puts can be supplied with the string’s pointer directly:

import core.stdc.stdio : puts;

void main() {
    string s0 = "Hello C ";
    string s1 = s0 ~ "from D!" ~ "\0";
    puts(s1.ptr);
}

Forgetting to use .ptr will result in a compilation error, but forget to append the "\0" and who knows when someone will catch it (possibly after a crash in production and one of those marathon debugging sessions which can make some programmers wish they had never heard of programming). So prefer toStringz to avoid such headaches.

However, because strings in D are immutable, toStringz does allocate memory from the GC heap. The same is true when manually appending "\0" with the append operator. If there’s a requirement to avoid garbage collection at the point where the C function is called, e.g., in a @nogc function or when -betterC is enabled, it will have to be done in the same manner as in C, e.g., by allocating/reallocating space with malloc/realloc (or some other allocator) and copying the NUL terminator. (Also note that, in some situations, passing pointers to GC-managed memory from D to C can result in unintended consequences. We’ll dig into what that means, and how to avoid it, in Part Two.)

None of this applies when we’re dealing directly with string literals, as they get a bit of special treatment from the compiler that makes puts("Hello D from C!".toStringz) redundant. Let’s see why.

String literals in D are special

D programmers very often find themselves passing string literals to C functions. Walter Bright recognized early on how common this would be and decided that it needed to be just as seamless in D as it is in C. So he implemented string literals in a way that mitigates the two major incompatibilities that arise from NUL terminators and differences in array internals:

  1. D string literals are implicitly NUL-terminated.
  2. D string literals are implicitly convertible to const(char)*.

These two features may seem minor, but they are quite major in terms of convenience. That’s why I didn’t pass a literal to puts in the toStringz example. With a literal, it would look like this:

import core.stdc.stdio : puts;

void main() {
    puts("Hello C from D!");
}

No need for toStringz. No need for manual NUL termination or .ptr. It just works.

I want to emphasize that this only applies to string literals (of type string, wstring, and dstring) and not to string variables; once a string literal is included in an expression, the NUL-termination guarantee goes out the window. Also, no other array literal type is implicitly convertible to a pointer, so the .ptr property must be used to bind them to a pointer function parameter, e.g., `giveMeIntPointer([1, 2, 3].ptr).

But there is a little more to this story.

String literals in memory

Normal array literals will usually trigger a GC allocation (unless the compiler can elide the allocation, such as when assigning the literal to a static array). Let’s do a bit of digging to see what happens with a D string literal:

import std.stdio;

void main() {
    writeln("Where am I?");
}

To make use of a command-line tool particularly convenient for this example, I compiled the above on 64-bit Linux with all three major compilers using the following command lines:

dmd -ofdmd-memloc memloc.d
gdc -o gdc-memloc memloc.d
ldc2 -ofldc-memloc memloc.d

If we were compiling C or C++, we could expect to find string literals in the read-only data segment, .rodata, of the binary. So let’s look there via the readelf command, which allows us to extract specific data from binaries in the elf object file format, to see if the same thing happens with D. The following is abbreviated output for each binary:

readelf -x .rodata ./dmd-memloc | less
Hex dump of section '.rodata':
  0x0008e000 01000200 00000000 00000000 00000000 ................
  0x0008e010 04100000 00000000 6d656d6c 6f630000 ........memloc..
  0x0008e020 57686572 6520616d 20493f00 2f757372 Where am I?./usr
  0x0008e030 2f696e63 6c756465 2f646d64 2f70686f /include/dmd/pho
...

readelf -x .rodata ./gdc-memloc | less
Hex dump of section '.rodata':
  0x00003000 01000200 00000000 57686572 6520616d ........Where am
  0x00003010 20493f00 00000000 2f757372 2f6c6962  I?...../usr/lib
...

readelf -x .rodata ./ldc-memloc | less
Hex dump of section '.rodata':
  0x00001e40 57686572 6520616d 20493f00 00000000 Where am I?.....
  0x00001e50 2f757372 2f6c6962 2f6c6463 2f783836 /usr/lib/ldc/x86

In all three cases, the string is right there in the read-only data segment. The D spec explicitly avoids specifying where a string literal will be stored, but in practice, we can bank on the following: it might be in the binary’s read-only segment, or it might be in the normal data segment, but it won’t trigger a GC allocation, and it won’t be allocated on the stack.

Wherever it is, there’s a positive consequence that we can sometimes take advantage of. Notice in the readelf output that there is a dot (.) immediately following the question mark at the end of each string. That represents the NUL terminator. It is not counted in the string’s .length (so "Where am I?".length is 11 and not 12), but it’s still there. So when we initialize a string variable with a string literal or assign a string literal to a variable, the lack of an allocation also means there’s no copying, which in turn means the variable is pointing to the literal’s location in memory. And that means we can safely do this:

import core.stdc.stdio: puts;

void main() {
    string s = "I'm NUL-terminated.";
    puts(s.ptr);
    s = "And so am I.";
    puts(s.ptr);
}

If you’ve read the GC series on this blog, you are aware that the GC can only have a chance to run a collection if an attempt is made to allocate from the GC heap. More allocations mean a higher chance to trigger a collection and more memory that needs to be scanned when a collection runs. Many applications may never notice, but it’s a good policy to avoid GC allocations when it’s easy to do so. The above is a good example of just that: toStringz allocates, we don’t need it in either call to puts because we can trust that s is NUL-terminated, so we don’t use it.

To be very clear: this is only true for string variables that have been directly initialized with a string literal or assigned one. If the value of the variable was the result of any other operation, then it cannot be considered NUL-terminated. Examples:

string s1 = s ~ "...I'm Unreliable!!";
string s2 = s ~ s1;
string s3 = format("I'm %s!!", "Unreliable");

None of these strings can be considered NUL-terminated. Each case will trigger a GC allocation. The runtime pays no mind to the NUL terminator of any of the literals during the append operations or in the format function, so the programmer can’t trust it will be part of the result. Pass any one of these strings to C without first terminating it and trouble will eventually come knocking.

But hold on…

Given that you’re reading a D blog, you’re probably adventurous or like experimenting. That may lead you to discover another case that looks reliable:

import core.stdc.stdio: puts;

void main() {
    string s = "Am I " ~ "reliable?";
    puts(s.ptr);
}

The above very much looks like appending multiple string literals in an initialization or assignment is just as reliable as using a single string literal. We can strengthen that assumption with the following:

import std.stdio : writeln;

void main() {
    writeln("Am I reliable?".ptr);

    string s = "Am I " ~ "reliable?";
    writeln(s.ptr);
}

writeln is a templated function that recognizes when it’s being given a pointer; rather than treating it as a string and printing what it points to, it prints the pointer’s value. So we can print memory addresses in D without a format string.

Compiling the above, again on 64-bit Linux:

dmd -ofdmd-rely rely.d
gdc -o gdc-rely rely.d
ldc2 -ofldc-rely rely.d

Now let’s execute them all:

./dmd-rely
562363F63010
562363F63030

./gdc-rely
5566145E0008
5566145E0008

./ldc-rely
55C63CFB461C
55C63CFB461C

We see that dmd-rely prints two different addresses, but they’re very close together. Both gdc-rely and ldc-rely print a single address in both cases. And if we make use of readelf as we did with the memloc example above, we’ll find that, in every case, the literals are in the read-only data segment. Case closed!

Well, not so fast.

What’s happening is that all three compilers are performing an optimization known as constant folding. In short, they can recognize when all operands of an append expression are compile-time constants, so they can perform the append at compile-time to produce a single string literal. In this case, the net effect is the same as s = "Am I reliable?". LDC and GDC go further and recognize that the resulting literal is identical to the one used earlier, so they reuse the existing literal’s address (a.k.a. string interning). (Note that DMD also performs string interning, but currently it only kicks in when a string literal appears more than twice.)

To be clear: this only works because all of the operands are string literals. No matter how many string literals are involved in an operation, if only one operand is a variable, then the operation triggers a GC allocation.

Although we see that the result of an append operation involving string literals can be passed directly to C just fine, and we’ve proven that it’s stored in read-only memory alongside its NUL terminator, this is not something we should consider reliable. It’s an optimization that no compiler is required to perform. Though it’s unlikely that any of the three major D compilers will suddenly stop constant folding string literals, a future D compiler could possibly be released without this particular optimization and instead trigger a GC allocation.

In short: rely on this at your own risk.

Addendum: Compile rely.d on Windows with dmd and the binary will yield some very different output:

dmd -m64 -ofwin-rely.exe rely.d
./win-rely
7FF76701D440
7FF76702BB30

There is a much bigger difference in the memory addresses here than in the dmd binary on Linux. We’re dealing with the PE/COFF format in this case, and I’m not familiar with anything similar to readelf for that format on Windows. But I do know a little something about Abner Fog’s objconv utility. Not only does it convert between object file formats, but it can also disassemble them:

objconv -fasm win-rely.obj

This produces a file, win-rely.asm. Open it in a text editor and search for a portion of the string, e.g., "I rel". You’ll find the two entries aren’t too far apart, but one is located in a block of text under this heading:

rdata SEGMENT PARA ‘CONST’ ; section number 4

And the other under this heading:

.data$B SEGMENT PARA ‘DATA’ ; section number 6

In other words, one of them is in the read-only data segment (rdata SEGMENT PARA 'CONST'), and the other is in the regular data segment. This goes back to what I mentioned earlier about the D spec being explicitly silent on where string literals are stored. Regardless, the behavior of the program on Windows is the same as it is on Linux; the second call to puts doesn’t blow anything up because the NUL terminator is still there, one slot past the last character. But it doesn’t change the fact that constant folding of appended string literals is an optimization and only to be relied upon at your own risk.

Conclusion

This post provides all that’s needed for many of the use cases encountered with strings when interacting with C from D, but it’s not the complete picture. In Part Two, we’ll look at how mutability, immutability, and constness come into the picture, how to avoid a potential problem spot that can arise when passing GC-allocated D strings to C, and how to get D strings from C strings. We’ll save encoding for Part Three.

Thanks to Walter Bright, Ali Çehreli, and Iain Buclaw for their valuable feedback on this article.

A Pull Request Manager’s Perspective

Since January of this year, I have been working as a part-time PR (Pull Request) manager. During this time, I have mostly been reviewing PRs and going through issues on the D Bugzilla. I have also been trying to come up with ways of creating organizational structures and procedures that will ultimately aid the D leadership in motivating and focusing community effort. This blog post presents a few insights I’ve had regarding the PR queues of the dmd, druntime, and phobos repositories, and a couple of proposals that, in my opinion, could benefit the D contribution process.

PR rounds

As a PR manager, I spend most of my time reviewing PRs. Since I started on the job, I have been involved in the merger of more than 400 PRs across our repositories. From this experience I have extracted a few insights:

  • If a new PR is not reviewed within the first 3 days after it was opened, chances are that it will get abandoned.
  • If a PR is not merged during the first 2 weeks after it was opened, chances are that it will be abandoned.
  • Contributions, in terms of PRs per month, are as follows: phobos (130), dmd (85), druntime (30).
  • Although phobos benefits from more contributions, dmd has a larger contributor base.
  • Druntime needs morelove.
  • Veteran contributors are more likely to abandon PRs than new/first-time contributors.

Given the first 2 points, I try to make contact as fast as possible with PR authors. It often happens that I do not have the necessary expertise to technically review a PR. In that case, I try to find people who are willing to take a look. However, since we do not have a concrete community hierarchy, it is sometimes difficult to find the needed reviewers. A solution to this problem is proposed later in the blog post.

Regarding the ratio of contributions per repository, it is noteworthy that phobos and dmd get a lot of attention, whereas druntime is by far the least attractive repository. Another interesting aspect is the diversity of the contributor base: in the last month, there have been ten contributors who opened more than one PR for dmd, five for phobos, and four for druntime. Ths emphasizes the fact that druntime needs more love.

Lastly, I noticed that veteran contributors tend to abandon their PRs more often than newcomers. This can be explained by the fact that veteran contributors usually tackle multiple PRs at the same time, whereas newcomers usually focus on a single PR. I want to take this opportunity to urge all contributors not to abandon their PRs. It is disappointing for reviewers such as myself to put in the time to properly investigate the patch and offer advice to then see it go to waste. I know that it is much more appealing to start working on new things, but it is highly important not to let any work go to waste.

Upcoming projects

From my perspective, D has come a long way from its early days: language features are maturing, adoption is steadily growing and the community is expanding around a nucleus of veteran contributors. But given that growth, it is surprising that from an organizational standpoint we are basically in the same spot: if a critical issue appears (a critical bug report, a CI failure, an expired certificate, etc.), the solution is to make a forum post or a comment on Slack and hope that someone who can fix it, or can get it fixed, notices it soon; non-critical issues depend on someone taking an interest: an issue might eventually be fixed, or we might be stuck with it indefinitely. The problem is not manpower or skill; our community has a lot of talent. Unfortunately, we fail to utilize it to its full potential.

If we want certain things to be done, it is the leadership’s responsibility to:

  1. specifically state what work needs to be done,
  2. organize the community, and
  3. incentivize contributors to do the needed work.

Although there is room for improvment, (1) has usually taken place in the form of forum discussions, DIPs, and blog posts. (2) is difficult to implement, given that people contribute in their own free time. As for (3), the mantra has been “fix it if you need it”, which works well for interesting topics, but not that well for important, hard-to-fix bugs, or high-impact, boring tasks.

Implementing points (2) and (3) in an open-source community and with limited financial resources is difficult. However, there are alternative approaches that have not been explored in the DLang ecosystem. I will outline them below.

Creating strike teams

One way of organizing the community is to create dedicated groups of people, or strike teams, that can be called upon for specific tasks. One will be assigned to each repository (dmd, druntime, phobos, dlang.org). The idea is to add people to these groups who either have expertise but lack time to contribute, or don’t lack expertise but are willing to actively contribute. This way, if you do not have time to contribute code, you can still help the community by offering implementation advice, whereas if you do have time to offer, you can contribute and develop expertise. The strike teams will be populated by a limited number of people who are trusted members of the community. These teams will be approached directly by the leadership (Walter, Atila, Mike, PR Managers) to fix issues or implement work defined in point (1). The components of the strike teams will receive recognition by having their name listed on the dlang.org site (thus satisfying point (3)).

Of course, this will work as long as there are folks out there willing to dedicate their time. If you want to contribute in some form to any of the strike teams, please contact me directly on Slack or via email.

Bugzilla Gamification

The D compiler has around 3000 reported bugs, druntime around 300, and phobos 900. These numbers have grown over time. Although some issues are fixed, we have had no means to incentivize people to work on the critical ones. To that end, we propose a simple gamification scheme: each issue has a severity associated; once a PR that closes an issue is merged, the github author of the PR is awarded points according to its severity level; a leaderboard, which is updated in real-time, is presented on dlang.org, and anyone can see who the top contributors are. At the end of each “season”, contributors will be awarded prizes and recognition based on different criteria, such as overall point total, number of total contributions, and so on (we have yet to finalize the kinds of prizes that will be awarded).

By implementing this scoring scheme, we offer some incentive for more experienced contributors to prioritize blocker/critical/major/regression issues over the more trivial or simpler ones, and encourage new contributors to try their hand at a level with which they’re comfortable. We are already working on implementing this and will announce the rules and prize categories once everything is up and running.

Conclusion

We are at a point in the evolution of the D programming language and its ecosystem where motivating community effort towards a common goal is crucial. This is a long-term, complicated task, but we need to start somewhere. I hope that with this initiative we can pave the way to a more sophisticated and better-organized contribution process that is a more satisfying and rewarding experience for our contributors.

D 2.096.0 Released and Other News

Digital Mars D logo

The latest version of DMD, the D reference compiler, is now available for download. The changelog notes 17 major changes and 81 resolved Bugzilla issues from 54 contributors. After we get into some notable items from the changelog, we’ll turn our attention to other items of note from the D community: a new release of LDC is right around the corner and one of GDC just beyond the horizon; the Symmetry Autumn of Code wrapped up with a surprise ending; and there are two sometimes forgotten places where anyone can go to find ways to contribute and sharpen their D coding skills.

DMD 2.096.0

This release of DMD is one of those where so many of the improvements are highlight-worthy that it’s difficult to decide which ones to focus on: there are more improvements to the experimental C++ header generation like those I highlighted in the previous release; support for DWARF debug info is surely important to a significant segment of the D community; and changing plain synchronized statements to use runtime-allocated mutexes fixes a critical issue.

The full changelog is there with all the details for anyone who wants them. The features below are a couple that I believe warrant a bit of exposition.

New C-compatible complex types

The complex types cfloat, cdouble, and creal (and their imaginary i-prefixed counterparts) have been a part of D for a very long time, but for much of that time they have been on the block to face future deprecation.

What became clear over time is that they are accompanied by a high maintenance cost, requiring special cases in the frontend to maintain compatibility with other language features. With that and their specialized nature, they have the wrong cost-benefit ratio. The std.complex module introduced the Complex type to shift that cost-benefit ratio in the other direction.

For several reasons, the built-in types have yet to be deprecated, but new D code should be written to use std.complex. One potential problem there is that the library type is not compatible with the C _Complex type, so anyone needing that compatibility has continued to reach for the built-ins.

This release introduces three new aliased types that are automatically configured at compile-time to be ABI compatible with C’s _Complex and should be used in place of the built-ins when interacting with C or C++: c_complex_float, c_complex_double, and c_complex_real. They are declared in the stdc.config module, which must be imported to use them (and which includes other aliased types that have ABI variations between C compilers, such as c_long and c_long_double).

Postblit and copy constructor priority

The postblit constructor has long been the second step in D’s approach to copy constructing struct instances:

  1. “blit” (copy) the fields from the source to the destination
  2. invoke the destination’s postblit constructor

The first step would always take place, the second only if the struct declaration included an implementation of this(this). The idea behind this is that the first step does a shallow copy, and the postblit constructor can take care of any extra work that is required, such as making “deep copies” (copying referenced data) or incrementing a reference count. The postblit constructor does not have access to the source object.

Unfortunately, time and usage uncovered issues with postblit, particularly in how it behaves in the presence of qualifiers like const, immutable, and shared. Changing the behavior of such a long-lived and deeply ingrained feature is problematic due to the impact on existing code. Instead, the approval of DIP 1018 introduced copy constructors to the language.

Now, newly written code should make use of copy constructors rather than postblits, but postblits are not deprecated and remain in the language. Consequently, the two features need to learn to play nice together.

Until now, when both a postblit and a copy constructor were present, priority was given to the postblit. Unfortunately, there was a corner case that slipped by.

The example in the changelog looks like this:

// library code using postblit
struct A
{
    this(this) {}
}

// new code using copy constructor
struct B
{
    A a;
    this(const scope ref B) {}
}

Because A implements a postblit, then given an instance b of B, the postblit of a should be invoked anytime a copy of b is made. Since the user defined no postblit for B, the compiler will generate one that, when invoked, will in turn invoke the postblit of a.

Now that postblit constructors have priority over copy constructors, the programmer who implemented B is expecting its copy constructor to run, but it never will.

With 2.096.0, the above code will result in a deprecation message informing the programmer that a generated postblit will be invoked instead of the copy constructor. The programmer then has three options:

  • disable the postblit with @disable this(this);
  • implement a postblit, which will then have priority over any copy constructors
  • remove all copy constructors from B in preference to the generated postblit

LDC

Up-to-date beta versions of the LLVM-based D compiler, LDC, tend to come not too long after DMD releases. As I write, the LDC maintainer Martin Kinkelin is working toward the next beta release, and that will support D 2.096.0. Keep an eye on the D Announce forum and the LDC release page for news of the release.

GDC

Using the current release of GDC, the D compiler which is distributed as part of the GNU Compiler Collection (GCC), the __VERSION__ constant reports 2.076. LDC and DMD moved on from that version with a D frontend that was ported from C++ to D, but to facilitate bootstrapping, GDC had to stick with the C++ version of the runtime for inclusion into GCC. Maintainer Iain Buclaw has been backporting some fixes and improvements from upstream, so the 2.076 of GDC is no longer at full bug/feature parity with DMD 2.076. Just one example of many, a performance boost to static foreach that was merged into DMD last year is also included in the GDC that shipped with GCC 10.2.1.

The upside of this is that GDC is rock-solid stable. The downside is that code that compiles on the latest DMD and LDC may fail to compile on GDC. But in between backporting bugfixes, his day job, and his normal life, Iain has been expending his free time on replacing the old C++ frontend with the newer D implementation. Given time constraints and the amount of work left to do, version 11 of GDC will remain on D 2.076. Once the GCC feature freeze is lifted in May, he will start laying the foundations for making the switch to the newer frontend in a future GCC release.

Symmetry Autumn of Code 2020

The latest edition of SAOC kicked off in September of 2020. With the sponsorship of Symmetry Investments, four students were to work through four milestones to complete projects that would benefit the D ecosystem. Upon successful completion of each of the first three milestones, each student would receive $1,000. At the end of the fourth milestone, the SAOC judges would award one of them with a final $1,000 payment and a free trip to the next real-world DConf.

All four students completed the first two milestones, but two of them were unable to continue past the third. Milestone 4 ended on January 15th. At the end of the month, the two remaining students submitted their final milestone reports and a summary of their experience working on their projects: what they learned, how reality compared to their expectations, and what their plans are going forward. With those documents and the mentors’ student evaluations in hand, the SAOC judges had to decide which of the two should be awarded the final payment.

Their decision was neither obvious nor easy. Both of the students did outstanding work throughout the event. Their milestone reports sufficiently documented their progress. They kept up with their forum updates as required. Their mentors gave them glowing reviews. There was very little to separate them. In the end, the judges looked at the state of both projects and determined that one would have a broader and more immediate benefit to the D community than the other, and so made a decision.

But they didn’t leave it there. The SAOC judges felt that both students had done so well that they both should be rewarded, but our sponsor had only allocated funding for one reward. The judges devised alternative scenarios and asked our sponsor which of the options they would support. They received an answer, and I informed the students in the last week of February (after they had patiently waited several days beyond our initial deadline).

Robert Aron was selected to receive the final payment and a free trip to the next real-world DConf. His project: implementing D clients for Google APIs. By the end, he had two fully functional libraries for talking to Google Drive and Google Calendar, and an in-progress library for interfacing with Gmail, all of which you can find on GitHub. He also has a work-in-progress template-based Google API generator. He intends to finish both it and the GMail client. He also plans to eventually generate libraries for Google APIs that use RPC.

The “runner-up” is Adela Vais. With Symmetry’s permission, she was offered a choice between a final $1,000 payment and a free trip to the next real-world DConf (I’ll leave it to her to tell folks which she chose). The initial goal of her SAOC project was to implement a D GLR parser for GNU Bison. After seeing the state of the existing D LALR(1) pull parser support in Bison, she shifted her goal to first implement its missing features and add support for a push parser before moving on to the GLR parser. By the end of the event, the push parser was 95% complete, and she intended to take it all the way and then turn to the GLR parser support.

Congratulations to both Robert and Adela! And a special thanks to Laeeth Isharc and Symmetry Investments for providing this opportunity to young programmers every year. Robert and Adela told us they learned more from this experience than they had expected to, and they were obviously proud of the work they had done.

D ecosystem projects and tasks

The DLang Project Idea Repository was created during DConf 2019 in London after a discussion among some of the attendees. Not only is it a solid source of project ideas for potential participants in Google Summer of Code and SAOC applicants, but it’s also useful for anyone looking to make a meaningful contribution to the D community.

The D Ecosystem Task List came about after discussions with representatives of companies using D in production. Any company that is willing to allow one of their employees at least one day per quarter to work on smaller tasks that benefit the D ecosystem can use the repository to get ideas and indicate which task they’re working on. Funkwerk, who have been using D in production since 2008, have been the only ones to take us up on this so far. Their employees have contributed improvements to D-Scanner, dfmt, and dub. We are grateful to them for all they’ve done.

The task list isn’t limited to company employees. It’s a list of broad categories, not specific tasks, for anyone who has an hour or two they’d like to spend on writing D code. So if you are looking for D experience or would simply like to contribute, and you don’t have time for a full-on project, the Ecosystem Task List is a great place to go for ideas. And if you have ideas for other broad task categories that can be added to the list, please submit a pull request!

Symphony of Destruction: Structs, Classes and the GC (Part One)

Digital Mars D logo

This post is part of a broader series on garbage collection in D. The motivation is to explore how destructors and the GC interact. To do that, we first need a bit of background. We do not go into a broader discussion on the ins and outs of object destruction, only what is most relevant to the interaction of destructors and the GC.

I’ve split the discussion into two blog posts. Here in Part One, we look at how deterministic and non-deterministic destruction differ, consider the consequences of having a single destructor for both scenarios, and finally establish two simple guidelines that will help us avoid those consequences. In Part Two, we’ll go further and explore how we can still write solid destructors when circumstances dictate that the guidelines don’t apply.

Deterministic destruction

Destruction is deterministic when it is predictable, meaning the programmer can, simply by following the flow of the code, point out where and when an object’s destructor is invoked. This is possible with struct instances allocated on the stack, as the compiler will insert calls to their destructors at well-defined points for automatic and deterministic destruction.

There are two basic rules for automatic destruction:

  1. The destructors of all stack-allocated structs in a given scope are invoked when the scope exits.
  2. Destructors are invoked in reverse lexical order (i.e., the opposite of the order in which the declarations appear in the source).

With these two rules in mind, we can examine the following example and accurately predict its output.

import std.stdio;
struct Predictable
{
    int number;
    this(int n)
    {
        writeln("Constructor #", n);
        number = n;
    }
    ~this()
    {
        writeln("Destructor #", number);
    }
}

void main()
{
    Predictable s0 = Predictable(0);
    {
        Predictable s1 = Predictable(1);
    }
    Predictable s2 = Predictable(2);
}

We see that both s0 and s2 are directly within the scope of the main function, so their destructors will run when main exits. Given that the declaration of s2 comes after that of s0, the destructor of s2 will run before that of s0.

We also see that s1 is declared in an anonymous inner scope between the declarations of s0 and s2. This scope exits before s2 is constructed, so the destructor of s1 will execute before the constructor of s2.

With that, we can expect the following output:

Constructor #0      // declaration of s0
Constructor #1      // declaration of s1
Destructor #1       // anonymous scope exits, s1 destroyed
Constructor #2      // declaration of s2
Destructor #2       // main exits, s2 then s0 destroyed
Destructor #0

Compiling and executing the example proves us accurate seers.

The programmer can implement deterministic destruction manually, as is necessary when destroying instances allocated on the non-GC heap, e.g., with malloc or std.experimental.allocator. In an earlier post, Go Your Own Way (Part Two: The Heap), I covered how to use std.conv.emplace to allocate instances on the non-GC heap and briefly mentioned that destructors can be invoked manually via destroy. That’s a function template declared in the automatically imported object module so that it’s always available. We won’t retread the allocation discussion, but an example of manual destruction isn’t out of bounds for this post.

In the following example, we’ll reuse the definition of the Predictable struct and a destroyPredictable function to manually invoke the destructors. For completeness, I’ve included functions for allocating and deallocating Predictable instances from the non-GC heap: allocatePredictable and deallocatePredictable. If it isn’t clear to you what these two functions are doing, please read the blog post I mentioned above.

void main()
{
    Predictable* s0 = allocatePredictable(0);
    scope(exit) { destroyPredictable(s0); }
    {
        Predictable* s1 = allocatePredictable(1);
        scope(exit) { destroyPredictable(s1); }
    }
    Predictable* s2 = allocatePredictable(2);
    scope(exit) { destroyPredictable(s2); }
}

void destroyPredictable(Predictable* p)
{
    if(p) {
        destroy(*p);
        deallocatePredictable(p);
    }
}

Predictable* allocatePredictable(int n)
{
    import core.stdc.stdlib : malloc;
    import std.conv : emplace;
    auto p = cast(Predictable*)malloc(Predictable.sizeof);
    return emplace!Predictable(p, n);
}

void deallocatePredictable(Predictable* p)
{
    import core.stdc.stdlib : free;
    free(p);
}

Running this program will result in precisely the same output as the previous example. In the destroyPredictable function, we dereference the struct pointer when calling destroy because there is no overload that takes a pointer. There are specializations for classes, interfaces, and structs passed by reference and a general catch-all that takes all other types by reference. Destructors are invoked on types that have them. Before exiting, the function sets the argument to its default .init value through the reference.

Note that if we were to give destroy a pointer without first dereferencing it, the code would still compile. The pointer would be accepted by reference and simply set to null, the default .init value for pointers, but the struct’s destructor would not be invoked (i.e., the pointer is “destroyed”, not the struct instance).

Inserting writeln(*p) immediately after destroy(*p) should print

Predictable(0)

for each destroyed instance. (The default .init state for a struct in D is the aggregate of the .init property of each of its members; in this case, the sole member, being of type int, has an .init property of 0, so the struct’s default .init state is Predictable(0). This can be changed in the struct definition, e.g., struct Predictable { int id = 1; }.)

destroy is not restricted to instances allocated on the non-GC heap. Any aggregate type instance (struct, class, or interface) is a valid argument no matter where it was allocated.

Non-deterministic destruction

In languages with support for objects and a garbage collector, the responsibility for destroying object instances allocated on the GC heap falls to the GC. This is known as finalization. Before reclaiming an object’s memory, the GC finalizes the object by invoking its finalizer.

Finalization, though convenient, comes with a price. In Java’s particular circumstances, the price was determined to be so high that its maintainers deprecated the Object.finalize method and left a scary warning about its use in the documentation. It’s worth quoting here:

The finalization mechanism is inherently problematic. Finalization can lead to performance issues, deadlocks, and hangs. Errors in finalizers can lead to resource leaks; there is no way to cancel finalization if it is no longer necessary; and no ordering is specified among calls to finalize methods of different objects. Furthermore, there are no guarantees regarding the timing of finalization. The finalize method might be called on a finalizable object only after an indefinite delay, if at all.

Finalization in D isn’t quite the bugbear it is in Java, but we do see a less dramatic warning about it in the D documentation:

The garbage collector is not guaranteed to run the destructor for all unreferenced objects. Furthermore, the order in which the garbage collector calls destructors for unreferenced objects is not specified.

Although there’s no mention of “finalization” or “finalizers” here, that’s precisely what the text is referring to. The core message is the same in both warnings: finalization is non-deterministic and cannot be relied on.

Unlike structs, classes in D are reference types by default. Some consequences: the programmer never has direct access to the underlying class instance; instances declared uninitialized are null by default; the normal use case is to allocate instances via new. When a class is instantiated in D, it is usually going to be managed by the GC and its destructor will serve as a finalizer.

As an experiment, let’s change the definition of struct Predictable in our first example to class Unpredictable and use new to allocate the instances like so:

import std.stdio;
class Unpredictable
{
    int number;
    this(int n)
    {
        writeln("Constructor #", n);
        number = n;
    }
    ~this()
    {
        writeln("Destructor #", number);
    }
}

void main()
{
    Unpredictable s0 = new Unpredictable(0);
    {
        Unpredictable s1 = new Unpredictable(1);
    }
    Unpredictable s2 = new Unpredictable(2);
}

We’ll see that the output is drastically different:

Constructor #0
Constructor #1
Constructor #2
Destructor #0
Destructor #1
Destructor #2

Anyone familiar with the characteristics of the default DRuntime constructor can predict for this very simple program that all the destructors will be run when the GC’s cleanup function is executed as the D runtime shuts down, and that they will be executed in the order in which they were declared (an implementation detail; and note that destruction at shut down can be disabled via a command line argument). But in a more complex program, this ability to predict breaks down. Destructors can be invoked by the GC at almost any time and in any order.

To be clear, the GC will only perform its cleanup duties if and when it finds more memory is needed to fulfill a specific allocation request. In other words, it isn’t constantly running in the background, marking objects unreachable and calling destructors willy nilly. To that extent, we can predict when the GC has the possibility to perform its duties. Beyond that, all bets are off. We cannot predict with accuracy if any destructors will be invoked during any given allocation request or the order in which they will be invoked. This uncertainty has ramifications for how one implements destructors for any GC-managed type.

For starters, destructors of GC-managed objects should never perform any operation that can potentially result in a GC allocation request. Attempting to do so can result in an InvalidMemoryOperationError at run time. I use the word “potentially” because some operations can indirectly cause the error in certain circumstances, but not in others. Some examples: attempting to index an associative array can trigger an attempt to allocate a RangeError if the key is not present; a failed assert will result in allocation of an AssertError; calling any function not annotated with @nogc means GC operations are always possible in the call stack. These and any such operations should be avoided in the destructors of GC-managed objects. (The first seven items on the list of operations disallowed in @nogc functions are collectively a good guide.)

A larger issue is that one cannot rely on any resource still being valid when a destructor is called by the GC. Consider a class that attempts to close a socket handle in its destructor; it’s quite possible that the destructor won’t be called until after the program has already shutdown the network interface. There is no scenario in which the runtime can catch this. In the best case, such circumstances will result in a silent failure, but they could also result in crashes during program shutdown or even sooner.

What it comes down to is that GC-allocated objects should never be used to manage any resource that depends upon deterministic destruction for cleanup.

Designing for destruction

For the D neophyte, it can appear as if destructors in D are useless. Given that both struct and class instances can be allocated from memory that may or may not be managed by the GC, that destructors of GC-managed objects are not guaranteed to run, and that destructors are forbidden to perform GC operations during finalization, how can we ever rely on them?

In practice, it’s not as bad as it may seem. Issues do arise for the unwary, but armed with a basic awareness of the nature of D destructors, it turns out that it’s pretty easy to avoid having problems. This is especially true if the programmer adopts two fairly simple rules.

1. Pretend class destructors don’t exist

Class instances will nearly always be allocated with new. That means their destructors will nearly always be non-deterministic. Much of what one would want to do in a destructor is somehow dependent on program state: either the destructor itself expects a certain state (like writing to a log file that is expected to be open), or the program expects the destructor to have modified the program state in a specific manner (like releasing a resource handle).

Non-deterministic destruction means that all expectations about program state are thrown out the window. That log file might have already been closed, so the message will never be written (I hope it wasn’t important). That system resource handle may never be released until the program ends (I hope that particular resource isn’t scarce). Even if it seems through testing that a class destructor is working the way it’s intended, it’s quite likely down to the fact that the testing has not uncovered the case where it breaks. In a long-running program, that case will inevitably pop up at some point. Have fun debugging when your production game server starts randomly crashing.

So when using classes in D, pretend they have no destructors. Pretend that they are Java classes with a deprecated finalize method.

2. Don’t allocate structs on the GC heap when they have destructors

Since we’re pretending classes have no destructors, then we’re going to turn to structs for all of our destructive needs. Allocating structs as value objects on the stack will cover many use cases, but sometimes we may need to allocate them on the heap. When that situation arises, do not allocate any destructor-bearing struct with new. If we allocate a struct that has a destructor on the GC heap, it completely defeats our purpose of avoiding class destructors in the first place. That destructor we intended to be deterministic is now non-deterministic, so we may as well have just used a class.

As we have seen, struct instances can be allocated on the non-GC heap (e.g., with malloc) and their destructors manually invoked with the destroy function. If we need deterministic destruction and we absolutely must have a heap-allocated struct, then we cannot allocate that struct on the GC heap.

Guidelines schmuidelines

I’m sure someone is reading the above guidelines and thinking, “If I have to pretend that classes have no destructors, then why do classes have destructors?” Well, you don’t have to.

There is no One True Path to follow when deciding if an object should be implemented as a class or struct. Personally, I will always prefer structs over classes, and I will only reach for a class if I need something structs can’t give me easily (like a hierarchy) or efficiently. Other people will consider if the object they need to represent has an identity, e.g., an Actor in a simulation versus the Vertex that defines its 3D coordinates. POD (Plain Old Data) types should always be structs, but beyond that it’s largely a matter of preference.

My two guidelines are based on my experience and that of others with whom I’ve spoken. They are intended to help you keep the full implications of D’s distinction between classes and structs at the forefront of your thoughts when architecting your program. They are not commandments that every D programmer must follow.

Realistically, most D programmers will encounter circumstances at one time or another in which the guidelines do not apply. For example, when mixing GC-managed memory and manually-managed memory in the same program, it’s quite possible for a struct intended for stack use to wind up on the GC heap if the programmer is unaware of the circumstances. And some D programmers will always prefer classes over structs because that’s just the way they want it, and so will simply choose to ignore the guidelines. That’s no problem as long as they fully understand the consequences.

So what does that mean? How do you get over the non-deterministic nature of class destructors if your Actor class absolutely must have a destructor, or if you prefer to always follow The Way of the Class? How do you prevent structs intended for the stack from being GC-allocated? These are things we’ll be looking at in Part Two. See you there.

Thanks to Ali Çehreli and Max Haughton for their feedback. And to Adam D. Ruppe for his conversation on the topic in Discord and the title suggestion (it fits in more nicely with the series than the ‘Appetite For Destruction’ I had intended to go with)