Making Of: LDC 1.0

Posted on

This is a guest post from Kai Nacke. A long-time contributor to the D community, Kai is the author of D Web Development and the maintainer of LDC, the LLVM D Compiler.


LDC has been under development for more than 10 years. From release to release, the software has gotten better and better, but the version number has always implied that LDC was still the new kid on block. Who would use a version 0.xx compiler for production code?

These were my thoughts when I raised the question, “Version number: Are we ready for 1.0?” in the forum about a year ago. At that time, the current LDC compiler was 0.15.1. In several discussions, the idea was born that the first version of LDC based on the frontend written in D should be version 1.0, because this would really be a major milestone. Version 0.18.0 should become 1.0!

Was LDC really as mature as I thought? Looking back, this was an optimistic view. At DConf 2015, Liran Zvibel from Weka.IO mentioned in his talk about large scale primary storage systems that he couldn’t use LDC because of bugs! Additionally, the beta version of 0.15.2 had some serious issues and was finally abandoned in favor of 0.16.0. And did I mention that I was busy writing a book about vibe.d?

Fortunately, over the past two years, more and more people began contributing to LDC. The number of active committers grew. Suddenly, the progress of LDC was very impressive: Johan added DMD-style code coverage and worked on merging the new frontend. Dan worked on an iOS version and Joakim on an Android version. Together, they made ARM a first class target of LDC. Martin and Rainer worked on the Windows version. David went ahead and fixed a lot of the errors which had occurred with the Weka code base. I spent some time working on the ports to PowerPC and AArch64. Uncounted small fixes from other contributors improved the overall quality.

Now it was obvious that a 1.x version was overdue. Shortly after DMD made the transition to the D-based frontend, LDC was able to use it. After the usual alpha and beta versions, I built the final release version on Sunday, June 5, and officially announced it the next day. Version 1.0 is shipping now!

Creating a release is always a major effort. I would like to say “Thank you!” to everybody who made this special release happen. And a big thanks to our users; your feedback is always a motivation to make the next LDC release even better.

Onward to 1.1!

Find Was Too Damn Slow, So We Fixed It

Posted on

This is a guest post from Andreas Zwinkau, a problem solving thinker, working as a doctoral researcher at the IPD Snelting within the InvasIC project on compiler and language perfection. He manages and teaches students at the KIT. With a beautiful wife and two jolly kids, he lives in Karlsruhe, Germany.


Please throw this hat into the ring as well, Andrei wrote when he submitted the winning algorithm. Let me tell you about this ring and how we made string search in D’s standard library, Phobos, faster. You might learn something about performance engineering and benchmarking. Maybe you’ll want to contribute to Phobos yourself after you read this.

The story started for me when I decided to help out D for some recreational programming. I went to the Get Involved wiki page. There was a link to the issue tracker for preapproved stuff and there I found issue 9646, which was about splitter being too slow in a specific case. It turned out that it was actually find, called inside splitter, which was slow. The find algorithm searches for one string (the needle) inside another (the haystack). It returns a substring of the haystack, starting with the first occurrence of the needle and ending with the end of the haystack. Find being slow meant that a naively implemented find (two nested for loops) was much faster than what the standard library provided.

So, let’s fix find!

Before I started coding, the crucial question was: How will I know I am done? At that moment, I only had one test case from the splitter issue. I had to ensure that any solution was fast in the general case. I wanted to fix issue 9646 without making find worse for everybody else. I needed a benchmark.

As a first step, I created a repository. It initially contained a simple program which compared Phobos’s find, a naive implementation from issue 9646, and a copy from Phobos I could modify and tune. My first change: insert the same naive algorithm into my Phobos copy. This was the Hello World of bugfixing and proved two things:

  1. I was working on the correct code. Phobos contained multiple overloads of find and I wanted to work on the right one. For example, there was an indirection which cast string into a ubyte array to avoid auto decoding.
  2. It was not an issue of meta programming. Phobos code is generic and uses D’s capabilities for meta programming. This means the compiler is responsible for specializing the generic code to every specific case. Fixing that would have required changing the compiler, but not the standard library.

At this point I knew which specific lines I needed to speed up and I had a benchmark to quickly see the effects of my changes. It was time to get creative, try things, and find a better find.

For a start, I tried the good old classic Boyer-Moore, which the standard library provides but wasn’t using for find. I quickly discarded it, as it was orders of magnitude slower in my benchmark. Gigabytes of data are probably needed to make that worthwhile with a modern processor.

I considered simply inserting the naive algorithm. It would have fixed the problem. On the other hand, Phobos contained a slightly more advanced algorithm which tried to skip elements. It first checks the end of the needle and, on a mismatch, we can advance the needle its whole length if the end element does not appear twice in the needle. This requires one pass over the needle initially to determine the step size. That algorithm looked nice. Someone had probably put some thought into it. Maybe my benchmark was biased? To be safe, I decided to fix the performance problem without changing the algorithm (too much).

How? Did the original code have any stupid mistakes? How else could you fix a performance problem without changing the whole algorithm?

One trick could be to use D’s meta programming. The code was generic, but in certain cases we could use a more efficient version. D has static-if, which means we could switch between the versions at compile time without any runtime overhead.

static if (isRandomAccessRange!Needle) {
   // new optimized algorithm
} else {
   // old algorithm
}

The main difference from the old algorithm was that we could avoid creating a temporary slice to use startsWith on. Instead, a simple for-loop was enough. The requirement was that the needle must be a random access range.

When I had a good version, the time was ripe for a pull request. Of course, I had to fix issues like style guide formatting before the pull request was acceptable. The D community wants high-quality code, so the autotester checked my pull request by running tests on different platforms. Reviewers checked it manually.

Meanwhile in the forum, others chimed in. Chris and Andrei proposed more algorithms. Since we had a benchmark now, it was easy to include them. Here are some numbers:

DMD:                       LDC:
std find:    178 ±32       std find:    156 ±33
manual find: 140 ±28       manual find: 117 ±24
qznc find:   102 ±4        qznc find:   114 ±14
Chris find:  165 ±31       Chris find:  136 ±25
Andrei find: 130 ±25       Andrei find: 112 ±26

You see the five mentioned algorithms. The first number is the mean slowdown compared to the fastest one on each single run. The annotated ± number is the mean absolute deviation. I considered LDC’s performance more relevant than DMD’s. You see manual, qznc, and Andrei find report nearly the same slowdown (117, 114, 112), and the deviation was much larger than the differences. This meant they all had roughly the same speed. Which algorithm would you choose?

We certainly needed to pick one of the three top algorithms and we had to base the decision on this benchmark. Since the numbers were not clear, we needed to improve the benchmark. When we ran it on different computers (usually an Intel i5 or i7) the numbers changed a lot.

So far, the benchmark had been generating a random scenario and then measuring each algorithm against it. The fastest algorithm got a speed of 100 and the others got higher numbers which measured their slowdown. Now we could generate a lot of different scenarios and measure the mean across them for each algorithm. This design placed a big responsibility on the scenario generator. For example, it chose the length of the haystack and the needle from a uniform distribution within a certain range. Was the uniform distribution realistic? Were the boundaries of the range realistic?

After discussion in the forum, it came down to three basic use cases:

  1. Searching for a few words within english text. The benchmark has a copy of ‘Alice in Wonderland’ and the task is to search for a part of the last sentence.
  2. Searching for a short needle in a short haystack. This corresponds to something like finding line breaks as in the initial splitter use case. This favors naive algorithms which do not require any precomputation or other overhead.
  3. Searching in a long haystack for a needle which it doesn’t contain. This favors more clever algorithms which can skip over elements. To guarantee a mismatch, the generator inserts a special character into the needle, which we do not use to generate the haystack.
  4. Just for comparison, the previous random scenario is still measured.

At this point, we had a good benchmark on which we could base a decision.

Please throw this hat into the ring as well.

Andrei found another algorithm in his magic optimization bag. He knew the algorithm was good in some cases, but how would it fare in our benchmark? What were the numbers with this new contender?

In short: Andrei’s second algorithm completely dominated the ring. It has two names in the plot: Andrei2 as he posted it and A2Phobos as I generalized it and integrated it into Phobos. In the plots you see those two algorithms always close to the optimal result 100.

It was interesting that the naive algorithm still won in the ‘Alice’ benchmark, but the new Phobos was really close. The old Phobos std was roughly twice as slow for short strings, which we already knew from issue 9646.

What did this new algorithm look like? It used the same nested loop design as Andrei’s first one, but it computed the skip length only on demand. This meant one more conditional branch, but modern branch predictors seem to handle that easily.

Here is the final winning algorithm. The version in Phobos is only slightly more generic.

T[] find(T)(T[] haystack, T[] needle) {
  if (needle.length == 0) return haystack;
  immutable lastIndex = needle.length - 1;
  auto last = needle[lastIndex];
  size_t j = lastIndex, skip = 0;
  while (j < haystack.length) {
    if (haystack[j] != last) {
      ++j;
      continue;
    }
    immutable k = j - lastIndex;
    // last elements match, check rest of needle
    for (size_t i = 0; ; ++i) {
      if (i == lastIndex)
        return haystack[k..$]; // needle found
      if (needle[i] != haystack[k + i])
        break;
    }
    if (skip == 0) { // compute skip length
      skip = 1;
      while (skip < needle.length &&
             needle[$-1-skip] != needle[$-1]) {
        ++skip;
      }
    }
    j += skip;
  }
  return haystack[$ .. $];
}

Now you might want to run the benchmark yourself on your specific architecture. Get it from Github and run it with make dmd or make ldc. We are still interested in results from a wide range of architectures.

For me personally, this was my biggest contribution to D’s standard library so far. I’m pleased with the community. I deserved all criticism and it was professionally expressed. Now we can celebrate a faster find and fix the next issue. If you want to help, the D community will welcome you!

Core Team Update: Vladimir Panteleev

Posted on

When Walter Bright unleashed the first public release of DMD on the world, the whole release and maintenance process surrounding the language and the compiler was largely a one man show. Today, there are a number of regular contributors at every step of the process, but at the heart, in addition to Walter, are three core team members who keep the engine running. One of them is Vladimir Panteleev.

Vladimir maintains a number of D projects that have proven useful to D users over the years. Today’s entry focuses on one that had as much of an impact as any other D project you could name at the time upon its release, if not more.

If the name DFeed isn’t familiar to you, it’s the software that powers the D Forums. Posts pop up now and again from new users (and even some old ones!) asking for features like post editing or deletion, or other capabilities that are common in forum software found around the web. What they don’t realize is that DFeed is actually not the traditional sort of forum package they may be familiar with.

Go back a few years and the primary vehicle powering D discussions was an NNTP server that most people accessed via newsreaders. Over time, there were additional interfaces added, as Vladimir points out:

D only had an NNTP news server, a mailing list connected to it, and two shoddy PHP web interfaces for the NNTP server, both somehow worse than the other. A lot of people were asking for a regular web forum, such as phpBB. However, Walter would always give the same answer – they were inferior in many aspects to NNTP, thus he wouldn’t use them. What’s the point of an official forum if the community leaders don’t visit them?

Vladimir decided to take the initiative and do something about it. He took a project he had already been working on and enhanced it.

The program was originally simply an IRC bot (and called “DIrcFeed”), which announced newsgroup posts, GitHub activity, Wikipedia edits and such to the #d channel on FreeNode. Adding a local message cache and a web interface thus turned it into a forum.

And so DFeed was born. Walter announced that it had gone live on Valentine’s Day 2012. It even managed to get a positive reddit thread behind it, which was a big deal for D in those days. The NNTP server and the mailing list interface to it are still alive and well, but it’s probably a safe assumption that a number of users eventually ditched their newsreaders for DFeed’s web interface and never looked back. I know I did.

So if you’ve ever wondered why you can’t delete your posts in the D Forums, you now have the reason. DFeed does not have the authoritative database. That belongs to the NNTP server. To illustrate two major problems this brings about, consider the idea of adding Markdown support to DFeed.

First, people using NNTP/email won’t see the rendered versions. Which isn’t a big deal by itself since it’s just text, but does create feature imparity. It *is* possible to write Markdown that looks fine when rendered but is unreadable in source form, especially with some common extensions such as GitHub Flavored Markdown.

Second, unless we’re careful with this, people using NNTP/mailing lists might trigger Markdown formatting that could make their post unreadable. This could be avoided, though, by only rendering messages with Markdown if they originate from the web interface, which allows previewing posts.

Even so, he’s hoping to add support for Markdown at some point in the future. Another enhancement he’s eyeing is optional delayed NNTP/email posting.

A lot of younger people communicate mainly through web forums, and are very used to being able to edit their post after sending it. This is not supported on forum.dlang.org for the simple reason that you can’t unsend an email. One solution to this problem is to save all messages sent from the web interface locally, but delay their propagation to NNTP/email for a few minutes. This creates a window during which the message can still be edited, or even deleted.

Other features coming in a future update are OAuth support, a thread overview widget, and performance improvements. He says DFeed isn’t as fast as it used to be. He wants to change that. Don’t look for the new version of DFeed too soon, though. Right now, Vladimir’s attention is turned in directions that he believes will have a bigger impact on D. As soon as I know what that means, you’ll read about it here.

In the meantime, if you have any ideas on how to improve the forums, keeping in mind that any new features have to play nice with both the NNTP server and the mailing list, don’t hesitate to bring it up.

Thanks to Vladimir for taking the time to provide me with an update and for maintaining DFeed over the years. Here’s to many more!

The D Website and You

Posted on

For those who have been around the D community for a long time, it’s all too easy to look at the website we have today and think how much better it is than anything we’ve had in the past. It’s miles ahead. Unfortunately, that perspective doesn’t lend itself well to recognizing actual problems that newcomers might face when visiting the site for the first time. Their point of reference is quite often the current website of another language, making their perspective on what does and doesn’t work very different.

It goes without saying that D is primarily a community-driven language. It doesn’t have a large company with a dedicated team of paid workers pushing its development. The website is one of the areas where this has a major impact. Its quality is almost entirely dependent on community contributions. Quite often, it’s recent additions to the community, those who have fresh eyes, who step up and push for improvements to the site. Several have been implemented in the past few weeks, thanks to community members who saw a void and took the initiative to get it filled.

André Stein put together an interactive tour of the language and set it up online. Now, that is part of the official site as tour.dlang.org, available from the menu bar at the top of most dlang.org pages by clicking on Learn. This is a tremendous improvement over what existed before, when users had to browse through different link categories to find the resources they needed, none of which were such a quick introduction to the language.

Sebastian Wilzbach has initiated a number of additions and improvements to the website in the short time he has been active in the community. One such addition is a page listing a number of organizations currently using D. Before, this information was only available on the Wiki. Now, it’s a first-class citizen of dlang.org.

Sebastian also put forth a suggestion for a major change in how the website is deployed. Previously, changes to the site had to be manually deployed, a process which caused a delay between when the changes were made and when they became visible. This led to a situation where news on the front page could become horribly outdated. Sebastian’s suggestion was approved and now updates to the site are deployed automatically when they are merged.

Another user, Ozan Nurettin Süel, was looking for the D Foundation website and coming up empty. A post about it in the forum led to a new page being added to the site. Anyone who wants to learn about the Foundation and, once memberships are open, how to join now has somewhere to go for that information under the dlang.org umbrella.

As Andrei said in his DConf keynote this year, the first five minutes become the next five years. The website plays a major role in that first five minutes. It’s up to us as a community to ensure that it meets the needs of potential new users instead of getting in their way. Small changes can often have a big impact. Whether you are an old timer or a newcomer, if you see something that is missing from the site, or have an idea for how to improve it, please let us know. If you notice broken links, incorrect text, or other such issues, please report them. Only then can dlang.org evolve to fully meet the needs of those coming to it for the first time.

Recent D Foundation Activities

Posted on

Welcome to the official D Blog!

My name is Michael Parker. With the launching of The D Blog, I’ve been fortunate enough to land the role of Blog Author, which means I’ll be making most of the posts around here. This blog will become your source for DLang news, inside looks at projects in the D community, and updates on happenings behind the scenes of language development and at the D Foundation. Consider it a complement to Adam Ruppe’s excellent weekly summary of D goings on, This Week in D. For this inaugural post, I’ve got three news items to pass along about some of the ways the D Foundation is working to promote D.

Lately, Andrei Alexandrescu has been doing a lot of talking in places other than conference stages. For one, he’s been having discussions with the Tech Lounge Foundation in Romania, an organization which, among other endeavors, works to help CS and engineering students and aspiring professionals move beyond their education. Specifically, the D Foundation is interviewing recent graduates and current graduate-level students to explore ideas for high-impact academic and industrial D projects. Not only could this result in a killer project or two for D, it could bring new faces into the community.

A second line of communication is active between Andrei and the University “Politehnica” of Bucharest, where he has had talks with the university’s Vice President Corneliu Burileanu and Director of the Telecommunications Department Eduard-Cristian Popovici. The focus of these discussions has been on finding opportunities for the university and the D Foundation to collaborate. Potential areas of cooperation in this realm include D-centric courses and joint projects developed with D.

Moving out of Romania and up into Scandanavia, Andrei was scheduled to host a D workshop on June 6 at NDC Oslo. That has now been canceled because of a scheduling conflict. Instead, he will be presenting a day-long consulting session at Cisco Norway. He will also be presenting a talk on D during lunch. All proceeds from the event will go straight to the D Foundation.

If you have any feedback on the blog theme, I would love to see it over in the forums. Be sure to keep an eye on this space, or subscribe to the feed, so you can stay up to date. And if you maintain a D project large or small, be on the lookout for an email from me. I’m getting ready to pick a target for the D Blog’s first project highlight.