Inside D Version Manager

Posted on

In his day job, Jacob Carlborg is a Ruby backend developer for Derivco Sweden, but he’s been using D on his own time since 2006. He is the maintainer of numerous open source projects, including DStep, a utility that generates D bindings from C and Objective-C headers, DWT, a port of the Java GUI library SWT, and the topic of this post, DVM. He implemented native Thread Local Storage support for DMD on OS X and contributed, along with Michel Fortin, to the integration of Objective-C in D.


D Version Manager (DVM), is a cross-platform tool that allows you to easily download, install and manage multiple D compiler versions. With DVM, you can select a specific version of the compiler to use without having to manually modify the PATH environment variable. A selected compiler is unique in each shell session, and it’s possible to configure a default compiler.

The main advantage of DVM is the easy downloading and installation of different compiler versions. Specify the version of the compiler you would like to install, e.g. dvm install 2.071.1, and it will automatically download and install that version. Then you can tell DVM to use that version by executing dvm use 2.071.1. After that, you can invoke the compiler as usual with dmd. The selected compiler version will persist until the end of the shell session.

DVM makes it possible for the user to select a specific compiler version without having to modify any makefiles or build scripts. It’s enough for any build script to refer to the compiler by name, i.e. dmd, as long as the user selects the compiler version with DVM before invoking the script.

History

DVM was created in the beginning of 2011. That was a different time for D. No proper installers existed, D1 was still a viable option, and each new release of DMD brought with it a number of regressions. Because of all the regressions, it was basically impossible to always use the latest compiler, and often even older compilers, for all of your projects. Taking into consideration projects from other developers, some were written in D1 and some in D2, making it inconvenient to have only one compiler version installed.

It was for these reasons I created DVM. Being able to have different versions of the compiler active in different shell sessions makes it easy to work on different projects requiring different versions of the compiler. For example, it was possible to open one tab for a D1 compiler and another for a D2 compiler.

The concept of DVM comes directly from the Ruby tool RVM. Where DVM installs D compilers, RVM installs Ruby interpreters. RVM can do everything DVM can do and a lot more. One of the major things I did not want to copy from RVM is that it’s completely written in shell script (bash). I wanted DVM to be written in D. Because it’s written in shell script, RVM enables some really useful features that DVM does not support, but some of them are questionable (some might call them hacks). For example, when navigating to an RVM-enabled project, RVM will automatically select the correct Ruby interpreter. However, it accomplishes this by overriding the built-in cd command. When the command is invoked, RVM will look in the target directory for one of the files .rvmrc or .ruby-version. If either is present, it will read that file to determine which Ruby interpreter to select.

Implementation and Usage

One of the goals of DVM was that it should be implemented in D. In the end, it was mostly written in D with a few bits of shell script. Note that the following implementation details are specific to the platforms that fall under D’s Posix umbrella, i.e. version(Posix), but DVM is certainly available for Windows with the same functionality.

Structure of the DVM Installation

Before DVM can be used, it needs to install itself. This is accomplished with the command, dvm install dvm. This will create the ~/.dvm directory. It contains the following subdirectories: archives, bin, compilers, env and scripts.

archives contains a cache of downloaded zip archives of D compilers.

bin contains shell scripts, acting as symbolic links, to all installed D compilers. The name of each contains the version of the compiler, e.g. dmd-2.071.1, making it possible to invoke a specific compiler without first having to invoke the use command. This directory also contains one shell script, dvm-current-dc, pointing to the currently active D compiler. This allows the currently active D compiler to be invoked without knowing which version has been set. This can be useful for executing the compiler from within an editor or IDE, for example. A shell script for the default compiler exists as well. Finally, this directory also contains the binary dvm itself.

The compilers directory contains all installed compilers. All of the downloaded compilers are unpacked here. Due to the varying quality of the D compiler archives throughout the years, the install command will also make a few adjustments if necessary. In the old days, there was only one archive for all platforms. This command will only include binaries and libraries for the current platform. Another adjustment is to make sure all executables have the executable permission set.

The env directory contains helper shell scripts for the use command. There’s one script for each installed compiler and one for the default selected compiler.

The scripts directory currently only contains one file, dvm. It’s a shell script which wraps the dvm binary in the bin directory. The purpose of this wrapper is to aid the use command.

The use Command

The most interesting part of the implementation is the use command, which selects a specific compiler, e.g. dvm use 2.071.1. The selection of a compiler will persist for the duration of the shell session (window, tab, script file).

The command works by prepending the path of the specified compiler to the PATH environment variable. This can be ~/.dvm/compilers/dmd-2.071.1/{platform}/bin for example, where {platform} is the currently running platform. By prepending the path to the environment variable, it guarantees the selected compiler takes precedence over any other possible compilers in the PATH. The reason the {platform} section of the path exists is related to the structure of the downloaded archive. Keeping this structure avoids having to modify the compiler’s configuration file, dmd.conf.

The interesting part here is that it’s not possible to modify the environment variables of the parent process, which in this case is the shell. The magic behind the use command is that the dvm command that you’re actually invoking is not the D binary; it’s the shell script in the ~/.dvm/scripts path. This shell script contains a function called dvm. This can be verified by invoking type dvm | head -n 1, which should print dvm is a function if everything is installed correctly.

The installation of DVM adds a line to the shell initialization file, .bashrc, .bash_profile or similar. This line will load/source the DVM shell script in the ~/.dvm/scripts path which will make the dvm command available. When the dvm function is invoked, it will forward the call to the dvm binary located in ~/.dvm/bin/dvm. The dvm binary contains all of the command logic. When the use command is invoked, the dvm binary will write a new file to ~/.dvm/tmp/result and exit. This file contains a command for loading/sourcing the environment file available in ~/.dvm/env that corresponds to the version that was specified when the use command was invoked. After the dvm binary has exited, the shell script function takes over again and loads/sources the result file if it exists. Since the shell script is loaded/sourced instead of executed, the code will be evaluated in the current shell instead of a sub-shell. This is what makes it possible to modify the PATH environment variable. After the result file is loaded/sourced, it’s removed.

If you find yourself with the need to build your D project(s) with multiple compiler versions, such as the current release of DMD, one or more previous releases, and/or the latest beta, then DVM will allow you to do so in a hassle-free manner. Pull up a shell, execute use on the version you want, and away you go.

The Origins of the D Cookbook

Posted on

Adam Ruppe is the author of D Cookbook and maintainer of This Week in D. Modules from his freely available arsd package are used throughout the D community. He is also known for his legendary DConf 2014 presentation.


dcookbookIn 2013, Packt Publishing approached me to write D Cookbook. Over the next several months, I was tasked with collecting over 100 D programming language “recipes” and writing a couple of pages about each one.

I wanted it to be a mix of practical and fun examples of what D can do over the wide variety of fields in which I have used it, each one illustrative of some language concept that the reader could use in different contexts. As such, for the majority of the book, I avoided the use of libraries, even keeping the presence of Phobos or my own modules to a minimum, allowing me to focus on the implementations of the ideas.

That said, I didn’t come up with a hundred recipes out of thin air! They came from three main sources: my arsd libraries, my support efforts on the D forums, the #D IRC channel, and Stack Overflow (it was Stack Overflow that got Packt’s attention in the first place), and a few other side projects, like my minimal D runtime (zip file).

I first saw D way back in 2002. At the time, I was still fairly new to programming and was going to download another copy of Digital Mars C++ (which I used to compile 16-bit DOS programs!) when I saw a link on the Digital Mars website about this D programming language. My first impression – upon seeing the import keyword – was that it was too Java-like and I wasn’t interested. Oh, how I regret that naive snap decision! I didn’t take my next look at D until 2006, and that time, with far more real world experience and theoretical knowledge under my belt, I quickly fell in love.

Over the next year, I wrote a game library based on SDL and OpenGL and made a few little games for myself. My D code at this point wasn’t terribly unique. Much of it was based on my existing C++ codebases. That fairly uninteresting fact would later become one of my most recurring D tips: a great deal of your C, C++, and Java knowledge can carry over directly to D! Thanks to the similarities between the languages, it’s easy to learn. While your code is unlikely to work perfectly if simply copy/pasted into a .d file, porting it probably isn’t hard. I found the D language very comfortable right off the bat. Even today, I tell people with D questions to simply consider how they would do it in C++ and apply that existing knowledge to D. This can also work with C and C++ solutions gleaned from Internet resources like Stack Overflow.

I started working as a full-time web developer in 2008, which put a time constraint on my hobby game development efforts, but didn’t ice my love of D. Indeed, it wasn’t long at all before I worked D into my professional life by writing web libraries and eventually transitioning my work projects away from PHP and onto D!

At the same time, the D language was going through a series of rapid changes. Templates and compile time function evaluation got overhauled, immutable was introduced… Most interestingly to me, compile time reflection got massively expanded and a few features like opDispatch got added. Compile time reflection in older D was limited to the is expression and template tricks, but new D had __traits, making it competitive even with dynamic languages, without sacrificing D’s other strengths.

I was a very early adopter of these features and set out to discover how to combine them in ways to make my work easier, to compete with the other web languages, and to just show off a little 🙂 If someone came on the forum and said D can’t do this, then I’d reply Challenge accepted.

In the following years, I wrote: a DOM and JSON library, taking the most interesting ideas from Javascript into D; a web framework, realizing some dreams I had in rapidly churning out prototypes that could also survive the change process of real world development; OAuth and database libraries to support the needs of the projects; and more. One of the most interesting modules was my web.d, which takes a D class definition and builds web infrastructure around it: an automatically generated web site with CRUD forms from the static type information as well as Javascript and PHP code for API access to that functionality. This stretched D’s reflection capabilities and hit quite a few bugs on the bleeding edge, necessitating creative workarounds or alternative approaches. If I was really desperate, I’d fix the bug in DMD myself!

At the same time, I was regularly seen in the D community, helping other people with their problems, and, of course, reading insights from other members of the community. Every few days, I had another tip, and was also building up a mental picture of people’s common difficulties with D.

By 2013, I had years of experience with almost every corner and combination of D. Now, you can get a good slice of that in just a few days by reading the book!

Programming in D: A Happy Accident

Posted on

This is a guest post from Ali Çehreli, who not only uses D as a hobby, but also gets to use it as an employee of Weka.io. He is the author of Programming in D and is frequently found in the D Learn forum with ready answers to questions on using the language. He also is an officer of the D Foundation.


progindI consider the book Programming in D a happy accident, because initially it was not intended to be a book.

I used to be a frequent contributor to Turkish C and C++ forums. Although there were many smart and motivated members on those forums, most of them were not well-versed enough to follow programming resources in English. If they were patient enough to wait about ten years and if a publisher decided to have them translated, then they might get their hands on Turkish versions of their favorite books.

In 2009, around the time when my interest in C++ had started to diminish, I read with great excitement Andrei Alexandrescu’s The Case for D article in ACCU’s C Vu magazine (also available at Dr. Dobb’s). To a person coming from a C++ background, D was a fresh breath of air, removing some of C++’s warts and bringing many new features, some unique, some borrowed from other languages.

I was instantly hooked. I immediately created the Turkish D site ddili.org, translated Andrei’s article to Turkish, and published it there. One of the reasons for my excitement was the potential that D could be one software technology that Turkish programmers would not be left in the dark about. Since D was still being designed and implemented, there was time to write fresh Turkish documentation for it. I translated other D articles and started writing an HTML tutorial that would later become the book.

I knew very well that attempting to teach a topic is one of the best ways of learning that topic. I knew that I would be learning D myself. Little did I know then that this project would make me a better software engineer in general as well.

Teaching programming is a notoriously difficult task. According to some academic papers I found when I started the tutorial, one of the difficulties comes from the fact that different people model new concepts in their minds in different ways, rendering particular teaching methods inefficient at least for some students. Encouraged by the lack of one correct way of teaching programming, I picked one that was the easiest for me: introducing concepts in linear fashion with as few forward references as possible, starting with the most basic concepts like the = character confusingly meaning something different than is equal to.

Starting from the basics made it necessary for me to introduce lower-level concepts before higher-level concepts. For example, although the foreach statement is much more commonly used in practice, while, for, and foreach statements are introduced in the book in that order. I think that choice created a better foundation for the reader.

It took me two years to finish writing a flow of chapters from the assignment operator all the way to the garbage collector. It was very challenging and very rewarding to find a natural flow of presentation not only throughout the book but also within each chapter. The method I used for each chapter was to think about the presentation of the topic along with non-trivial examples beforehand, without touching the computer. I then wrote the chapter fairly quickly without much attention to detail, put it aside for a couple of days, then came back to review it from the point of view of a reader. I repeated that process perhaps five to ten times for each chapter until I thought it was fairly acceptable. Likely as a result of that process, a common feedback I receive is about how to-the-point my writing style has been.

Based on feedback from the Turkish community and encouragement from Andrei Alexandrescu, I started translating the book to English in early 2011. The translation continued along with new chapter additions, many corrections, and some chapter rewrites.

I made a PDF version available in January 2012 and the translation was finally completed in July 2014. Not only had I achieved my initial goal of providing fresh Turkish documentation for D, this book might have been the first software resource that was translated in the other direction.

I readily agreed with the suggestion that the book should be available in paper form as well. That decision brought many different challenges related to self-publishing like layout, cover design, pricing, the printing company, etc. The first print publication was in August 2015. Surprisingly, producing an ebook version turned out to be even more challenging. In addition to different kinds of layout issues, all ebook formats require special attention.

I am awestruck that my humble idea of a humble tutorial turned into a well known resource in the D ecosystem. It makes me very happy that people actually find the book useful. I am also happy that, periodically, people express interest in translating it to other languages. As of this writing, in addition to the completed Turkish and English versions, there are ongoing translations by volunteers to French and Chinese (German, Korean, Portuguese, and Russian translations were started but not continued).

As for future directions, I would like to add more chapters; definitely one on allocators once they’re added to the standard library (they currently live in the std.experimental.allocator package).

One thing that bothers me about the book is that most code samples don’t take full advantage of D’s universal function call syntax (UFCS), mainly because that feature was added to the language only after most of the book was already written. I would like to move the UFCS chapter to an earlier point in the book so that more code samples can be in the idiomatic D style.

The book will always be freely available online, allowing me to make frequent updates and corrections. Fortunately, my Inglish leaves a lot to improve on, so there will always be grammar and typo corrections as well.

Making Of: LDC 1.0

Posted on

This is a guest post from Kai Nacke. A long-time contributor to the D community, Kai is the author of D Web Development and the maintainer of LDC, the LLVM D Compiler.


LDC has been under development for more than 10 years. From release to release, the software has gotten better and better, but the version number has always implied that LDC was still the new kid on block. Who would use a version 0.xx compiler for production code?

These were my thoughts when I raised the question, “Version number: Are we ready for 1.0?” in the forum about a year ago. At that time, the current LDC compiler was 0.15.1. In several discussions, the idea was born that the first version of LDC based on the frontend written in D should be version 1.0, because this would really be a major milestone. Version 0.18.0 should become 1.0!

Was LDC really as mature as I thought? Looking back, this was an optimistic view. At DConf 2015, Liran Zvibel from Weka.IO mentioned in his talk about large scale primary storage systems that he couldn’t use LDC because of bugs! Additionally, the beta version of 0.15.2 had some serious issues and was finally abandoned in favor of 0.16.0. And did I mention that I was busy writing a book about vibe.d?

Fortunately, over the past two years, more and more people began contributing to LDC. The number of active committers grew. Suddenly, the progress of LDC was very impressive: Johan added DMD-style code coverage and worked on merging the new frontend. Dan worked on an iOS version and Joakim on an Android version. Together, they made ARM a first class target of LDC. Martin and Rainer worked on the Windows version. David went ahead and fixed a lot of the errors which had occurred with the Weka code base. I spent some time working on the ports to PowerPC and AArch64. Uncounted small fixes from other contributors improved the overall quality.

Now it was obvious that a 1.x version was overdue. Shortly after DMD made the transition to the D-based frontend, LDC was able to use it. After the usual alpha and beta versions, I built the final release version on Sunday, June 5, and officially announced it the next day. Version 1.0 is shipping now!

Creating a release is always a major effort. I would like to say “Thank you!” to everybody who made this special release happen. And a big thanks to our users; your feedback is always a motivation to make the next LDC release even better.

Onward to 1.1!

Find Was Too Damn Slow, So We Fixed It

Posted on

This is a guest post from Andreas Zwinkau, a problem solving thinker, working as a doctoral researcher at the IPD Snelting within the InvasIC project on compiler and language perfection. He manages and teaches students at the KIT. With a beautiful wife and two jolly kids, he lives in Karlsruhe, Germany.


Please throw this hat into the ring as well, Andrei wrote when he submitted the winning algorithm. Let me tell you about this ring and how we made string search in D’s standard library, Phobos, faster. You might learn something about performance engineering and benchmarking. Maybe you’ll want to contribute to Phobos yourself after you read this.

The story started for me when I decided to help out D for some recreational programming. I went to the Get Involved wiki page. There was a link to the issue tracker for preapproved stuff and there I found issue 9646, which was about splitter being too slow in a specific case. It turned out that it was actually find, called inside splitter, which was slow. The find algorithm searches for one string (the needle) inside another (the haystack). It returns a substring of the haystack, starting with the first occurrence of the needle and ending with the end of the haystack. Find being slow meant that a naively implemented find (two nested for loops) was much faster than what the standard library provided.

So, let’s fix find!

Before I started coding, the crucial question was: How will I know I am done? At that moment, I only had one test case from the splitter issue. I had to ensure that any solution was fast in the general case. I wanted to fix issue 9646 without making find worse for everybody else. I needed a benchmark.

As a first step, I created a repository. It initially contained a simple program which compared Phobos’s find, a naive implementation from issue 9646, and a copy from Phobos I could modify and tune. My first change: insert the same naive algorithm into my Phobos copy. This was the Hello World of bugfixing and proved two things:

  1. I was working on the correct code. Phobos contained multiple overloads of find and I wanted to work on the right one. For example, there was an indirection which cast string into a ubyte array to avoid auto decoding.
  2. It was not an issue of meta programming. Phobos code is generic and uses D’s capabilities for meta programming. This means the compiler is responsible for specializing the generic code to every specific case. Fixing that would have required changing the compiler, but not the standard library.

At this point I knew which specific lines I needed to speed up and I had a benchmark to quickly see the effects of my changes. It was time to get creative, try things, and find a better find.

For a start, I tried the good old classic Boyer-Moore, which the standard library provides but wasn’t using for find. I quickly discarded it, as it was orders of magnitude slower in my benchmark. Gigabytes of data are probably needed to make that worthwhile with a modern processor.

I considered simply inserting the naive algorithm. It would have fixed the problem. On the other hand, Phobos contained a slightly more advanced algorithm which tried to skip elements. It first checks the end of the needle and, on a mismatch, we can advance the needle its whole length if the end element does not appear twice in the needle. This requires one pass over the needle initially to determine the step size. That algorithm looked nice. Someone had probably put some thought into it. Maybe my benchmark was biased? To be safe, I decided to fix the performance problem without changing the algorithm (too much).

How? Did the original code have any stupid mistakes? How else could you fix a performance problem without changing the whole algorithm?

One trick could be to use D’s meta programming. The code was generic, but in certain cases we could use a more efficient version. D has static-if, which means we could switch between the versions at compile time without any runtime overhead.

static if (isRandomAccessRange!Needle) {
   // new optimized algorithm
} else {
   // old algorithm
}

The main difference from the old algorithm was that we could avoid creating a temporary slice to use startsWith on. Instead, a simple for-loop was enough. The requirement was that the needle must be a random access range.

When I had a good version, the time was ripe for a pull request. Of course, I had to fix issues like style guide formatting before the pull request was acceptable. The D community wants high-quality code, so the autotester checked my pull request by running tests on different platforms. Reviewers checked it manually.

Meanwhile in the forum, others chimed in. Chris and Andrei proposed more algorithms. Since we had a benchmark now, it was easy to include them. Here are some numbers:

DMD:                       LDC:
std find:    178 ±32       std find:    156 ±33
manual find: 140 ±28       manual find: 117 ±24
qznc find:   102 ±4        qznc find:   114 ±14
Chris find:  165 ±31       Chris find:  136 ±25
Andrei find: 130 ±25       Andrei find: 112 ±26

You see the five mentioned algorithms. The first number is the mean slowdown compared to the fastest one on each single run. The annotated ± number is the mean absolute deviation. I considered LDC’s performance more relevant than DMD’s. You see manual, qznc, and Andrei find report nearly the same slowdown (117, 114, 112), and the deviation was much larger than the differences. This meant they all had roughly the same speed. Which algorithm would you choose?

We certainly needed to pick one of the three top algorithms and we had to base the decision on this benchmark. Since the numbers were not clear, we needed to improve the benchmark. When we ran it on different computers (usually an Intel i5 or i7) the numbers changed a lot.

So far, the benchmark had been generating a random scenario and then measuring each algorithm against it. The fastest algorithm got a speed of 100 and the others got higher numbers which measured their slowdown. Now we could generate a lot of different scenarios and measure the mean across them for each algorithm. This design placed a big responsibility on the scenario generator. For example, it chose the length of the haystack and the needle from a uniform distribution within a certain range. Was the uniform distribution realistic? Were the boundaries of the range realistic?

After discussion in the forum, it came down to three basic use cases:

  1. Searching for a few words within english text. The benchmark has a copy of ‘Alice in Wonderland’ and the task is to search for a part of the last sentence.
  2. Searching for a short needle in a short haystack. This corresponds to something like finding line breaks as in the initial splitter use case. This favors naive algorithms which do not require any precomputation or other overhead.
  3. Searching in a long haystack for a needle which it doesn’t contain. This favors more clever algorithms which can skip over elements. To guarantee a mismatch, the generator inserts a special character into the needle, which we do not use to generate the haystack.
  4. Just for comparison, the previous random scenario is still measured.

At this point, we had a good benchmark on which we could base a decision.

Please throw this hat into the ring as well.

Andrei found another algorithm in his magic optimization bag. He knew the algorithm was good in some cases, but how would it fare in our benchmark? What were the numbers with this new contender?

In short: Andrei’s second algorithm completely dominated the ring. It has two names in the plot: Andrei2 as he posted it and A2Phobos as I generalized it and integrated it into Phobos. In the plots you see those two algorithms always close to the optimal result 100.

It was interesting that the naive algorithm still won in the ‘Alice’ benchmark, but the new Phobos was really close. The old Phobos std was roughly twice as slow for short strings, which we already knew from issue 9646.

What did this new algorithm look like? It used the same nested loop design as Andrei’s first one, but it computed the skip length only on demand. This meant one more conditional branch, but modern branch predictors seem to handle that easily.

Here is the final winning algorithm. The version in Phobos is only slightly more generic.

T[] find(T)(T[] haystack, T[] needle) {
  if (needle.length == 0) return haystack;
  immutable lastIndex = needle.length - 1;
  auto last = needle[lastIndex];
  size_t j = lastIndex, skip = 0;
  while (j < haystack.length) {
    if (haystack[j] != last) {
      ++j;
      continue;
    }
    immutable k = j - lastIndex;
    // last elements match, check rest of needle
    for (size_t i = 0; ; ++i) {
      if (i == lastIndex)
        return haystack[k..$]; // needle found
      if (needle[i] != haystack[k + i])
        break;
    }
    if (skip == 0) { // compute skip length
      skip = 1;
      while (skip < needle.length &&
             needle[$-1-skip] != needle[$-1]) {
        ++skip;
      }
    }
    j += skip;
  }
  return haystack[$ .. $];
}

Now you might want to run the benchmark yourself on your specific architecture. Get it from Github and run it with make dmd or make ldc. We are still interested in results from a wide range of architectures.

For me personally, this was my biggest contribution to D’s standard library so far. I’m pleased with the community. I deserved all criticism and it was professionally expressed. Now we can celebrate a faster find and fix the next issue. If you want to help, the D community will welcome you!