Big Performance Improvement for std.regex

Posted on

Dmitry Olshansky has been a frequent contributor to the D programming language. Perhaps his best known work is his overhaul of the std.regex module, which he architected as part of Google Summer of Code 2011. In this post, he describes an algorithmic optimization he implemented this past summer that resulted in a big performance win.


Optimizing std.regex has been my favorite pastime, but it has gotten harder over the years. It eventually became clear that micro-optimizing the engine’s state copy routine, or trying to avoid that extra write, wasn’t going to cut it anymore. To move further, I needed a new algorithmic improvement. This is how the so-called “Bit-NFA” came to be implemented. Developed in May of this year, it has come a long way to land in the main repository.

Before going into details, a short overview of the engine is called for. A user-specified pattern given to the engine first goes through a compilation process, where it gets transformed into a bytecode program, along with a bunch of lookup tables and auxiliary data-structures. Bytecode implies a VM, not unlike, say, the Java VM, but far simpler and more specific. In fact, there are two of them in std.regex, one that evaluates execution of threads in a backtracking manner and another one which evaluates all threads in lock-step, resolving any duplicates along the way.

Now, running a full blown VM, even a tiny one, on each character of input doesn’t sound all that high-performance. That’s why there is an extra trick, a kickstart engine (it should probably be called a “sidekick engine”), which is a dumb approximation of the full engine. It is run over the input first. When it spots something that looks like a match, the full engine is run to check it. The only requirement is that it can have no false negatives. That is, it has to detect as positive all matches of the regex pattern. This kickstart engine is the central piece of today’s post.

Historically, I intended for there to be a lot of different kickstart engines, ranging from a simple ‘memchr the first byte of the pattern’ to a Boyer-Moore search on the prefix of a pattern. But during the gory days of GSOC 2011, a simple solution came first and out-shadowed all others: the Shift Or algorithm.

Basically, it is an NFA (Nondeterministic Finite Automation), where each state is a bit in a word. Shifting this word advances all the states. Masking removes those that don’t match the current character. Importantly, shifting also places 0 as the first bit, indicating the active state.

With these two insights, the whole process of searching for a string becomes shifting + OR-masking the bits. The last point is checking for the successful match – one of the bits is in the finish state.

Looking at this marvelous construction, it’s tempting to try and overcome its limitation – the straight-forward execution of states. So let’s introduce some control flow by denoting some bits as jumps. To carry out a jump, we just need to map every combination of jump bits to the mask of the resulting positions. A basic hashmap could serve us well in this regard. Then the cycle becomes:

  1. Shift the word
  2. Capture control flow bits
  3. Lookup control flow table
  4. Mask AND with control flow bits
  5. Check for finish state(s) bits
  6. Mask OR with match filter table

In the end, we execute the whole engine with nothing more than a hash-map lookup, a table lookup, and a bit of bitwise operations. This is the essence of what I call the Bit-NFA engine.

Of course, there are some tricky bits, such as properly mapping bytecodes to bits. Then there comes Unicode.. oh gosh. The trick to Unicode, though, is having a fast path for ASCII < 0x80 and the rest. For ASCII, we just go with a simple table. Unicode is a two-staged variation of it. Two-staging the table let’s us coalesce identical pages, saving space for the whole 21 bits of the code point range.

Overall, the picture really is worth a thousand words. Here is how the new kickstart engine stacks up.

This now only leaves us to optimize the VMs further. The proven technique is JIT-ing the bytecode and is what top engines are doing. Still, I’m glad there are notable tricks to speed up regex execution in general without pulling out this heavy handed weapon.

GSoC Report: std.experimental.xml

Posted on

Lodovico Giaretta is currently pursuing a Bachelor Degree in Computer Science at the University of Trento, Italy. He participated in Google Summer of Code 2016, working on a new XML module for D’s standard library, Phobos.


GSoC-icon-192I started coding in high school with Pascal. I immediately fell in love with programming, so I started studying it by myself and learned both Java and C++. But when I was using Java, I was missing the powerful metaprogramming facilities and the low level features of C++. When I was using C++, I was missing the simplicity and usability of Java. So I started looking for a language that “filled the gap” between these two worlds. After looking into many languages, I finally found D. Despite being more geared towards C++, D provides a very high level of productivity, as correct code is easier to read and write. As an example, I was programming in D for several months before I was bitten by a segfault for the first time. It easily became one of my favorite languages.

The apparent lack of libraries, my lack of time, and the need to use other languages for university projects made me forget D for some time, at least until someone told me about Google Summer of Code. When I discovered that the D Foundation was participating, I immediately decided to take part and found that there was the need for a new XML library. So I contacted Craig Dillabaugh and Robert Schadek and started to plan my adventure. I want to take this occasion to thank them for their great continuous support, and the entire community for their feedback and help.

This was my first public codebase and my first contribution to a big open source project, so I didn’t really know anything about project management. The advice about this field from my mentor Robert has been fundamental for my success; he helped me improve my workflow, keep my efforts focused towards the goal, and set up correctness tests and performance benchmarks. Without his help, I would never have been able to reach this point.

The first thing to do when writing a library is to pick a set of principles that will guide development. This choice is what will give the library its peculiar shape, and by having a look around one finds that there are XML libraries that want to be minimal in terms of codebase size, or very small in terms of binary size, or fully featured and 101% adherent to the specification. For std.experimental.xml, I decided to focus on genericity and extensibility. The processing is divided in many small, quite simple stages with well-defined interfaces implemented by templated components. The result is a pipeline that is fully customizable; you can add or substitute components anywhere, and add custom validation steps and custom error handlers.

From an XML library, a programmer expects different high level constructs: a SAX parser, a DOM parser, a DOM writer and maybe some extensions like XPath. He also expects to be able to process different kinds of input and, for std.experimental.xml, to “hack in” his own logic in the process. This requires a simple, yet very flexible, intermediate representation, which is produced by the parsing stage and can be easily manipulated, validated, and transformed into whatever high-level construct is needed. For this, I chose a concept called Cursor, a pointer inside an XML document, which can be queried for properties of a given XML node or advanced to a subsequent one. It’s akin to Java’s StAX (Streaming API for XML), from which I took inspiration. In std.experimental.xml, all validations and transformations are implemented as chains of Cursors, which are then usually processed by a SAX parser or a DOM builder, but can also be used directly in user code, providing more control and speed.

Talking about speed, which in XML processing can be very important, I have to admit that I didn’t spend much time on optimization, leaving a lot of space for future performance improvements. Yet, the library is fast enough to guarantee that, for big files (where performance matters), an SSD (Solid State Drive) is needed to move the bottleneck from the fetching to the processing of the data. Being this is an extensible and configurable library, the user can choose his tradeoffs with fine granularity, trading input validation and higher level constructs for speed at will.

To conclude, the GSoC is finished, but the library is not. Although most parts are there, some bits are still missing. As a new university semester has started, time is becoming a rare and valuable resource, but I’ll do my best to finish the work in a short time so that Phobos can finally have a modern XML library to be proud of. I also have a plan to add more advanced functionality, like XML Schemas and XPath, but I don’t know when I’ll manage to work on that, as it is quite a lot to do.

How to Write @trusted Code in D

Posted on

Steven Schveighoffer is the creator and maintainer of the dcollections and iopipe libraries. He was the primary instigator of D’s inout feature and the architect of a major rewrite of the language’s built-in arrays. He also authored the oft-recommended introductory article on the latter.


d6In computer programming, there is a concept of memory-safe code, which is guaranteed at some level not to cause memory corruption issues. The ultimate holy grail of memory safety is to be able to mechanically verify you will not corrupt memory no matter what. This would provide immunity from attacks via buffer overflows and the like. The D language provides a definition of memory safety that allows quite a bit of useful code, but conservatively forbids things that are sketchy. In practice, the compiler is not omnipotent, and it lacks the context that we humans are so good at seeing (most of the time), so there is often the need to allow otherwise risky behavior. Because the compiler is very rigid on memory safety, we need the equivalent of a cast to say “yes, I know this is normally forbidden, but I’m guaranteeing that it is fine”. That tool is called @trusted.

Because it’s very difficult to explain why @trusted code might be incorrect without first discussing memory safety and D’s @safe mechanism, I’ll go over that first.

What is Memory Safe Code?

The easiest way to explain what is safe, is to examine what results in unsafe code. There are generally 3 main ways to create a safety violation in a statically-typed language:

  1. Write or read from a buffer outside the valid segment of memory that you have access to.
  2. Cast some value to a type that allows you to treat a piece of memory that is not a pointer as a pointer.
  3. Use a pointer that is dangling, or no longer valid.

The first item is quite simple to achieve in D:

auto buf = new int[1]; 
buf[2] = 1;

With default bounds checks on, this results in an exception at runtime, even in code that is not checked for safety. But D allows circumventing this by accessing the pointer of the array:

buf.ptr[2] = 1;

For an example of the second, all that is needed is a cast:

*cast(int*)(0xdeadbeef) = 5;

And the third is relatively simple as well:

auto buf = new int[1];
auto buf2 = buf;
delete buf;  // sets buf to null
buf2[0] = 5; // but not buf2.

Dangling pointers also frequently manifest by pointing at stack data that is no longer in use (or is being used for a different reason). It’s very simple to achieve:

int[] foo()
{
    int[4] buf;
    int[] result = buf[];
    return result;
}

So simply put, safe code avoids doing things that could potentially result in memory corruption. To that end, we must follow some rules that prohibit such behavior.

Note: dereferencing a null pointer in user-space is not considered a memory safety issue in D. Why not? Because this triggers a hardware exception, and generally does not leave the program in an undefined state with corrupted memory. It simply aborts the program. This may seem undesirable to the user or the programmer, but it’s perfectly fine in terms of preventing exploits. There are potential memory issues possible with null pointers, if one has a null pointer to a very large memory space. But for safe D, this requires an unusually large struct to even begin to worry about it. In the eyes of the D language, instrumenting all pointer dereferences to check for null is not worth the performance degradation for these rare situations.

D’s @safe rules

D provides the @safe attribute that tags a function to be mechanically checked by the compiler to follow rules that should prevent all possible memory safety problems. Of course, there are cases where developers need to make exceptions in order to get some meaningful work done.

The following rules are geared to prevent issues like the ones discussed above (listed in the spec here).

  1. Changing a raw pointer value is not allowed. If @safe D code has a pointer, it has access only to the value pointed at, no others. This includes indexing a pointer.
  2. Casting pointers to any type other than void* is not allowed. Casting from any non-pointer type to a pointer type is not allowed. All other casts are OK (e.g. casting from float to int) as long as they are valid. Casting a dynamic array to a void[] is also allowed.
  3. Unions that have pointer types that overlap other types cannot be accessed. This is similar to rules 1 and 2 above.
  4. Accessing an element in or taking a slice from a dynamic array must be either proven safe by the compiler, or incur a bounds check during runtime. This even happens in release mode, when bounds checks are normally omitted (note: dmd’s option -boundscheck=off will override this, so use with extreme caution).
  5. In normal D, you can create a dynamic array from a pointer by slicing the pointer. In @safe D, this is not allowed, since the compiler has no idea how much space you actually have available via that pointer.
  6. Taking a pointer to a local variable or function parameter (variables that are stored on the stack) or taking a pointer to a reference parameter are forbidden. An exception is slicing a local static array, including the function foo above. This is a known issue.
  7. Explicit casting between immutable and mutable types that are or contain references is not allowed. Casting value-types between immutable and mutable can be done implicitly and is perfectly fine.
  8. Explicit casting between thread-local and shared types that are or contain references is not allowed. Again, casting value-types is fine (and can be done implicitly).
  9. The inline assembler feature of D is not allowed in @safe code.
  10. Catching thrown objects that are not derived from class Exception is not allowed.
  11. In D, all variables are default initialized. However, this can be changed to uninitialized by using a void initializer:
    int *s = void;

    Such usage is not allowed in @safe D. The above pointer would point to random memory and create an obvious dangling pointer.

  12. __gshared variables are static variables that are not properly typed as shared, but are still in global space. Often these are used when interfacing with C code. Accessing such variables is not allowed in @safe D.
  13. Using the ptr property of a dynamic array is forbidden (a new rule that will be released in version 2.072 of the compiler).
  14. Writing to void[] data by means of slice-assigning from another void[] is not allowed (this rule is also new, and will be released in 2.072).
  15. Only @safe functions or those inferred to be @safe can be called.

The need for @trusted

The above rules work well to prevent memory corruption, but they prevent a lot of valid, and actually safe, code. For example, consider a function that wants to use the system call read, which is prototyped like this:

ssize_t read(int fd, void* ptr, size_t nBytes);

For those unfamiliar with this function, it reads data from the given file descriptor, and puts it into the buffer pointed at by ptr and expected to be nBytes bytes long. It returns the number of bytes actually read, or a negative value if an error occurs.

Using this function to read data into a stack-allocated buffer might look like this:

ubyte[128] buf;
auto nread = read(fd, buf.ptr, buf.length);

How is this done inside a @safe function? The main issue with using read in @safe code is that pointers can only pass a single value, in this case a single ubyte. read expects to store more bytes of the buffer. In D, we would normally pass the data to be read as a dynamic array. However, read is not D code, and uses a common C idiom of passing the buffer and length separately, so it cannot be marked @safe. Consider the following call from @safe code:

auto nread = read(fd, buf.ptr, 10_000);

This call is definitely not safe. What is safe in the above read example is only the one call, where the understanding of the read function and calling context assures memory outside the buffer will not be written.

To solve this situation, D provides the @trusted  attribute, which tells the compiler that the code inside the function is assumed to be @safe, but will not be mechanically checked. It’s on you, the developer, to make sure the code is actually @safe.

A function that solves the problem might look like this in D:

auto safeRead(int fd, ubyte[] buf) @trusted
{
    return read(fd, buf.ptr, buf.length);
}

Whenever marking an entire function @trusted, consider if code could call this function from any context that would compromise memory safety. If so, this function should not be marked @trusted under any circumstances. Even if the intention is to only call it in safe ways, the compiler will not prevent unsafe usage by others. safeRead should be fine to call from any @safe context, so it’s a great candidate to mark @trusted.

A more liberal API for the safeRead function might take a void[] array as the buffer. However, recall that in @safe code, one can cast any dynamic array to a void[] array — including an array of pointers. Reading file data into an array of pointers could result in an array of dangling pointers. This is why ubyte[] is used instead.

@trusted escapes

A @trusted escape is a single expression that allows @system (the unsafe default in D) calls such as read without exposing the potentially unsafe call to any other part of the program. Instead of writing the safeRead function, the same feat can be accomplished inline within a @safe function:

auto nread = ( () @trusted => read(fd, buf.ptr, buf.length) )();

Let’s take a closer look at this escape to see what is actually happening. D allows declaring a lambda function that evaluates and returns a single expression, with the () => expr syntax. In order to call the lambda function, parentheses are appended to the lambda. However, operator precedence will apply those parentheses to the expression and not the lambda, so the entire lambda must be wrapped in parentheses to clarify the call. And finally, the lambda can be tagged @trusted as shown, so the call is now usable from the @safe context that contains it.

In addition to simple lambdas, whole nested functions or multi-statement lambdas can be used. However, remember that adding a trusted nested function or saving a lambda to a variable exposes the rest of the function to potential safety concerns! Take care not to expose the escape too much because this risks having to manually verify code that should just be mechanically checked.

Rules of Thumb for @trusted

The previous examples show that tagging something as @trusted has huge implications. If you are disabling memory safety checks, but allowing any @safe code to call it, then you must be sure that it cannot result in memory corruption. These rules should give guidance on where to put @trusted marks and avoid getting into trouble:

Keep @trusted code small

@trusted code is never mechanically checked for safety, so every line must be reviewed for correctness. For this reason, it’s always advisable to keep the code that is @trusted as small as possible.

Apply @trusted to entire functions when the unsafe calls are leaky

Code that modifies or uses data that @safe code also uses creates the potential for unsafe calls to leak into the mechanically checked portion of a @safe function. This means that portion of the code must be manually reviewed for safety issues. It’s better to mark the whole thing @trusted, as that’s more in line with the truth. This is not a hard and fast rule; for example, the read call from the earlier example is perfectly safe, even though it will affect data that is used later by the function in @safe mode.

A pointer allocated with C’s malloc in the beginning of the function, and free‘d later, could have been copied somewhere in between. In this case, the dangling pointer may violate @safe, even in the mechanically checked part. Instead, try wrapping the entire portion that uses the pointer as @trusted, or even the entire function. Alternatively, use scope guards to guarantee the lifetime of the data until the end of the function.

Never use @trusted on template functions that accept arbitrary types

D is smart enough to infer @safe for template functions that follow the rules. This includes member functions of templated types. Just let the compiler do its job here. To ensure the function is actually @safe in the right contexts, create an @safe unittest  to call it. Marking the function @trusted allows any operator overloads or members that might violate memory safety to be ignored by the safety checker! Some tricky ones to remember are postblit and opCast.

It’s still OK to use @trusted escapes here, but be very careful. Consider especially possible types that contain pointers when thinking about how such a function could be abused. A common mistake is to mark a range function or range usage @trusted. Remember that most ranges are templates, and can be easily inferred as @system when the type being iterated has a @system postblit or constructor/destructor, or is generated from a user-provided lambda.

Use @safe to find the parts you need to mark as @trusted

Sometimes, a template intended to be @safe may not be inferred @safe, and it’s not clear why. In this case, try temporarily marking the template function @safe to see where the compiler complains. That’s where @trusted escapes should be inserted if appropriate.

In some cases, a template is used pervasively, and tagging it as @safe may make too many parts break. Make a copy of the template under a different name that you mark @safe, and change the calls that are to be checked so that they call the alternative template instead.

Consider how the function may be edited in the future

When writing a trusted function, always think about how it could be called with the given API, and ensure that it should be @safe. A good example from above is making sure safeRead cannot accept an array of pointers. However, another possibility for unsafe code to creep in is when someone edits a part of the function later, invalidating the previous verification, and the whole function needs to be rechecked. Insert comments to explain the danger of changing something that would then violate safety. Remember, pull request diffs don’t always show the entire context, including that a long function being edited is @trusted!

Use types to encapsulate @trusted operations with defined lifetimes

Sometimes, a resource is only dangerous to create and/or destroy, but not to use during its lifetime. The dangerous operations can be encapsulated into a type’s constructor and destructor, marked @trusted, which allows @safe code to use the resource in between those calls. This takes a lot of planning and care. At no time can you allow @safe code to ferret out the actual resource so that it can keep a copy past the lifetime of the managing struct! It is essential to make sure the resource is alive as long as @safe code has a reference to it.

For example, a reference-counted type can be perfectly safe, as long as a raw pointer to the payload data is never available. D’s std.typecons.RefCounted cannot be marked @safe, since it uses alias this to devolve to the protected allocated struct in order to function, and any calls into this struct are unaware of the reference counting. One copy of that payload pointer, and then when the struct is free‘d, a dangling pointer is present.

This can’t be @safe!

Sometimes, the compiler allows a function to be @safe, or is inferred @safe, and it’s obvious that shouldn’t be allowed. This is caused by one of two things: either a function that is called by the @safe function (or some deeper function) is marked @trusted but allows unsafe calls, or there is a bug or hole in the @safe system. Most of the time, it is the former. @trusted is a very tricky attribute to get correct, as is shown by most of this post. Frequently, developers will mark a function @trusted only thinking of some uses of their function, not realizing the dangers it allows. Even core D developers make this mistake! There can be template functions that are inferred safe because of this, and sometimes it’s difficult to even find the culprit. Even after the root cause is discovered, it’s often difficult to remove the @trusted tag as it will break many users of the function. However, it’s better to break code that is expecting a promise of memory safety than subject it to possible memory exploits. The sooner you can deprecate and remove the tag, the better. Then insert trusted escapes for cases that can be proven safe.

If it does happen to be a hole in the system, please report the issue, or ask questions on the D forums. The D community is generally happy to help, and memory safety is a particular focus for Walter Bright, the creator of the language.

GSoC Report: DStep

Posted on

Wojciech Szęszoł is a Computer Science major at the University of Wrocław. As part of Google Summer of Code 2016, he chose to make improvements to Jacob Carlborg’s DStep, a tool to generate D bindings from C and Objective-C header files.


GSoC-icon-192It was December of last year and I was writing an image processing project for a course at my university. I would normally use Python, but the project required some custom processing, so I wasn’t able to use numpy. And writing the inner loops of image processing algorithms in plain Python isn’t the best idea. So I decided to use D.

I’ve been conscious about the existence of the D language for as long as I can remember, but I’d never convinced myself before to try it out. The first thing I needed to do was to load an image. At the time, I didn’t know that there is a DUB repository containing bindings to image loading libraries, so I started writing bindings to libjpeg by myself. It didn’t end very well, so I thought there should be a tool that will do the job for me. That’s when I found DStep and htod.

Unluckily, the capabilities of DStep weren’t satisfying (mostly the lack of any kind of support for the preprocessor) and htod didn’t run on Linux. I ended up coding my project in C++, but as GSoC (Google Summer of Code) was lurking on the horizon, I decided that I should give it a try with DStep. I began by contacting Craig Dillabaugh (Ed. Note: Craig volunteers to organize GSoC projects using D) to learn if there was any need for developing such a project. It sparked some discussion on the forum, the idea was accepted, and, more importantly, Russel Winder agreed to be the mentor of the project. After some time I needed to prepare and submit an official proposal. There was an interview and fortunately I was accepted.

The first commit I made for DStep is dated to February, 1. It was a proof of concept that C preprocessor definitions can be translated to D using libclang. Then I improved the testing framework by replacing the old Cucumber-based tests with some written in D. I made a few more improvements before the actual GSoC coding period began.

During GSoC, I added support for translation of preprocessor macros (constants and functions). It required implementing a parser for a small part of the C language as the information from libclang was insufficient. I implemented translation of comments, improved formatting of the output code (e.g. made DStep keep the spacing from C sources), fixed most of the issues from the GitHub issue list and ported DStep to Windows. While I was coding I was getting support from Jacob Carlborg. He did a great job by reviewing all of the commits I made. When I didn’t know how to accomplish something with D, I could always count on help on forum.dlang.org.

DStep was the first project of such a size that I coded in D. I enjoyed its modern features, notably the module system, garbage collector, built-in arrays, and powerful templates. I used unittest blocks a lot. It would be nice to have named unit tests, so that they can be run selectively. From the perspective of a newcomer, the lack of consistency and symmetry in some features is troubling, at least before getting used to it. For example there is a built-in hash map and no hash set, some identifiers that should be keywords starts with @ (Ed. Note: see the Attributes documentation), etc. I was very sad when I read that the in keyword is not yet fully implemented. Despite those little issues, the language served me very well overall. I suppose I will use D for my personal toy projects in the future. And for DStep of course. I have some unfinished business with it :).

I would like to encourage all students to take part in future editions of GSoC. And I must say the D Language Foundation is a very good place to do this. I learned a lot during this past summer and it was a very exciting experience.

Inside D Version Manager

Posted on

In his day job, Jacob Carlborg is a Ruby backend developer for Derivco Sweden, but he’s been using D on his own time since 2006. He is the maintainer of numerous open source projects, including DStep, a utility that generates D bindings from C and Objective-C headers, DWT, a port of the Java GUI library SWT, and the topic of this post, DVM. He implemented native Thread Local Storage support for DMD on OS X and contributed, along with Michel Fortin, to the integration of Objective-C in D.


D Version Manager (DVM), is a cross-platform tool that allows you to easily download, install and manage multiple D compiler versions. With DVM, you can select a specific version of the compiler to use without having to manually modify the PATH environment variable. A selected compiler is unique in each shell session, and it’s possible to configure a default compiler.

The main advantage of DVM is the easy downloading and installation of different compiler versions. Specify the version of the compiler you would like to install, e.g. dvm install 2.071.1, and it will automatically download and install that version. Then you can tell DVM to use that version by executing dvm use 2.071.1. After that, you can invoke the compiler as usual with dmd. The selected compiler version will persist until the end of the shell session.

DVM makes it possible for the user to select a specific compiler version without having to modify any makefiles or build scripts. It’s enough for any build script to refer to the compiler by name, i.e. dmd, as long as the user selects the compiler version with DVM before invoking the script.

History

DVM was created in the beginning of 2011. That was a different time for D. No proper installers existed, D1 was still a viable option, and each new release of DMD brought with it a number of regressions. Because of all the regressions, it was basically impossible to always use the latest compiler, and often even older compilers, for all of your projects. Taking into consideration projects from other developers, some were written in D1 and some in D2, making it inconvenient to have only one compiler version installed.

It was for these reasons I created DVM. Being able to have different versions of the compiler active in different shell sessions makes it easy to work on different projects requiring different versions of the compiler. For example, it was possible to open one tab for a D1 compiler and another for a D2 compiler.

The concept of DVM comes directly from the Ruby tool RVM. Where DVM installs D compilers, RVM installs Ruby interpreters. RVM can do everything DVM can do and a lot more. One of the major things I did not want to copy from RVM is that it’s completely written in shell script (bash). I wanted DVM to be written in D. Because it’s written in shell script, RVM enables some really useful features that DVM does not support, but some of them are questionable (some might call them hacks). For example, when navigating to an RVM-enabled project, RVM will automatically select the correct Ruby interpreter. However, it accomplishes this by overriding the built-in cd command. When the command is invoked, RVM will look in the target directory for one of the files .rvmrc or .ruby-version. If either is present, it will read that file to determine which Ruby interpreter to select.

Implementation and Usage

One of the goals of DVM was that it should be implemented in D. In the end, it was mostly written in D with a few bits of shell script. Note that the following implementation details are specific to the platforms that fall under D’s Posix umbrella, i.e. version(Posix), but DVM is certainly available for Windows with the same functionality.

Structure of the DVM Installation

Before DVM can be used, it needs to install itself. This is accomplished with the command, dvm install dvm. This will create the ~/.dvm directory. It contains the following subdirectories: archives, bin, compilers, env and scripts.

archives contains a cache of downloaded zip archives of D compilers.

bin contains shell scripts, acting as symbolic links, to all installed D compilers. The name of each contains the version of the compiler, e.g. dmd-2.071.1, making it possible to invoke a specific compiler without first having to invoke the use command. This directory also contains one shell script, dvm-current-dc, pointing to the currently active D compiler. This allows the currently active D compiler to be invoked without knowing which version has been set. This can be useful for executing the compiler from within an editor or IDE, for example. A shell script for the default compiler exists as well. Finally, this directory also contains the binary dvm itself.

The compilers directory contains all installed compilers. All of the downloaded compilers are unpacked here. Due to the varying quality of the D compiler archives throughout the years, the install command will also make a few adjustments if necessary. In the old days, there was only one archive for all platforms. This command will only include binaries and libraries for the current platform. Another adjustment is to make sure all executables have the executable permission set.

The env directory contains helper shell scripts for the use command. There’s one script for each installed compiler and one for the default selected compiler.

The scripts directory currently only contains one file, dvm. It’s a shell script which wraps the dvm binary in the bin directory. The purpose of this wrapper is to aid the use command.

The use Command

The most interesting part of the implementation is the use command, which selects a specific compiler, e.g. dvm use 2.071.1. The selection of a compiler will persist for the duration of the shell session (window, tab, script file).

The command works by prepending the path of the specified compiler to the PATH environment variable. This can be ~/.dvm/compilers/dmd-2.071.1/{platform}/bin for example, where {platform} is the currently running platform. By prepending the path to the environment variable, it guarantees the selected compiler takes precedence over any other possible compilers in the PATH. The reason the {platform} section of the path exists is related to the structure of the downloaded archive. Keeping this structure avoids having to modify the compiler’s configuration file, dmd.conf.

The interesting part here is that it’s not possible to modify the environment variables of the parent process, which in this case is the shell. The magic behind the use command is that the dvm command that you’re actually invoking is not the D binary; it’s the shell script in the ~/.dvm/scripts path. This shell script contains a function called dvm. This can be verified by invoking type dvm | head -n 1, which should print dvm is a function if everything is installed correctly.

The installation of DVM adds a line to the shell initialization file, .bashrc, .bash_profile or similar. This line will load/source the DVM shell script in the ~/.dvm/scripts path which will make the dvm command available. When the dvm function is invoked, it will forward the call to the dvm binary located in ~/.dvm/bin/dvm. The dvm binary contains all of the command logic. When the use command is invoked, the dvm binary will write a new file to ~/.dvm/tmp/result and exit. This file contains a command for loading/sourcing the environment file available in ~/.dvm/env that corresponds to the version that was specified when the use command was invoked. After the dvm binary has exited, the shell script function takes over again and loads/sources the result file if it exists. Since the shell script is loaded/sourced instead of executed, the code will be evaluated in the current shell instead of a sub-shell. This is what makes it possible to modify the PATH environment variable. After the result file is loaded/sourced, it’s removed.

If you find yourself with the need to build your D project(s) with multiple compiler versions, such as the current release of DMD, one or more previous releases, and/or the latest beta, then DVM will allow you to do so in a hassle-free manner. Pull up a shell, execute use on the version you want, and away you go.

The Origins of the D Cookbook

Posted on

Adam Ruppe is the author of D Cookbook and maintainer of This Week in D. Modules from his freely available arsd package are used throughout the D community. He is also known for his legendary DConf 2014 presentation.


dcookbookIn 2013, Packt Publishing approached me to write D Cookbook. Over the next several months, I was tasked with collecting over 100 D programming language “recipes” and writing a couple of pages about each one.

I wanted it to be a mix of practical and fun examples of what D can do over the wide variety of fields in which I have used it, each one illustrative of some language concept that the reader could use in different contexts. As such, for the majority of the book, I avoided the use of libraries, even keeping the presence of Phobos or my own modules to a minimum, allowing me to focus on the implementations of the ideas.

That said, I didn’t come up with a hundred recipes out of thin air! They came from three main sources: my arsd libraries, my support efforts on the D forums, the #D IRC channel, and Stack Overflow (it was Stack Overflow that got Packt’s attention in the first place), and a few other side projects, like my minimal D runtime (zip file).

I first saw D way back in 2002. At the time, I was still fairly new to programming and was going to download another copy of Digital Mars C++ (which I used to compile 16-bit DOS programs!) when I saw a link on the Digital Mars website about this D programming language. My first impression – upon seeing the import keyword – was that it was too Java-like and I wasn’t interested. Oh, how I regret that naive snap decision! I didn’t take my next look at D until 2006, and that time, with far more real world experience and theoretical knowledge under my belt, I quickly fell in love.

Over the next year, I wrote a game library based on SDL and OpenGL and made a few little games for myself. My D code at this point wasn’t terribly unique. Much of it was based on my existing C++ codebases. That fairly uninteresting fact would later become one of my most recurring D tips: a great deal of your C, C++, and Java knowledge can carry over directly to D! Thanks to the similarities between the languages, it’s easy to learn. While your code is unlikely to work perfectly if simply copy/pasted into a .d file, porting it probably isn’t hard. I found the D language very comfortable right off the bat. Even today, I tell people with D questions to simply consider how they would do it in C++ and apply that existing knowledge to D. This can also work with C and C++ solutions gleaned from Internet resources like Stack Overflow.

I started working as a full-time web developer in 2008, which put a time constraint on my hobby game development efforts, but didn’t ice my love of D. Indeed, it wasn’t long at all before I worked D into my professional life by writing web libraries and eventually transitioning my work projects away from PHP and onto D!

At the same time, the D language was going through a series of rapid changes. Templates and compile time function evaluation got overhauled, immutable was introduced… Most interestingly to me, compile time reflection got massively expanded and a few features like opDispatch got added. Compile time reflection in older D was limited to the is expression and template tricks, but new D had __traits, making it competitive even with dynamic languages, without sacrificing D’s other strengths.

I was a very early adopter of these features and set out to discover how to combine them in ways to make my work easier, to compete with the other web languages, and to just show off a little 🙂 If someone came on the forum and said D can’t do this, then I’d reply Challenge accepted.

In the following years, I wrote: a DOM and JSON library, taking the most interesting ideas from Javascript into D; a web framework, realizing some dreams I had in rapidly churning out prototypes that could also survive the change process of real world development; OAuth and database libraries to support the needs of the projects; and more. One of the most interesting modules was my web.d, which takes a D class definition and builds web infrastructure around it: an automatically generated web site with CRUD forms from the static type information as well as Javascript and PHP code for API access to that functionality. This stretched D’s reflection capabilities and hit quite a few bugs on the bleeding edge, necessitating creative workarounds or alternative approaches. If I was really desperate, I’d fix the bug in DMD myself!

At the same time, I was regularly seen in the D community, helping other people with their problems, and, of course, reading insights from other members of the community. Every few days, I had another tip, and was also building up a mental picture of people’s common difficulties with D.

By 2013, I had years of experience with almost every corner and combination of D. Now, you can get a good slice of that in just a few days by reading the book!

Programming in D: A Happy Accident

Posted on

This is a guest post from Ali Çehreli, who not only uses D as a hobby, but also gets to use it as an employee of Weka.io. He is the author of Programming in D and is frequently found in the D Learn forum with ready answers to questions on using the language. He also is an officer of the D Foundation.


progindI consider the book Programming in D a happy accident, because initially it was not intended to be a book.

I used to be a frequent contributor to Turkish C and C++ forums. Although there were many smart and motivated members on those forums, most of them were not well-versed enough to follow programming resources in English. If they were patient enough to wait about ten years and if a publisher decided to have them translated, then they might get their hands on Turkish versions of their favorite books.

In 2009, around the time when my interest in C++ had started to diminish, I read with great excitement Andrei Alexandrescu’s The Case for D article in ACCU’s C Vu magazine (also available at Dr. Dobb’s). To a person coming from a C++ background, D was a fresh breath of air, removing some of C++’s warts and bringing many new features, some unique, some borrowed from other languages.

I was instantly hooked. I immediately created the Turkish D site ddili.org, translated Andrei’s article to Turkish, and published it there. One of the reasons for my excitement was the potential that D could be one software technology that Turkish programmers would not be left in the dark about. Since D was still being designed and implemented, there was time to write fresh Turkish documentation for it. I translated other D articles and started writing an HTML tutorial that would later become the book.

I knew very well that attempting to teach a topic is one of the best ways of learning that topic. I knew that I would be learning D myself. Little did I know then that this project would make me a better software engineer in general as well.

Teaching programming is a notoriously difficult task. According to some academic papers I found when I started the tutorial, one of the difficulties comes from the fact that different people model new concepts in their minds in different ways, rendering particular teaching methods inefficient at least for some students. Encouraged by the lack of one correct way of teaching programming, I picked one that was the easiest for me: introducing concepts in linear fashion with as few forward references as possible, starting with the most basic concepts like the = character confusingly meaning something different than is equal to.

Starting from the basics made it necessary for me to introduce lower-level concepts before higher-level concepts. For example, although the foreach statement is much more commonly used in practice, while, for, and foreach statements are introduced in the book in that order. I think that choice created a better foundation for the reader.

It took me two years to finish writing a flow of chapters from the assignment operator all the way to the garbage collector. It was very challenging and very rewarding to find a natural flow of presentation not only throughout the book but also within each chapter. The method I used for each chapter was to think about the presentation of the topic along with non-trivial examples beforehand, without touching the computer. I then wrote the chapter fairly quickly without much attention to detail, put it aside for a couple of days, then came back to review it from the point of view of a reader. I repeated that process perhaps five to ten times for each chapter until I thought it was fairly acceptable. Likely as a result of that process, a common feedback I receive is about how to-the-point my writing style has been.

Based on feedback from the Turkish community and encouragement from Andrei Alexandrescu, I started translating the book to English in early 2011. The translation continued along with new chapter additions, many corrections, and some chapter rewrites.

I made a PDF version available in January 2012 and the translation was finally completed in July 2014. Not only had I achieved my initial goal of providing fresh Turkish documentation for D, this book might have been the first software resource that was translated in the other direction.

I readily agreed with the suggestion that the book should be available in paper form as well. That decision brought many different challenges related to self-publishing like layout, cover design, pricing, the printing company, etc. The first print publication was in August 2015. Surprisingly, producing an ebook version turned out to be even more challenging. In addition to different kinds of layout issues, all ebook formats require special attention.

I am awestruck that my humble idea of a humble tutorial turned into a well known resource in the D ecosystem. It makes me very happy that people actually find the book useful. I am also happy that, periodically, people express interest in translating it to other languages. As of this writing, in addition to the completed Turkish and English versions, there are ongoing translations by volunteers to French and Chinese (German, Korean, Portuguese, and Russian translations were started but not continued).

As for future directions, I would like to add more chapters; definitely one on allocators once they’re added to the standard library (they currently live in the std.experimental.allocator package).

One thing that bothers me about the book is that most code samples don’t take full advantage of D’s universal function call syntax (UFCS), mainly because that feature was added to the language only after most of the book was already written. I would like to move the UFCS chapter to an earlier point in the book so that more code samples can be in the idiomatic D style.

The book will always be freely available online, allowing me to make frequent updates and corrections. Fortunately, my Inglish leaves a lot to improve on, so there will always be grammar and typo corrections as well.

Making Of: LDC 1.0

Posted on

This is a guest post from Kai Nacke. A long-time contributor to the D community, Kai is the author of D Web Development and the maintainer of LDC, the LLVM D Compiler.


LDC has been under development for more than 10 years. From release to release, the software has gotten better and better, but the version number has always implied that LDC was still the new kid on block. Who would use a version 0.xx compiler for production code?

These were my thoughts when I raised the question, “Version number: Are we ready for 1.0?” in the forum about a year ago. At that time, the current LDC compiler was 0.15.1. In several discussions, the idea was born that the first version of LDC based on the frontend written in D should be version 1.0, because this would really be a major milestone. Version 0.18.0 should become 1.0!

Was LDC really as mature as I thought? Looking back, this was an optimistic view. At DConf 2015, Liran Zvibel from Weka.IO mentioned in his talk about large scale primary storage systems that he couldn’t use LDC because of bugs! Additionally, the beta version of 0.15.2 had some serious issues and was finally abandoned in favor of 0.16.0. And did I mention that I was busy writing a book about vibe.d?

Fortunately, over the past two years, more and more people began contributing to LDC. The number of active committers grew. Suddenly, the progress of LDC was very impressive: Johan added DMD-style code coverage and worked on merging the new frontend. Dan worked on an iOS version and Joakim on an Android version. Together, they made ARM a first class target of LDC. Martin and Rainer worked on the Windows version. David went ahead and fixed a lot of the errors which had occurred with the Weka code base. I spent some time working on the ports to PowerPC and AArch64. Uncounted small fixes from other contributors improved the overall quality.

Now it was obvious that a 1.x version was overdue. Shortly after DMD made the transition to the D-based frontend, LDC was able to use it. After the usual alpha and beta versions, I built the final release version on Sunday, June 5, and officially announced it the next day. Version 1.0 is shipping now!

Creating a release is always a major effort. I would like to say “Thank you!” to everybody who made this special release happen. And a big thanks to our users; your feedback is always a motivation to make the next LDC release even better.

Onward to 1.1!

Find Was Too Damn Slow, So We Fixed It

Posted on

This is a guest post from Andreas Zwinkau, a problem solving thinker, working as a doctoral researcher at the IPD Snelting within the InvasIC project on compiler and language perfection. He manages and teaches students at the KIT. With a beautiful wife and two jolly kids, he lives in Karlsruhe, Germany.


Please throw this hat into the ring as well, Andrei wrote when he submitted the winning algorithm. Let me tell you about this ring and how we made string search in D’s standard library, Phobos, faster. You might learn something about performance engineering and benchmarking. Maybe you’ll want to contribute to Phobos yourself after you read this.

The story started for me when I decided to help out D for some recreational programming. I went to the Get Involved wiki page. There was a link to the issue tracker for preapproved stuff and there I found issue 9646, which was about splitter being too slow in a specific case. It turned out that it was actually find, called inside splitter, which was slow. The find algorithm searches for one string (the needle) inside another (the haystack). It returns a substring of the haystack, starting with the first occurrence of the needle and ending with the end of the haystack. Find being slow meant that a naively implemented find (two nested for loops) was much faster than what the standard library provided.

So, let’s fix find!

Before I started coding, the crucial question was: How will I know I am done? At that moment, I only had one test case from the splitter issue. I had to ensure that any solution was fast in the general case. I wanted to fix issue 9646 without making find worse for everybody else. I needed a benchmark.

As a first step, I created a repository. It initially contained a simple program which compared Phobos’s find, a naive implementation from issue 9646, and a copy from Phobos I could modify and tune. My first change: insert the same naive algorithm into my Phobos copy. This was the Hello World of bugfixing and proved two things:

  1. I was working on the correct code. Phobos contained multiple overloads of find and I wanted to work on the right one. For example, there was an indirection which cast string into a ubyte array to avoid auto decoding.
  2. It was not an issue of meta programming. Phobos code is generic and uses D’s capabilities for meta programming. This means the compiler is responsible for specializing the generic code to every specific case. Fixing that would have required changing the compiler, but not the standard library.

At this point I knew which specific lines I needed to speed up and I had a benchmark to quickly see the effects of my changes. It was time to get creative, try things, and find a better find.

For a start, I tried the good old classic Boyer-Moore, which the standard library provides but wasn’t using for find. I quickly discarded it, as it was orders of magnitude slower in my benchmark. Gigabytes of data are probably needed to make that worthwhile with a modern processor.

I considered simply inserting the naive algorithm. It would have fixed the problem. On the other hand, Phobos contained a slightly more advanced algorithm which tried to skip elements. It first checks the end of the needle and, on a mismatch, we can advance the needle its whole length if the end element does not appear twice in the needle. This requires one pass over the needle initially to determine the step size. That algorithm looked nice. Someone had probably put some thought into it. Maybe my benchmark was biased? To be safe, I decided to fix the performance problem without changing the algorithm (too much).

How? Did the original code have any stupid mistakes? How else could you fix a performance problem without changing the whole algorithm?

One trick could be to use D’s meta programming. The code was generic, but in certain cases we could use a more efficient version. D has static-if, which means we could switch between the versions at compile time without any runtime overhead.

static if (isRandomAccessRange!Needle) {
   // new optimized algorithm
} else {
   // old algorithm
}

The main difference from the old algorithm was that we could avoid creating a temporary slice to use startsWith on. Instead, a simple for-loop was enough. The requirement was that the needle must be a random access range.

When I had a good version, the time was ripe for a pull request. Of course, I had to fix issues like style guide formatting before the pull request was acceptable. The D community wants high-quality code, so the autotester checked my pull request by running tests on different platforms. Reviewers checked it manually.

Meanwhile in the forum, others chimed in. Chris and Andrei proposed more algorithms. Since we had a benchmark now, it was easy to include them. Here are some numbers:

DMD:                       LDC:
std find:    178 ±32       std find:    156 ±33
manual find: 140 ±28       manual find: 117 ±24
qznc find:   102 ±4        qznc find:   114 ±14
Chris find:  165 ±31       Chris find:  136 ±25
Andrei find: 130 ±25       Andrei find: 112 ±26

You see the five mentioned algorithms. The first number is the mean slowdown compared to the fastest one on each single run. The annotated ± number is the mean absolute deviation. I considered LDC’s performance more relevant than DMD’s. You see manual, qznc, and Andrei find report nearly the same slowdown (117, 114, 112), and the deviation was much larger than the differences. This meant they all had roughly the same speed. Which algorithm would you choose?

We certainly needed to pick one of the three top algorithms and we had to base the decision on this benchmark. Since the numbers were not clear, we needed to improve the benchmark. When we ran it on different computers (usually an Intel i5 or i7) the numbers changed a lot.

So far, the benchmark had been generating a random scenario and then measuring each algorithm against it. The fastest algorithm got a speed of 100 and the others got higher numbers which measured their slowdown. Now we could generate a lot of different scenarios and measure the mean across them for each algorithm. This design placed a big responsibility on the scenario generator. For example, it chose the length of the haystack and the needle from a uniform distribution within a certain range. Was the uniform distribution realistic? Were the boundaries of the range realistic?

After discussion in the forum, it came down to three basic use cases:

  1. Searching for a few words within english text. The benchmark has a copy of ‘Alice in Wonderland’ and the task is to search for a part of the last sentence.
  2. Searching for a short needle in a short haystack. This corresponds to something like finding line breaks as in the initial splitter use case. This favors naive algorithms which do not require any precomputation or other overhead.
  3. Searching in a long haystack for a needle which it doesn’t contain. This favors more clever algorithms which can skip over elements. To guarantee a mismatch, the generator inserts a special character into the needle, which we do not use to generate the haystack.
  4. Just for comparison, the previous random scenario is still measured.

At this point, we had a good benchmark on which we could base a decision.

Please throw this hat into the ring as well.

Andrei found another algorithm in his magic optimization bag. He knew the algorithm was good in some cases, but how would it fare in our benchmark? What were the numbers with this new contender?

In short: Andrei’s second algorithm completely dominated the ring. It has two names in the plot: Andrei2 as he posted it and A2Phobos as I generalized it and integrated it into Phobos. In the plots you see those two algorithms always close to the optimal result 100.

It was interesting that the naive algorithm still won in the ‘Alice’ benchmark, but the new Phobos was really close. The old Phobos std was roughly twice as slow for short strings, which we already knew from issue 9646.

What did this new algorithm look like? It used the same nested loop design as Andrei’s first one, but it computed the skip length only on demand. This meant one more conditional branch, but modern branch predictors seem to handle that easily.

Here is the final winning algorithm. The version in Phobos is only slightly more generic.

T[] find(T)(T[] haystack, T[] needle) {
  if (needle.length == 0) return haystack;
  immutable lastIndex = needle.length - 1;
  auto last = needle[lastIndex];
  size_t j = lastIndex, skip = 0;
  while (j < haystack.length) {
    if (haystack[j] != last) {
      ++j;
      continue;
    }
    immutable k = j - lastIndex;
    // last elements match, check rest of needle
    for (size_t i = 0; ; ++i) {
      if (i == lastIndex)
        return haystack[k..$]; // needle found
      if (needle[i] != haystack[k + i])
        break;
    }
    if (skip == 0) { // compute skip length
      skip = 1;
      while (skip < needle.length &&
             needle[$-1-skip] != needle[$-1]) {
        ++skip;
      }
    }
    j += skip;
  }
  return haystack[$ .. $];
}

Now you might want to run the benchmark yourself on your specific architecture. Get it from Github and run it with make dmd or make ldc. We are still interested in results from a wide range of architectures.

For me personally, this was my biggest contribution to D’s standard library so far. I’m pleased with the community. I deserved all criticism and it was professionally expressed. Now we can celebrate a faster find and fix the next issue. If you want to help, the D community will welcome you!