Author Archives: Jacob Carlborg

DStep 1.0.0

DStep is a tool for automatically generating D bindings for C and Objective-C libraries. This is implemented by processing C or Objective-C header files and outputting D modules. DStep uses the Clang compiler as a library (libclang) to process the header files.

Background

The first version of DStep was released on the 7th of July, 2012. There have been four subsequent releases, the last of which was on the 16th of January, 2016. Quite a lot has happened in the D world and with DStep since then.

After the release of DStep 0.2.1 in January 2016, there wasn’t much progress on DStep. I had a limited amount of time and chose to spend it on other projects. Fortunately, in 2016, DStep got picked as one of four D-related projects for Google Summer of Code (GSoC). The student who chose to work on DStep was Wojciech Szęszoł. He did a tremendous amount of work and pushed DStep forward by years compared to the time it would have taken me. In fact, I was often a blocker because I couldn’t keep up with reviewing all the changes he made.

New Release

The latest release of DStep contains a huge number of new features and bug fixes. A lot of the new features add support for translating C preprocessor macros in various forms. DStep also gained support for one more platform: Windows. Here follow some of the new features available in DStep 1.0.0:

Support for Simple Defines

This feature adds support for translating a simple form of #define to a manifest constant in D. Example:

#define FOO 1

The above C code is translated to the following D code:

enum FOO = 1;

DStep will try to translate the C code so that the D code looks as much as possible like the original C code. If a #define contains an expression instead of a single literal, DStep will try to preserve the original expression:

#define FOO 1 + 3

Instead of translating this to a manifest constant with the value of 4 (which would be semantically correct), DStep will preserve the original expression and translate it to:

enum FOO = 1 + 3;

This also goes for other types of literals, like hexadecimal literals:

#define FOO 0x1

Here DStep will preserve the hexadecimal literal and translate it to:

enum FOO = 0x1;

Function-Like Macros

DStep is now able to translate function-like macros. This is a pretty advanced feature that requires a small parser for the macros. DStep uses libclang to tokenize the macros and the parses them to be able to do the proper translations. The most basic example looks like:

#define FOO() 0 + 1

The above macro will translate to the following D code:

extern (D) int FOO()
{
    return 0 + 1;
}

Although not shown here (to minimize the examples), DStep will output extern (C): at the top of each file. Therefore, for macros translated to functions, DStep will add extern (D) to give the functions D linkage and mangling.

Here’s an example of a C macro containing parameters:

#define FOO(a, b) a + b

Unfortunately, in C, a and b can be basically anything. D doesn’t have an exact corresponding feature. DStep will translate this as accurately as
possible by outputting a templated function:

extern (D) auto FOO(T0, T1)(auto ref T0 a, auto ref T1 b)
{
    return a + b;
}

The assumption in this translation is that a and b will be a value of some kind of type. They can either be of the same type or of different types. To
avoid copying any of the values, ref parameters are used. Since an rvalue cannot be passed to a ref parameter, auto ref is used instead to properly handle both rvalues and lvalues.

More advanced expressions are supported as well:

#define BAR 4
#define FOO(a, b) a + 3 + (b + BAR) - sizeof(b)

In the above example there’s a combination of parameters, literals, parenthesized expression, usages of other macros, and built-in operators. DStep handles all those and translates it to:

enum BAR = 4;

extern (D) auto FOO(T0, T1)(auto ref T0 a, auto ref T1 b)
{
    return a + 3 + (b + BAR) - b.sizeof;
}

Again, the expression is preserved as closely as possible to the original source code. The parentheses, the reference to the BAR macro, all are preserved.

Token Concatenation

This feature adds support for translating the token concatenation, or token pasting, operator to a D string concatenation:

#define CONCAT(prefix, name) prefix ## name

The above function-like macro concatenates the two given tokens. DStep translates that to a function that converts the arguments to strings and concatenates the two resulting strings. This can then be used together with the string mixin statement to give the same behavior as in C.

extern (D) string CONCAT(T0, T1)(auto ref T0 prefix, auto ref T1 name)
{
    import std.conv : to;

    return to!string(prefix) ~ to!string(name);
}

Another example is parameters combined with tokens:

#define CONCAT(prefix) prefix ## name

This translates similarly to the previous example, but since name is not a parameter this will be translated to a string literal:

extern (D) string CONCAT(T)(auto ref T prefix)
{
    import std.conv : to;

    return to!string(prefix) ~ "name";
}

Preprocessor Constants in Array Sizes

DStep will now preserve preprocessor constants for the size of arrays:

#define Foo 3
int a[Foo];

In previous versions of DStep the translation would just output the size of the array a as 3. This would be semantically accurate but the generated source
code would look less like the original C source code. Now DStep is able to translate preprocessor constants and can, therefore, use the preprocessor constant as the size of the array:

enum Foo = 3;
extern __gshared int[Foo] a;

In the above example, the manifest constant Foo is used as the size of a instead of a plain 3. This more closely matches the original C source code.

Preserving Comments

In previous versions of DStep comments were completely stripped out. With this release DStep is able to preserve comments in the D code from the original C code:

// This comment describes this whole file

// Documentation for the symbol `foo`
void foo();

/* Loose comment */ /* Loose comment */

/*
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
*/ /* Loose comment */

int a; // this is `a`

In the above example there are three types of comments:

  • A header comment for the whole file
  • A preceding comment for the symbol foo
  • Loose comments not belonging to any symbol
  • A trailing comment for the symbol a

All of these comments are now properly preserved:

// This comment describes this whole file

extern (C):

// Documentation for the symbol `foo`
void foo ();

/* Loose comment */ /* Loose comment */

/*
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
*/ /* Loose comment */

extern __gshared int a; // this is `a`

In the above example, notice how the header comment is placed above the extern (C): line. If a module declaration is output in the D file (when the --package flag is used), the header comment will be placed above that as well:

// This comment describes this whole file

module bar.foo;

extern (C):

// Documentation for the symbol `foo`
void foo ();

/* Loose comment */ /* Loose comment */

/*
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
    Multi-line loose comment.
*/ /* Loose comment */

extern __gshared int a; // this is `a`

It’s also possible to disable the preservation of comments using the --comments=false flag.

Package Prefix

To better help organize bindings, DStep supports the
--package <name.of.package> flag. When this flag is enabled DStep will put all the translated modules in the <name> package. Note, this will only add the package prefix to the module declaration of the translated file. It will not place the output file in a directory corresponding to the package.

Removing Excessive Newlines

DStep will now remove excessive newlines but still preserve spacing for the original C code. This is best illustrated with an example:

int a;


int b;

In the above example there are two newlines between the declarations of a and b. DStep will remove the excessive newline and only output one to still preserve the spacing:

extern __gshared int a;

extern __gshared int b;

But if there is no spacing between the declarations, that is respected as well:

int a;
int b;

In the above example there is no newline between the declarations and DStep will preserve that:

extern __gshared int a;
extern __gshared int b;

Preserving Order of Declarations

In previous versions of DStep the declarations in the translated D code would follow a certain order defined by DStep, aliases first, then constants, then types and last functions. With this release, DStep will now preserve the order of the declarations of the original C code:

void bar();

struct Foo
{
    int a;
};

Previous versions would translate the above to:

struct Foo
{
    int a;
}

void bar ();

with the struct first and then the function declaration. With this release, the order is preserved:

void bar ();

struct Foo
{
    int a;
}

Multiple Input Files

Previous versions of DStep only allowed a single header file as input. With this release, multiple files can be passed to DStep at once. Each input file will produce one D source file as input. To pass multiple input files to DStep, just pass the filenames when invoking DStep.

$ dstep foo.h bar.h

Running the above command will produce two D source files: foo.d and bar.d.

If multiple input files and the -o flag are given, the -o flag specifies the output directory where the D source files will be placed. When multiple input
files are given it’s not possible to specify the names of the D source files.

$ dstep foo.h bar.h -o foobar
$ ls foobar
bar.d foo.d

The above command will place the two D source files in the directory foobar.

--reduce-aliases Flag

Normally when DStep translates a header file to a D module it will reduce aliases if possible. DStep contains a set of common typedefs that can be reduced to native D types. That means that code like this:

#include ;
int32_t a = 3;

Will be translated to the following D code:

extern __gshared int a;

In this release of DStep, there’s a new flag, --reduce-aliases. This flag allows the reduce aliases feature to be enabled or disabled. By default it’s enabled, but can be disabled by invoking DStep with the following command: dstep --reduce-aliases=false. When this feature is disabled, it will translate the above example to the following D code:

import core.stdc.stdint;
extern __gshared int32_t a;

It will keep the name int32_t as the type of the variable declaration and add the import for the module that contains the declaration of int32_t.

--alias-enum-members Flag

In C, enum members are accessible directly in the global scope. Example:

enum Foo
{
    foo,
    bar
};

enum Foo a = foo;

In D however, enum members need to be qualified with the enum name. The correct translation of the above would be:

enum Foo
{
    foo = 0,
    bar = 1
}


Foo a = Foo.foo;

In this release of DStep a new flag has been added, --alias-enum-members, that enables the generation of aliases to enum members at module scope. This will allow keeping the translation more closely to the original C code:

enum Foo
{
    foo = 0,
    bar = 1
}

alias foo = Foo.foo;
alias bar = Foo.bar;

By default this feature is not enabled for the translated D code to more closely follow the D conventions.

--translate-macros Flag

In this release, DStep can now translate several kinds of C macros to their equivalent in D. This might not always be desirable because the translations are not bulletproof. Therefore, there’s a new flag, --translate-macros, which will enable or disable the translation of macros. By default translation of macros is enabled.

libclang Bindings

DStep uses Clang as a library to process the C code. There two main ways of using Clang as a library. One is to use the C++ APIs directly. This will give full access to what Clang can do. The problem with this API is that is not a stable API. It’s also a C++ API and when DStep was first implemented, the C++ integration in D was quite lacking. One of my requirements when implementing DStep was to implement it in D. Therefore, the natural choice was to use the C API, called “libclang”, which is also provided. This library is the main interface intended to be used by editors and IDEs that want to leverage Clang as a library. It’s also recommended to use libclang when accessing Clang from a language other than C++—D in my case.

Since libclang is a C library only exposing C header files, I needed bindings to be able to use from it within D. Up until now these bindings were hand made. Fortunately, these headers are the most forgiving I have ever seen when it comes to translating into D. Only a single header file was needed containing hardly any macros at all. It was quite a quick job of translating these headers by using search-and-replace.

With this release of DStep, since DStep has improved so much, the bindings are now self-hosted. That is, DStep has been used to generate the bindings. In addition to that, generation of the bindings has now been added to the test suite to make sure it doesn’t break.

Support for Windows

DStep was originally developed on macOS since that is my main development platform. Thanks to the Posix standard it was easy to port to Linux as well. The first release of DStep was available for both macOS and Linux. Back in 2011 when the development of DStep first started, Clang did not support Windows. There seems to have been some support for MinGW, but that was not a target supported by DMD. The LLVM and Clang team has made huge progress since then and in 2016 when DStep got picked as a GSoC project, support for Windows was available, targeting compatibility with the Visual Studio compiler.

In this release, thanks to Wojciech Szęszoł, DStep is now available on Windows. Due to Clang being compatible with Visual Studio, DStep needs to be built to use the same object format. When compiling with DMD, that means compiling for 64-bit (via the -m64 flag) or using the -m32mscoff flag when compiling for 32 bit. The Dub package automatically takes care of this.

Continuous Integration/Deployment

In the area of CI/CD quite a few things have happened. Originally the test suite of DStep was implemented using Cucumber and Ruby. These tests were a form of end-to-end test and failed to take advantage of the reasons to use Cucumber in the first place. These tests have now been replaced with a combination of unit tests and end-to-end tests, all implemented in D.

LDC

LDC has been added as a supported compiler. That means that DStep is compiled with LDC in addition to DMD as part of the CI pipelines. Every commit and every pull request is now tested with LDC as well.

Upgrade of Compilers

Both DMD and LDC have been upgraded to their latest versions. In addition, beta and nightly releases are being tested in the CI pipelines. A scheduled job has been added to the CI pipelines as well, which will run once every day to make sure new releases of the compilers won’t break DStep even if no changes have been made to DStep. This also means that only the latest versions of LDC and DMD are supported for building DStep.

Testing Windows using AppVeyor

Since DStep now supports Windows as an additional platform a new CI pipeline has been added in the form of AppVeyor. This is a CI service that provides Windows as a platform to run builds on. This build run compiles DStep both using DMD and LDC and it also builds both 32-bit and 64-bit versions.

Complex Floating-Point Types

Another feature that is new in this version of DStep is that complex floating-point types are now supported. There are three complex types that are supported:
float _Complex, double _Complex and long double _Complex.

float _Complex a;
double _Complex b;
long double _Complex c;

The above code snippet in C is translated to the following D code:

extern __gshared cfloat a;
extern __gshared cdouble b;
extern __gshared creal c;

New Alias Syntax

Typedefs in C header files are translated to alias declarations in the D code. Up until this release they used the old alias syntax: alias oldName newName. Since the previous release of DStep the D language has improved and gained new features. One of them is a new (now considered the standard) alias syntax: alias newName = oldName. It’s easier to follow which name is the alias and which is the original when it’s using a more familiar syntax similar to variable declarations. Here’s an example of how the C code is translated into D code with the new alias syntax:

typedef int foo;

The above C code is translated to the following D code:

alias foo = int;

Custom Global Attributes

By default DStep doesn’t add any attributes like @nogc or nothrow to the translated code. In this release of DStep, support for attributes has been added. Custom global attributes can be enabled with the --global-attribute flag. For a C header file with the following content:

int a;

And invoking DStep with the following command:

$ dstep foo.h --global-attribute @nogc --global-attribute nothrow

Will output the following D code:

@nogc:
nothrow:

extern __gshared int a;

Rename Enums

Unlike in D, enums in C don’t create a new scope for their members, even if a name is given to the enum. Example:

enum
{
    a,
    b
};

enum Foo
{
    c,
    d
};

int e = a;
int f = c;

In the above example it’s possible to access the enum members from both of the enums without qualifying the type. In D, this is not the case for named enums. They require the qualifying the enum member with the type name:

enum
{
    a,
    b
}

enum Foo
{
    c,
    d
}

int e = a; // ok since the first enum is anonymous
int f = Foo.c; // need to qualifying the enum member with the type name

To reduce the risk of symbol conflict it’s quite common for C libraries to prefix enum members with the name of the type:

enum Foo
{
    FooC,
    FooD
};

While the following is perfectly fine in D as well, it gets a bit redundant and verbose to have to specify Foo twice:

enum Foo
{
    FooC,
    FooD
}

int c = Foo.FooC;
int d = Foo.FooD;

For this reason DStep now supports a new flag, --rename-enum-members, which when enabled will try to remove any prefix of the enum member names. Given the
following C header file:

enum Foo
{
    FooA,
    FooB
};

And running DStep as follows:

$ dstep foo.h --rename-enum-members

It will produce the following D code:

enum Foo
{
    a = 0,
    b = 1
}

DStep identified the Foo prefix and removed it from the enum member names. It also converted the names to lowercase to better match the standard D naming
conventions.

By default this feature is not enabled to more closely match the original C code.

Normalize Modules

When the --package flag is specified DStep will add a module declaration to all D modules. By default, it will use the name of the input file as the name of the module. In the C world there’s no direct file-naming convention. Some libraries will use all lowercase letters, some will use snake case, some will use camel case, some will use Pascal case, and so on.

The standard D naming convention for modules (and therefore files) is to use only lowercase letters and underscores, i.e snake case. To help with following this convention DStep now supports the new flag --normalize-modules. When this flag is enabled (and the --package flag is used) DStep will try to convert the name of the input file to a name matching the D conventions.

Given a C header file named Foo.h and only invoking DStep with the --package flag:

$ dstep Foo.h --package bar

DStep will produce the following D code:

module bar.Foo;

When the --normalize-modules flag is used as well:

$ dstep Foo.h --package bar --normalize-modules

DStep will output the following D code:

module bar.foo;

Note that Foo has been converted to foo.

Another example with a file using the Pascal naming convention:

$ dstep NSString.h --package bar --normalize-modules

And the result:

module bar.ns_string;

By default this feature is not enabled to more closely match the original C code.

Bit Fields

Another new feature that has been added in this release of DStep is support for bit fields. The bit field is a built-in language construct in C, but there’s no language support for it in D. Fortunately, with the help of D’s metaprogramming capabilities, the bit field has been implemented as a library construct and is available in the standard library [3]. The library construct will generate getters and setters that perform the same bit manipulation that
the C compiler would have generated.

The following snippet in C:

struct Foo
{
    unsigned int a : 1;
    unsigned int b : 2;
    unsigned int c : 5;
};

Is translated to D:

struct Foo
{
    import std.bitmanip : bitfields;

    mixin(bitfields!(
        uint, "a", 1,
        uint, "b", 2,
        uint, "c", 5));
}

The D translation makes use of the bitfields template from the standard library. It’s automatically imported, directly inside the struct, to minimize the scope of where the symbol is available.


In his day job, Jacob Carlborg is a DevOps engineer for Derivco Sweden, but he’s been using D on his own time since 2006. He is the maintainer of numerous open source projects, including DStep, a utility that generates D bindings from C and Objective-C headers, DWT, a port of the Java GUI library SWT, and DVM, the topic of another post on this blog. He implemented native Thread Local Storage support for DMD on OS X and contributed, along with Michel Fortin, to the integration of Objective-C in D.

A DUB Case Study: Compiling DMD as a Library

In his day job, Jacob Carlborg is a Ruby backend developer for Derivco Sweden, but he’s been using D on his own time since 2006. He is the maintainer of numerous open source projects, including DStep, a utility that generates D bindings from C and Objective-C headers, DWT, a port of the Java GUI library SWT, and DVM, the topic of another post on this blog. He implemented native Thread Local Storage support for DMD on OS X and contributed, along with Michel Fortin, to the integration of Objective-C in D.


DUB is the official build tool and package manager for the D programming language. Originally written and currently maintained by Sönke Ludwig as part of the vibe.d web framework, its acceptance as an official part of the D toolchain means it is now shipping with the most recent DMD and LDC compilers.

A Quick Introduction to DUB

If you have have the latest DMD or LDC installed, you already have DUB installed as well. If not, or if you want to check for a more recent version, you can get the very latest release, beta or release candidate from the DUB download page.

You can create a new DUB project by executing the dub init command. This will start an interactive setup that guides you through project creation.

  1. First decide the format of the package recipe. Two formats are supported: JSON and SDLang. Here we picked SDLang.
  2. Then specify the name of the project. Press enter to use the default name, which is displayed in brackets and is inferred from the directory
  3. Do the same for the description, author, license, copyright, and dependencies to select the default values
$ dub init foo
Package recipe format (sdl/json) [json]: sdl
Name [foo]:
Description [A minimal D application.]:
Author name [Jacob Carlborg]:
License [proprietary]:
Copyright string [Copyright © 2017, Jacob Carlborg]:
Add dependency (leave empty to skip) []:
Successfully created an empty project in '/Users/jacob/tmp/foo'.
Package successfully created in foo

After the setup has completed, the following files and directories will have been created:

$ tree foo
foo
├── dub.sdl
└── source
    └── app.d

1 directory, 2 files
  • dub.sdl is the package recipe file, which provides instructions telling DUB how to build the package
  • source is the default path where DUB looks for D source files
  • app.d contains the main function and is an example Hello World generated by DUB with the following content:
import std.stdio;

void main()
{
	writeln("Edit source/app.d to start your project.");
}

The content of the dub.sdl file is the following:

name "foo"
description "A minimal D application."
authors "Jacob Carlborg"
copyright "Copyright © 2017, Jacob Carlborg"
license "proprietary"

All of which was taken from what we specified during project creation. By default, DUB looks for D source files in either source or src directories and compiles all files it finds there and in any subdirectories.

To build and run the application, navigate to the project’s root directory, foo in this case, and invoke dub:

$ dub
Performing "debug" build using dmd for x86_64.
foo ~master: building configuration "application"...
Linking...
Running ./foo
Edit source/app.d to start your project.

To build without running, invoke dub build:

$ dub build
Performing "debug" build using dmd for x86_64.
foo ~master: building configuration "application"...
Linking...

Case Study: DMD as a Library

Recently there has been some progress in making the D compiler (DMD) available as a library. Razvan Nitu has been working on it as part of his D Foundation scholarship at the University Politechnica of Bucharest. He gave a presentation at DConf 2017 (a video of the talk is available, as well as examples in the DMD repository). So I had the idea that as part of the DConf 2017 hackathon I could create a simple DUB package for DMD to make only the lexer and the parser available as a library, something his work has made possible.

Currently DMD is built using make. There are three Makefiles, one for Posix, one for 32-bit Windows and one for 64-bit Windows  (which is only a wrapper of the 32-bit one). I don’t intend to try to completely replicate the Makefiles as a DUB package (they contain some additional tasks besides building the compiler), but instead will start out fresh and only include what’s necessary to build the lexer and parser.

DMD already has all the source code in the src directory, which is one of the directories DUB searches by default. If we would leave it as is, DUB would include the entirety of DMD, including the backend and other parts we don’t want to include at this point.

The first step is to create the DUB package recipe file. We start simple with only the metadata (here using the SDLang format):

name "dmd"
description "The DMD compiler"
authors "Walter Bright"
copyright "Copyright © 1999-2017, Digital Mars"
license "BSL-1.0"

When we have this we need to figure out which files to include in the package. We can do this by invoking DMD with the -deps flag to generate the imports of a module. A good start is the lexer, which is located in src/ddmd/lexer.d. We run the following command to output the imports that lexer.d is using:

$ dmd -deps=deps.txt -o- -Isrc src/ddmd/lexer.d

This will write a file named deps.txt containing all the imports used by lexer.d. The -o- flag is used to tell the compiler not to generate any code. The -I flag is used to add an import path where the compiler will look for additional modules to import (but not compile). An example of the output looks like this (the long path names have been reduced to save space):

core.attribute (druntime/import/core/attribute.d) : private : object (druntime/import/object.d)
object (druntime/import/object.d) : public : core.attribute (druntime/import/core/attribute.d):selector
ddmd.lexer (ddmd/lexer.d) : private : object (druntime/import/object.d)
core.stdc.ctype (druntime/import/core/stdc/ctype.d) : private : object (druntime/import/object.d)
ddmd.root.array (ddmd/root/array.d) : private : object (druntime/import/object.d)
ddmd.root.array (ddmd/root/array.d) : private : core.stdc.string (druntime/import/core/stdc/string.d)

The most interesting part of this output, in this case, is the first column, which consists of a long list of module names. What we are interested in here is a unique list of modules that are located in the ddmd package. All modules in the core package are part of the D runtime and are already precompiled as a library and automatically linked when compiling a D executable, so these modules don’t need to be compiled. The modules from the ddmd package can be extracted with some search-and-replace in your favorite text editor or using some standard Unix command lines tools:

$ cat deps.txt | cut -d ' ' -f 1 | grep ddmd | sort | uniq
ddmd.console
ddmd.entity
ddmd.errors
ddmd.globals
ddmd.id
ddmd.identifier
ddmd.lexer
ddmd.root.array
ddmd.root.ctfloat
ddmd.root.file
ddmd.root.filename
ddmd.root.hash
ddmd.root.outbuffer
ddmd.root.port
ddmd.root.rmem
ddmd.root.rootobject
ddmd.root.stringtable
ddmd.tokens
ddmd.utf

Here we can see that a set of modules is located in the nested package ddmd.root. This package contains common functionality used throughout the DMD source code. Since it doesn’t have any dependencies on any code outside the package it’s a good fit to place in a DUB subpackage. This can be done using the subPackage directive, as follows:

subPackage {
  name "root"
  targetType "library"
  sourcePaths "src/ddmd/root"
}

We specify the name of the subpackage, root. The targetType directive is used to tell DUB whether it should build an executable or a library (though it’s optional — DUB will build an executable if it finds an app.d in the root of the source directory and a library if it doesn’t). Finally, sourcePaths can be used to specify the paths where DUB should look for the D source files if neither of the default directories is used. Fortunately, we want to include all the files in the src/ddmd/root, so using sourcePaths works perfectly fine.

We can verify that the subpackage works and builds by invoking:

$ dub build :root
Building package dmd:root in /Users/jacob/development/d/dlang/dmd/
Performing "debug" build using dmd for x86_64.
dmd:root ~master: building configuration "library"...

:package-name is shorthand that tells DUB to build the package-name subpackage of the current package, in our case the root subpackage.

After removing all the modules from the root package from the initial list of dependencies, the following modules remain:

ddmd.console
ddmd.entity
ddmd.errors
ddmd.globals
ddmd.id
ddmd.identifier
ddmd.lexer
ddmd.tokens
ddmd.utf

The next step is to create a subpackage for the lexer containing the remaning modules.

subPackage {
  name "lexer"
  targetType "library"
  sourcePaths

Again we start by specifying the name of the subpackage and that the target type is a library. Specifying sourcePaths without any value will set it to an empty list, i.e. no source paths. This is done because there are more files than we want to include in this subpackage in the source directory.

sourceFiles \
    "src/ddmd/console.d" \
    "src/ddmd/entity.d" \
    "src/ddmd/errors.d" \
    "src/ddmd/globals.d" \
    "src/ddmd/id.d" \
    "src/ddmd/identifier.d" \
    "src/ddmd/lexer.d" \
    "src/ddmd/tokens.d" \
    "src/ddmd/utf.d"

The above specifies all source files that should be included in this subpackage. The difference between sourcePaths and sourceFiles is that sourcePaths expects a whole directory of source files that should be included, where sourceFiles lists only the individual files that should be included. A list in SDLang is written by separating the items with a space. The backslash (\) is used for line continuation, making it possible spread the list across multiple lines.

The final step of the lexer subpackage is to add a dependency on the root subpackage. This is done with the dependency directive:

dependency "dmd:root" version="*"
}

The first parameter for the dependency directive is the name of another DUB package. The colon is used to separate the package name from the subpackage name. The version attribute is used to specify which version the package should depend on. The * is used to indicate that any version of the dependency matches, i.e. the latest version should always be used. When implementing subpackages in any given package, this is generally what should be used. External projects that depend on any DUB package should specify a SemVer version number corresponding to a known release version.

If we build the lexer subpackage now it will result in an error:

$ dub build :lexer
Building package dmd:lexer in /Users/jacob/development/d/dlang/dmd/
Performing "debug" build using dmd for x86_64.
dmd:lexer ~master: building configuration "library"...
src/ddmd/globals.d(339,21): Error: need -Jpath switch to import text file VERSION
dmd failed with exit code 1.

Looking at the file and line of the error shows that it contains the following code:

_version = (import("VERSION") ~ '\0').ptr;

This code contains an import expression. Import expressions differ from import statements (e.g. import std.stdio;) in that they take a file from the file system and insert its contents into the current module. It’s just as if you copied and pasted the contents yourself. Using an import expression requires that the path where the file is imported from be passed to the compiler as a security mechanism. This can be done using the -J flag. In this case, we want to use the package root, where we are executing DUB, so we can use a single dot: “.“. Passing arbitrary flags to the compiler can be done with the dflags build setting, as follows:

dflags "-J."

Add that to the lexer subpackage configuration and it will compile correctly:

$ dub build :lexer
Building package dmd:lexer in /Users/jacob/development/d/dlang/dmd/
Performing "debug" build using dmd for x86_64.
dmd:lexer ~master: building configuration "library"...

For the final subpackage, we have the parser. The parser is located in src/ddmd/parse.d. To get its dependencies we can use the same approach we used for the lexer. But we will filter out all files that are part of the other subpackages:

$ dmd -deps=deps.txt -Isrc -J. -o- src/ddmd/parse.d
$ cat deps.txt | cut -d ' ' -f 1 | grep ddmd | grep -E -v '(root|console|entity|errors|globals|id|identifier|lexer|tokens|utf)' | sort | uniq
ddmd.parse

Here, we’re supplying the -v flag to grep to filter the results and the -E flag to enable extended regular expressions. All modules from the root package and all modules from the lexer subpackage are filtered out and the only remaining module is the ddmd.parse module.

The subpackage for the parser will look similar to the other subpackages:

subPackage {
  name "parser"
  targetType "library"
  sourcePaths

  sourceFiles "src/ddmd/parse.d"

  dependency "dmd:lexer" version="*"
}

Again, we can verify that it’s working by building the subpackage:

$ dub build :parser
Building package dmd:parser in /Users/jacob/development/d/dlang/dmd/
Performing "debug" build using dmd for x86_64.
dmd:parser ~master: building configuration "library"...

Currently we have three subpackages in the DUB recipe file, but no way to use the main package as a whole. To fix this we add the parser subpackage as a dependency of the main package. We pick the parser subpackage as a dependency because it will include the other two subpackages through its own dependencies.

license "BSL-1.0"

targetType "none"
dependency ":parser" version="*"

subPackage {
  name "root"

In addition to specifying parser as a dependency, we also specify the target type to be none. This will avoid building an empty library out of the main package, since it doesn’t contain any source files of its own.

As a final step, we’ll verify that the whole library is working by creating a separate project that uses the DMD DUB package as a dependency. We create a new DUB project in the test directory, called dub_package:

$ cd test
$ mkdir dub_package
$ cd dub_package
$ cat > dub.sdl <<EOF
> name "dmd-dub-test"
> description "Test of the DMD Dub package"
> license "BSL 1.0"
>
> dependency "dmd" path="../../"
> EOF
$ mkdir source

We create a new file, source/app.d, with the following content:

void main()
{
}

// lexer
unittest
{
    import ddmd.lexer;
    import ddmd.tokens;

    immutable expected = [
        TOKvoid,
        TOKidentifier,
        TOKlparen,
        TOKrparen,
        TOKlcurly,
        TOKrcurly
    ];

    immutable sourceCode = "void test() {} // foobar";
    scope lexer = new Lexer("test", sourceCode.ptr, 0, sourceCode.length, 0, 0);
    lexer.nextToken;

    TOK[] result;

    do
    {
        result ~= lexer.token.value;
    } while (lexer.nextToken != TOKeof);

    assert(result == expected);
}

// parser
unittest
{
    import ddmd.astbase;
    import ddmd.parse;

    scope parser = new Parser!ASTBase(null, null, false);
    assert(parser !is null);
}

The above file contains two unit tests, one for the lexer and one for the parser. We can run dub test to run the unit tests for this package:

$ dub test
No source files found in configuration 'library'. Falling back to "dub -b unittest".
Performing "unittest" build using dmd for x86_64.
dmd:root ~issue-17392-dub: building configuration "library"...
dmd:lexer ~issue-17392-dub: building configuration "library"...
../../src/ddmd/globals.d(339,21): Error: file "VERSION" cannot be found or not in a path specified with -J
dmd failed with exit code 1.

Which gives us the error that it cannot find the VERSION file in any string import paths, even though we added the correct directory to the string import paths. If we run the tests with verbose output enabled, using the --verbose flag we get a hint (the output has been reduced to save space):

dmd:lexer ~issue-17392-dub: building configuration "library"...
dmd -J. -lib

Here we see that the compiler is invoked with the -J. flag, which is what we previously specified in the lexer subpackage. The problem is that the current directory is now of the dmd-dub-test DUB package instead of the dmd DUB package. Looking at the documentation of DUB we can see there’s an environment variable, $PACKAGE_DIR, that we can use as the string import path instead of hardcoding it to use a single dot. We update the dflags setting of the lexer subpackage to use the $PACKAGE_DIR environment variable:

dflags "-J$PACKAGE_DIR"
}

Running the tests again shows that the error is fixed, but now we get a new error, a long list of undefined symbols (shortened here):

$ dub test
No source files found in configuration 'library'. Falling back to "dub -b unittest".
Performing "unittest" build using dmd for x86_64.
dmd:root ~issue-17392-dub: building configuration "library"...
dmd:lexer ~issue-17392-dub: building configuration "library"...
dmd:parser ~issue-17392-dub: building configuration "library"...
dmd-dub-test ~master: building configuration "application"...
Linking...
Undefined symbols for architecture x86_64:
  "_D4ddmd7astbase12__ModuleInfoZ", referenced from:
      _D3app12__ModuleInfoZ in dmd-dub-test.o

The reason for this is that we’re importing the ddmd.astbase module in the test of the parser, but it’s never compiled. We can solve that problem by adding it to the parser subpackage in the dmd DUB package. Running dmd again to show all its dependencies shows that it also depends on the ddmd.astbasevisitor module. We add these two modules as follows:

sourceFiles \
  "src/ddmd/astbase.d" \
  "src/ddmd/astbasevisitor.d" \
  "src/ddmd/parse.d"

Finally, running the tests again shows that everything is working correctly:

$ dub test
No source files found in configuration 'library'. Falling back to "dub -b unittest".
Performing "unittest" build using dmd for x86_64.
dmd:root ~issue-17392-dub: building configuration "library"...
dmd:lexer ~issue-17392-dub: building configuration "library"...
dmd:parser ~issue-17392-dub: building configuration "library"...
dmd-dub-test ~master: building configuration "application"...
Linking...
Running ./dmd-dub-test

After verifying that both the lexer and parser are working in a separate DUB package, this is the final result of the package recipe for the dmd DUB package:

name "dmd"
description "The DMD compiler"
authors "Walter Bright"
copyright "Copyright © 1999-2017, Digital Mars"
license "BSL-1.0"

targetType "none"
dependency ":parser" version="*"

subPackage {
  name "root"
  targetType "library"
  sourcePaths "src/ddmd/root"
}

subPackage {
  name "lexer"
  targetType "library"
  sourcePaths

  sourceFiles \
    "src/ddmd/console.d" \
    "src/ddmd/entity.d" \
    "src/ddmd/errors.d" \
    "src/ddmd/globals.d" \
    "src/ddmd/id.d" \
    "src/ddmd/identifier.d" \
    "src/ddmd/lexer.d" \
    "src/ddmd/tokens.d" \
    "src/ddmd/utf.d"

  dflags "-J$PACKAGE_DIR"

  dependency "dmd:root" version="*"
}

subPackage {
  name "parser"
  targetType "library"
  sourcePaths

  sourceFiles \
    "src/ddmd/astbase.d" \
    "src/ddmd/astbasevisitor.d" \
    "src/ddmd/parse.d"

  dependency "dmd:lexer" version="*"
}

All this has now been merged into master and the DUB package is available here: http://code.dlang.org/packages/dmd. Happy hacking!

Inside D Version Manager

In his day job, Jacob Carlborg is a Ruby backend developer for Derivco Sweden, but he’s been using D on his own time since 2006. He is the maintainer of numerous open source projects, including DStep, a utility that generates D bindings from C and Objective-C headers, DWT, a port of the Java GUI library SWT, and the topic of this post, DVM. He implemented native Thread Local Storage support for DMD on OS X and contributed, along with Michel Fortin, to the integration of Objective-C in D.


D Version Manager (DVM), is a cross-platform tool that allows you to easily download, install and manage multiple D compiler versions. With DVM, you can select a specific version of the compiler to use without having to manually modify the PATH environment variable. A selected compiler is unique in each shell session, and it’s possible to configure a default compiler.

The main advantage of DVM is the easy downloading and installation of different compiler versions. Specify the version of the compiler you would like to install, e.g. dvm install 2.071.1, and it will automatically download and install that version. Then you can tell DVM to use that version by executing dvm use 2.071.1. After that, you can invoke the compiler as usual with dmd. The selected compiler version will persist until the end of the shell session.

DVM makes it possible for the user to select a specific compiler version without having to modify any makefiles or build scripts. It’s enough for any build script to refer to the compiler by name, i.e. dmd, as long as the user selects the compiler version with DVM before invoking the script.

History

DVM was created in the beginning of 2011. That was a different time for D. No proper installers existed, D1 was still a viable option, and each new release of DMD brought with it a number of regressions. Because of all the regressions, it was basically impossible to always use the latest compiler, and often even older compilers, for all of your projects. Taking into consideration projects from other developers, some were written in D1 and some in D2, making it inconvenient to have only one compiler version installed.

It was for these reasons I created DVM. Being able to have different versions of the compiler active in different shell sessions makes it easy to work on different projects requiring different versions of the compiler. For example, it was possible to open one tab for a D1 compiler and another for a D2 compiler.

The concept of DVM comes directly from the Ruby tool RVM. Where DVM installs D compilers, RVM installs Ruby interpreters. RVM can do everything DVM can do and a lot more. One of the major things I did not want to copy from RVM is that it’s completely written in shell script (bash). I wanted DVM to be written in D. Because it’s written in shell script, RVM enables some really useful features that DVM does not support, but some of them are questionable (some might call them hacks). For example, when navigating to an RVM-enabled project, RVM will automatically select the correct Ruby interpreter. However, it accomplishes this by overriding the built-in cd command. When the command is invoked, RVM will look in the target directory for one of the files .rvmrc or .ruby-version. If either is present, it will read that file to determine which Ruby interpreter to select.

Implementation and Usage

One of the goals of DVM was that it should be implemented in D. In the end, it was mostly written in D with a few bits of shell script. Note that the following implementation details are specific to the platforms that fall under D’s Posix umbrella, i.e. version(Posix), but DVM is certainly available for Windows with the same functionality.

Structure of the DVM Installation

Before DVM can be used, it needs to install itself. This is accomplished with the command, dvm install dvm. This will create the ~/.dvm directory. It contains the following subdirectories: archives, bin, compilers, env and scripts.

archives contains a cache of downloaded zip archives of D compilers.

bin contains shell scripts, acting as symbolic links, to all installed D compilers. The name of each contains the version of the compiler, e.g. dmd-2.071.1, making it possible to invoke a specific compiler without first having to invoke the use command. This directory also contains one shell script, dvm-current-dc, pointing to the currently active D compiler. This allows the currently active D compiler to be invoked without knowing which version has been set. This can be useful for executing the compiler from within an editor or IDE, for example. A shell script for the default compiler exists as well. Finally, this directory also contains the binary dvm itself.

The compilers directory contains all installed compilers. All of the downloaded compilers are unpacked here. Due to the varying quality of the D compiler archives throughout the years, the install command will also make a few adjustments if necessary. In the old days, there was only one archive for all platforms. This command will only include binaries and libraries for the current platform. Another adjustment is to make sure all executables have the executable permission set.

The env directory contains helper shell scripts for the use command. There’s one script for each installed compiler and one for the default selected compiler.

The scripts directory currently only contains one file, dvm. It’s a shell script which wraps the dvm binary in the bin directory. The purpose of this wrapper is to aid the use command.

The use Command

The most interesting part of the implementation is the use command, which selects a specific compiler, e.g. dvm use 2.071.1. The selection of a compiler will persist for the duration of the shell session (window, tab, script file).

The command works by prepending the path of the specified compiler to the PATH environment variable. This can be ~/.dvm/compilers/dmd-2.071.1/{platform}/bin for example, where {platform} is the currently running platform. By prepending the path to the environment variable, it guarantees the selected compiler takes precedence over any other possible compilers in the PATH. The reason the {platform} section of the path exists is related to the structure of the downloaded archive. Keeping this structure avoids having to modify the compiler’s configuration file, dmd.conf.

The interesting part here is that it’s not possible to modify the environment variables of the parent process, which in this case is the shell. The magic behind the use command is that the dvm command that you’re actually invoking is not the D binary; it’s the shell script in the ~/.dvm/scripts path. This shell script contains a function called dvm. This can be verified by invoking type dvm | head -n 1, which should print dvm is a function if everything is installed correctly.

The installation of DVM adds a line to the shell initialization file, .bashrc, .bash_profile or similar. This line will load/source the DVM shell script in the ~/.dvm/scripts path which will make the dvm command available. When the dvm function is invoked, it will forward the call to the dvm binary located in ~/.dvm/bin/dvm. The dvm binary contains all of the command logic. When the use command is invoked, the dvm binary will write a new file to ~/.dvm/tmp/result and exit. This file contains a command for loading/sourcing the environment file available in ~/.dvm/env that corresponds to the version that was specified when the use command was invoked. After the dvm binary has exited, the shell script function takes over again and loads/sources the result file if it exists. Since the shell script is loaded/sourced instead of executed, the code will be evaluated in the current shell instead of a sub-shell. This is what makes it possible to modify the PATH environment variable. After the result file is loaded/sourced, it’s removed.

If you find yourself with the need to build your D project(s) with multiple compiler versions, such as the current release of DMD, one or more previous releases, and/or the latest beta, then DVM will allow you to do so in a hassle-free manner. Pull up a shell, execute use on the version you want, and away you go.