Don’t Fear the Reaper

Posted on

D, like many other programming languages in active use today, comes with a garbage collector out of the box. There are many types of software that can be written without worrying at all about the GC, taking full advantage of its benefits. But the GC does have drawbacks, and there are certainly scenarios in which garbage collection is undesirable. For those situations, the language provides the means to temporarily disable it, or even avoid it completely.

In order to maximize the positive impacts of garbage collection and minimize any negative, it’s necessary to have a good grounding in how the GC works in D. A good place to start is the Garbage Collection page at dlang.org, which outlines the rationale for D’s GC and provides some tips on working with it. This post is the first in a series intended to expand on the information provided on that page.

This time, we’ll look at the very basics, focusing on the language features that can trigger GC allocations. Future posts will introduce ways to disable the GC when necessary, as well as idioms useful in dealing with its nondeterministic nature (e.g. managing resources in the destructors of GC-managed objects).

The first thing to understand about D’s garbage collector is that it only runs during allocation, and only if there is no memory available to allocate. It isn’t sitting in the background, periodically scanning the heap and collecting garbage. This knowledge is fundamental to writing code that makes efficient use of GC-managed memory. Take the following example:

void main() {
    int[] ints;
    foreach(i; 0..100) {
        ints ~= i;
    }
}

This declares a dynamic array of int, then uses D’s append operator to append the numbers 0 to 99 in a foreach range loop. What isn’t obvious to the untrained eye is that the append operator makes use of the GC heap to allocate space for the values it adds to the array.

DRuntime’s array implementation isn’t a dumb one. In the example, there aren’t going to be one hundred allocations, one for each value. When more memory is needed, the implementation will allocate more space than is requested. In this particular case, it’s possible to determine how many allocations are actually made by making use of the capacity property of D’s dynamic arrays and slices. This returns the total number of elements the array can hold before an allocation is necessary.

void main() {
    import std.stdio : writefln;
    int[] ints;
    size_t before, after;
    foreach(i; 0..100) {
        before = ints.capacity;
        ints ~= i;
        after = ints.capacity;
        if(before != after) {
            writefln("Before: %s After: %s",
                before, after);
        }
    }
}

Executing this when compiled with DMD 2.073.2 shows the message is printed six times, meaning there were six total GC allocations in the loop. That means there were six opportunities for the GC to collect garbage. In this small example, it almost certainly didn’t. If this loop were part of a larger program, with GC allocations throughout, it very well might.

On a side note, it’s informative to examine the values of of before and after. Doing so shows a sequence of 0, 3, 7, 15, 31, 63, and 127. So by the end, ints contains 100 values and has space for appending 27 more before the next allocation, which, extrapolating from the values in the sequence, should be 255. That’s an implementation detail of DRuntime, though, and could be tweaked or changed in any release. For more details on how arrays and slices are managed by the GC, take a look at Steven Schveighoffer’s excellent article on the topic.

So, six allocations, six opportunities for the GC to initiate one of its pauses of unpredictable length even in that simple little loop. In the general case, that could become an issue depending on if the loop is in a hot part of code and how much total memory is allocated from the GC heap. But even then, it’s not necessarily a reason to disable the GC in that part of the code.

Even with languages that don’t come with a stock GC out of the box, like C and C++, most programmers learn at some point that it’s better for overall performance to allocate as much as possible up front and minimize allocations in the inner loops. It’s one of the many types of premature optimization that are not actually the root of all evil, something we tend to call best practice. Given that D’s GC only runs when memory is allocated, the same strategy can be applied as a simple way to mitigate any potential impact it could have on performance. Here’s one way to rewrite the example:

void main() {
    int[] ints = new int[](100);
    foreach(i; 0..100) {
        ints[i] = i;
    }
}

Now we’ve gone from six allocations down to one. The only opportunity the GC has to run is before the inner loop. This actually allocates space for at least 100 values and initializes them all to 0 before entering the loop. The array will have a length of 100 after new, but will almost certainly have additional capacity.

There’s an alternative to new for arrays: the reserve function:

void main() {
    int[] ints;
    ints.reserve(100);
    foreach(i; 0..100) {
        ints ~= i;
    }
}

This allocates memory for at least 100 values, but the array is still empty (its length property will return 0) when it returns, so nothing is default initialized. Given that the loop only appends 100 values, it’s guaranteed not to allocate.

In addition to new and reserve, it’s possible to call GC.malloc directly for explicit allocation.

import core.memory;
void* intsPtr = GC.malloc(int.sizeof * 100);
auto ints = (cast(int*)intsPtr)[0 .. 100];

Array literals will usually allocate.

auto ints = [0, 1, 2];

This is also true when an array literal enum is used.

enum intsLiteral = [0, 1, 2];
auto ints1 = intsLiteral;
auto ints2 = intsLiteral;

An enum exists only at compile time and has no memory address. Its name is a synonym for its value. Everywhere you use one, it’s like copying and pasting its value in place of its name. Both ints1 and ints2 trigger allocations exactly as if they were declared like so:

auto ints1 = [0, 1, 2];
auto ints2 = [0, 1, 2];

Array literals do not allocate when the target is a static array. Also, string literals (strings in D are arrays under the hood) are an exception to the rule.

int[3] noAlloc1 = [0, 1, 2];
auto noAlloc2 = "No Allocation!";

The concatenate operator will always allocate:

auto a1 = [0, 1, 2];
auto a2 = [3, 4, 5];
auto a3 = a1 ~ a2;

D’s associative arrays have their own allocation strategy, but you can expect them to allocate when items are added and potentially when removed. They also expose two properties, keys and values, which both allocate arrays and fill them with copies of the respective items. When its desirable to modify the underlying associative array during iteration, or when the items need to be sorted or otherwise manipulated independently of the associative array, these properties are just what the doctor ordered. Otherwise, they’re needless allocations that put undue pressure on the GC.

When the GC does run, the total amount of memory it needs to scan is going to determine how long it takes. The smaller, the better. Avoiding unnecessary allocations isn’t going to hurt anyone and is another good mitigation strategy. D’s associative arrays provide three properties that help do just that: byKey, byValue, and byKeyValue. These each return forward ranges that can be iterated lazily. They do not allocate because they actually refer to the items in the associative array, so it should not be modified while iterating them. For more on ranges, see the chapters titled Ranges and More Ranges in Ali Çehreli’s Programming in D.

Closures, which are delegates or function literals that need to carry around a pointer to the local stack frame, may also allocate. The last allocating language feature listed on the Garbage Collection page is the assertion. An assertion will allocate when it fails because it needs to throw an AssertError, which is part of D’s class-based exception hierarchy (we’ll look at how classes interact with the GC in a future post).

Then there’s Phobos, D’s standard library. Once upon a time, much of Phobos was implemented with little concern for GC allocations, making it difficult to use in situations where they were undesirable. However, a massive effort was initiated
to make it more conservative in its GC usage. Some functions were made to work with lazy ranges, others were rewritten to take buffers supplied by the caller, and some were reimplemented to avoid unnecessary allocations internally. The result is a standard library more amenable to GC-free code (though there are still probably some corners of the library that have not yet been renovated — PRs welcome).

Now that we’ve seen the basics of using the GC, the next post in this series will look at the tools the language and the compiler provide for turning the GC off and making sure specific sections of code are GC-free.

Thanks to Guillaume Piolat and Steven Schveighoffer for their help with this article.

Project Highlight: vibe.d

Posted on

Since the day Sönke Ludwig first announced vibe.d on the D Forums, the project has been a big hit in the D community. It’s the exclusive subject of one book, has a chapter of its own in another, and has been proven in production both commercially and otherwise. As so many projects do, it all started out of frustration.

I was dissatisfied with existing network web libraries (in particular with Node.js, which was the new big thing back then, because it was also built on an asynchronous I/O model). D 2.0 gained cross platform fiber support through the integration of DRuntime, which seemed like a perfect opportunity to avoid the shortcomings of Node.js’s programming model (“callback hell”). Together with D’s strong type checking and the high performance of natively compiled applications this had the ideal preconditions for creating a network framework.

From the initial release, work progressed on adding web and REST interface generators (vibe.web.web and vibe.web.rest, respectively).

This was made possible by D’s advanced meta programming facilities, string mixins and compile-time reflection in particular. The eventual addition of user-defined attributes to the language enabled some important advances later on, such as the recently added authorization framework.

Vibe.d is at its core an I/O and concurrency framework that makes heavy use of fibers which run in a quasi-parallel framework.

Every time an operation (e.g. reading from a socket) needs to wait, the fiber yields execution, so another fiber can run instead. Each fiber uses up very little memory compared to a full thread and switching between fibers is very cheap. This enables highly scalable applications that behave like normal multithreaded applications (save for the low-level issues associated with real multithreading).

At a higher level, it can serve as a web framework for backend development and provides functionality for protocols like HTTP and SMTP, database connectivity, and the parsing of data formats. A number of third-party packages that extend or complement vibe.d can be found in the DUB repository (Sönke is also the creator and maintainer of DUB, the D build tool and package manager).

Big changes are currently afoot with the project. Beginning with the release of vibe.d 0.7.27 in February 2016, work began on splitting the monolithic project into independent DUB packages. One goal is to make it possible to use one vibe.d component without pulling them all in, reducing build times in the process.

Another goal is to employ modern D idioms where possible and to improve memory usage and performance as far as possible. It is surprising how much D evolved in just the short amount of time that vibe.d has been alive!

Diet-NG, vibe.d’s template engine based on Jade, was the first to be granted independence. It went through a complete rewrite that adds a strong test suite, makes use of D’s ranges where possible, provides a more flexible API, and eliminates dependencies on other vibe.d packages. Now he’s working on the core package.

The vibe-core package encapsulates the whole event and fiber logic, including I/O, tasks, concurrency primitives and general operating system abstraction. The original design was based heavily on classes and interfaces and had a very high level operating system abstraction layer, resulting in several downsides. For example, there was a dependence on the GC and virtual function calls could be an issue on certain platforms. One of the main goals was to minimize performance overhead in the new implementation.

As part of his experimentation with different API idioms and slimming down the code base, he produced the eventcore library.

The API follows a proactor pattern, meaning that a callback gets invoked whenever a certain asynchronous operation finishes. This is in contrast to the reactor pattern that is exposed by the non-blocking Posix APIs. It was chosen mainly so that asynchronous I/O APIs, such as Windows overlapped I/O and POSIX AIO, could be supported.

On top of eventcore, the fiber logic was implemented in a completely general way, using a central fiber scheduler and a generic asyncAwait function. This means that lots of corner cases are handled in a much more robust way now and that improvements in all areas can be made much faster and with fewer chances of breaking anything.

Next on the list for independence is the HTTP package. Sönke plans to completely rebuild the package from the ground up, adding HTTP/2 support and making it possible to enable allocation-free request/response processing.

Other obvious candidates are MongoDB and Redis clients, JSON and BSON support, the serialization framework, and the Markdown parser. The goal for each of these packages is always to take a look at the code and employ modern D idioms during the process.

If you’ve never taken D for a spin, vibe.d is a fun playground in which to do so. It’s not difficult to get up and running, and easier still if you have experience with such frameworks in other languages. It’s also ready to put into production in its current state, despite the leading zero in the version number. As Sönke makes progress on breaking it up into separate packages, it will almost certainly become an even more integral part of the growing D community.

Project Highlight: DPaste

Posted on

DPaste is an online compiler and collaboration tool for the D Programming Language. Type in some D code, click run, and see the results. Code can also be saved so that it can be shared with others. Since it was first announced in the forums back in 2012, it has become a frequently used tool in facilitating online discussions in the D community. But Damian Ziemba, the creator and maintainer of DPaste, didn’t set out with that goal in mind.

Actually it was quite spontaneous and random. I was hanging out on the #D IRC channel at freenode. I was quite amazed at how active this channel was. People were dropping by asking questions, lots of code snippets were floating around. One of the members created an IRC bot that was able to compile code snippets, but it was for his own language that he created with D. Someone else followed and created the same kind of bot, but with the ability to compile code in D, though it didn’t last long as it was run on his own PC. So I wrote my own, purely in D, that was compiling D snippets. It took me maybe 4-5 hours to write both an IRC support lib and the logic itself. Then some server hardening where the bot was running and voila, we had nazbot @ #D, which was able to evaluate statements like ^stmt import std.stdio; writeln("hello world"); and would respond with, "hello world".

Nazbot became popular and people started floating new ideas. That ultimately led Damian to take a CMS he had already written in PHP and repurpose it to use as a frontend for what then became DPaste.

The frontend is written in PHP and uses MySQL for storage. It acts as a web interface (using a Bootstrap HTML template and jQuery) and API provider for 3rd Parties. The backend is responsible for actual compilation and execution. It’s possible to use multiple backends. The frontend is a kind of load-balancer when it comes to choosing a backend. The frontend and the backend may live on different machines.

DPaste is primarily used through the web interface, but it’s also used by dlang.org.

Once dpaste.dzfl.pl was well received, the idea popped up that maybe we could provide runnable examples on the main site. So it was implemented. The next idea, proposed by Andrei Alexandrescu, was to enable runnable examples on all of the Phobos documentation. I got swallowed by real life and couldn’t contribute at the time, but eventually Sebastian Wilzbach took it up and finished the implementation. So today we have interactive examples in the Phobos documentation.

When Damian first started work on DPaste in 2011, the D ecosystem looked a bit different than it does today.

There weren’t as many 3rd party libraries as we have now; there was no DUB, there was no vibe.d, etc. I wish I’d had vibe.d back then. I would have implemented the frontend in D instead of PHP.

What I enjoy the most about D is just how “nice” to the eye the language is (compared to C and C++, which I work with on a daily basis) and how easy it is to express what’s in your mind. I’ve never had to stop and think, “how the hell can I implement this”, which is quite common with C++ in my case. In the current state, what is also amazing is how D is becoming a “batteries-included” package. Whatever you need, you just dub fetch it.

He’s implemented DPaste such that it requires very little in terms of maintenance costs. It automatically updates itself to the latest compiler release and also knows how to restart itself if the backend hangs for some reason. He says the only real issue he’s had to deal with over the past five years is spam, which has forced him to reimplement the captcha mechanism several times.

As for the future? He has a few things in mind.

I plan to rewrite the backend from scratch, open source it and use a docker image so anybody can easily pick up development or host his own backend (which is almost done). Functionally, I want to maintain different compiler versions like DMD 2.061.0, DMD 2.062.1, DMD 2.063.0, LDC 0.xx, GDC x.xx.xx, etc., and connect more architectures as backends (currently x86, arm and aarch64 are planned).

I also want to rewrite the frontend in D using vibe.d, websockets, and angular.js. In general, I would like to make the created applications more interactive. So, for example, you could use the output from your code snippet in realtime as it is produced. I would like also to split a middle end off from the frontend. The middle end would provide communication with backends and offer both a REST API and websockets. Then the frontend would be responsible purely for user interaction and nothing else.

He would also like to see DPaste become more official, perhaps through making it a part of dlang.org. And for a point further down the road, Damian has an even grander plan.

I hope to make a full blown online IDE for dlang.org with workspaces, compilers to chose, and so on.

That would be cool to see!

The D Blog in 2016: Seven Months of Page Views

Posted on

The D Blog was born at DConf 2016 and the first post was published on June 3rd. There were 27 more posts between then and the end of the year, most of which were shared on the usual social media sites. In case some of you in DLand are curious about such things, a year-end stats post is a fun way to kick off the new year.

First, we welcomed 39,471 visitors who viewed a total of 53,013 pages. The top five referrers in terms of page views:

  1. 16,604 — Reddit
  2.  3,698 — The D Forums
  3.  3,123 — Hacker News
  4.  2,847 — Twitter
  5.  1,759 — Facebook

The top five countries in terms of page views:

  1. 17,244 — United States
  2.  4,427 — Germany
  3.  3,349 — United Kingdom
  4.  2,251 — Canada
  5.  1,598 — France

Several posts included links to D projects at GitHub. Counting both projects and profiles, the top five most-clicked were:

  1. dlangui
  2. atrium
  3. Timur Gafurov
  4. voxelman
  5. dlib

The single most-clicked page was the DLangUI screenshot page.

Finally, the top six posts in terms of page views:

  1. 5,865 — Find Was Too Damn Slow, So We Fixed It
  2. 5,602 — Ruminations on D: An Interview with Walter Bright
  3. 4,267 — Project Highlight: DLangUI
  4. 2,704 — Programming in D: A Happy Accident
  5. 2,579 — Project Highlight: Timur Gafarov
  6. 2,257 — Project Highlight: Voxelman

The list of posts was intended to be a top-five, but it was interesting that Voxelman was posted only on December 30th and managed to become the sixth most-viewed post on the site.

2016 was the time for the blog to find its sea legs. The coming year will see more Project Highlights and more guest posters (including Andrei and Walter). We’re also looking to expand the scope somewhat, so keep your eyes open for new types of content.

If you would like to write for the D blog, please go and contact the fellow who owns this GitHub profile, where he’s showing his email address for the world to see. He would be happy to discuss posts about your D projects, idioms you like to use, tutorials you’d like to share, or anything related to the D Programming Language.

Thanks for tuning in, and Happy 2017!

Project Highlight: Voxelman

Posted on

If you spend any time over at r/VoxelGameDev, you may have seen posts about Voxelman, the plugin-driven game engine MrSmith33 is developing with D. His real name is Andrey Penechko, and he started work on Voxelman after he was inspired by Minecraft to think about all the cool things he could do with a voxel engine, particularly the low-level optimization tricks he could use in implementing one. Then he jumped in and started figuring things out.

I started the project somewhere in 2011 or 2012. It began with creating an SDL window and getting some triangles on the screen. Then I did cubes, then a single chunk. It was a simple, single-threaded thing. I did it all with a fixed camera and only had rudimentary camera controls.

For that initial version of the project, he was using C++, but he found himself stuck from a lack of knowledge about the language. So he started searching to see what else was out there. That led him to D.

I don’t really remember how I found D. I was in need of some statically typed compiled language other than C++. I was frustrated about all the source file organisation, the need of forward declarations, header separation and the include system. In D, it was as simple as writing code. I bought a cheap 10 inch tablet just to read Andrei’s book, because my 3.2″ PPC was too small to read the whole thing. I enjoyed reading every single bit of it.

His ultimate goal with the project is to provide a platform for which people can create and share plugins and game worlds.

Ideally a complete project build should have the engine source and tools (launcher, source editor, compiler). Players should be able to initiate a connection to any server in the server list, then the launcher will download any missing plugins, compile a new executable and start the engine with the list of plugins. Currently, a build of Voxelman is less than 3MB in size. I think that this is a good property to have.

The major sticking point he sees with this approach is the dependency DMD has on the Microsoft tools for 64-bit (and 32-bit COFF) support on Windows (specifically the Windows SDK and the Microsoft linker). Even though the MS linker is considered the system linker, it’s not uncommon to see Cygwin and or one of the various distributions of MinGW installed instead of the MS tools. In a perfect world, he could tell people to download the D compiler and they would have everything they need. But it’s not a deal-breaker, so he’s not letting it stop him.

Voxelman uses a client-server architecture, where the server can be launched in a dedicated process or as part of the client’s. This is managed by a launcher which, in addition to launching the game, can be used to compile projects, manage the world, and find servers to connect with.

World and mesh generation is multi-threaded and, as in most such engines, the model is chunk-based. The chunk management implementation is informed by the concept of entity component systems, with a chunk’s world position serving as its entity ID and layers functioning as components.

Each dimension is broken into chunks. A chunk is a 32³ array of blocks. Each chunk can have a set of data layers (currently blocks and block entities). Each layer is essentially an immutable snapshot. It can be of different storage types (uniform, where all blocks are the same,  or a compressed or full array, where the layer stores an array of data). Those layers then can be freely transmitted between threads, with reference counting done in the main thread. When a layer is no longer needed it’s deleted.

Immutable chunk data makes for fast auto saves of chunk snapshots in a separate IO thread.

When a chunk is received on the client side, it can be sent to a worker thread and the geometry will be generated. Snapshots are sent to the IO thread when save points occur, and they can still be used in the main thread, sent to the client, or processed by other worker threads. One can easily use an old snapshot while several new ones are in use. Whenever a layer is being modified, data is copied into a write buffer, changes are made, and at a commit point at the end of the frame, all write buffers are committed to chunk storage.

Andrey calls his plugin system “semi-hackish”.

All plugins inherit from an IPlugin interface. Then, each plugin registers itself in a global table of plugins from a shared static constructor. The global table has lists for server and client plugins. The engine adds those plugins to the plugin manager based on a provided plugin pack. The plugin manager implements the initialization sequence. When starting initialization, you have lots of dependencies, so you need to run things in a specific order.

He has found a lot of things to like about D. As major pros, he cites the module system (“no forward declarations”), foreach loops (“99% of loops in my code are these guys”), associative arrays, delegates, and templates (“They’re beautiful; you simply add another set of parentheses and you’re done”). He also loves D’s dynamic arrays (slices).

They are a perfect design, with the pointer and the length bundled together. You can append to them, concatenate them, and change their length.

As minor pros, he lists D’s Compile-Time Function Execution and its code generation and compile-time introspection features. Unlike some D users, he also counts the garbage collector in that group. He has implemented a mix of GC-ed and non-GCed memory in Voxelman.

High-level stuff is fully in GC memory. I call something high-level if it has only one instance, so I use interfaces/classes for the high-level parts. Low-level things are mostly stack allocated, using structs (which are POD in D), and the most performance sensitive and memory consuming parts use manual memory management (via Mallocator). This includes chunk storage and chunk meshes.

He also has a list of rough corners. He doesn’t like that support for DLLs is not yet fully functional and reliable. He has found problems when trying to use shared (for example, the Mutex class cannot be used with it). He also finds all the use cases of the is expression confusing, saying the syntax “feels like regular expressions for templates; very powerful and concise, but hard to understand.”

His difficulties with shared actually took him down an interesting path that ultimately had a positive impact on performance.

I started my multi-threading by using the send and receive functions from std.concurrency. I found that I needed to send messages of variable length. For example, when loading or saving chunks, you need to send all the layers to another thread. This involved allocating arrays for all the layers and also required the use of shared.

This situation led me to the implementation of a lock-free message queue, where each message is just a stream of bytes. You write variables on one end and read them from the other. This is obviously a single producer, single consumer queue.

A disadvantage was the use of a fixed-size circular array. You need to make sure that the queue doesn’t fill up. This was a point where I found a good book that explains how atomics work: C++ Concurency in Action: Practical Multithreading. This is one of the places in D’s documentation where you feel a lack of pointers on where to find relevant information on a specific topic.

So the new solution doesn’t require any allocations and is actually faster than the built-in one. Later I added a notification system via Semaphore, so that worker threads wait when out of work.

If you’re looking for an open source D game to contribute to, Voxelman is waiting for you. You can read more about some of its internals on reddit, check out some images on imgur, and watch some videos on YouTube. I’ll leave you with this example of it in action:

The D Language Foundation’s Scholarship Program

Posted on

d6The D Language Foundation recently announced a new scholarship program aimed at EE and CS majors attending University “Politehnica” Bucharest (UPB). I contacted Andrei Alexandrescu for a few details on how the initiative came together, hoping for just enough tidbits of backstory to craft a blog post around. He obliged in a big way, turning my one question and “a few details” into an informative conversation.

Mike: I assume quite a lot of work went into this. Could you share a few details about how it came about?

Andrei: Gladly! The story starts back in 2012, when I gave a talk at the How to Web conference in Bucharest, my native city. It was a great event and I got to meet many great people. Except for one whose name kept coming up all over the Romanian IT space, Andrei Pitis.

I heard he was an instructor in the CS department at UPB (the best IT school in Romania, also noted internationally). He’s been directly involved in a number of IT-related foundations and professional organizations, and he created and led the immensely successful Vector Smart Watch startup. So, having heard he’d be around, I went to the conference speakers’ dinner hoping to bump into him.

Not knowing what he looked like, I was just craning my neck in search of someone who seemed popular. Meanwhile, I was passing time by making chit chat with a nice fellow who introduced himself to me. Now, you know how these group parties go. There’s always loud music and conversation, so I didn’t even hear his name and assumed he hadn’t heard mine.

As the evening progressed, I figured Andrei Pitis wasn’t going to show, so I had more time to chat with that fine gentleman. And I noticed two things. First, he was incredibly insightful. Second, he seemed equally excited about meeting me as I was about meeting Andrei Pitis. After a long while, the coin dropped: they were one and the same.

Thus started a great friendship. Andrei gave me great tips about how to start and conduct The D Language Foundation. Recently, he introduced me to two UPB CS systems professors, Razvan Deaconescu and Razvan Rughinis (together, the three had created the Tech Lounge nonprofit organization dedicated to helping graduating CS students start their careers).

Razvan Rughinis came up with the scholarship idea while we were chatting over beers in the quaint old town of Bucharest. In great part the idea was motivated by the strong interest UPB systems graduate students had in participating in a high-impact open source project such as the D language as part of their MSc thesis. In systems research (unlike e.g. CS theory), actual system building is a key part of the research project; therefore, a visible OSS project makes for a much stronger dissertation than the usual throwaway experimental code.

Clearly a strong opportunity had presented itself, and the DLang UPB scholarship is its realization.

Mike: How does the selection process work?

Andrei: The two professors introduce a few candidates, which I pass through the rigors of the typical Facebook interview. We also ask for the usual suspects – proof of enrollment, transcripts, motivation letter, and references.

Of all components, the most important are (in order) the interview, the quality of the BSc projects, and the recommendation letters from their professors. The four current scholarship recipients passed the interview with flying colors and have very strong BSc projects and references. Some of them returned from summer internships at prestigious companies such as Bloomberg, others won CS awards. I have no doubt any company in the Bay Area or elsewhere would be happy to work with them. Once they finish their MSc, of course :o).

And I should mention here that the two professors aren’t only involved in the selection process. They will make themselves available to help manage the students on an ongoing basis. We’re very fortunate to have them.

Mike: Can you provide any info on the current recipients and their projects?

Andrei: The current recipients are Alexandru Razvan Caciulescu, Lucia Cojocaru, Eduard Staniloiu, and Razvan Nitu. I have posted an introduction to each on the D forums and, now that you mention it, I told them to create a wiki page with a blurb for each. They are hosted in a nice shared office kindly donated by Tech-Lounge.ro and… we’re in the process of getting a coffee machine up there :o).

They are all obviously interested in taking large systems projects that benefit their research interests and have an impact on the D language. To get them started, I took a page from Facebook’s practice and defined a “bootcamp” program. Bootcamp is a month-long process (six weeks at Facebook) during which the so-called n00bs get familiar with the technologies used in the organization: the language proper; the core runtime and standard library; the build process; the way code changes are created, reviewed, accepted, and committed; and, last but not least, the community ethos and the kind of problems we are facing that are fit for ingenious solutions.

To kickstart the bootcamp program, I defined a “bootcamp” label in our Bugzilla and applied it to a bunch of existing bugs, with an eye for the kind of bug that simultaneously has low surface (you don’t need to know a lot of internal details to get into it) and offers a good learning experience. Right now each student is busy fixing a couple of such bugs.

Long-term we are looking at high-impact libraries and tools. I do have a few ideas, but I have no doubt the students will come up with their own. Just give them time.

Mike: Speaking of time… is there any room here for an update on the D Foundation’s finances?

Andrei: Of course. To be honest, right now we’re in better shape than ever before (and than I would have hoped). Thanks to Sociomantic, who footed a large part of DConf 2016’s bills, we have quite a bit of change left from conference registration fees. I have also personally carried a number of high-profile appearances at public tech events and private corporate training events, with proceeds flowing to the Foundation.

So we have accumulated a little war chest – not much, but definitely not negligible. With our current funds and operational costs, we are covered for over two years. Of course, the situation is fluid and I am working on expanding both income and (useful) expenditures.

We’re running a very tight operation, and I want to keep it that way. By the Foundation bylaws, its officers (Walter Bright, Ali Çehreli, and myself) cannot get income from the Foundation, which preempts a variety of conflicts of interest. We are a public charity, which reduces and simplifies our taxation. We use modern, low-overhead money transfer methods such as transferwise.com and constantly scan for better ones. Anyone who considers donating should know that about every five dollars donated goes straight to pay for one hour of an exceptional graduate student’s time.

Mike: Are there more applications in the queue? Do you plan to extend scholarships to other universities?

Andrei: UPB seems to be off to a great start, but it’s also a happy case for many reasons: it’s my undergrad alma mater, we know professors there, and we don’t need to pay tuition. If we wanted to extend a scholarship to another university we’d need to avail ourselves of similar strategic advantages. Needless to say, if anyone who reads this has ideas on the matter, please contact me.

Anyhow, for the time being, we got one more strong DLang UPB scholarship application literally today.

Mike: To close out, is there anything you’d like to say to people who’d like to help out?

Andrei: I’m very excited about this scholarship program and possible extensions to it. The reason for my excitement is that this is but a part of a larger strategy. Allow me to explain.

Up until now, we had no idea what to do with money even if we had it. A while ago, I met this potential donor who said, “OK, say I gave the Foundation half a million dollars over two years, no strings attached. What would you do with it?” To my own surprise, I had only vague answers. I asked Walter the same question, and he had even less of a clue than me.

So then I figured it’s essential for the Foundation to have a strong response to that. I’m a big believer in the adage “luck helps the prepared”, of which the converse is “luck is wasted on the unprepared”. By that paradigm, not knowing what we’d do with money was a definite way to ensure we’d never be big. Now that we have the scholarship program, there exists a powerful reason for people to donate to the Foundation: donations help us find and support good students to work on high-impact D-related projects that push the state of CS systems research forward.

Another thing that would be great to have “donations” of is contributor time. Receiving more students starts pushing against our management capacity. Currently, and somewhat to my surprise, I am effectively a manager, seeing that all of these things I just gave you an earful of (bringing money in to the Foundation, managing bootcamp, finances, operations) take enough time to be a full-time job that leaves little time for coding. At some point, I won’t be able to help everyone with their research, so I’ll need to delegate some of that work to other folks. I’m talking any capacity here – from code reviews to managing to co-authoring papers to co-advising.

There are more things I have in mind, but it’s early to share those. In brief, we need to organize ourselves for further growth. What’s clear to me is we’re no longer a seat-of-the-pants operation in a (virtual) basement. The D Language is exiting its adolescence.

Project Highlight: The New CTFE Engine

Posted on

CTFE (Compile-Time Function Execution) is today a core feature of the D Programming Language. D creator Walter Bright first implemented it in DMD as an extension of the constant folding logic that was already there. Don Clugston (of FastDelegate fame) made a pass at improving it and, according to Walter, “took it much further“. Since that time, usage of CTFE has shown up in one D project after another, including in D’s standard library. For example, Dmitry Olshansky employed it in his overhaul of std.regex to great effect.

On the last day of DConf 2016, Stefan Koch gave a lightning talk on his thoughts about CTFE in D. At the end of the talk, in response to a question from Andrei Alexandrescu on how D’s implementation could be improved, he said the following:

CTFE is really a hack. You can see that it’s a hack. It’s implemented as a hack. It is the most useful hack that I’ve ever seen, and it is definitely a hacker’s tool to do stuff that are like magic. But to be fast, it would need to be heavily redesigned, reimplemented, possibly executed in multiple threads, because it is used for stuff that we could never have envisioned when it was invented.

Not long after that, Stefan opened a discussion on the fourms and took up the torch to improve the CTFE engine. As to why he got started on this journey in the first place, Stefan says, “I started work on the CTFE engine because I said so at DConf.” But, of course, there’s more to it than that.

I have pretty heavy-weight CTFE needs (I worked on a compile-time trans-compiler). Also my CTFE SQLite reader is failing if you want to read a database bigger then 2MB at ctfe.

His investigations into the performance of the CTFE interpreter shed light on its problems.

The current interpreter interprets every AST-Node it sees directly. This leaves very little space to collect information about the code that is being interpreted. It doesn’t know when something will be used as a reference, so it needs to copy every variable on every mutation. It has to do a deep-copy for this. That means it copies the whole chain of mutations every time.

To clarify, he offers the following example.

Imagine foreach(i;0 .. 10) { a = i; }. On the first iteration we save a` = 0 and set a`` to 1. On the second iteration we save a``` = 1 and a````= 0 and we set a````` to 2 , then a`````` = 1 and a``````` = 0 and so on. As you can see, the memory requirements just shoot up. It’s basically a factorial function with a very small coefficient. That is why for very small workloads this extreme overhead is not noticeable.

That flaw looked unfixable. Indeed the whole architecture in dinterpret.d is very convoluted and hard to understand. I did a few experiments on improving memory-management of the interpreter but it proved fruitless.

Once he realized there was going to be no quick fix, Stefan sat down and drew up a plan to avoid digging himself into the same hole the current interpreter was in. The result of his planning led him down a road he hadn’t expected to travel.

Direct Interpretation was out of the question since it would give the new engine too little time to analyze data-flow and decided whether a copy was really needed or not. I had to implement an Intermediate Representation. It had to be portable to different evaluation back-ends. I ended up with a solution, inspired by OpenGL, of defining my interface in the form of function calls an evaluation back end had to implement. That meant I would not be able to simply modify the current interpreter. This made the start very steep, but it is a decision I do not regret.

His implementation consists of a front end and a back end.

The front end walks the AST and issues calls to the back end. And the back end transforms those calls into actual bytecode. This bytecode is interperted by the back end as soon as the front end requires it.

In terms of functionality, he likens the current implementation to an immediate mode graphics API, and his revamp to retained mode. In this case, though, it’s the immediate mode that’s the memory hog.

You can read about his progress in the CTFE Status thread, where he has been posting frequent updates. His updates include problems he encounters, features he implements, and performance statistics. Eventually, every compiler that uses the DMD front end will benefit from his improvements.

Project Highlight: libasync

Posted on

d6libasync is a cross-platform event loop library written completely in D.  It was created, and continues to be maintained, by Etienne Cimon, who started it as a native driver for vibe.d, a modular asynchronous I/O framework most often used for web app development in D.

In 2014 or so, I was looking for a framework to power my future web development projects. I wasn’t going to use an interpreted language, as binary executables were too attractive. I found vibe.d appealing because, coming from C++, it was relatively simple and featureful. So I studied it, along with the D programming language and the Phobos standard library.

vibe.d has always used libevent under the hood by default. This is where Etienne ran into a problem that bothered him.

I stumbled on some workflow issues when deploying vibe.d apps to other operating systems which may or may not have the right version of libevent in the package repository. I didn’t want to package a DLL with my server, or have to go through dependency hell with my software, and I wanted everything to be consistently written in D to reduce the mental complexity of switching programming languages or to debug other issues.

So he decided to study up on the system APIs across the platforms supported by DMD (Windows, Linux, *BSD and OS X) and create his own event loop library in D. Now he, and anyone using libasync, can issue a single command with DUB to compile and execute a web application without needing to worry about external event loop dependencies.

libasync takes advantage of D’s delegates to provide a very intuitive interface.

void testDNS() {
	auto dns = new shared AsyncDNS(g_evl);
	dns.handler((NetworkAddress addr) {
		writeln("Resolved to: ", addr.toString(), ", it took: ", g_swDns.peek().usecs, " usecs");
	}).resolveHost("127.0.0.1");
}

Etienne says of the code snippet above:

The D garbage collector will keep the AsyncDNS object in dns alive for as long as the delegate used in the parameter of dns.handler is alive in the heap, which is in this object. The delegate syntax is more simple to declare than Javascript, and it is also type-safe. This DNS resolver will work on any platform thrown at it, thanks to D’s compile-time version conditions.

libasync makes use of the asynchronous I/O facilities available on each supported platform and provides a number of event-handlers out of the box.

Cross-platform event handlers have been defined for DNS resolution, UDP Messages, (Buffered/Unbuffered) TCP Connections, TCP Listeners, File Operations, Thread-local (Notifiers) and Cross-thread Signals, Timers and File Watchers. The intrinsics involve EPoll for Linux, KQueue for OS X and BSD, and overlapped I/O for Windows. With all of these features thoroughly tested through a vibe.d driver, libasync has become a very fast and reliable library which I use in all of my projects. My benchmarks show it as being a little slower than the libevent driver in vibe.d, though its self-explanatory code base makes it seamless to understand, maintain, and deploy.

A libasync driver has been added to vibe.d and work is going on to improve the library’s performance.

The stability of the underlying OS features makes for very little need for changes, although there is a big improvement involving the proactor pattern in the works for libasync and a new architecture for vibe.d. Together, those two developments are likely to increase the library’s performance significantly.

If you find yourself needing an event loop in D and want to give libasync a spin, you can visit the library’s page at the DUB repository for information on how to add it as a dependency to your own DUB-managed projects. libasync, in turn, has only one dependency itself, another library maintained by Etienne that provides a set of allocators and allocator-friendly containers called memutils.

It wasn’t so long ago that anyone using D who wanted something like libasync or memutils would need to either roll their own or bind to a C library. The ever-expanding list of libraries in the DUB repository, created and made available by members of the D community like Etienne, make it much easier to jump into D today than ever before.

Project Highlight: DlangUI

Posted on

Vadim Lopatin is an active D user who, like many in the D community, comes from a Java and C++ background.

My current job is writing a Java backend for a virtual call center . I’ve also worked as a C++ developer on IP PBX devices. Programming is my hobby as well. My biggest hobby project, which I’ve rewritten from scratch twice in the last 15 years, is CoolReader, a cross platform e-book reader written in C++.

He kept hearing news about D and, over time, became more interested in its “cool features”, like CTFE and code generation. So, three years ago, he decided to initiate a couple of projects to learn its features.

DDBC is a database connector similar to Java’s JDBC, with an API close to the original. HibernateD is an ORM library, similar to the Java-based Hibernate. Unlike Java, D allows the use of compile-time code introspection and code generation. It was interesting work, and I was impressed by the power of D.

Both projects proved to be no more than learning exercises, however, as he never used either himself and neither became popular in the community. Now they are largely abandoned, but he has since found another area where he could apply his talents and, as it turns out, where community interest has been much higher.

The new idea came about as he surveyed the state of available GUI libraries in D. While there were several options to choose from, he wasn’t satisfied by the fact that they were all either non-native wrappers or not cross-platform. He had already written a cross-platform GUI in C++ for CoolReader GL, a version of his ebook reader that uses the same GUI on all supported platforms. Why not implement another one in D?

He has a long list of items he thinks are important for a GUI library to check off. A few of them are:

  • Cross-platform — the same code should work on all platforms with simple recompilation.
  • Internationalization — it should be easy to write multilingual apps. Unicode everywhere. Strings externalized to resources.
  • Hardware acceleration — take advantage of DirectX or OpenGL where available, but it should be possible to use software rendering where they aren’t.
  • Resolution independence — flexible layouts must be used instead of fixed pixel-by-pixel positioning of controls.

A markup language for describing layouts, touch screen support, 3D rendering, customizable look-and-feel, easy event handling, and several other items complete the list. A big set of requirements for one person to work on alone, but he already had a good deal of experience with the GUI he wrote for CoolReader. So when he got going with his DlangUI project, his previous work is where he started.

Part of DlangUI is a direct port of the CoolReader GL GUI. It was easy to reuse big parts of C++ code thanks to the similarity of D and C++ syntax.

So he set about checking items off of his list. Such as support for hardware acceleration via an interface that easily supports different rendering backends, one of which is implemented using OpenGL. But as things got under way, he discovered that there is one particular issue with porting C++ to D that arises in the parts that can’t be directly reused.

The D GC does not bring any help for resource management, since object destructors may be called in any thread, in any order, or never at all. If an object owns some resources, it ought to be destroyed in a predictable way. Therefore, widgets and other objects holding resources must be destroyed manually by their owners

DlangUI uses reference counting for easy freeing of owned objects. Widgets remove their children on destroy. Windows remove their widgets when closing. I had to add debug mode instance counts for various objects, and corresponding messages in the log, to make sure all resources are freed gracefully.

Some resources (e.g. images) are cached. Their references may be taken from the cache, used, and then released often. To allow cleanup of caches, all such resources have usage flags. The cache provides a checkpoint method which removes the usage flag from all items, and a cleanup method which frees all cache items which have not been used since the last checkpoint.

He has worked on a number of items from his list, such as theme customization.

DlangUI themes are inspired by the Android API. It borrows Android’s state drawables (they may be even used as is), nine-patch PNGs, and resource versions for different screen sizes or resolutions. Usually widgets don’t use a hardcoded look and feel or layout properties. Instead, they use a style ID referencing to currently selected theme. If the theme is changed in runtime, all widgets receive a corresponding notification so that they can reload any cached values from the new theme. Simply providing a new theme changes the  look and feel significally.

Currently, two standard themes are provided in DlangUI: default (light) and dark. Applications may specify a standard theme as a parent, and override only the styles it needs. Standard theme resources are usually embedded into the application executable using the cool D feature import("filename"). Applications may embed their own resources as well. This allows creating a single file app withoug any additional resource files needing to be shipped with the executable.

Another check mark can  be place next to layouts. Here, he again looked to Android.

To support multiple screen resolutions and sizes, widgets must be placed and resized using layouts instead of direct pixel-based positioning. DlangUI uses Android API-like layouts for grouping, placing and resizing widgets, based on a two-phase measure/layout scheme.

And, while a GUI can be assembled entirely in code, he took inspiration from elsewhere for ideas to knock the markup item off his list.

Manually writing code to create a widget hierarchy and setting their properties is a bit boring. DlangUI offers the possibility to create widgets using a JSON-based description similar to Qt QML. I call it DML. Currently, only the creation of widgets and the setting of their properties are supported. In future, I hope to add the ability to describe signal handlers in DML, and automatically assign signals to handlers, and widget instances to variables. There is a GUI app, dlangui:dmledit, which helps to write DML. It combines a text editor for DML and a preview window to see the results.

When it comes to being cross-platform, a lot has been done so far, thanks to different backend implementations: Win32, SDL2, DSFML, X11, and Android. Not long ago, Vadim even announced a text-based interface which works in the Linux terminal or Windows console.

It was a real surprise for me how few changes were required to implement text-mode support. Besides the backend code and the text-mode drawing buffer implementation, most of the changes came in the form of a Console theme. Only a few fixes were required in the widgets, removing several hardcoded margins and sizes. Even DlangIDE, a DlangUI-based IDE for the D programming language, is now usable in terminals.

Here’s what DlangUI’s components normally look like on Windows.
screenshot-example1-windows

And this is what DlangIDE looks like running in the Windows console.

dlangide

When compared to screenshots of programs running with different DlangUI backends, seeing it in a terminal like that is pretty darn cool.

DlangUI manages event handling via signals and has built-in support for 3D graphics, including a 3D scene package. And work still continues on making Vadim’s list smaller, as well as addressing the problems with the library.

The most mentioned issue the non-native look and feel of widgets. Although it’s possible to make a theme looking exactly like native one, it would not track system theme changes anyway. There’s no system menu support on OS X and in Gnome (where a common menu is used for all apps). The documentation is poor. There is some DDOX-generated documentation, but it’s not detailed enough and I seldom update. I need more tutorials and examples. And some advanced controls are missing, e.g. an HTML view.

He also says that there are too few developers working on the project. While some users have submitted PRs, the majority of the work has been done by Vadim alone. Given what he has produced so far, that’s a pretty impressive achievement. But, in addition to solving the problems above, he’s got a lot more he wants to implement, such as:

  • An XML+CSS rendering widget to show/edit HTML or rich text
  • Refactoring DlangUI to extract window creation, OpenGL context creation, drawing, font support, input events code from widget set – for cases when no widgets are needed
  •  Mobile platforms support improvements – add iOS backend, improve android support, improve touch mode support
  •  Native system menu support on OS X and Gnome
  •  Support for fallback fonts in font engines, from which to get missing symbols
  •  A native OSX backend based on Cocoa instead of libSDL2
  • Improvements in Scene3D to make it suitable for writing 3D games

If you need a GUI for your D app, DlangUI is a viable option today. More importantly, if you’re able and willing to help out here and there, Vadim sure could use a few more hands on a few more keyboards!

 

 

 

The Origins of Learning D

Posted on

In ealearningdrly 2015, Adam Ruppe asked in the D forums if anyone was interested in authoring a new book for Packt Publishing, the company that published his D Cookbook. I had submitted a book proposal to Packt a few months before, one with a different concept, and had heard nothing back. So I began to mull over the idea of putting my name forward, but I was concerned about the time investment. Before I could decide, Packt contacted me with some details and an offer. With a target publication date of November 2015, I figured I had plenty of time to get it done, so I accepted. It wasn’t long before I learned how my concept of “plenty of time” was quite a bit off base.

Learning D was not the first programming book I had worked on. I coauthored Learn to Tango with D, which was published in 2008 by Apress, with three other D users. It was a book that simply aimed to introduce the language (version 1) and Tango, the community-driven alternative standard library for D1. I wrote two chapters and had nothing to do with the outline. It was a relatively easy experience that left me completely unprepared for the process of authoring an in-depth programming language book on my own. Tango was a bit controversial at the time and, though it’s no longer actively developed, a fork is still maintained and usable with modern D for those so inclined. The recently released Ocean library from Sociomantic Labs is derived from Tango.

When all the legalities were out of the way, work on the book began in earnest. Packt proposed an outline, with a handful of topics they specifically wanted me to cover. I also had to decide how to approach the book and who my target audience would be. Andrei Alexandrescu’s The D Programming Language already served as the language reference and Ali Çehreli’s Programming in D was targeting programming novices. I wanted to hit somewhere in the middle. I envisioned a young college student or recent undergraduate who, having already picked up some level of proficiency with a C-family language, was not a complete beginner, but also did not yet have the level of experience required to easily think in different languages.

Once I had an imaginary reader to talk to, I had a good idea of what I wanted to say. It was fairly easy to create two groups of features: those fundamental to become proficient enough with the language and the standard library to be productive, and those that are not essential or could be more aptly considered advanced. The former group would comprise the major focus of the book. But in addition to teaching the language, I had a general message I wanted to drive home.

Through all my years of visiting the D forums and maintaining a popular set of bindings to C libraries, I had encountered more than one new D user who was trying to program C++ or Java in D. While that approach will certainly enable some progress, it is bound to lead to compiler errors or unexpected results sooner rather than later. I explicitly pushed this message early in the book, and reiterated it each time I discussed one of the features that look like C++ or Java, but behave differently. Thinking in one language while learning another is a mistake that requires experience–it’s not the sort of thing novices do–and, as such, it’s a hard habit to break. My goal was to help minimize the pain.

I also wanted to create a sample program that demonstrated the language and library features I was discussing. My original plan was to create a new version of the program whenever a new paradigm was discussed. After gaining a false sense of confidence from completing Chapter 1 well before the deadline, Chapter 2 quickly disabused me of any thoughts that the whole book would go that way. It went well over the page budget and took longer than I had estimated, which meant the program for Chapter 2 was written over the space of a few hours during an all-night marathon. It was rightly lambasted by the technical reviewers. In the end, I scrapped the sample programs during the revision process and created a single program that evolves with new features as the book progresses. It didn’t turn out the way I had hoped, but it does show working examples of different paradigms in one program. The web app version developed in Chapter 10, which demonstrates the use of vibe.d, went more smoothly. Since it was the focus of the chapter, I was able to develop it in tandem with the text.

Speaking of the reviewers, the book would have been a mess had it not been for their valuable feedback. In the past, I had always believed authors were simply being kind when they said such a thing, but I can now attest that it is absolutely true. Packt asked me to recommend some technical reviewers, so I gave them a shortlist of people whom I knew to have strengths in areas where I was weak. John Colvin, Jonathan M. Davis, David Nadlinger, and Steven Schveighoffer came in early on, and were later joined by Kingsley Hendrickse and Ilya Yaroshenko. I was determined to be discriminating in implementing their suggestions, but they were so good that I agreed with and implemented most of them. And their criticisms were spot-on, catching coding errors, Big-O crimes, incorrect statements, and so much more. They taught me a few things I hadn’t known about D, cleared up a few misconceptions, and even spotted a couple of compiler bugs.

In the Foreword, Walter Bright says that this book, like D itself, is a labor of love. He is entirely correct. I first encountered D in 2003 and have been using it ever since. It’s just such a fun language. Though it’s a cliché to any long-time D user, I will tell anyone who cares to listen that this language is very much the sum of its parts; it’s not any one specific feature that makes D such a pleasure to use, but all of them taken together. Yes, it has its warts. Yes, it’s possible to encounter frustrating scenarios that require less-than-attractive workarounds. But, for me, my list of cons is nowhere near long enough to detract from the overall experience. I simply enjoy it. I want to see the cons list shrink and hope that more people see in the language what I see in it, so that one day ads looking for a D programmer are as common as those for other major languages. I accepted the offer to write this book as a way to contribute to that process. Most gratifyingly, I have received personal messages from a few readers letting me know that it helped them learn D. As a long time teacher, that’s a feeling I will never get tired of.