The D Language Foundation Google Summer of Code 2016 Postmortem

Posted on

Craig Dillabaugh was first drawn to D by its attractive syntax and Walter Bright’s statement that D is “a programming language, not a religion”. He maintains bindings to the geospatial libraries shapelib and gdal, volunteered to manage the GSoC 2015 & 2016 efforts for D, and has taken it on again for 2017. He lives near Ottawa, Canada, and works for a network monitoring/security company called Solana Networks.


The 2016 Google Summer of Code (GSoC) proved to be a great success for the D Language Foundation. Not only did we have, for us, a record number of slots allotted (four) and all projects completed successfully, perhaps most important of all we attracted four excellent students who will hopefully be long time contributors to the D Language and its community. This report serves as a review for the community of our GSoC efforts this past summer, and tries to identify some ways we can make 2017 an equal, or better, success.

Background

Back in 2011 and 2012, Digital Mars applied to participate in, and was accepted to, Google Summer of Code. In each of those years we were awarded three slots and had successful projects. Additionally, a number of long time D contributors, including David Nadlinger, Alex Bothe, and Dmitry Olshansky, were involved as students. Sadly, in the succeeding two years we were not awarded any slots. After 2014’s unsuccessful bid, Andrei asked on the forums if anyone wanted to take the lead for the 2015 GSoC, as he had too many things on his plate. This is when I decided to volunteer for the job.

I prepared for the 2015 GSoC and worked on getting some solid items for our Ideas page. I even prepared what I thought was a beautifully typeset document in LaTeX for our final submission. Needless to say, I was very disappointed when I had to copy/paste each section into the simple web form that Google provided for submissions. Sadly, that year we were rejected once more, though I felt our list of ideas was solid.

We applied again in 2016 for the first time as The D Language Foundation. Again, the community contributed lots of solid suggestions for the Ideas page and we were accepted for the first time in four years. I think that perhaps getting accepted involves a bit of luck, as our ideas were similar to, or repeated from, those that were not accepted in 2015. However, more effort was put into polishing up the page, so perhaps that helped.

The Selection Process

Once we were accepted as a mentoring organization, the process of receiving student proposals began. We received interest from a large number of students from all over the world (about 35). In the end, a total of 23 proposals were officially submitted, ranging from very short–obviously last minute–pieces, to several excellent efforts, including Sebastian Wilzbach’s 20-page document.

Our selection process was, I felt, very rigorous. We had seven of our potential admins/mentors screen the initial proposals. This involved reading all 23 proposals, which was a significant amount of work. From this initial screening we identified eight students/proposals that we thought could become successful projects. We then had all mentors individually rank each of the shortlisted proposals, another significant time commitment on their part.

Finally, interviews were arranged with all eight students. In most cases, two mentors interviewed each student, and the interviews were fairly intense, job-style interviews that involved coding exercises. A number of our mentors were involved in this process, but I think Amaury Sechet interviewed all of the students. It is no small feat to arrange and then conduct interviews with students in so many different time zones, so a huge thanks to all the mentors, but Amaury in particular. Those involved in the screening/interview process included Andrei Alexandrescu, Ilya Yaroshenko, Adam Ruppe, Adam Wilson, Dragos Carp, Russel Winder, Robert Schadek, Amaury, and myself.

Awarding of Slots

The next step for our organization was to decide how many slots we would request from Google. I really had no idea what to expect, but I was hoping we might get two slots awarded to us, as there were many good organizations vying for a limited number of slots. We felt that most of the short-listed projects could have been successful, but decided to not be too greedy and requested just four slots. As it turned out, perhaps we should have asked for more; we were awarded all four. We then selected our top four ranked students from the interview process. They were, in no particular order:

  • Sebastian Wilzbach: Science for D – a non-uniform RNG (Ilya Yaroshenko mentor)
  • Lodovico Giaretta: Phobos: std.xml (Robert Schadek mentor)
  • Wojciech Szeszol: Improvements to DStep (Russel Winder mentor)
  • Jeremy DeHaan: Precise Garbage Collector (Adam Wilson mentor)

Summer of Code

Once the projects were awarded, I must say that most of my work was done. From there on the mentors and students got down to work. I tried to keep tabs on progress and asked for regular updates from both the mentors and the students. These were, in most cases, promptly provided.

While there were some challenges, and a few projects had to be modified slightly in some instances, everyone progressed steadily throughout the summer, so there were no emergencies to deal with. All of our students passed their mid-term evaluations and by the end of the summer all four projects were completed (although Jeremy has some on-going work on his precise GC). As a result, everyone got paid and, I presume, everyone was happy.

In addition to our original mentors, thanks are due to Jacob Carlborg (DStep) and Joseph Rushton Wakeling (RNG) for providing additional expertise.

Mentor Summit

Google offered money for students to attend academic conferences and present results based on their GSoC work. Google also offered to pay travel costs for two mentors to travel to the mentor summit in California. Regrettably, none of our students had the time to take advantage of the conference money, but Robert Schadek was able to attend the Mentor Summit from Oct 28th to 30th in Sunnyvale, California. There he was able mingle with, and learn from, mentors from the other organizations that participated.

Looking Forward

It is hard to believe, but the process starts all over again in a few short months. The success of this past year will create expectations for 2017, and I hope that we can replicate that success. A number of lessons were learned from this past year that we can carry forward into the next round. So in this section, I will try to distill some of what we learned to help guide our efforts in the coming year.

The Ideas Page and Advertising

Most of the work of identifying projects was carried out through the D Forums, with the odd email to past mentors. This was generally successful, but a number of proposals from previous years ended up being recycled. While it may be inevitable, it seemed that many of the proposal ideas were added at the last minute. Since a number of our best ideas from the 2016 page are now completed projects, we will need to replenish the Ideas page for 2017.
Recommendations

  1. We should post a PDF version of one of the successful proposals on our Ideas page to give students an example of what we expect. Although it was excellent, we likely shouldn’t use Sebastian Wilzbach’s treatise, as that may scare some people off.
  2. Try to get a decent set of solid proposals with committed mentors earlier in the process. In 2016 a number of the mentors were signed up at the last minute. The earlier the proposals are posted the more time we have to polish them and make them look more attractive.

Interview and Selection Process

The selection process went well, but was a lot of work. Having input from a number of different mentors/individuals was invaluable.
Recommendations

  1. Streamline the selection process, but reuse much of what was done last year. Having a rigorous selection process was a key contributor to 2016’s success.
  2. Start the interview portion of the selection process earlier so that we have more time to set up and carry out the interviews.

Project Progress and Mentoring

Much of the success of an individual project involves having a good relationship and work plan between the student and mentor. From this perspective, the organization isn’t heavily involved. Since all of our students worked well with their mentors, even less organizational administration was required. This is a byproduct of good screening and a solid set of ideas, and being fortunate enough to get good students.

However, there are areas where we could have run things a bit better. Students and mentors were asked to regularly provide updates on their progress, and they generally did this well, but there was no formal reporting process. Also, it would be worthwhile to have a centralized collection of project timelines/milestones where administrators and others involved in the projects (we had a few individuals working in advisory roles) can keep an eye on project progress.

Recommendations

  1. We should keep a centralized version of project timelines somewhere (ie. Google Docs Spreadsheet) where we can check on project milestones. This should be shared with all individuals involved in a project (student/mentors/advisors/admins).
  2. Have a more formalized process for students and mentors reporting on their progress. This would involve weekly student updates and biweekly mentor updates.

Summary

The 2016 GSoC was a great success, and with any luck will be a good foundation for our successful participation in the year to come. We were fortunate that everything seemed to fall nicely into place, from our being awarded all four projects, to having all of our students complete their projects. Perhaps Sebastian, Lodovico, Wojciech or Jeremy will be involved again as students (or even mentors), and in any case continue to contribute to the D Language.

The D Blog in 2016: Seven Months of Page Views

Posted on

The D Blog was born at DConf 2016 and the first post was published on June 3rd. There were 27 more posts between then and the end of the year, most of which were shared on the usual social media sites. In case some of you in DLand are curious about such things, a year-end stats post is a fun way to kick off the new year.

First, we welcomed 39,471 visitors who viewed a total of 53,013 pages. The top five referrers in terms of page views:

  1. 16,604 — Reddit
  2.  3,698 — The D Forums
  3.  3,123 — Hacker News
  4.  2,847 — Twitter
  5.  1,759 — Facebook

The top five countries in terms of page views:

  1. 17,244 — United States
  2.  4,427 — Germany
  3.  3,349 — United Kingdom
  4.  2,251 — Canada
  5.  1,598 — France

Several posts included links to D projects at GitHub. Counting both projects and profiles, the top five most-clicked were:

  1. dlangui
  2. atrium
  3. Timur Gafurov
  4. voxelman
  5. dlib

The single most-clicked page was the DLangUI screenshot page.

Finally, the top six posts in terms of page views:

  1. 5,865 — Find Was Too Damn Slow, So We Fixed It
  2. 5,602 — Ruminations on D: An Interview with Walter Bright
  3. 4,267 — Project Highlight: DLangUI
  4. 2,704 — Programming in D: A Happy Accident
  5. 2,579 — Project Highlight: Timur Gafarov
  6. 2,257 — Project Highlight: Voxelman

The list of posts was intended to be a top-five, but it was interesting that Voxelman was posted only on December 30th and managed to become the sixth most-viewed post on the site.

2016 was the time for the blog to find its sea legs. The coming year will see more Project Highlights and more guest posters (including Andrei and Walter). We’re also looking to expand the scope somewhat, so keep your eyes open for new types of content.

If you would like to write for the D blog, please go and contact the fellow who owns this GitHub profile, where he’s showing his email address for the world to see. He would be happy to discuss posts about your D projects, idioms you like to use, tutorials you’d like to share, or anything related to the D Programming Language.

Thanks for tuning in, and Happy 2017!

Happy New Year from the D Language Foundation

Posted on

Ali Çehreli uses D professionally at Weka.io, is the author of Programming in D, and is frequently found in the D Learn forum with ready answers to questions on using the language. He also is an officer of the D Language Foundation.


Happy 2017!

2016 was filled with many great things happening for the D community:

All of that was achieved by you through your direct contributions or the donations that you’ve made.

We look forward to another great year filled with many cool things happening in the D world. We can’t wait to see your work on D in 2017, some of which we hope to hear about at DConf 2017. 😀

Project Highlight: Voxelman

Posted on

If you spend any time over at r/VoxelGameDev, you may have seen posts about Voxelman, the plugin-driven game engine MrSmith33 is developing with D. His real name is Andrey Penechko, and he started work on Voxelman after he was inspired by Minecraft to think about all the cool things he could do with a voxel engine, particularly the low-level optimization tricks he could use in implementing one. Then he jumped in and started figuring things out.

I started the project somewhere in 2011 or 2012. It began with creating an SDL window and getting some triangles on the screen. Then I did cubes, then a single chunk. It was a simple, single-threaded thing. I did it all with a fixed camera and only had rudimentary camera controls.

For that initial version of the project, he was using C++, but he found himself stuck from a lack of knowledge about the language. So he started searching to see what else was out there. That led him to D.

I don’t really remember how I found D. I was in need of some statically typed compiled language other than C++. I was frustrated about all the source file organisation, the need of forward declarations, header separation and the include system. In D, it was as simple as writing code. I bought a cheap 10 inch tablet just to read Andrei’s book, because my 3.2″ PPC was too small to read the whole thing. I enjoyed reading every single bit of it.

His ultimate goal with the project is to provide a platform for which people can create and share plugins and game worlds.

Ideally a complete project build should have the engine source and tools (launcher, source editor, compiler). Players should be able to initiate a connection to any server in the server list, then the launcher will download any missing plugins, compile a new executable and start the engine with the list of plugins. Currently, a build of Voxelman is less than 3MB in size. I think that this is a good property to have.

The major sticking point he sees with this approach is the dependency DMD has on the Microsoft tools for 64-bit (and 32-bit COFF) support on Windows (specifically the Windows SDK and the Microsoft linker). Even though the MS linker is considered the system linker, it’s not uncommon to see Cygwin and or one of the various distributions of MinGW installed instead of the MS tools. In a perfect world, he could tell people to download the D compiler and they would have everything they need. But it’s not a deal-breaker, so he’s not letting it stop him.

Voxelman uses a client-server architecture, where the server can be launched in a dedicated process or as part of the client’s. This is managed by a launcher which, in addition to launching the game, can be used to compile projects, manage the world, and find servers to connect with.

World and mesh generation is multi-threaded and, as in most such engines, the model is chunk-based. The chunk management implementation is informed by the concept of entity component systems, with a chunk’s world position serving as its entity ID and layers functioning as components.

Each dimension is broken into chunks. A chunk is a 32³ array of blocks. Each chunk can have a set of data layers (currently blocks and block entities). Each layer is essentially an immutable snapshot. It can be of different storage types (uniform, where all blocks are the same,  or a compressed or full array, where the layer stores an array of data). Those layers then can be freely transmitted between threads, with reference counting done in the main thread. When a layer is no longer needed it’s deleted.

Immutable chunk data makes for fast auto saves of chunk snapshots in a separate IO thread.

When a chunk is received on the client side, it can be sent to a worker thread and the geometry will be generated. Snapshots are sent to the IO thread when save points occur, and they can still be used in the main thread, sent to the client, or processed by other worker threads. One can easily use an old snapshot while several new ones are in use. Whenever a layer is being modified, data is copied into a write buffer, changes are made, and at a commit point at the end of the frame, all write buffers are committed to chunk storage.

Andrey calls his plugin system “semi-hackish”.

All plugins inherit from an IPlugin interface. Then, each plugin registers itself in a global table of plugins from a shared static constructor. The global table has lists for server and client plugins. The engine adds those plugins to the plugin manager based on a provided plugin pack. The plugin manager implements the initialization sequence. When starting initialization, you have lots of dependencies, so you need to run things in a specific order.

He has found a lot of things to like about D. As major pros, he cites the module system (“no forward declarations”), foreach loops (“99% of loops in my code are these guys”), associative arrays, delegates, and templates (“They’re beautiful; you simply add another set of parentheses and you’re done”). He also loves D’s dynamic arrays (slices).

They are a perfect design, with the pointer and the length bundled together. You can append to them, concatenate them, and change their length.

As minor pros, he lists D’s Compile-Time Function Execution and its code generation and compile-time introspection features. Unlike some D users, he also counts the garbage collector in that group. He has implemented a mix of GC-ed and non-GCed memory in Voxelman.

High-level stuff is fully in GC memory. I call something high-level if it has only one instance, so I use interfaces/classes for the high-level parts. Low-level things are mostly stack allocated, using structs (which are POD in D), and the most performance sensitive and memory consuming parts use manual memory management (via Mallocator). This includes chunk storage and chunk meshes.

He also has a list of rough corners. He doesn’t like that support for DLLs is not yet fully functional and reliable. He has found problems when trying to use shared (for example, the Mutex class cannot be used with it). He also finds all the use cases of the is expression confusing, saying the syntax “feels like regular expressions for templates; very powerful and concise, but hard to understand.”

His difficulties with shared actually took him down an interesting path that ultimately had a positive impact on performance.

I started my multi-threading by using the send and receive functions from std.concurrency. I found that I needed to send messages of variable length. For example, when loading or saving chunks, you need to send all the layers to another thread. This involved allocating arrays for all the layers and also required the use of shared.

This situation led me to the implementation of a lock-free message queue, where each message is just a stream of bytes. You write variables on one end and read them from the other. This is obviously a single producer, single consumer queue.

A disadvantage was the use of a fixed-size circular array. You need to make sure that the queue doesn’t fill up. This was a point where I found a good book that explains how atomics work: C++ Concurency in Action: Practical Multithreading. This is one of the places in D’s documentation where you feel a lack of pointers on where to find relevant information on a specific topic.

So the new solution doesn’t require any allocations and is actually faster than the built-in one. Later I added a notification system via Semaphore, so that worker threads wait when out of work.

If you’re looking for an open source D game to contribute to, Voxelman is waiting for you. You can read more about some of its internals on reddit, check out some images on imgur, and watch some videos on YouTube. I’ll leave you with this example of it in action:

Perspectives on D: Mihails Strasuns

Posted on

Joakim is the resident interviewer for the D Blog. He has also interviewed members of the D community for This Week in D and is responsible for the Android port of LDC.


Mihails Strasuns, known as Dicebot on the D newsgroup, is a well-known community member who works for Sociomantic, one of the largest commercial users of D and host of the previous and upcoming DConfs in Berlin. He has given talks about declarative programming at DConf 2014 and the process of transitioning from D1 (D 1.0) to D2 (D 2.0), at DConf 2015; acted as review manager for several additions to the standard library, Phobos; and is the current manager for DIPs (D Improvement Proposals), a process for suggesting changes to the D language. He also maintains the D packages for Arch Linux.

Joakim: Please tell us about yourself: who you are and where you’re from, what programming languages you used before D, and take us from your experience first discovering and using D to getting involved with its development.

Mihails: This is quite long story to tell but I will try to keep details to bare minimum.

My real name is Михаил Страшун, age 27, coming from Latvia. Have been into programming since early primary school – initially started with Pascal courses for kids and continued with informatics competitions and small pet projects in Delphi . After ending secondary school got my first job which was also about Delphi but by that time I have already understood that it isn’t most practical specialization. So next was C++ and next few years have been spent moving between small Latvian companies doing VoIP and CCTV server software. Ended up in local outsourcing company doing part of a huge LTE project for Nokia Siemens Networks. That was also my introduction to the world of barebone programming and plain C.

Shortly before that (in something like 2010) I have stumbled upon Andrei Alexandrescu article The Case For D and immediately got hooked. With fresh memories of learning C++ the hard way, it just felt like a breakthrough. There wasn’t any practical application I could use D for at that point so it remained purely theoretical interest for a long time. At that point, best thing about D was reading the newsgroup and studying papers and articles linked from there – which also sparked my interest about programming language design in general.

It is quite telling that it took me about 30 minutes from trying “Hello, World” to finding first Phobos bug. And 1 day to find first DMD bug. D toolchain stability has really improved since 2011. 🙂 Because of that I didn’t initially have the courage to try D even for pet project. To be honest, I still don’t have any, preferring to contribute to projects of others I have interest in. Resulting contrast between my work activities in C and spare time contributions in D started a series of events that resulted in me being hired by Sociomantic Labs in 2013.

Regarding D development involvement – I don’t feel like I am really part of it, even if perception is sometimes different. I simply do stuff that feels necessary and that no one else seems to work on. Phobos contributions, compiler features, even review manager activity – it all has happened simply because no one else was doing things I wanted to get done. Stepping up was simply fastest way to make it happen. Can’t even remember when I have created first Phobos/DMD pull request – it was a very casual and natural thing to do. Same with Arch Linux packaging.

I think this is one of the most commonly underrated things about how D development works – one doesn’t need any outstanding expertise or authority to make an impact. No permission of benevolent dictator is needed either – just patience and desire to work on things you want to happen.

Joakim: Sociomantic was started with D1 and has been moving to D2, a transition that you helped set up. You didn’t code much in D2 at Sociomantic initially, what are your impressions of D2 now that you’re using it more?

Mihails: I started with D2 and have used D1 for the first time in my life only in Sociomantic. 🙂

Most of the code I write these days is D2-compatible. But it isn’t what one may expect from idiomatic D code because D1 compatibility is preserved too. The Ocean library is quite a typical example of that kind of code and I am one of its maintainers.

Though there is also bunch of small tools/scripts I write occasionally – those are pure (and maybe even idiomatic) D2. Our migration helper tool, d1to2fix, is one such example and we will probably open-source a bit more in the near future.

But most importantly, since this month I will be spending part of my work time (1-2 days / week) helping D upstream – this is the first step in planned Sociomantic contribution to D Foundation. 🙂 And that definitely means using some bleeding edge D2!

Joakim: Have you written much in D2 outside of Sociomantic? What projects and how was your experience?

Mihails: Sadly, not much. My main point of interest was vibe.d, specifically its MongoDB driver and REST interface generator. The latter has become my personal “playground” for stressing limits of D meta-programming capabilities while still trying to maintain code readability (but initial idea and implementation is 100% by Sönke Ludwig). I used it any time some personal web service was necessary but that didn’t result in anything persistent. There were some minor contributions to tools like DStep or dub but most often it was just trying out various concept and throwing them away.

There is also some amount of D2 activity that is directly related to my job as our upgrade process has been slowly moving forward, but that is more about compiler itself. Like adding more permissive deprecation paths during recent beta release cycle to ensure that we will be able to smoothly go through versions later. Sadly, it is very hard for me to find motivation to work with D both at work and in spare time – my mind urges for more diversity.

Joakim: You forked the Volt programming language repository on github a couple years ago, Rust last year. How do you feel those languages compare to D2? What do you think D2 has done right and wrong?

Mihails: Volt has caught my interest about three years ago. Same as D tries to improve on C mistakes, Volt is an attempt to rethink D design mistakes. It is hard to really compare it with D as a language, because Volt is more of a hobbyist thing that is more of a prototype than finished design. That was one of the best things about my (very short) involvement – all those refreshing design discussions in IRC with no concerns about backwards compatibility and strong desire to get things right. 🙂 At some point I have been seriously considering dropping D and joining Volt development team but joining Sociomantic has changed that. It feels more pragmatic to work on small improvement of language you will actually use than on fundamental things that are likely to remain as hobby.

My attitude to Rust is quite different. Right now I consider it to have a serious advantage over D in embedded/barebone domain, at least when thinking about types of applications I have worked on earlier with C and C++. Last year, I wrote a blog post that compared D vs Rust from my personal point of view, this should give a more detailed explanations about language features. At the same time, I don’t feel tempted to start any personal hobby projects in Rust. It is a very well-designed strict purist language – exactly the kind of tool you want to have to manage big, complicated projects but not that fun to use for small dirty experiments.

These days my main grudge at D is more about process than language itself. It just happened that many of D2 features were added in quick burst when the split from D1 has happened and since then people keep trying to work with that mostly theoretical designs even if practice has shown that some choice were sub-optimal. Commonly mentioned example is choice of attributes like pure or @safe to be permissive by default. I believe having regular (once in ~5 years) major language revisions could be a better approach to move forward and this was one of the themes for my DConf talk last year. 🙂

Joakim: Please expand on some of these “D design mistakes:” what are the “theoretical designs” that have proven sub-optimal? Not making pure and @safe the default sounds more pragmatic, not theoretical.

Mihails: By “theoretical” I have meant that certain decisions simply didn’t have any prolonged field-trial period before being set in stone. It felt right to add purity and safety enforcements but only after some years of trying to adjust Phobos to actually use those we started to realize that other way around for defaults could have been better approach. Another example is D module system – it felt perfectly reasonable and elegant when I have first read the spec, but with more D project maintenance experience my opinion has changed. Main issue with it is that there is no way to add new public symbols to libraries in backwards compatible way without risking the breakage of user code (I have explained it in a bit more details in my Rust vs D blog post). Some other aspects we have been discussing in Volt IRC channel is relation between symbol visibility and internal linkage and introduction of more structured template constraints for better error feedback. All kind of stuff that is simply hard to foresee until you actually try it in practice and see how it fails.

Joakim: You certainly have a lot of criticism for D: what do you feel it got right?

Mihails: Just want to make it clear – I don’t have any bad feelings for D, it just the way my naturally grumpy perception works. If I don’t criticize something, that usually means that I am simply not familiar enough with the topic. 🙂

Despite all my complaints D remains one of most pleasant and practical languages I have used. It has a very rewarding learning curve – easy to start with for anyone familiar with C-style languages, easy to get your job done using only subset of language you are comfortable with, easy to slowly adopt more advanced concepts of language one by one. Documentation can be lacking but language itself is very well-designed in that regard. One example of such decisions is choice of string mixins vs macros as primary meta-programming facility. Latter is “cleaner” but former is much easier to jump in, being a very intuitive concept.

It is not about getting any specific feature right but about overall taste of pragmatism that implies small tough trade-offs here and there. And Walter seems to have a pretty good taste. 🙂

Joakim: You’ve been review manager for some Phobos modules over the years: what was good or bad about the experience? Phobos has a reputation for interminable review, what are your thoughts on the current review process?

Mihails: That was a good experience – actually moving on with Phobos proposals instead of them rotting for years in review queue. 🙂 Even rejecting is better than keeping good work completely abandoned with no feedback at all. That was exactly how I have started with this role – there were several interesting proposals in review queue and no one wanted to step up even if required effort was trivial.

Most bad experience comes from attention disbalance. Proposals that target smaller audience and/or have complicated implementation can’t gather enough reviewers to be reliably accepted (like it has happened with new std.signal). Proposals that are widely demanded and have lot of natural subjectivity (like std.logger) get debated to death over and over again.

In my opinion there isn’t anything inherently wrong with review process itself (it is quite simple and flexible). It is natural consequence of wanting to get useful things in Phobos and maintaining strict backwards compatibility at the same time. We simply can’t risk accepting anything with debatable API into Phobos because it will be impossible to fix if issues will be found later. And some packages are just so naturally opinionated that making “correct” decision is simply impossible – it is matter of taste!

In the end, it all comes to argument between two camps – those who prefer all-powerful standard library and those who prefer endorsing dub, the D package manager. Actual review process is hardly that important here. When I understood that Phobos is following kitchen sink path and this is not going to change, I have lost any interest in its development.

Joakim: How is the new DIP process you initiated going? Lay out any changes you’ve had to make to the process and how you feel the proposal queue is now.

Mihails: I am quite satisfied with it. There are still small tweaks happening to the process as I gather more feedback from Andrei and Walter of course. For example, for first submitted DIPs I only checked most formal acceptance criteria and Andrei has clearly indicated the bar has to be much higher. But the core process seems to be working as intended right now.

In The Why and Wherefore of the New D Improvement Proposal Process, I have outlined three key goals for new process:

1) introduce some preliminary quality control
2) ensure formal response from language authors
3) transparent DIP status maintenance

(1) is probably the most lacking bit as I am very alien to academical world myself and can’t review proposals with the level of scrutiny that is desired. I could really use some help from other community members with experience in this domain.

But on (2) and (3) there was a huge success in my opinion. Responses provided by Andrei (DIP 1001 and 1002) explain all issues of the proposal in greatest details and provide great insight on decision rationale. And switching to GitHub repository for managing documents naturally helped a lot with (3).

Joakim: You’ve mentioned taste a couple times, including that Walter has “pretty good taste.” What stands out in D as exemplars?

Mihails: I think decision to stick to C syntax family was a big success and remains one of big selling points for D in the language market. C syntax is often criticized for bad grammar decision (for example, with variable declarations) but in practice it proves to not be too big of a deal. But providing some familiar ground for new devs is definitely a big deal.

Slices come to mind too. When I was only learning D it seemed awkward to separate actual dynamic array from its view like that. But eventually I figured out those can be used as view on any kind of contiguous data and started to appreciate how convenient it can be. Like the fact that one can make D string from C string by simply slicing the pointer. That makes you feel good.

Those examples may feel artificial though because “pretty good taste” is not about any specific feature and decision. It just happens that you start using the language and find yourself much more comfortable with it, as opposed to thinking about any of its design aspects in theory. For me D feels like a language which was designed by someone with huge programming experience, even if I can’t truly reflect why.

The D Language Foundation’s Scholarship Program

Posted on

d6The D Language Foundation recently announced a new scholarship program aimed at EE and CS majors attending University “Politehnica” Bucharest (UPB). I contacted Andrei Alexandrescu for a few details on how the initiative came together, hoping for just enough tidbits of backstory to craft a blog post around. He obliged in a big way, turning my one question and “a few details” into an informative conversation.

Mike: I assume quite a lot of work went into this. Could you share a few details about how it came about?

Andrei: Gladly! The story starts back in 2012, when I gave a talk at the How to Web conference in Bucharest, my native city. It was a great event and I got to meet many great people. Except for one whose name kept coming up all over the Romanian IT space, Andrei Pitis.

I heard he was an instructor in the CS department at UPB (the best IT school in Romania, also noted internationally). He’s been directly involved in a number of IT-related foundations and professional organizations, and he created and led the immensely successful Vector Smart Watch startup. So, having heard he’d be around, I went to the conference speakers’ dinner hoping to bump into him.

Not knowing what he looked like, I was just craning my neck in search of someone who seemed popular. Meanwhile, I was passing time by making chit chat with a nice fellow who introduced himself to me. Now, you know how these group parties go. There’s always loud music and conversation, so I didn’t even hear his name and assumed he hadn’t heard mine.

As the evening progressed, I figured Andrei Pitis wasn’t going to show, so I had more time to chat with that fine gentleman. And I noticed two things. First, he was incredibly insightful. Second, he seemed equally excited about meeting me as I was about meeting Andrei Pitis. After a long while, the coin dropped: they were one and the same.

Thus started a great friendship. Andrei gave me great tips about how to start and conduct The D Language Foundation. Recently, he introduced me to two UPB CS systems professors, Razvan Deaconescu and Razvan Rughinis (together, the three had created the Tech Lounge nonprofit organization dedicated to helping graduating CS students start their careers).

Razvan Rughinis came up with the scholarship idea while we were chatting over beers in the quaint old town of Bucharest. In great part the idea was motivated by the strong interest UPB systems graduate students had in participating in a high-impact open source project such as the D language as part of their MSc thesis. In systems research (unlike e.g. CS theory), actual system building is a key part of the research project; therefore, a visible OSS project makes for a much stronger dissertation than the usual throwaway experimental code.

Clearly a strong opportunity had presented itself, and the DLang UPB scholarship is its realization.

Mike: How does the selection process work?

Andrei: The two professors introduce a few candidates, which I pass through the rigors of the typical Facebook interview. We also ask for the usual suspects – proof of enrollment, transcripts, motivation letter, and references.

Of all components, the most important are (in order) the interview, the quality of the BSc projects, and the recommendation letters from their professors. The four current scholarship recipients passed the interview with flying colors and have very strong BSc projects and references. Some of them returned from summer internships at prestigious companies such as Bloomberg, others won CS awards. I have no doubt any company in the Bay Area or elsewhere would be happy to work with them. Once they finish their MSc, of course :o).

And I should mention here that the two professors aren’t only involved in the selection process. They will make themselves available to help manage the students on an ongoing basis. We’re very fortunate to have them.

Mike: Can you provide any info on the current recipients and their projects?

Andrei: The current recipients are Alexandru Razvan Caciulescu, Lucia Cojocaru, Eduard Staniloiu, and Razvan Nitu. I have posted an introduction to each on the D forums and, now that you mention it, I told them to create a wiki page with a blurb for each. They are hosted in a nice shared office kindly donated by Tech-Lounge.ro and… we’re in the process of getting a coffee machine up there :o).

They are all obviously interested in taking large systems projects that benefit their research interests and have an impact on the D language. To get them started, I took a page from Facebook’s practice and defined a “bootcamp” program. Bootcamp is a month-long process (six weeks at Facebook) during which the so-called n00bs get familiar with the technologies used in the organization: the language proper; the core runtime and standard library; the build process; the way code changes are created, reviewed, accepted, and committed; and, last but not least, the community ethos and the kind of problems we are facing that are fit for ingenious solutions.

To kickstart the bootcamp program, I defined a “bootcamp” label in our Bugzilla and applied it to a bunch of existing bugs, with an eye for the kind of bug that simultaneously has low surface (you don’t need to know a lot of internal details to get into it) and offers a good learning experience. Right now each student is busy fixing a couple of such bugs.

Long-term we are looking at high-impact libraries and tools. I do have a few ideas, but I have no doubt the students will come up with their own. Just give them time.

Mike: Speaking of time… is there any room here for an update on the D Foundation’s finances?

Andrei: Of course. To be honest, right now we’re in better shape than ever before (and than I would have hoped). Thanks to Sociomantic, who footed a large part of DConf 2016’s bills, we have quite a bit of change left from conference registration fees. I have also personally carried a number of high-profile appearances at public tech events and private corporate training events, with proceeds flowing to the Foundation.

So we have accumulated a little war chest – not much, but definitely not negligible. With our current funds and operational costs, we are covered for over two years. Of course, the situation is fluid and I am working on expanding both income and (useful) expenditures.

We’re running a very tight operation, and I want to keep it that way. By the Foundation bylaws, its officers (Walter Bright, Ali Çehreli, and myself) cannot get income from the Foundation, which preempts a variety of conflicts of interest. We are a public charity, which reduces and simplifies our taxation. We use modern, low-overhead money transfer methods such as transferwise.com and constantly scan for better ones. Anyone who considers donating should know that about every five dollars donated goes straight to pay for one hour of an exceptional graduate student’s time.

Mike: Are there more applications in the queue? Do you plan to extend scholarships to other universities?

Andrei: UPB seems to be off to a great start, but it’s also a happy case for many reasons: it’s my undergrad alma mater, we know professors there, and we don’t need to pay tuition. If we wanted to extend a scholarship to another university we’d need to avail ourselves of similar strategic advantages. Needless to say, if anyone who reads this has ideas on the matter, please contact me.

Anyhow, for the time being, we got one more strong DLang UPB scholarship application literally today.

Mike: To close out, is there anything you’d like to say to people who’d like to help out?

Andrei: I’m very excited about this scholarship program and possible extensions to it. The reason for my excitement is that this is but a part of a larger strategy. Allow me to explain.

Up until now, we had no idea what to do with money even if we had it. A while ago, I met this potential donor who said, “OK, say I gave the Foundation half a million dollars over two years, no strings attached. What would you do with it?” To my own surprise, I had only vague answers. I asked Walter the same question, and he had even less of a clue than me.

So then I figured it’s essential for the Foundation to have a strong response to that. I’m a big believer in the adage “luck helps the prepared”, of which the converse is “luck is wasted on the unprepared”. By that paradigm, not knowing what we’d do with money was a definite way to ensure we’d never be big. Now that we have the scholarship program, there exists a powerful reason for people to donate to the Foundation: donations help us find and support good students to work on high-impact D-related projects that push the state of CS systems research forward.

Another thing that would be great to have “donations” of is contributor time. Receiving more students starts pushing against our management capacity. Currently, and somewhat to my surprise, I am effectively a manager, seeing that all of these things I just gave you an earful of (bringing money in to the Foundation, managing bootcamp, finances, operations) take enough time to be a full-time job that leaves little time for coding. At some point, I won’t be able to help everyone with their research, so I’ll need to delegate some of that work to other folks. I’m talking any capacity here – from code reviews to managing to co-authoring papers to co-advising.

There are more things I have in mind, but it’s early to share those. In brief, we need to organize ourselves for further growth. What’s clear to me is we’re no longer a seat-of-the-pants operation in a (virtual) basement. The D Language is exiting its adolescence.

Project Highlight: The New CTFE Engine

Posted on

CTFE (Compile-Time Function Execution) is today a core feature of the D Programming Language. D creator Walter Bright first implemented it in DMD as an extension of the constant folding logic that was already there. Don Clugston (of FastDelegate fame) made a pass at improving it and, according to Walter, “took it much further“. Since that time, usage of CTFE has shown up in one D project after another, including in D’s standard library. For example, Dmitry Olshansky employed it in his overhaul of std.regex to great effect.

On the last day of DConf 2016, Stefan Koch gave a lightning talk on his thoughts about CTFE in D. At the end of the talk, in response to a question from Andrei Alexandrescu on how D’s implementation could be improved, he said the following:

CTFE is really a hack. You can see that it’s a hack. It’s implemented as a hack. It is the most useful hack that I’ve ever seen, and it is definitely a hacker’s tool to do stuff that are like magic. But to be fast, it would need to be heavily redesigned, reimplemented, possibly executed in multiple threads, because it is used for stuff that we could never have envisioned when it was invented.

Not long after that, Stefan opened a discussion on the fourms and took up the torch to improve the CTFE engine. As to why he got started on this journey in the first place, Stefan says, “I started work on the CTFE engine because I said so at DConf.” But, of course, there’s more to it than that.

I have pretty heavy-weight CTFE needs (I worked on a compile-time trans-compiler). Also my CTFE SQLite reader is failing if you want to read a database bigger then 2MB at ctfe.

His investigations into the performance of the CTFE interpreter shed light on its problems.

The current interpreter interprets every AST-Node it sees directly. This leaves very little space to collect information about the code that is being interpreted. It doesn’t know when something will be used as a reference, so it needs to copy every variable on every mutation. It has to do a deep-copy for this. That means it copies the whole chain of mutations every time.

To clarify, he offers the following example.

Imagine foreach(i;0 .. 10) { a = i; }. On the first iteration we save a` = 0 and set a`` to 1. On the second iteration we save a``` = 1 and a````= 0 and we set a````` to 2 , then a`````` = 1 and a``````` = 0 and so on. As you can see, the memory requirements just shoot up. It’s basically a factorial function with a very small coefficient. That is why for very small workloads this extreme overhead is not noticeable.

That flaw looked unfixable. Indeed the whole architecture in dinterpret.d is very convoluted and hard to understand. I did a few experiments on improving memory-management of the interpreter but it proved fruitless.

Once he realized there was going to be no quick fix, Stefan sat down and drew up a plan to avoid digging himself into the same hole the current interpreter was in. The result of his planning led him down a road he hadn’t expected to travel.

Direct Interpretation was out of the question since it would give the new engine too little time to analyze data-flow and decided whether a copy was really needed or not. I had to implement an Intermediate Representation. It had to be portable to different evaluation back-ends. I ended up with a solution, inspired by OpenGL, of defining my interface in the form of function calls an evaluation back end had to implement. That meant I would not be able to simply modify the current interpreter. This made the start very steep, but it is a decision I do not regret.

His implementation consists of a front end and a back end.

The front end walks the AST and issues calls to the back end. And the back end transforms those calls into actual bytecode. This bytecode is interperted by the back end as soon as the front end requires it.

In terms of functionality, he likens the current implementation to an immediate mode graphics API, and his revamp to retained mode. In this case, though, it’s the immediate mode that’s the memory hog.

You can read about his progress in the CTFE Status thread, where he has been posting frequent updates. His updates include problems he encounters, features he implements, and performance statistics. Eventually, every compiler that uses the DMD front end will benefit from his improvements.

Big Performance Improvement for std.regex

Posted on

Dmitry Olshansky has been a frequent contributor to the D programming language. Perhaps his best known work is his overhaul of the std.regex module, which he architected as part of Google Summer of Code 2011. In this post, he describes an algorithmic optimization he implemented this past summer that resulted in a big performance win.


Optimizing std.regex has been my favorite pastime, but it has gotten harder over the years. It eventually became clear that micro-optimizing the engine’s state copy routine, or trying to avoid that extra write, wasn’t going to cut it anymore. To move further, I needed a new algorithmic improvement. This is how the so-called “Bit-NFA” came to be implemented. Developed in May of this year, it has come a long way to land in the main repository.

Before going into details, a short overview of the engine is called for. A user-specified pattern given to the engine first goes through a compilation process, where it gets transformed into a bytecode program, along with a bunch of lookup tables and auxiliary data-structures. Bytecode implies a VM, not unlike, say, the Java VM, but far simpler and more specific. In fact, there are two of them in std.regex, one that evaluates execution of threads in a backtracking manner and another one which evaluates all threads in lock-step, resolving any duplicates along the way.

Now, running a full blown VM, even a tiny one, on each character of input doesn’t sound all that high-performance. That’s why there is an extra trick, a kickstart engine (it should probably be called a “sidekick engine”), which is a dumb approximation of the full engine. It is run over the input first. When it spots something that looks like a match, the full engine is run to check it. The only requirement is that it can have no false negatives. That is, it has to detect as positive all matches of the regex pattern. This kickstart engine is the central piece of today’s post.

Historically, I intended for there to be a lot of different kickstart engines, ranging from a simple ‘memchr the first byte of the pattern’ to a Boyer-Moore search on the prefix of a pattern. But during the gory days of GSOC 2011, a simple solution came first and out-shadowed all others: the Shift Or algorithm.

Basically, it is an NFA (Nondeterministic Finite Automation), where each state is a bit in a word. Shifting this word advances all the states. Masking removes those that don’t match the current character. Importantly, shifting also places 0 as the first bit, indicating the active state.

With these two insights, the whole process of searching for a string becomes shifting + OR-masking the bits. The last point is checking for the successful match – one of the bits is in the finish state.

Looking at this marvelous construction, it’s tempting to try and overcome its limitation – the straight-forward execution of states. So let’s introduce some control flow by denoting some bits as jumps. To carry out a jump, we just need to map every combination of jump bits to the mask of the resulting positions. A basic hashmap could serve us well in this regard. Then the cycle becomes:

  1. Shift the word
  2. Capture control flow bits
  3. Lookup control flow table
  4. Mask AND with control flow bits
  5. Check for finish state(s) bits
  6. Mask OR with match filter table

In the end, we execute the whole engine with nothing more than a hash-map lookup, a table lookup, and a bit of bitwise operations. This is the essence of what I call the Bit-NFA engine.

Of course, there are some tricky bits, such as properly mapping bytecodes to bits. Then there comes Unicode.. oh gosh. The trick to Unicode, though, is having a fast path for ASCII < 0x80 and the rest. For ASCII, we just go with a simple table. Unicode is a two-staged variation of it. Two-staging the table let’s us coalesce identical pages, saving space for the whole 21 bits of the code point range.

Overall, the picture really is worth a thousand words. Here is how the new kickstart engine stacks up.

This now only leaves us to optimize the VMs further. The proven technique is JIT-ing the bytecode and is what top engines are doing. Still, I’m glad there are notable tricks to speed up regex execution in general without pulling out this heavy handed weapon.

Project Highlight: libasync

Posted on

d6libasync is a cross-platform event loop library written completely in D.  It was created, and continues to be maintained, by Etienne Cimon, who started it as a native driver for vibe.d, a modular asynchronous I/O framework most often used for web app development in D.

In 2014 or so, I was looking for a framework to power my future web development projects. I wasn’t going to use an interpreted language, as binary executables were too attractive. I found vibe.d appealing because, coming from C++, it was relatively simple and featureful. So I studied it, along with the D programming language and the Phobos standard library.

vibe.d has always used libevent under the hood by default. This is where Etienne ran into a problem that bothered him.

I stumbled on some workflow issues when deploying vibe.d apps to other operating systems which may or may not have the right version of libevent in the package repository. I didn’t want to package a DLL with my server, or have to go through dependency hell with my software, and I wanted everything to be consistently written in D to reduce the mental complexity of switching programming languages or to debug other issues.

So he decided to study up on the system APIs across the platforms supported by DMD (Windows, Linux, *BSD and OS X) and create his own event loop library in D. Now he, and anyone using libasync, can issue a single command with DUB to compile and execute a web application without needing to worry about external event loop dependencies.

libasync takes advantage of D’s delegates to provide a very intuitive interface.

void testDNS() {
	auto dns = new shared AsyncDNS(g_evl);
	dns.handler((NetworkAddress addr) {
		writeln("Resolved to: ", addr.toString(), ", it took: ", g_swDns.peek().usecs, " usecs");
	}).resolveHost("127.0.0.1");
}

Etienne says of the code snippet above:

The D garbage collector will keep the AsyncDNS object in dns alive for as long as the delegate used in the parameter of dns.handler is alive in the heap, which is in this object. The delegate syntax is more simple to declare than Javascript, and it is also type-safe. This DNS resolver will work on any platform thrown at it, thanks to D’s compile-time version conditions.

libasync makes use of the asynchronous I/O facilities available on each supported platform and provides a number of event-handlers out of the box.

Cross-platform event handlers have been defined for DNS resolution, UDP Messages, (Buffered/Unbuffered) TCP Connections, TCP Listeners, File Operations, Thread-local (Notifiers) and Cross-thread Signals, Timers and File Watchers. The intrinsics involve EPoll for Linux, KQueue for OS X and BSD, and overlapped I/O for Windows. With all of these features thoroughly tested through a vibe.d driver, libasync has become a very fast and reliable library which I use in all of my projects. My benchmarks show it as being a little slower than the libevent driver in vibe.d, though its self-explanatory code base makes it seamless to understand, maintain, and deploy.

A libasync driver has been added to vibe.d and work is going on to improve the library’s performance.

The stability of the underlying OS features makes for very little need for changes, although there is a big improvement involving the proactor pattern in the works for libasync and a new architecture for vibe.d. Together, those two developments are likely to increase the library’s performance significantly.

If you find yourself needing an event loop in D and want to give libasync a spin, you can visit the library’s page at the DUB repository for information on how to add it as a dependency to your own DUB-managed projects. libasync, in turn, has only one dependency itself, another library maintained by Etienne that provides a set of allocators and allocator-friendly containers called memutils.

It wasn’t so long ago that anyone using D who wanted something like libasync or memutils would need to either roll their own or bind to a C library. The ever-expanding list of libraries in the DUB repository, created and made available by members of the D community like Etienne, make it much easier to jump into D today than ever before.

GSoC Report: std.experimental.xml

Posted on

Lodovico Giaretta is currently pursuing a Bachelor Degree in Computer Science at the University of Trento, Italy. He participated in Google Summer of Code 2016, working on a new XML module for D’s standard library, Phobos.


GSoC-icon-192I started coding in high school with Pascal. I immediately fell in love with programming, so I started studying it by myself and learned both Java and C++. But when I was using Java, I was missing the powerful metaprogramming facilities and the low level features of C++. When I was using C++, I was missing the simplicity and usability of Java. So I started looking for a language that “filled the gap” between these two worlds. After looking into many languages, I finally found D. Despite being more geared towards C++, D provides a very high level of productivity, as correct code is easier to read and write. As an example, I was programming in D for several months before I was bitten by a segfault for the first time. It easily became one of my favorite languages.

The apparent lack of libraries, my lack of time, and the need to use other languages for university projects made me forget D for some time, at least until someone told me about Google Summer of Code. When I discovered that the D Foundation was participating, I immediately decided to take part and found that there was the need for a new XML library. So I contacted Craig Dillabaugh and Robert Schadek and started to plan my adventure. I want to take this occasion to thank them for their great continuous support, and the entire community for their feedback and help.

This was my first public codebase and my first contribution to a big open source project, so I didn’t really know anything about project management. The advice about this field from my mentor Robert has been fundamental for my success; he helped me improve my workflow, keep my efforts focused towards the goal, and set up correctness tests and performance benchmarks. Without his help, I would never have been able to reach this point.

The first thing to do when writing a library is to pick a set of principles that will guide development. This choice is what will give the library its peculiar shape, and by having a look around one finds that there are XML libraries that want to be minimal in terms of codebase size, or very small in terms of binary size, or fully featured and 101% adherent to the specification. For std.experimental.xml, I decided to focus on genericity and extensibility. The processing is divided in many small, quite simple stages with well-defined interfaces implemented by templated components. The result is a pipeline that is fully customizable; you can add or substitute components anywhere, and add custom validation steps and custom error handlers.

From an XML library, a programmer expects different high level constructs: a SAX parser, a DOM parser, a DOM writer and maybe some extensions like XPath. He also expects to be able to process different kinds of input and, for std.experimental.xml, to “hack in” his own logic in the process. This requires a simple, yet very flexible, intermediate representation, which is produced by the parsing stage and can be easily manipulated, validated, and transformed into whatever high-level construct is needed. For this, I chose a concept called Cursor, a pointer inside an XML document, which can be queried for properties of a given XML node or advanced to a subsequent one. It’s akin to Java’s StAX (Streaming API for XML), from which I took inspiration. In std.experimental.xml, all validations and transformations are implemented as chains of Cursors, which are then usually processed by a SAX parser or a DOM builder, but can also be used directly in user code, providing more control and speed.

Talking about speed, which in XML processing can be very important, I have to admit that I didn’t spend much time on optimization, leaving a lot of space for future performance improvements. Yet, the library is fast enough to guarantee that, for big files (where performance matters), an SSD (Solid State Drive) is needed to move the bottleneck from the fetching to the processing of the data. Being this is an extensible and configurable library, the user can choose his tradeoffs with fine granularity, trading input validation and higher level constructs for speed at will.

To conclude, the GSoC is finished, but the library is not. Although most parts are there, some bits are still missing. As a new university semester has started, time is becoming a rare and valuable resource, but I’ll do my best to finish the work in a short time so that Phobos can finally have a modern XML library to be proud of. I also have a plan to add more advanced functionality, like XML Schemas and XPath, but I don’t know when I’ll manage to work on that, as it is quite a lot to do.