The D Language Foundation Google Summer of Code 2016 Postmortem

Posted on

Craig Dillabaugh was first drawn to D by its attractive syntax and Walter Bright’s statement that D is “a programming language, not a religion”. He maintains bindings to the geospatial libraries shapelib and gdal, volunteered to manage the GSoC 2015 & 2016 efforts for D, and has taken it on again for 2017. He lives near Ottawa, Canada, and works for a network monitoring/security company called Solana Networks.


The 2016 Google Summer of Code (GSoC) proved to be a great success for the D Language Foundation. Not only did we have, for us, a record number of slots allotted (four) and all projects completed successfully, perhaps most important of all we attracted four excellent students who will hopefully be long time contributors to the D Language and its community. This report serves as a review for the community of our GSoC efforts this past summer, and tries to identify some ways we can make 2017 an equal, or better, success.

Background

Back in 2011 and 2012, Digital Mars applied to participate in, and was accepted to, Google Summer of Code. In each of those years we were awarded three slots and had successful projects. Additionally, a number of long time D contributors, including David Nadlinger, Alex Bothe, and Dmitry Olshansky, were involved as students. Sadly, in the succeeding two years we were not awarded any slots. After 2014’s unsuccessful bid, Andrei asked on the forums if anyone wanted to take the lead for the 2015 GSoC, as he had too many things on his plate. This is when I decided to volunteer for the job.

I prepared for the 2015 GSoC and worked on getting some solid items for our Ideas page. I even prepared what I thought was a beautifully typeset document in LaTeX for our final submission. Needless to say, I was very disappointed when I had to copy/paste each section into the simple web form that Google provided for submissions. Sadly, that year we were rejected once more, though I felt our list of ideas was solid.

We applied again in 2016 for the first time as The D Language Foundation. Again, the community contributed lots of solid suggestions for the Ideas page and we were accepted for the first time in four years. I think that perhaps getting accepted involves a bit of luck, as our ideas were similar to, or repeated from, those that were not accepted in 2015. However, more effort was put into polishing up the page, so perhaps that helped.

The Selection Process

Once we were accepted as a mentoring organization, the process of receiving student proposals began. We received interest from a large number of students from all over the world (about 35). In the end, a total of 23 proposals were officially submitted, ranging from very short–obviously last minute–pieces, to several excellent efforts, including Sebastian Wilzbach’s 20-page document.

Our selection process was, I felt, very rigorous. We had seven of our potential admins/mentors screen the initial proposals. This involved reading all 23 proposals, which was a significant amount of work. From this initial screening we identified eight students/proposals that we thought could become successful projects. We then had all mentors individually rank each of the shortlisted proposals, another significant time commitment on their part.

Finally, interviews were arranged with all eight students. In most cases, two mentors interviewed each student, and the interviews were fairly intense, job-style interviews that involved coding exercises. A number of our mentors were involved in this process, but I think Amaury Sechet interviewed all of the students. It is no small feat to arrange and then conduct interviews with students in so many different time zones, so a huge thanks to all the mentors, but Amaury in particular. Those involved in the screening/interview process included Andrei Alexandrescu, Ilya Yaroshenko, Adam Ruppe, Adam Wilson, Dragos Carp, Russel Winder, Robert Schadek, Amaury, and myself.

Awarding of Slots

The next step for our organization was to decide how many slots we would request from Google. I really had no idea what to expect, but I was hoping we might get two slots awarded to us, as there were many good organizations vying for a limited number of slots. We felt that most of the short-listed projects could have been successful, but decided to not be too greedy and requested just four slots. As it turned out, perhaps we should have asked for more; we were awarded all four. We then selected our top four ranked students from the interview process. They were, in no particular order:

  • Sebastian Wilzbach: Science for D – a non-uniform RNG (Ilya Yaroshenko mentor)
  • Lodovico Giaretta: Phobos: std.xml (Robert Schadek mentor)
  • Wojciech Szeszol: Improvements to DStep (Russel Winder mentor)
  • Jeremy DeHaan: Precise Garbage Collector (Adam Wilson mentor)

Summer of Code

Once the projects were awarded, I must say that most of my work was done. From there on the mentors and students got down to work. I tried to keep tabs on progress and asked for regular updates from both the mentors and the students. These were, in most cases, promptly provided.

While there were some challenges, and a few projects had to be modified slightly in some instances, everyone progressed steadily throughout the summer, so there were no emergencies to deal with. All of our students passed their mid-term evaluations and by the end of the summer all four projects were completed (although Jeremy has some on-going work on his precise GC). As a result, everyone got paid and, I presume, everyone was happy.

In addition to our original mentors, thanks are due to Jacob Carlborg (DStep) and Joseph Rushton Wakeling (RNG) for providing additional expertise.

Mentor Summit

Google offered money for students to attend academic conferences and present results based on their GSoC work. Google also offered to pay travel costs for two mentors to travel to the mentor summit in California. Regrettably, none of our students had the time to take advantage of the conference money, but Robert Schadek was able to attend the Mentor Summit from Oct 28th to 30th in Sunnyvale, California. There he was able mingle with, and learn from, mentors from the other organizations that participated.

Looking Forward

It is hard to believe, but the process starts all over again in a few short months. The success of this past year will create expectations for 2017, and I hope that we can replicate that success. A number of lessons were learned from this past year that we can carry forward into the next round. So in this section, I will try to distill some of what we learned to help guide our efforts in the coming year.

The Ideas Page and Advertising

Most of the work of identifying projects was carried out through the D Forums, with the odd email to past mentors. This was generally successful, but a number of proposals from previous years ended up being recycled. While it may be inevitable, it seemed that many of the proposal ideas were added at the last minute. Since a number of our best ideas from the 2016 page are now completed projects, we will need to replenish the Ideas page for 2017.
Recommendations

  1. We should post a PDF version of one of the successful proposals on our Ideas page to give students an example of what we expect. Although it was excellent, we likely shouldn’t use Sebastian Wilzbach’s treatise, as that may scare some people off.
  2. Try to get a decent set of solid proposals with committed mentors earlier in the process. In 2016 a number of the mentors were signed up at the last minute. The earlier the proposals are posted the more time we have to polish them and make them look more attractive.

Interview and Selection Process

The selection process went well, but was a lot of work. Having input from a number of different mentors/individuals was invaluable.
Recommendations

  1. Streamline the selection process, but reuse much of what was done last year. Having a rigorous selection process was a key contributor to 2016’s success.
  2. Start the interview portion of the selection process earlier so that we have more time to set up and carry out the interviews.

Project Progress and Mentoring

Much of the success of an individual project involves having a good relationship and work plan between the student and mentor. From this perspective, the organization isn’t heavily involved. Since all of our students worked well with their mentors, even less organizational administration was required. This is a byproduct of good screening and a solid set of ideas, and being fortunate enough to get good students.

However, there are areas where we could have run things a bit better. Students and mentors were asked to regularly provide updates on their progress, and they generally did this well, but there was no formal reporting process. Also, it would be worthwhile to have a centralized collection of project timelines/milestones where administrators and others involved in the projects (we had a few individuals working in advisory roles) can keep an eye on project progress.

Recommendations

  1. We should keep a centralized version of project timelines somewhere (ie. Google Docs Spreadsheet) where we can check on project milestones. This should be shared with all individuals involved in a project (student/mentors/advisors/admins).
  2. Have a more formalized process for students and mentors reporting on their progress. This would involve weekly student updates and biweekly mentor updates.

Summary

The 2016 GSoC was a great success, and with any luck will be a good foundation for our successful participation in the year to come. We were fortunate that everything seemed to fall nicely into place, from our being awarded all four projects, to having all of our students complete their projects. Perhaps Sebastian, Lodovico, Wojciech or Jeremy will be involved again as students (or even mentors), and in any case continue to contribute to the D Language.

GSoC Report: std.experimental.xml

Posted on

Lodovico Giaretta is currently pursuing a Bachelor Degree in Computer Science at the University of Trento, Italy. He participated in Google Summer of Code 2016, working on a new XML module for D’s standard library, Phobos.


GSoC-icon-192I started coding in high school with Pascal. I immediately fell in love with programming, so I started studying it by myself and learned both Java and C++. But when I was using Java, I was missing the powerful metaprogramming facilities and the low level features of C++. When I was using C++, I was missing the simplicity and usability of Java. So I started looking for a language that “filled the gap” between these two worlds. After looking into many languages, I finally found D. Despite being more geared towards C++, D provides a very high level of productivity, as correct code is easier to read and write. As an example, I was programming in D for several months before I was bitten by a segfault for the first time. It easily became one of my favorite languages.

The apparent lack of libraries, my lack of time, and the need to use other languages for university projects made me forget D for some time, at least until someone told me about Google Summer of Code. When I discovered that the D Foundation was participating, I immediately decided to take part and found that there was the need for a new XML library. So I contacted Craig Dillabaugh and Robert Schadek and started to plan my adventure. I want to take this occasion to thank them for their great continuous support, and the entire community for their feedback and help.

This was my first public codebase and my first contribution to a big open source project, so I didn’t really know anything about project management. The advice about this field from my mentor Robert has been fundamental for my success; he helped me improve my workflow, keep my efforts focused towards the goal, and set up correctness tests and performance benchmarks. Without his help, I would never have been able to reach this point.

The first thing to do when writing a library is to pick a set of principles that will guide development. This choice is what will give the library its peculiar shape, and by having a look around one finds that there are XML libraries that want to be minimal in terms of codebase size, or very small in terms of binary size, or fully featured and 101% adherent to the specification. For std.experimental.xml, I decided to focus on genericity and extensibility. The processing is divided in many small, quite simple stages with well-defined interfaces implemented by templated components. The result is a pipeline that is fully customizable; you can add or substitute components anywhere, and add custom validation steps and custom error handlers.

From an XML library, a programmer expects different high level constructs: a SAX parser, a DOM parser, a DOM writer and maybe some extensions like XPath. He also expects to be able to process different kinds of input and, for std.experimental.xml, to “hack in” his own logic in the process. This requires a simple, yet very flexible, intermediate representation, which is produced by the parsing stage and can be easily manipulated, validated, and transformed into whatever high-level construct is needed. For this, I chose a concept called Cursor, a pointer inside an XML document, which can be queried for properties of a given XML node or advanced to a subsequent one. It’s akin to Java’s StAX (Streaming API for XML), from which I took inspiration. In std.experimental.xml, all validations and transformations are implemented as chains of Cursors, which are then usually processed by a SAX parser or a DOM builder, but can also be used directly in user code, providing more control and speed.

Talking about speed, which in XML processing can be very important, I have to admit that I didn’t spend much time on optimization, leaving a lot of space for future performance improvements. Yet, the library is fast enough to guarantee that, for big files (where performance matters), an SSD (Solid State Drive) is needed to move the bottleneck from the fetching to the processing of the data. Being this is an extensible and configurable library, the user can choose his tradeoffs with fine granularity, trading input validation and higher level constructs for speed at will.

To conclude, the GSoC is finished, but the library is not. Although most parts are there, some bits are still missing. As a new university semester has started, time is becoming a rare and valuable resource, but I’ll do my best to finish the work in a short time so that Phobos can finally have a modern XML library to be proud of. I also have a plan to add more advanced functionality, like XML Schemas and XPath, but I don’t know when I’ll manage to work on that, as it is quite a lot to do.

GSoC Report: DStep

Posted on

Wojciech Szęszoł is a Computer Science major at the University of Wrocław. As part of Google Summer of Code 2016, he chose to make improvements to Jacob Carlborg’s DStep, a tool to generate D bindings from C and Objective-C header files.


GSoC-icon-192It was December of last year and I was writing an image processing project for a course at my university. I would normally use Python, but the project required some custom processing, so I wasn’t able to use numpy. And writing the inner loops of image processing algorithms in plain Python isn’t the best idea. So I decided to use D.

I’ve been conscious about the existence of the D language for as long as I can remember, but I’d never convinced myself before to try it out. The first thing I needed to do was to load an image. At the time, I didn’t know that there is a DUB repository containing bindings to image loading libraries, so I started writing bindings to libjpeg by myself. It didn’t end very well, so I thought there should be a tool that will do the job for me. That’s when I found DStep and htod.

Unluckily, the capabilities of DStep weren’t satisfying (mostly the lack of any kind of support for the preprocessor) and htod didn’t run on Linux. I ended up coding my project in C++, but as GSoC (Google Summer of Code) was lurking on the horizon, I decided that I should give it a try with DStep. I began by contacting Craig Dillabaugh (Ed. Note: Craig volunteers to organize GSoC projects using D) to learn if there was any need for developing such a project. It sparked some discussion on the forum, the idea was accepted, and, more importantly, Russel Winder agreed to be the mentor of the project. After some time I needed to prepare and submit an official proposal. There was an interview and fortunately I was accepted.

The first commit I made for DStep is dated to February, 1. It was a proof of concept that C preprocessor definitions can be translated to D using libclang. Then I improved the testing framework by replacing the old Cucumber-based tests with some written in D. I made a few more improvements before the actual GSoC coding period began.

During GSoC, I added support for translation of preprocessor macros (constants and functions). It required implementing a parser for a small part of the C language as the information from libclang was insufficient. I implemented translation of comments, improved formatting of the output code (e.g. made DStep keep the spacing from C sources), fixed most of the issues from the GitHub issue list and ported DStep to Windows. While I was coding I was getting support from Jacob Carlborg. He did a great job by reviewing all of the commits I made. When I didn’t know how to accomplish something with D, I could always count on help on forum.dlang.org.

DStep was the first project of such a size that I coded in D. I enjoyed its modern features, notably the module system, garbage collector, built-in arrays, and powerful templates. I used unittest blocks a lot. It would be nice to have named unit tests, so that they can be run selectively. From the perspective of a newcomer, the lack of consistency and symmetry in some features is troubling, at least before getting used to it. For example there is a built-in hash map and no hash set, some identifiers that should be keywords starts with @ (Ed. Note: see the Attributes documentation), etc. I was very sad when I read that the in keyword is not yet fully implemented. Despite those little issues, the language served me very well overall. I suppose I will use D for my personal toy projects in the future. And for DStep of course. I have some unfinished business with it :).

I would like to encourage all students to take part in future editions of GSoC. And I must say the D Language Foundation is a very good place to do this. I learned a lot during this past summer and it was a very exciting experience.