Author Archives: Razvan Nitu

DMD Compiler as a Library: A Call to Arms

Digital Mars D logo

Having a flexible and powerful compiler library has been one of the stated goals of the D Language Foundation for some time now. This makes sense, as a proper compiler library will channel the efforts of contributors into building developer tools, which in turn, will increase the adoption rate of the language. However, progress on this topic has been slow, mainly due to two aspects: (1) the lack of a clear direction, and (2) the intimidating complexity of the DMD frontend, which requires significant work on the compiler codebase.

The good news is that we now have a plan, which I will outline in this blog post. The bad news is that implementing this plan requires significant effort, and we need more contributors. However, the silver lining is that the work, while extensive, mostly involves refactoring the code. This provides an excellent opportunity for contributors to familiarize themselves with the compiler codebase while delivering real value. Before delving into the specifics, let me give you some background.

Current Status And How We Got Here

To fully understand the work done so far on the compiler-as-a-library project, I highly recommend watching my talk on this subject.

In summary:

  • Several years ago, we began packaging the compiler as a library.
  • Our goal was to clearly separate compilation phases: lexing, parsing, semantic analysis, optimizations, and code generation.
  • The parsing and semantic analysis modules were interdependent, necessitating a method for separation.
  • We opted to template the parser with an ASTFamily template parameter, defining the AST nodes required for parsing.
  • We created ASTBase (containing AST nodes essential for parsing) and ASTCodegen (containing AST nodes needed for code generation).
  • ASTBase, as it stands, is code duplicated from ASTCodegen.
  • We started extracting semantic routines and fields from AST nodes to eliminate ASTBase’s code duplication by importing a subset of modules used by ASTCodegen.
  • Additionally, we began replacing third-party libraries (like libdparse) with the DMD-as-a-library package.

For more detailed information on each of these points, I recommend watching the talk I referenced.

Recently, I proposed to Walter a modification to the codebase that would significantly enhance the flexibility of the compiler library, allowing any AST node to be overwritten. Walter was hesitant to accept my proposal, concerned about the potential “ugliness” it would introduce to the codebase. He cited the addition of ASTBase and the resulting code duplication as a precedent. He then suggested that if we eliminate ASTBase, he would reconsider my proposal.

What You Can Do To Help

We are now focused on eliminating the duplication in ASTBase. To achieve this, we need to extract all information related to semantic analysis from the existing AST nodes. The challenge is the sheer number of AST nodes and the multitude of functions associated with each. I have been working on this sporadically over the past few months, and progress is slow due to the nature of the work: it mostly involves moving code, creating visitors, breaking dependencies, etc. While not overly complex, it isn’t particularly creative work either. However, for someone interested in understanding a real-life compiler codebase, it’s an ideal starting point.

If you’re willing to support this initiative, I’ve put together a guide on where to start and what you can do. Feel free to contact me on Slack (razvan.nitu), Discord, or email (razvan.nitu1305@gmail.com) for more details or to request a review of your PR.

I see this as an excellent opportunity to onboard new people into compiler development in a way that benefits both the language and the contributor. So, if you have some spare time, please join us in getting this work done!

Bugzilla Reward System

Digital Mars D logo

The Dlang bot has been updated to track Bugzilla issues
that have been fixed. It went live for testing on the 2nd of July. Each GitHub user who fixes a bug via a merged pull request is awarded a number of points depending on the severity of the issue. The current results can always be seen on the contributor stats page. This blog post covers all of the details regarding the implementation, rules, and prizes of the reward system.

Raison d’être

I want to start by saying that the motivation of this system is not to start a fierce competition between contributors to fix as many issues as possible. The primary reasons: we see this as a means for the D Language Foundation to reward committed contributors and to channel their efforts towards more important bugs. If, as a side effect, the system motivates people to fix more bugs, that’s great! We won’t complain.

There are some negative side effects that are possible with any sort of gamification system, and we’ll be keeping an eye out for them. We think we have one of the best online communities out there. Our members are generally friendly and helpful, and we don’t want to do anything that causes tension or proves negatively disruptive. We think this will be a fun way to reward our contributors, but we will pull the plug if it proves otherwise.

Scoring system

The scoring is designed to reward contributors based on the importance of the issues they fix, rather than the total number fixed. As such, issues are awarded points based on severity:

  • enhancement: 10
  • trivial: 10
  • minor: 15
  • normal: 20
  • major: 50
  • critical/blocker: 75
  • regression: 100

Of course, the severity of an issue does not necessarily reflect the complexity of the solution. There might be regressions that are trivial to solve, and enhancements that require an extremely complicated fix. The message that we are trying to send is that complexity is secondary to need. That is why regressions are given top priority and critical/blocker/major issues aren’t far behind.

Rules

The following rules will guide how points are awarded from the initial launch of the reward system. They are not set in stone and are open to revision over time.

Rule #1: The severity of an issue will be decided by the reviewers of a proposed patch.

Severity levels are not always accurately set when issues are first reported and may not have been updated since. The reviewer of a pull request that closes a Bugzilla issue will evaluate the issue’s severity level and may change it if he or she determines it is inaccurate. I will moderate any disagreements that may arise about severity levels.

Rule #2: A PR fixing a bug may not be merged by the same person that proposed the patch.

This is already an unwritten rule that applies to the DLang repositories, so it should not surprise anyone.

Rule #3: Anyone who adopts an orphaned PR that fixes a bug may be awarded its associated points.

To avoid any authorship conflicts, it is best if the adopter contacts the original author to ask if it is okay to adopt the PR. Rule #3 will apply if there is no response or if the response is affirmative. Otherwise, no points will be awarded.

Rule #4: Only one person may receive points per fixed issue.

This rule is specifically designed for reverted PRs. Imagine that a PR that presumably fixes an issue is merged and the author gets points for it. Later on, it is decided that the fix is incorrect and the PR is reverted. If someone else proposes the correct fix, the points will be subtracted from the original contributor and awarded to the new author. Hopefully, this will motivate the original contributor to propose the correct fix after the reversion.

Rule #5: Incomplete fixes still get points.

A Bugzilla report usually includes a snippet of code that reproduces the issue. A frequent pattern is that the bug is correctly fixed for the provided snippet, then someone comes up with a slightly modified example that does not work and reopens the issue. Since the original fix was correct, but not complete, the procedure here is that the original issue should be left closed and a new one should be opened. The original author keeps the points awarded for the original issue.

Implementation

Since most of you are die-hard geeks and are eagerly awaiting the code, here’s the database implementation hosted on the dlang-bot, and here’s the web page implementation. You will notice that the web page is extremely minimal. That is because I am a total n00b when it comes to web programming, so if anyone has the skills and the time to make a cooler web page, feel free to make a PR :D.

In short, for each of the issues that are fixed, the database stores the Bugzilla issue number, the GitHub ID of the person who fixed it, the date when the fix was merged, and the severity of the issue. Every time the leaderboard page is accessed, a query is issued to the database to compute the total points for all of the contributors and sort them in descending order. Easy peasy.

I would like to thank Vladimir Pantaleev for his continued support and assistance throughout the period that I implemented the system.

Prizes

As Mike briefly described in this forum post, we are going to have quarterly competitions. The quarterly prizes will vary. At the end of the year, the person who has acquired the most points will be awarded a bigger prize.

For the inaugural competition, which will officially start on the 20th of September 2021 and will last until the start of DConf Online (the 20th of November), the prizes will be:

  • First Place: a $300 Amazon eGift Card
  • Second Place: a $200 Amazon eGift Card
  • Third Place: a $100 Amazon eGift Card

The next set of prizes will be announced at DConf Online, so stay tuned!

That’s all folks!

If there are any questions or suggestions regarding any aspect of the bugfix reward system, please contact me at razvan.nitu1305@gmail.com. Also, feel free to directly propose changes to the existing infrastructure.

Happy coding everyone!

A Pull Request Manager’s Perspective

Since January of this year, I have been working as a part-time PR (Pull Request) manager. During this time, I have mostly been reviewing PRs and going through issues on the D Bugzilla. I have also been trying to come up with ways of creating organizational structures and procedures that will ultimately aid the D leadership in motivating and focusing community effort. This blog post presents a few insights I’ve had regarding the PR queues of the dmd, druntime, and phobos repositories, and a couple of proposals that, in my opinion, could benefit the D contribution process.

PR rounds

As a PR manager, I spend most of my time reviewing PRs. Since I started on the job, I have been involved in the merger of more than 400 PRs across our repositories. From this experience I have extracted a few insights:

  • If a new PR is not reviewed within the first 3 days after it was opened, chances are that it will get abandoned.
  • If a PR is not merged during the first 2 weeks after it was opened, chances are that it will be abandoned.
  • Contributions, in terms of PRs per month, are as follows: phobos (130), dmd (85), druntime (30).
  • Although phobos benefits from more contributions, dmd has a larger contributor base.
  • Druntime needs morelove.
  • Veteran contributors are more likely to abandon PRs than new/first-time contributors.

Given the first 2 points, I try to make contact as fast as possible with PR authors. It often happens that I do not have the necessary expertise to technically review a PR. In that case, I try to find people who are willing to take a look. However, since we do not have a concrete community hierarchy, it is sometimes difficult to find the needed reviewers. A solution to this problem is proposed later in the blog post.

Regarding the ratio of contributions per repository, it is noteworthy that phobos and dmd get a lot of attention, whereas druntime is by far the least attractive repository. Another interesting aspect is the diversity of the contributor base: in the last month, there have been ten contributors who opened more than one PR for dmd, five for phobos, and four for druntime. Ths emphasizes the fact that druntime needs more love.

Lastly, I noticed that veteran contributors tend to abandon their PRs more often than newcomers. This can be explained by the fact that veteran contributors usually tackle multiple PRs at the same time, whereas newcomers usually focus on a single PR. I want to take this opportunity to urge all contributors not to abandon their PRs. It is disappointing for reviewers such as myself to put in the time to properly investigate the patch and offer advice to then see it go to waste. I know that it is much more appealing to start working on new things, but it is highly important not to let any work go to waste.

Upcoming projects

From my perspective, D has come a long way from its early days: language features are maturing, adoption is steadily growing and the community is expanding around a nucleus of veteran contributors. But given that growth, it is surprising that from an organizational standpoint we are basically in the same spot: if a critical issue appears (a critical bug report, a CI failure, an expired certificate, etc.), the solution is to make a forum post or a comment on Slack and hope that someone who can fix it, or can get it fixed, notices it soon; non-critical issues depend on someone taking an interest: an issue might eventually be fixed, or we might be stuck with it indefinitely. The problem is not manpower or skill; our community has a lot of talent. Unfortunately, we fail to utilize it to its full potential.

If we want certain things to be done, it is the leadership’s responsibility to:

  1. specifically state what work needs to be done,
  2. organize the community, and
  3. incentivize contributors to do the needed work.

Although there is room for improvment, (1) has usually taken place in the form of forum discussions, DIPs, and blog posts. (2) is difficult to implement, given that people contribute in their own free time. As for (3), the mantra has been “fix it if you need it”, which works well for interesting topics, but not that well for important, hard-to-fix bugs, or high-impact, boring tasks.

Implementing points (2) and (3) in an open-source community and with limited financial resources is difficult. However, there are alternative approaches that have not been explored in the DLang ecosystem. I will outline them below.

Creating strike teams

One way of organizing the community is to create dedicated groups of people, or strike teams, that can be called upon for specific tasks. One will be assigned to each repository (dmd, druntime, phobos, dlang.org). The idea is to add people to these groups who either have expertise but lack time to contribute, or don’t lack expertise but are willing to actively contribute. This way, if you do not have time to contribute code, you can still help the community by offering implementation advice, whereas if you do have time to offer, you can contribute and develop expertise. The strike teams will be populated by a limited number of people who are trusted members of the community. These teams will be approached directly by the leadership (Walter, Atila, Mike, PR Managers) to fix issues or implement work defined in point (1). The components of the strike teams will receive recognition by having their name listed on the dlang.org site (thus satisfying point (3)).

Of course, this will work as long as there are folks out there willing to dedicate their time. If you want to contribute in some form to any of the strike teams, please contact me directly on Slack or via email.

Bugzilla Gamification

The D compiler has around 3000 reported bugs, druntime around 300, and phobos 900. These numbers have grown over time. Although some issues are fixed, we have had no means to incentivize people to work on the critical ones. To that end, we propose a simple gamification scheme: each issue has a severity associated; once a PR that closes an issue is merged, the github author of the PR is awarded points according to its severity level; a leaderboard, which is updated in real-time, is presented on dlang.org, and anyone can see who the top contributors are. At the end of each “season”, contributors will be awarded prizes and recognition based on different criteria, such as overall point total, number of total contributions, and so on (we have yet to finalize the kinds of prizes that will be awarded).

By implementing this scoring scheme, we offer some incentive for more experienced contributors to prioritize blocker/critical/major/regression issues over the more trivial or simpler ones, and encourage new contributors to try their hand at a level with which they’re comfortable. We are already working on implementing this and will announce the rules and prize categories once everything is up and running.

Conclusion

We are at a point in the evolution of the D programming language and its ecosystem where motivating community effort towards a common goal is crucial. This is a long-term, complicated task, but we need to start somewhere. I hope that with this initiative we can pave the way to a more sophisticated and better-organized contribution process that is a more satisfying and rewarding experience for our contributors.