Monthly Archives: November 2018

The New Fundraising Campaign

On January 23, 2011, Walter announced that development of the core D projects had moved to GitHub. It’s somewhat entertaining to go through the thread and realize that it’s almost entirely about people coming to terms with using git, a reminder that there was a time when git was still relatively young and had yet to dominate the source control market.

The move to GitHub has since been cited by both Walter and Andrei as a big win, thanks to a subsequent increase in contributions. It was smooth sailing for quite some time, but eventually some grumbling began to be heard below the surface. Pull Requests (PRs) were being ignored. Reviews weren’t happening fast enough. The grumbling grew louder. There were long communication delays. PRs were sometimes closed by frustrated contributors, demotivated and unwilling to contribute further.

The oldest open PR in the DMD queue as I write is dated June 10, 2013. If we dig into it, we see that it was ultimately done in by a break in communication. The contributor was asking for more feedback, then there was silence for nine months. Six months later, when asked to rebase, the contributor no longer had time for it. A year and a half.

There are other reasons why PRs can stall, but in the end many of them have one thing in common: there’s no one pushing all parties involved toward a resolution. There’s no one following up every few days to make sure communication hasn’t broken down, or that any action that must be taken is followed through. Everyone involved in maintaining the PR queue has other roles and responsibilities both inside and outside of D development, but none of them have the bandwidth to devote to regular PR management.

We have had people step up and try to revive old PRs. We have seen some of them closed or merged. But there are some really challenging or time-consuming PRs in the list. The one linked above, for example. The last comment is from Sebastian Wilzbach in March of this year, who points out that it will be difficult to revive because it’s based on the old C++ code base. So who has the motivation to get it done?

I promised in the forums that I would soon be launching targeted funding campaigns. This seems like an excellent place to start.

Fostering an environment that encourages more contributions benefits everyone who uses D. The pool of people in the D community who have the skill and knowledge necessary to manage those contributions is small. If they had the time, we wouldn’t have this problem. A community effort to compensate one of them to make more time for it seems a just cause.

The D Language Foundation is asking the community as a whole to contribute to a fund to pay a part-time PR manager $1,000 per month for a period of three months, from November 15 to February 14. The manager will be paid the full amount after February 14, in one lump sum. At that time, we’ll evaluate our progress and decide whether to proceed with another three-month campaign.

We’ve already roped someone in who’s willing to do the job. Nicholas Wilson has been rummaging around the PR queue lately, trying to get merged those he has an immediate interest in. He’s also shown an interest in improving the D development process as a whole. That, and the fact that he said yes, makes him an ideal candidate for the role.

He’ll have two primary goals: preventing active PRs from becoming stale, and reviving PRs that have gone dormant. He’ll also be looking into open Bugzilla issues when he’s got some time to fill. He’ll have the weight of the Foundation behind his finger when he pokes it in the shoulder of anyone who is being unresponsive (and that includes everyone on the core team). Where he is unable to merge a PR himself, he’ll contact the people he needs to make it happen.

Click on the campaign card below and you’ll be taken to the campaign page. Our target is $3,000 by February, 14. If we exceed that goal, the overage will go toward the next three-month cycle should we continue. If we decide not to continue, overage will go to the General Fund.

Note that there are options for recurring donations that go beyond the campaign period. All donations through any campaign we launch through Flipcause go to the D Language Foundation’s account. In other words, they aren’t tied specifically to a single campaign. If you choose to initiate a recurring donation through any of our campaigns, you’ll be helping us meet our goal for that campaign and, once the campaign ends, your donations will go toward our general fund. If you do set up a monthly or quarterly donation, leave a note if you want to continue putting it toward paying the PR manager and we’ll credit it toward future PR manager campaigns.

When you’re viewing the blog on a desktop system, you’ll be able to see all of our active campaigns by clicking on the Donate Now button that I’ve added to the sidebar. Of course, the other donation options that have always been supported are still available from the donation page, accessible through the menu bar throughout dlang.org.

Thanks to Nicholas for agreeing to take on this job, and thanks in advance to everyone who supports us in getting it done!

Lost in Translation: Encapsulation

I first learned programming in BASIC. Outgrew it, and switched to Fortran. Amusingly, my early Fortran code looked just like BASIC. My early C code looked like Fortran. My early C++ code looked like C. – Walter Bright, the creator of D

Programming in a language is not the same as thinking in that language. A natural side effect of experience with one programming language is that we view other languages through the prism of its features and idioms. Languages in the same family may look and feel similar, but there are guaranteed to be subtle differences that, when not accounted for, can lead to compiler errors, bugs, and missed opportunities. Even when good docs, books, and other materials are available, most misunderstandings are only going to be solved through trial-and-error.

D programmers come from a variety of programming backgrounds, C-family languages perhaps being the most common among them. Understanding the differences and how familiar features are tailored to D can open the door to more possibilities for organizing a code base, and designing and implementing an API. This article is the first of a few that will examine D features that can be overlooked or misunderstood by those experienced in similar languages.

We’re starting with a look at a particular feature that’s common among languages that support Object-Oriented Programming (OOP). There’s one aspect in particular of the D implementation that experienced programmers are sure they already fully understand and are often surprised to later learn they don’t.

Encapsulation

Most readers will already be familiar with the concept of encapsulation, but I want to make sure we’re on the same page. For the purpose of this article, I’m talking about encapsulation in the form of separating interface from implementation. Some people tend to think of it strictly as it relates to object-oriented programming, but it’s a concept that’s more broad than that. Consider this C code:

#include <stdio.h>
static size_t s_count;

void print_message(const char* msg) {
    puts(msg);
    s_count++;
}

size_t num_prints() { return s_count; }

In C, functions and global variables decorated with static become private to the translation unit (i.e. the source file along with any headers brought in via #include) in which they are declared. Non-static declarations are publicly accessible, usually provided in header files that lay out the public API for clients to use. Static functions and variables are used to hide implementation details from the public API.

Encapsulation in C is a minimal approach. C++ supports the same feature, but it also has anonymous namespaces that can encapsulate type definitions in addition to declarations. Like Java, C#, and other languages that support OOP, C++ also has access modifiers (alternatively known as access specifiers, protection attributes, visibility attributes) which can be applied to class and struct member declarations.

C++ supports the following three access modifiers, common among OOP languages:

  • public – accessible to the world
  • private – accessible only within the class
  • protected – accessible only within the class and its derived classes

An experienced Java programmer might raise a hand to say, “Um, excuse me. That’s not a complete definition of protected.” That’s because in Java, it looks like this:

  • protected – accessible within the class, its derived classes, and classes in the same package.

Every class in Java belongs to a package, so it makes sense to factor packages into the equation. Then there’s this:

  • package-private (not a keyword) – accessible within the class and classes in the same package.

This is the default access level in Java when no access modifier is specified. This combined with protected make packages a tool for encapsulation beyond classes in Java.

Similarly, C# has assemblies, which MSDN defines as “a collection of types and resources that forms a logical unit of functionality”. In C#, the meaning of protected is identical to that of C++, but the language has two additional forms of protection that relate to assemblies and that are analogous to Java’s protected and package-private.

  • internal – accessible within the class and classes in the same assembly.
  • protected internal – accessible within the class, its derived classes, and classes in the same assembly.

Examining encapsulation in other programming languages will continue to turn up similarities and differences. Common encapsulation idioms are generally adapted to language-specific features. The fundamental concept remains the same, but the scope and implementation vary. So it should come as no surprise that D also approaches encapsulation in its own, language-specific manner.

Modules

The foundation of D’s approach to encapsulation is the module. Consider this D version of the C snippet from above:

module mymod;

private size_t _count;

void printMessage(string msg) {
    import std.stdio : writeln;

    writeln(msg);
    _count++;
}

size_t numPrints() { return _count; }

In D, access modifiers can apply to module-scope declarations, not just class and struct members. _count is private, meaning it is not visible outside of the module. printMessage and numPrints have no access modifiers; they are public by default, making them visible and accessible outside of the module. Both functions could have been annotated with the keyword public.

Note that imports in module scope are private by default, meaning the symbols in the imported modules are not visible outside the module, and local imports, as in the example, are never visible outside of their parent scope.

Alternative syntaxes are supported, giving more flexibility to the layout of a module. For example, there’s C++ style:

module mymod;

// Everything below this is private until either another
// protection attribute or the end of file is encountered.
private:
    size_t _count;

// Turn public back on
public:
    void printMessage(string msg) {
        import std.stdio : writeln;

        writeln(msg);
        _count++;
    }

    size_t numPrints() { return _count; }

And this:

module mymod;

private {
    // Everything declared within these braces is private.
    size_t _count;
}

// The functions are still public by default
void printMessage(string msg) {
    import std.stdio : writeln;

    writeln(msg);
    _count++;
}

size_t numPrints() { return _count; }

Modules can belong to packages. A package is a way to group related modules together. In practice, the source files corresponding to each module should be grouped together in the same directory on disk. Then, in the source file, each directory becomes part of the module declaration:

// mypack/amodule.d
mypack.amodule;

// mypack/subpack/anothermodule.d
mypack.subpack.anothermodule;

Note that it’s possible to have package names that don’t correspond to directories and module names that don’t correspond to files, but it’s bad practice to do so. A deep dive into packages and modules will have to wait for a future post.

mymod does not belong to a package, as no packages were included in the module declaration. Inside printMessage, the function writeln is imported from the stdio module, which belongs to the std package. Packages have no special properties in D and primarily serve as namespaces, but they are a common part of the codescape.

In addition to public and private, the package access modifier can be applied to module-scope declarations to make them visible only within modules in the same package.

Consider the following example. There are three modules in three files (only one module per file is allowed), each belonging to the same root package.

// src/rootpack/subpack1/mod2.d
module rootpack.subpack1.mod2;
import std.stdio;

package void sayHello() {
    writeln("Hello!");
}

// src/rootpack/subpack1/mod1.d
module rootpack.subpack1.mod1;
import rootpack.subpack1.mod2;

class Speaker {
    this() { sayHello(); }
}

// src/rootpack/app.d
module rootpack.app;
import rootpack.subpack1.mod1;

void main() {
    auto speaker = new Speaker;
}

Compile this with the following command line:

cd src
dmd -i rootpack/app.d

The -i switch tells the compiler to automatically compile and link imported modules (excluding those in the standard library namespaces core and std). Without it, each module would have to be passed on the command line, else they wouldn’t be compiled and linked.

The class Speaker has access to sayHello because they belong to modules that are in the same package. Now imagine we do a refactor and we decide that it could be useful to have access to sayHello throughout the rootpack package. D provides the means to make that happen by allowing the package attribute to be parameterized with the fully-qualified name (FQN) of a package. So we can change the declaration of sayHello like so:

package(rootpack) void sayHello() {
    writeln("Hello!");
}

Now all modules in rootpack and all modules in packages that descend from rootpack will have access to sayHello. Don’t overlook that last part. A parameter to the package attribute is saying that a package and all of its descendants can access this symbol. It may sound overly broad, but it isn’t.

For one thing, only a package that is a direct ancestor of the module’s parent package can be used as a parameter. Consider a module rootpack.subpack.subsub.mymod. That name contains all of the packages that are legal parameters to the package attribute in mymod.d, namely rootpack, subpack, and subsub. So we can say the following about symbols declared in mymod:

  • package – visible only to modules in the parent package of mymod, i.e. the subsub package.
  • package(subsub) – visible to modules in the subsub package and modules in all packages descending from subsub.
  • package(subpack) – visible to modules in the subpack package and modules in all packages descending from subpack.
  • package(rootpack) – visible to modules in the rootpack package and modules in all packages descending from rootpack.

This feature makes packages another tool for encapsulation, allowing symbols to be hidden from the outside world but visible and accessible in specific subtrees of a package hierarchy. In practice, there are probably few cases where expanding access to a broad range of packages in an entire subtree is desirable.

It’s common to see parameterized package protection in situations where a package exposes a common public interface and hides implementations in one or more subpackages, such as a graphics package with subpackages containing implementations for DirectX, Metal, OpenGL, and Vulkan. Here, D’s access modifiers allow for three levels of encapsulation:

  • the graphics package as a whole
  • each subpackage containing the implementations
  • individual modules in each package

Notice that I didn’t include class or struct types as a fourth level. The next section explains why.

Classes and structs

Now we come to the motivation for this article. I can’t recall ever seeing anyone come to the D forums professing surprise about package protection, but the behavior of access modifiers in classes and structs is something that pops up now and then, largely because of expectations derived from experience in other languages.

Classes and structs use the same access modifiers as modules: public, package, package(some.pack), and private. The protected attribute can only be used in classes, as inheritance is not supported for structs (nor for modules, which aren’t even objects). public, package, and package(some.pack) behave exactly as they do at the module level. The thing that surprises some people is that private also behaves the same way.

import std.stdio;

class C {
    private int x;
}

void main() {
    C c = new C();
    c.x = 10;
    writeln(c.x);
}

Run this example online

Snippets like this are posted in the forums now and again by people exploring D, accompanying a question along the lines of, “Why does this compile?” (and sometimes, “I think I’ve found a bug!”). This is an example of where experience can cloud expectations. Everyone knows what private means, so it’s not something most people bother to look up in the language docs. However, those who do would find this:

Symbols with private visibility can only be accessed from within the same module.

private in D always means private to the module. The module is the lowest level of encapsulation. It’s easy to understand why some experience an initial resistance to this, that it breaks encapsulation, but the intent behind the design is to strengthen encapsulation. It’s inspired by the C++ friend feature.

Having implemented and maintained a C++ compiler for many years, Walter understood the need for a feature like friend, but felt that it wasn’t the best way to go about it.

Being able to declare a “friend” that is somewhere in some other file runs against notions of encapsulation.

An alternative is to take a Java-like approach of one class per module, but he felt that was too restrictive.

One may desire a set of closely interrelated classes that encapsulate a concept, and those should go into a module.

So the way to view a module in D is not just as a single source file, but as a unit of encapsulation. It can contain free functions, classes, and structs, all operating on the same data declared in module scope and class scope. The public interface is still protected from changes to the private implementation inside the module. Along those same lines, protected class members are accessible not just in derived classes, but also in the module.

Sometimes though, there really is a benefit to denying access to private members in a module. The bigger a module becomes, the more of a burden it is to maintain, especially when it’s being maintained by a team. Every place a private member of a class is accessed in a module means more places to update when a change is made, thereby increasing the maintenance burden. The language provides the means to alleviate the burden in the form of the special package module.

In some cases, we don’t want to require the user to import multiple modules individually. Splitting a large module into smaller ones is one of those cases. Consider the following file tree:

-- mypack
---- mod1.d
---- mod2.d

We have two modules in a package called mypack. Let’s say that mod1.d has grown extremely large and we’re starting to worry about maintaining it. For one, we want to ensure that private members aren’t manipulated outside of class declarations with hundreds or thousands of lines in between. We want to split the module into smaller ones, but at the same time we don’t want to break user code. Currently, users can get at the module’s symbols by importing it with import mypack.mod1. We want that to continue to work. Here’s how we do it:

-- mypack
---- mod1
------ package.d
------ split1.d
------ split2.d
---- mod2.d

We’ve split mod1.d into two new modules and put them in a package named mod1. We’ve also created a special package.d file, which looks like this:

module mypack.mod1;

public import mypack.mod1.split1,
              mypack.mod1.split2;

When the compiler sees package.d, it knows to treat it specially. Users will be able to continue using import mypack.mod1 without ever caring that it’s now split into two modules in a new package. The key is the module declaration at the top of package.d. It’s telling the compiler to treat this package as the module mod1. And instead of automatically importing all modules in the package, the requirement to list them as public imports in package.d allows more freedom in implementing the package. Sometimes, you might want to require the user to explicitly import a module even when a package.d is present.

Now users will continue seeing mod1 as a single module and can continue to import it as such. Meanwhile, encapsulation is now more stringently enforced internally. Because split1 and split2 are now separate modules, they can’t touch each other’s private parts. Any part of the API that needs to be shared by both modules can be annotated with package protection. Despite the internal transformation, the public interface remains unchanged, and encapsulation is maintained.

Wrapping up

The full list of access modifiers in D can be defined as such:

  • public – accessible everywhere.
  • package – accessible to modules in the same package.
  • package(some.pack) – accessible to modules in the package some.pack and to the modules in all of its descendant packages.
  • private – accessible only in the module.
  • protected (classes only) – accessible in the module and in derived classes.

Hopefully, this article has provided you with the perspective to think in D instead of your “native” language when thinking about encapsulation in D.

Thanks to Ali Çehreli, Joakim Noah, and Nicholas Wilson for reviewing and providing feedback on this article.

DMD 2.083.0 Released

Version 2.083.0 of DMD, the D reference compiler, is ready for download. The changelog lists 47 fixes and enhancements across all components of the release package. Notable among them are some C++ compatibility enhancements and some compile-time love.

C++ compatibility

D’s support for linking to C++ binaries has been evolving and improving with nearly every release. This time, the new things aren’t very dramatic, but still very welcome to those who work with both C++ and D in the same code base.

What’s my runtime?

For a while now, D has had predefined version identifiers for user code to detect the C runtime implementation at compile time. These are:

  • CRuntime_Bionic
  • CRuntime_DigitalMars
  • CRuntime_Glibc
  • CRuntime_Microsoft
  • CRuntime_Musl
  • CRuntime_UClibc

These aren’t reliable when linking against C++ code. Where the C runtime in use often depends on the system, the C++ runtime is compiler-specific. To remedy that, 2.083.0 introduces a few new predefined versions:

  • CppRuntime_Clang
  • CppRuntime_DigitalMars
  • CppRuntime_Gcc
  • CppRuntime_Microsoft
  • CppRuntime_Sun

Why so much conflict?

C++ support also gets a new syntax for declaring C++ linkage, which affects how a symbol is mangled. Consider a C++ library, MyLib, which uses the namespace mylib. The original syntax for binding a function in that namespace looks like this:

/*
 The original C++:
 namespace mylib { void cppFunc(); }
*/
// The D declaration
extern(C++, mylib) void cppFunc();

This declares that cppFunc has C++ linkage (the symbol is mangled in a manner specific to the C++ compiler) and that the symbol belongs to the C++ namespace mylib . On the D side, the function can be referred to either as cppFunc or as mylib.cppFunc.

In practice, this approach creates opportunities for conflict when a namespace is the same as a D keyword. It also has an impact on how one approaches the organization of a binding.  It’s natural to want to name the root package in D mylib, as it matches the library name and it is a D convention to name modules and packages using lowercase. In that case, extern(C++, mylib) declarations will not be compilable anywhere in the mylib package because the symbols conflict.

To alleviate the problem, an alternative syntax was proposed using strings to declare the namespaces in the linkage attribute, rather than identifiers:

/*
 The original C++:
 namespace foo { void cppFunc(); }
*/
// The D declaration
extern(C++, "foo") void cppFunc();

With this syntax, no mylib symbol is created on the D side; it is used solely for name mangling. No more conflicts with keywords, and D packages can be used to match the C++ namespaces on the D side. The old syntax isn’t going away anytime soon, though.

New compile-time things

This release provides two new built-in traits for more compile-time reflection options. Like all built-in traits, they are accessible via the __traits expression. There’s also a new pragma that lets you bring some linker options into the source code in a very specific circumstance.

Are you a zero?

isZeroInit can be used to determine if the default initializer of a given type is 0, or more specifically, it evaluates to true if all of the init value’s bits are zero. The example below uses compile-time asserts to verify the zeroness and nonzeroness of a few default init values, but I’ve saved a version that prints the results at runtime, for more immediate feedback, and can be compiled and run from the browser.

struct ImaZero {
    int x;
}

struct ImaNonZero {
    int x = 10;
}

// double.init == double.nan
static assert(!__traits(isZeroInit, double));

// int.init == 0
static assert(__traits(isZeroInit, int));

// ImaZero.init == ImaZero(0)
static assert(__traits(isZeroInit, ImaZero));

// ImaNonZeror.init == ImaZero(10)
static assert(!__traits(isZeroInit, ImaNonZero));

Computer, query target.

The second new trait is getTargetInfo, which allows compile-time queries about the target platform. The argument is a string that serves as a key, and the result is “an expression describing the requested target information”. Currently supported strings are “cppRuntimeLibrary”, “floatAbi”, and “ObjectFormat”.

The following prints all three at compile time.

pragma(msg, __traits(getTargetInfo, "cppRuntimeLibrary"));
pragma(msg, __traits(getTargetInfo, "floatAbi"));
pragma(msg, __traits(getTargetInfo, "objectFormat"));

On Windows, using the default (-m32) DigitalMars toolchain, I see this:

snn
hard
omf

With the Microsoft Build Tools (via VS 2017), compiling with -m64 and -m32mscoff, I see this:

libcmt
hard
coff

Yo! Linker! Take care of this, will ya?

D has long supported a lib pragma, allowing programmers to tell the compiler to pass a given library to the linker in source code rather than on the command line. Now, there’s a new pragma in town that let’s the programmer specify specific linker commands in source code and behaves rather differently. Meet the linkerDirective pragma:

pragma(linkerDirective, "/FAILIFMISMATCH:_ITERATOR_DEBUG_LEVEL=2");

The behavior is specified as “Implementation Defined”. The current implementation is specced to behave like so:

  • The string literal specifies a linker directive to be embedded in the generated object file.
  • Linker directives are only supported for MS-COFF output.

Just to make sure you didn’t gloss over the first list item, look at it again. The linker directive is not passed by the compiler to the linker, but emitted to the object file. Since it is only supported for MS-COFF, that means its only a thing for folks on Windows when they are compiling with -m64 or -m32mscoff. And some say the D community doesn’t care about Windows!

Of course there’s more!

The above are just a few cherries I picked from the list. For a deeper dive, see the full changelog. And head over to the Downloads page to get the installer for your platform. It looks a lot nicer than the boring list of files linked in the changelog.