Testing In The D Standard Library

Posted on

Jack Stouffer is a member of the Phobos team and contributor to dlang.org. You can check out more of his writing on his blog.


In the D standard library, colloquially named Phobos, we take a multi-pronged approach to testing and code review. Currently, there are five different services any addition has to go through:

  1. The whole complier chain of tests: DMD’s and DRuntime’s test suite, and Phobos’s unit tests
  2. A documentation builder
  3. Coverage analysis
  4. A style checker
  5. And a community project builder/test runner

Using these, we’re able to automatically catch the vast majority of common problems that we see popping up in PRs. And we make regressions much less likely using the full test suite and examining coverage reports.

Hopefully this will provide some insight into how a project like a standard library can use testing in order to increase stability. Also, it can spark some ideas on how to improve your own testing and review process.

Unit Tests

In D, unit tests are an integrated part of the language rather than a library
feature:

size_t sum(int[] a)
{
    size_t result;

    foreach (e; a)
    {
        result += e;
    }

    return result;
}

unittest
{
    assert(sum([1, 2, 3]) == 6);
    assert(sum([0, 50, 100]) == 150);
}

void main() {}

Save this as test.d and run dmd -unittest -run test.d. Before your main function is run, all of the unittest blocks will be executed. If any of the asserts fail, execution is terminated and an error is printed to stderr.

The effect of putting unit tests in the language has been enormous. One of the main ones we’ve seen is tests no longer have the “out of sight, out of mind” problem. Comprehensive tests in D projects are the rule and not the exception. Phobos dogfoods inline unittest blocks and uses them as its complete test suite. There are no other tests for Phobos than the inline tests, meaning for a reviewer to check their changes, they can just run dmd -main -unittest -run std/algorithm/searching.d (this is just for quick and dirty tests; full tests are done via make).

Every PR onto Phobos runs the inline unit tests, DMD’s tests, and the DRuntime tests on the following platforms:

  • Windows 32 and 64 bit
  • MacOS 32 and 64 bit
  • Linux 32 and 64 bit
  • FreeBSD 32 and 64 bit

This is done by Brad Roberts‘s auto-tester. As a quick aside, work is currently progressing to make bring D to iOS and Android.

Idiot Proof

In order to avoid pulling untested PRs, we have three mechanisms in place. First, only PRs which have at least one Github review by someone with pull rights can be merged.

Second, we don’t use the normal button for merging PRs. Instead, once a reviewer is satisfied with the code, we tell the auto-tester to merge the PR if and only if all tests have passed on all platforms.

Third, every single change to any of the official repositories has to go through the PR review process. This includes changes made by the BDFL Walter Bright and the Language Architect Andrei Alexandrescu. We have even turned off pushing directly to the master branch in Github to make sure that nothing gets around this.

Unit Tests and Examples

Unit tests in D solve the perennial problem of out of date docs by using the unit test code itself as the example code in the documentation. This way, documentation examples are part of the test suite rather than just some comment which will go out of date.

With this format, if the unit test goes out of date, then the test suite fails. When the tests are updated, the docs change automatically.

Here’s an example:

/**
 * Sums an array of `int`s.
 * 
 * Params:
 *      a = the array to sum
 * Returns:
 *      The sum of the array.
 */
size_t sum(int[] a)
{
    size_t result;

    foreach (e; a)
    {
        result += e;
    }

    return result;
}

///
unittest
{
    assert(sum([1, 2, 3]) == 6);
    assert(sum([0, 50, 100]) == 150);
}

// only tests with a doc string above them get included in the
// docs
unittest
{
    assert(sum([100, 100, 100]) == 300);
}

void main() {}

Run dmd -D test.d and it generates the following un-styled HTML:

Phobos uses this to great effect. The vast majority of examples in the Phobos documentation are from unittest blocks. For example, here is the documentation for std.algorithm.find and here is the unit test that generates that example.

This is not a catch all approach. Wholesale example programs, which are very useful when introducing a complex module or function, still have to be in comments.

Protecting Against Old Bugs

Despite our best efforts, bugs do find their way into released code. When they do, we require the person who’s patching the code to add in an extra unit test underneath the buggy function in order to protect against future regressions.

Docs

For Phobos, the documentation pages which were changed are generated on a test server for every PR. Developed by Vladimir Panteleev, the DAutoTest allows reviewers to compare the old page and the new page from one location.

For example, this PR changed the docs for two structs and their member functions. This page on DAutoTest shows a summary of the changed pages with links to view the final result.

Coverage

Perfectly measuring the effectiveness of a test suite is impossible, but we can get a good rough approximation with test coverage. For those unaware, coverage is a ratio which represents the number of lines of code that were executed during a test suite vs. lines that weren’t executed.

DMD has built-in coverage analysis to work in tandem with the built-in unit tests. Instead of dmd -unittest -run main.d, do dmd -unittest -cov -run main.d and a file will be generated showing a report of how many times each line of code was executed with a final coverage ratio at the end.

We generate this report for each PR. Also, we use codecov in order to get details on how well the new code is covered, as well as how coverage for the whole project has changed. If coverage for the patch is lower than 80%, then codecov marks the PR as failed.

At the time of writing, of the 77,601 lines of code (not counting docs or whitespace) in Phobos, 68,549 were covered during testing and 9,052 lines were not. This gives Phobos a test coverage of 88.3%, which is increasing all of the time. This is all achieved with the built in unittest blocks.

Project Tester

Because test coverage doesn’t necessarily “cover” all real world use cases and combinations of features, D uses a Jenkins server to download, build, and run the tests for a select number of popular D projects with the master branches of Phobos, DRuntime, and DMD. If any of the tests fail, the reviewers are notified.

Style And Anti-Pattern Checker

Having a code style set from on high stops a lot of pointless Internet flame wars (tabs vs spaces anyone?) dead in their tracks. D has had such a style guide for a couple of years now, but its enforcement in official code repos was spotty at best, and was mostly limited to brace style.

Now, we use CircleCI in order to run a series of bash scripts and the fantastically helpful dscanner which automatically checks for all sorts of things you shouldn’t be doing in your code. For example, CircleCI will give an error if it finds:

  • bad brace style
  • trailing whitespace
  • using whole module imports inside of functions
  • redundant parenthesis

And so on.

The automation of the style checker and coverage reports was done by Sebastian Wilzbach. dscanner was written by Brian Schott.

Closing Thoughts

We’re still working to improve somethings. Currently, Sebastian is writing a script to automatically check the documentation of every function for at least one example. Plus, the D Style Guide can be expanded to end arguments over the formatting of template constraints and other contested topics.

Practically speaking, other than getting the coverage of Phobos up to >= 95%, there’s not too much more to do. Phobos is one of the most throughly tested projects I’ve ever worked on, and it shows. Just recently, Phobos hit under 1000 open bugs, and that’s including enhancement requests.

Perspectives on D: Mihails Strasuns

Posted on

Joakim is the resident interviewer for the D Blog. He has also interviewed members of the D community for This Week in D and is responsible for the Android port of LDC.


Mihails Strasuns, known as Dicebot on the D newsgroup, is a well-known community member who works for Sociomantic, one of the largest commercial users of D and host of the previous and upcoming DConfs in Berlin. He has given talks about declarative programming at DConf 2014 and the process of transitioning from D1 (D 1.0) to D2 (D 2.0), at DConf 2015; acted as review manager for several additions to the standard library, Phobos; and is the current manager for DIPs (D Improvement Proposals), a process for suggesting changes to the D language. He also maintains the D packages for Arch Linux.

Joakim: Please tell us about yourself: who you are and where you’re from, what programming languages you used before D, and take us from your experience first discovering and using D to getting involved with its development.

Mihails: This is quite long story to tell but I will try to keep details to bare minimum.

My real name is Михаил Страшун, age 27, coming from Latvia. Have been into programming since early primary school – initially started with Pascal courses for kids and continued with informatics competitions and small pet projects in Delphi . After ending secondary school got my first job which was also about Delphi but by that time I have already understood that it isn’t most practical specialization. So next was C++ and next few years have been spent moving between small Latvian companies doing VoIP and CCTV server software. Ended up in local outsourcing company doing part of a huge LTE project for Nokia Siemens Networks. That was also my introduction to the world of barebone programming and plain C.

Shortly before that (in something like 2010) I have stumbled upon Andrei Alexandrescu article The Case For D and immediately got hooked. With fresh memories of learning C++ the hard way, it just felt like a breakthrough. There wasn’t any practical application I could use D for at that point so it remained purely theoretical interest for a long time. At that point, best thing about D was reading the newsgroup and studying papers and articles linked from there – which also sparked my interest about programming language design in general.

It is quite telling that it took me about 30 minutes from trying “Hello, World” to finding first Phobos bug. And 1 day to find first DMD bug. D toolchain stability has really improved since 2011. 🙂 Because of that I didn’t initially have the courage to try D even for pet project. To be honest, I still don’t have any, preferring to contribute to projects of others I have interest in. Resulting contrast between my work activities in C and spare time contributions in D started a series of events that resulted in me being hired by Sociomantic Labs in 2013.

Regarding D development involvement – I don’t feel like I am really part of it, even if perception is sometimes different. I simply do stuff that feels necessary and that no one else seems to work on. Phobos contributions, compiler features, even review manager activity – it all has happened simply because no one else was doing things I wanted to get done. Stepping up was simply fastest way to make it happen. Can’t even remember when I have created first Phobos/DMD pull request – it was a very casual and natural thing to do. Same with Arch Linux packaging.

I think this is one of the most commonly underrated things about how D development works – one doesn’t need any outstanding expertise or authority to make an impact. No permission of benevolent dictator is needed either – just patience and desire to work on things you want to happen.

Joakim: Sociomantic was started with D1 and has been moving to D2, a transition that you helped set up. You didn’t code much in D2 at Sociomantic initially, what are your impressions of D2 now that you’re using it more?

Mihails: I started with D2 and have used D1 for the first time in my life only in Sociomantic. 🙂

Most of the code I write these days is D2-compatible. But it isn’t what one may expect from idiomatic D code because D1 compatibility is preserved too. The Ocean library is quite a typical example of that kind of code and I am one of its maintainers.

Though there is also bunch of small tools/scripts I write occasionally – those are pure (and maybe even idiomatic) D2. Our migration helper tool, d1to2fix, is one such example and we will probably open-source a bit more in the near future.

But most importantly, since this month I will be spending part of my work time (1-2 days / week) helping D upstream – this is the first step in planned Sociomantic contribution to D Foundation. 🙂 And that definitely means using some bleeding edge D2!

Joakim: Have you written much in D2 outside of Sociomantic? What projects and how was your experience?

Mihails: Sadly, not much. My main point of interest was vibe.d, specifically its MongoDB driver and REST interface generator. The latter has become my personal “playground” for stressing limits of D meta-programming capabilities while still trying to maintain code readability (but initial idea and implementation is 100% by Sönke Ludwig). I used it any time some personal web service was necessary but that didn’t result in anything persistent. There were some minor contributions to tools like DStep or dub but most often it was just trying out various concept and throwing them away.

There is also some amount of D2 activity that is directly related to my job as our upgrade process has been slowly moving forward, but that is more about compiler itself. Like adding more permissive deprecation paths during recent beta release cycle to ensure that we will be able to smoothly go through versions later. Sadly, it is very hard for me to find motivation to work with D both at work and in spare time – my mind urges for more diversity.

Joakim: You forked the Volt programming language repository on github a couple years ago, Rust last year. How do you feel those languages compare to D2? What do you think D2 has done right and wrong?

Mihails: Volt has caught my interest about three years ago. Same as D tries to improve on C mistakes, Volt is an attempt to rethink D design mistakes. It is hard to really compare it with D as a language, because Volt is more of a hobbyist thing that is more of a prototype than finished design. That was one of the best things about my (very short) involvement – all those refreshing design discussions in IRC with no concerns about backwards compatibility and strong desire to get things right. 🙂 At some point I have been seriously considering dropping D and joining Volt development team but joining Sociomantic has changed that. It feels more pragmatic to work on small improvement of language you will actually use than on fundamental things that are likely to remain as hobby.

My attitude to Rust is quite different. Right now I consider it to have a serious advantage over D in embedded/barebone domain, at least when thinking about types of applications I have worked on earlier with C and C++. Last year, I wrote a blog post that compared D vs Rust from my personal point of view, this should give a more detailed explanations about language features. At the same time, I don’t feel tempted to start any personal hobby projects in Rust. It is a very well-designed strict purist language – exactly the kind of tool you want to have to manage big, complicated projects but not that fun to use for small dirty experiments.

These days my main grudge at D is more about process than language itself. It just happened that many of D2 features were added in quick burst when the split from D1 has happened and since then people keep trying to work with that mostly theoretical designs even if practice has shown that some choice were sub-optimal. Commonly mentioned example is choice of attributes like pure or @safe to be permissive by default. I believe having regular (once in ~5 years) major language revisions could be a better approach to move forward and this was one of the themes for my DConf talk last year. 🙂

Joakim: Please expand on some of these “D design mistakes:” what are the “theoretical designs” that have proven sub-optimal? Not making pure and @safe the default sounds more pragmatic, not theoretical.

Mihails: By “theoretical” I have meant that certain decisions simply didn’t have any prolonged field-trial period before being set in stone. It felt right to add purity and safety enforcements but only after some years of trying to adjust Phobos to actually use those we started to realize that other way around for defaults could have been better approach. Another example is D module system – it felt perfectly reasonable and elegant when I have first read the spec, but with more D project maintenance experience my opinion has changed. Main issue with it is that there is no way to add new public symbols to libraries in backwards compatible way without risking the breakage of user code (I have explained it in a bit more details in my Rust vs D blog post). Some other aspects we have been discussing in Volt IRC channel is relation between symbol visibility and internal linkage and introduction of more structured template constraints for better error feedback. All kind of stuff that is simply hard to foresee until you actually try it in practice and see how it fails.

Joakim: You certainly have a lot of criticism for D: what do you feel it got right?

Mihails: Just want to make it clear – I don’t have any bad feelings for D, it just the way my naturally grumpy perception works. If I don’t criticize something, that usually means that I am simply not familiar enough with the topic. 🙂

Despite all my complaints D remains one of most pleasant and practical languages I have used. It has a very rewarding learning curve – easy to start with for anyone familiar with C-style languages, easy to get your job done using only subset of language you are comfortable with, easy to slowly adopt more advanced concepts of language one by one. Documentation can be lacking but language itself is very well-designed in that regard. One example of such decisions is choice of string mixins vs macros as primary meta-programming facility. Latter is “cleaner” but former is much easier to jump in, being a very intuitive concept.

It is not about getting any specific feature right but about overall taste of pragmatism that implies small tough trade-offs here and there. And Walter seems to have a pretty good taste. 🙂

Joakim: You’ve been review manager for some Phobos modules over the years: what was good or bad about the experience? Phobos has a reputation for interminable review, what are your thoughts on the current review process?

Mihails: That was a good experience – actually moving on with Phobos proposals instead of them rotting for years in review queue. 🙂 Even rejecting is better than keeping good work completely abandoned with no feedback at all. That was exactly how I have started with this role – there were several interesting proposals in review queue and no one wanted to step up even if required effort was trivial.

Most bad experience comes from attention disbalance. Proposals that target smaller audience and/or have complicated implementation can’t gather enough reviewers to be reliably accepted (like it has happened with new std.signal). Proposals that are widely demanded and have lot of natural subjectivity (like std.logger) get debated to death over and over again.

In my opinion there isn’t anything inherently wrong with review process itself (it is quite simple and flexible). It is natural consequence of wanting to get useful things in Phobos and maintaining strict backwards compatibility at the same time. We simply can’t risk accepting anything with debatable API into Phobos because it will be impossible to fix if issues will be found later. And some packages are just so naturally opinionated that making “correct” decision is simply impossible – it is matter of taste!

In the end, it all comes to argument between two camps – those who prefer all-powerful standard library and those who prefer endorsing dub, the D package manager. Actual review process is hardly that important here. When I understood that Phobos is following kitchen sink path and this is not going to change, I have lost any interest in its development.

Joakim: How is the new DIP process you initiated going? Lay out any changes you’ve had to make to the process and how you feel the proposal queue is now.

Mihails: I am quite satisfied with it. There are still small tweaks happening to the process as I gather more feedback from Andrei and Walter of course. For example, for first submitted DIPs I only checked most formal acceptance criteria and Andrei has clearly indicated the bar has to be much higher. But the core process seems to be working as intended right now.

In The Why and Wherefore of the New D Improvement Proposal Process, I have outlined three key goals for new process:

1) introduce some preliminary quality control
2) ensure formal response from language authors
3) transparent DIP status maintenance

(1) is probably the most lacking bit as I am very alien to academical world myself and can’t review proposals with the level of scrutiny that is desired. I could really use some help from other community members with experience in this domain.

But on (2) and (3) there was a huge success in my opinion. Responses provided by Andrei (DIP 1001 and 1002) explain all issues of the proposal in greatest details and provide great insight on decision rationale. And switching to GitHub repository for managing documents naturally helped a lot with (3).

Joakim: You’ve mentioned taste a couple times, including that Walter has “pretty good taste.” What stands out in D as exemplars?

Mihails: I think decision to stick to C syntax family was a big success and remains one of big selling points for D in the language market. C syntax is often criticized for bad grammar decision (for example, with variable declarations) but in practice it proves to not be too big of a deal. But providing some familiar ground for new devs is definitely a big deal.

Slices come to mind too. When I was only learning D it seemed awkward to separate actual dynamic array from its view like that. But eventually I figured out those can be used as view on any kind of contiguous data and started to appreciate how convenient it can be. Like the fact that one can make D string from C string by simply slicing the pointer. That makes you feel good.

Those examples may feel artificial though because “pretty good taste” is not about any specific feature and decision. It just happens that you start using the language and find yourself much more comfortable with it, as opposed to thinking about any of its design aspects in theory. For me D feels like a language which was designed by someone with huge programming experience, even if I can’t truly reflect why.

How to Write @trusted Code in D

Posted on

Steven Schveighoffer is the creator and maintainer of the dcollections and iopipe libraries. He was the primary instigator of D’s inout feature and the architect of a major rewrite of the language’s built-in arrays. He also authored the oft-recommended introductory article on the latter.


d6In computer programming, there is a concept of memory-safe code, which is guaranteed at some level not to cause memory corruption issues. The ultimate holy grail of memory safety is to be able to mechanically verify you will not corrupt memory no matter what. This would provide immunity from attacks via buffer overflows and the like. The D language provides a definition of memory safety that allows quite a bit of useful code, but conservatively forbids things that are sketchy. In practice, the compiler is not omnipotent, and it lacks the context that we humans are so good at seeing (most of the time), so there is often the need to allow otherwise risky behavior. Because the compiler is very rigid on memory safety, we need the equivalent of a cast to say “yes, I know this is normally forbidden, but I’m guaranteeing that it is fine”. That tool is called @trusted.

Because it’s very difficult to explain why @trusted code might be incorrect without first discussing memory safety and D’s @safe mechanism, I’ll go over that first.

What is Memory Safe Code?

The easiest way to explain what is safe, is to examine what results in unsafe code. There are generally 3 main ways to create a safety violation in a statically-typed language:

  1. Write or read from a buffer outside the valid segment of memory that you have access to.
  2. Cast some value to a type that allows you to treat a piece of memory that is not a pointer as a pointer.
  3. Use a pointer that is dangling, or no longer valid.

The first item is quite simple to achieve in D:

auto buf = new int[1]; 
buf[2] = 1;

With default bounds checks on, this results in an exception at runtime, even in code that is not checked for safety. But D allows circumventing this by accessing the pointer of the array:

buf.ptr[2] = 1;

For an example of the second, all that is needed is a cast:

*cast(int*)(0xdeadbeef) = 5;

And the third is relatively simple as well:

auto buf = new int[1];
auto buf2 = buf;
delete buf;  // sets buf to null
buf2[0] = 5; // but not buf2.

Dangling pointers also frequently manifest by pointing at stack data that is no longer in use (or is being used for a different reason). It’s very simple to achieve:

int[] foo()
{
    int[4] buf;
    int[] result = buf[];
    return result;
}

So simply put, safe code avoids doing things that could potentially result in memory corruption. To that end, we must follow some rules that prohibit such behavior.

Note: dereferencing a null pointer in user-space is not considered a memory safety issue in D. Why not? Because this triggers a hardware exception, and generally does not leave the program in an undefined state with corrupted memory. It simply aborts the program. This may seem undesirable to the user or the programmer, but it’s perfectly fine in terms of preventing exploits. There are potential memory issues possible with null pointers, if one has a null pointer to a very large memory space. But for safe D, this requires an unusually large struct to even begin to worry about it. In the eyes of the D language, instrumenting all pointer dereferences to check for null is not worth the performance degradation for these rare situations.

D’s @safe rules

D provides the @safe attribute that tags a function to be mechanically checked by the compiler to follow rules that should prevent all possible memory safety problems. Of course, there are cases where developers need to make exceptions in order to get some meaningful work done.

The following rules are geared to prevent issues like the ones discussed above (listed in the spec here).

  1. Changing a raw pointer value is not allowed. If @safe D code has a pointer, it has access only to the value pointed at, no others. This includes indexing a pointer.
  2. Casting pointers to any type other than void* is not allowed. Casting from any non-pointer type to a pointer type is not allowed. All other casts are OK (e.g. casting from float to int) as long as they are valid. Casting a dynamic array to a void[] is also allowed.
  3. Unions that have pointer types that overlap other types cannot be accessed. This is similar to rules 1 and 2 above.
  4. Accessing an element in or taking a slice from a dynamic array must be either proven safe by the compiler, or incur a bounds check during runtime. This even happens in release mode, when bounds checks are normally omitted (note: dmd’s option -boundscheck=off will override this, so use with extreme caution).
  5. In normal D, you can create a dynamic array from a pointer by slicing the pointer. In @safe D, this is not allowed, since the compiler has no idea how much space you actually have available via that pointer.
  6. Taking a pointer to a local variable or function parameter (variables that are stored on the stack) or taking a pointer to a reference parameter are forbidden. An exception is slicing a local static array, including the function foo above. This is a known issue.
  7. Explicit casting between immutable and mutable types that are or contain references is not allowed. Casting value-types between immutable and mutable can be done implicitly and is perfectly fine.
  8. Explicit casting between thread-local and shared types that are or contain references is not allowed. Again, casting value-types is fine (and can be done implicitly).
  9. The inline assembler feature of D is not allowed in @safe code.
  10. Catching thrown objects that are not derived from class Exception is not allowed.
  11. In D, all variables are default initialized. However, this can be changed to uninitialized by using a void initializer:
    int *s = void;

    Such usage is not allowed in @safe D. The above pointer would point to random memory and create an obvious dangling pointer.

  12. __gshared variables are static variables that are not properly typed as shared, but are still in global space. Often these are used when interfacing with C code. Accessing such variables is not allowed in @safe D.
  13. Using the ptr property of a dynamic array is forbidden (a new rule that will be released in version 2.072 of the compiler).
  14. Writing to void[] data by means of slice-assigning from another void[] is not allowed (this rule is also new, and will be released in 2.072).
  15. Only @safe functions or those inferred to be @safe can be called.

The need for @trusted

The above rules work well to prevent memory corruption, but they prevent a lot of valid, and actually safe, code. For example, consider a function that wants to use the system call read, which is prototyped like this:

ssize_t read(int fd, void* ptr, size_t nBytes);

For those unfamiliar with this function, it reads data from the given file descriptor, and puts it into the buffer pointed at by ptr and expected to be nBytes bytes long. It returns the number of bytes actually read, or a negative value if an error occurs.

Using this function to read data into a stack-allocated buffer might look like this:

ubyte[128] buf;
auto nread = read(fd, buf.ptr, buf.length);

How is this done inside a @safe function? The main issue with using read in @safe code is that pointers can only pass a single value, in this case a single ubyte. read expects to store more bytes of the buffer. In D, we would normally pass the data to be read as a dynamic array. However, read is not D code, and uses a common C idiom of passing the buffer and length separately, so it cannot be marked @safe. Consider the following call from @safe code:

auto nread = read(fd, buf.ptr, 10_000);

This call is definitely not safe. What is safe in the above read example is only the one call, where the understanding of the read function and calling context assures memory outside the buffer will not be written.

To solve this situation, D provides the @trusted  attribute, which tells the compiler that the code inside the function is assumed to be @safe, but will not be mechanically checked. It’s on you, the developer, to make sure the code is actually @safe.

A function that solves the problem might look like this in D:

auto safeRead(int fd, ubyte[] buf) @trusted
{
    return read(fd, buf.ptr, buf.length);
}

Whenever marking an entire function @trusted, consider if code could call this function from any context that would compromise memory safety. If so, this function should not be marked @trusted under any circumstances. Even if the intention is to only call it in safe ways, the compiler will not prevent unsafe usage by others. safeRead should be fine to call from any @safe context, so it’s a great candidate to mark @trusted.

A more liberal API for the safeRead function might take a void[] array as the buffer. However, recall that in @safe code, one can cast any dynamic array to a void[] array — including an array of pointers. Reading file data into an array of pointers could result in an array of dangling pointers. This is why ubyte[] is used instead.

@trusted escapes

A @trusted escape is a single expression that allows @system (the unsafe default in D) calls such as read without exposing the potentially unsafe call to any other part of the program. Instead of writing the safeRead function, the same feat can be accomplished inline within a @safe function:

auto nread = ( () @trusted => read(fd, buf.ptr, buf.length) )();

Let’s take a closer look at this escape to see what is actually happening. D allows declaring a lambda function that evaluates and returns a single expression, with the () => expr syntax. In order to call the lambda function, parentheses are appended to the lambda. However, operator precedence will apply those parentheses to the expression and not the lambda, so the entire lambda must be wrapped in parentheses to clarify the call. And finally, the lambda can be tagged @trusted as shown, so the call is now usable from the @safe context that contains it.

In addition to simple lambdas, whole nested functions or multi-statement lambdas can be used. However, remember that adding a trusted nested function or saving a lambda to a variable exposes the rest of the function to potential safety concerns! Take care not to expose the escape too much because this risks having to manually verify code that should just be mechanically checked.

Rules of Thumb for @trusted

The previous examples show that tagging something as @trusted has huge implications. If you are disabling memory safety checks, but allowing any @safe code to call it, then you must be sure that it cannot result in memory corruption. These rules should give guidance on where to put @trusted marks and avoid getting into trouble:

Keep @trusted code small

@trusted code is never mechanically checked for safety, so every line must be reviewed for correctness. For this reason, it’s always advisable to keep the code that is @trusted as small as possible.

Apply @trusted to entire functions when the unsafe calls are leaky

Code that modifies or uses data that @safe code also uses creates the potential for unsafe calls to leak into the mechanically checked portion of a @safe function. This means that portion of the code must be manually reviewed for safety issues. It’s better to mark the whole thing @trusted, as that’s more in line with the truth. This is not a hard and fast rule; for example, the read call from the earlier example is perfectly safe, even though it will affect data that is used later by the function in @safe mode.

A pointer allocated with C’s malloc in the beginning of the function, and free‘d later, could have been copied somewhere in between. In this case, the dangling pointer may violate @safe, even in the mechanically checked part. Instead, try wrapping the entire portion that uses the pointer as @trusted, or even the entire function. Alternatively, use scope guards to guarantee the lifetime of the data until the end of the function.

Never use @trusted on template functions that accept arbitrary types

D is smart enough to infer @safe for template functions that follow the rules. This includes member functions of templated types. Just let the compiler do its job here. To ensure the function is actually @safe in the right contexts, create an @safe unittest  to call it. Marking the function @trusted allows any operator overloads or members that might violate memory safety to be ignored by the safety checker! Some tricky ones to remember are postblit and opCast.

It’s still OK to use @trusted escapes here, but be very careful. Consider especially possible types that contain pointers when thinking about how such a function could be abused. A common mistake is to mark a range function or range usage @trusted. Remember that most ranges are templates, and can be easily inferred as @system when the type being iterated has a @system postblit or constructor/destructor, or is generated from a user-provided lambda.

Use @safe to find the parts you need to mark as @trusted

Sometimes, a template intended to be @safe may not be inferred @safe, and it’s not clear why. In this case, try temporarily marking the template function @safe to see where the compiler complains. That’s where @trusted escapes should be inserted if appropriate.

In some cases, a template is used pervasively, and tagging it as @safe may make too many parts break. Make a copy of the template under a different name that you mark @safe, and change the calls that are to be checked so that they call the alternative template instead.

Consider how the function may be edited in the future

When writing a trusted function, always think about how it could be called with the given API, and ensure that it should be @safe. A good example from above is making sure safeRead cannot accept an array of pointers. However, another possibility for unsafe code to creep in is when someone edits a part of the function later, invalidating the previous verification, and the whole function needs to be rechecked. Insert comments to explain the danger of changing something that would then violate safety. Remember, pull request diffs don’t always show the entire context, including that a long function being edited is @trusted!

Use types to encapsulate @trusted operations with defined lifetimes

Sometimes, a resource is only dangerous to create and/or destroy, but not to use during its lifetime. The dangerous operations can be encapsulated into a type’s constructor and destructor, marked @trusted, which allows @safe code to use the resource in between those calls. This takes a lot of planning and care. At no time can you allow @safe code to ferret out the actual resource so that it can keep a copy past the lifetime of the managing struct! It is essential to make sure the resource is alive as long as @safe code has a reference to it.

For example, a reference-counted type can be perfectly safe, as long as a raw pointer to the payload data is never available. D’s std.typecons.RefCounted cannot be marked @safe, since it uses alias this to devolve to the protected allocated struct in order to function, and any calls into this struct are unaware of the reference counting. One copy of that payload pointer, and then when the struct is free‘d, a dangling pointer is present.

This can’t be @safe!

Sometimes, the compiler allows a function to be @safe, or is inferred @safe, and it’s obvious that shouldn’t be allowed. This is caused by one of two things: either a function that is called by the @safe function (or some deeper function) is marked @trusted but allows unsafe calls, or there is a bug or hole in the @safe system. Most of the time, it is the former. @trusted is a very tricky attribute to get correct, as is shown by most of this post. Frequently, developers will mark a function @trusted only thinking of some uses of their function, not realizing the dangers it allows. Even core D developers make this mistake! There can be template functions that are inferred safe because of this, and sometimes it’s difficult to even find the culprit. Even after the root cause is discovered, it’s often difficult to remove the @trusted tag as it will break many users of the function. However, it’s better to break code that is expecting a promise of memory safety than subject it to possible memory exploits. The sooner you can deprecate and remove the tag, the better. Then insert trusted escapes for cases that can be proven safe.

If it does happen to be a hole in the system, please report the issue, or ask questions on the D forums. The D community is generally happy to help, and memory safety is a particular focus for Walter Bright, the creator of the language.