{"id":283,"date":"2016-09-28T13:59:14","date_gmt":"2016-09-28T13:59:14","guid":{"rendered":"http:\/\/dlang.org\/blog\/?p=283"},"modified":"2021-10-08T11:09:59","modified_gmt":"2021-10-08T11:09:59","slug":"how-to-write-trusted-code-in-d","status":"publish","type":"post","link":"https:\/\/dlang.org\/blog\/2016\/09\/28\/how-to-write-trusted-code-in-d\/","title":{"rendered":"How to Write @trusted Code in D"},"content":{"rendered":"<p><em>Steven Schveighoffer is the creator and maintainer of the <a href=\"https:\/\/github.com\/schveiguy\/dcollections\">dcollections<\/a> and <a href=\"https:\/\/github.com\/schveiguy\/iopipe\">iopipe<\/a> libraries. He was the primary instigator of D&#8217;s <a href=\"https:\/\/dlang.org\/spec\/function.html#inout-functions\">inout<\/a> feature and the architect of a major rewrite of the language&#8217;s built-in arrays. He also authored the oft-recommended <a href=\"https:\/\/dlang.org\/d-array-article.html\">introductory article<\/a> on the latter.<br \/>\n<\/em><\/p>\n<hr \/>\n<p><img loading=\"lazy\" class=\"alignleft size-full wp-image-181\" src=\"http:\/\/dlang.org\/blog\/wp-content\/uploads\/2016\/08\/d6.png\" alt=\"d6\" width=\"200\" height=\"200\" \/>In computer programming, there is a concept of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Memory_safety\"><strong>memory-safe<\/strong> code<\/a>, which is guaranteed at some level not to cause memory corruption issues. The ultimate holy grail of memory safety is to be able to mechanically verify you will not corrupt memory no matter what. This would provide immunity from attacks via buffer overflows and the like. The D language provides a definition of memory safety that allows quite a bit of useful code, but conservatively forbids things that are sketchy. In practice, the compiler is not omnipotent, and it lacks the context that we humans are so good at seeing (most of the time), so there is often the need to allow otherwise risky behavior. Because the compiler is very rigid on memory safety, we need the equivalent of a cast to say &#8220;yes, I know this is normally forbidden, but I&#8217;m guaranteeing that it is fine&#8221;. That tool is called <code>@trusted<\/code>.<\/p>\n<p>Because it&#8217;s very difficult to explain why <code>@trusted<\/code> code might be incorrect without first discussing memory safety and D&#8217;s <code>@safe<\/code> mechanism, I&#8217;ll go over that first.<\/p>\n<h3>What is Memory Safe Code?<\/h3>\n<p>The easiest way to explain what is safe, is to examine what results in unsafe code. There are generally 3 main ways to create a safety violation in a statically-typed language:<\/p>\n<ol>\n<li>Write or read from a buffer outside the valid segment of memory that you have access to.<\/li>\n<li>Cast some value to a type that allows you to treat a piece of memory that is not a pointer as a pointer.<\/li>\n<li>Use a pointer that is dangling, or no longer valid.<\/li>\n<\/ol>\n<p>The first item is quite simple to achieve in D:<\/p>\n<pre class=\"prettyprint lang-d\">auto buf = new int[1]; \r\nbuf[2] = 1;<\/pre>\n<p>With default bounds checks on, this results in an exception at runtime, even in code that is not checked for safety. But D allows circumventing this by accessing the pointer of the array:<\/p>\n<pre class=\"prettyprint lang-d\">buf.ptr[2] = 1;<\/pre>\n<p>For an example of the second, all that is needed is a cast:<\/p>\n<pre class=\"prettyprint lang-d\">*cast(int*)(0xdeadbeef) = 5;<\/pre>\n<p>And the third is relatively simple as well:<\/p>\n<pre class=\"prettyprint lang-d\">auto buf = new int[1];\r\nauto buf2 = buf;\r\ndelete buf;  \/\/ sets buf to null\r\nbuf2[0] = 5; \/\/ but not buf2.<\/pre>\n<p>Dangling pointers also frequently manifest by pointing at stack data that is no longer in use (or is being used for a different reason). It&#8217;s very simple to achieve:<\/p>\n<pre class=\"prettyprint lang-d\">int[] foo()\r\n{\r\n    int[4] buf;\r\n    int[] result = buf[];\r\n    return result;\r\n}<\/pre>\n<p>So simply put, safe code avoids doing things that could potentially result in memory corruption. To that end, we must follow some rules that prohibit such behavior.<\/p>\n<p>Note: dereferencing a <code>null<\/code> pointer in user-space is <em>not<\/em> considered a memory safety issue in D. Why not? Because this triggers a hardware exception, and generally does not leave the program in an undefined state with corrupted memory. It simply aborts the program. This may seem undesirable to the user or the programmer, but it&#8217;s perfectly fine in terms of preventing exploits. There are potential memory issues possible with <code>null<\/code> pointers, if one has a <code>null<\/code> pointer to a very large memory space. But for safe D, this requires an unusually large struct to even begin to worry about it. In the eyes of the D language, instrumenting all pointer dereferences to check for <code>null<\/code> is not worth the performance degradation for these rare situations.<\/p>\n<h3>D&#8217;s @safe rules<\/h3>\n<p>D provides the <a href=\"http:\/\/dlang.org\/spec\/function.html#safe-functions\"><strong><code>@safe<\/code><\/strong> attribute<\/a> that tags a function to be mechanically checked by the compiler to follow rules that should prevent all possible memory safety problems. Of course, there are cases where developers need to make exceptions in order to get some meaningful work done.<\/p>\n<p>The following rules are geared to prevent issues like the ones discussed above (listed in the spec <a href=\"http:\/\/dlang.org\/spec\/function.html#function-safety\">here<\/a>).<\/p>\n<ol>\n<li>Changing a raw pointer value is not allowed. If <code>@safe<\/code> D code has a pointer, it has access <em>only<\/em> to the value pointed at, no others. This includes indexing a pointer.<\/li>\n<li>Casting pointers to any type other than <code>void*<\/code> is not allowed. Casting from any non-pointer type to a pointer type is not allowed. All other casts are OK (e.g. casting from <code>float<\/code> to <code>int<\/code>) as long as they are valid. Casting a dynamic array to a <code>void[]<\/code> is also allowed.<\/li>\n<li>Unions that have pointer types that overlap other types cannot be accessed. This is similar to rules 1 and 2 above.<\/li>\n<li>Accessing an element in or taking a slice from a dynamic array <em>must<\/em> be either proven safe by the compiler, or incur a bounds check during runtime. This even happens in release mode, when bounds checks are normally omitted (note: dmd&#8217;s option <strong>-boundscheck=off<\/strong> will override this, so use with extreme caution).<\/li>\n<li>In normal D, you can create a dynamic array from a pointer by <a href=\"http:\/\/dlang.org\/spec\/expression.html#SliceExpression\">slicing<\/a> the pointer. In <code>@safe<\/code> D, this is not allowed, since the compiler has no idea how much space you actually have available via that pointer.<\/li>\n<li>Taking a pointer to a local variable or function parameter (variables that are stored on the stack) or taking a pointer to a reference parameter are forbidden. An exception is slicing a local static array, including the function <code>foo<\/code> above. This is a <a href=\"https:\/\/issues.dlang.org\/show_bug.cgi?id=8838\">known issue<\/a>.<\/li>\n<li>Explicit casting between immutable and mutable types that are or contain references is not allowed. Casting value-types between immutable and mutable can be done implicitly and is perfectly fine.<\/li>\n<li>Explicit casting between thread-local and shared types that are or contain references is not allowed. Again, casting value-types is fine (and can be done implicitly).<\/li>\n<li>The <a href=\"http:\/\/dlang.org\/spec\/iasm.html\">inline assembler<\/a> feature of D is not allowed in <code>@safe<\/code> code.<\/li>\n<li>Catching thrown objects that are not derived from <code>class Exception<\/code> is not allowed.<\/li>\n<li>In D, all variables are default initialized. However, this can be changed to uninitialized by using a <a href=\"http:\/\/dlang.org\/spec\/declaration.html#void_init\">void initializer<\/a>:\n<pre class=\"prettyprint lang-d\">int *s = void;<\/pre>\n<p>Such usage is not allowed in <code>@safe<\/code> D. The above pointer would point to random memory and create an obvious dangling pointer.<\/li>\n<li><code>__gshared<\/code> variables are static variables that are not properly typed as <code>shared<\/code>, but are still in global space. Often these are used when interfacing with C code. Accessing such variables is not allowed in <code>@safe<\/code> D.<\/li>\n<li>Using the <code>ptr<\/code> property of a dynamic array is forbidden (a new rule that will be released in version 2.072 of the compiler).<\/li>\n<li>Writing to <code>void[]<\/code> data by means of slice-assigning from another <code>void[]<\/code> is not allowed (this rule is also new, and will be released in 2.072).<\/li>\n<li>Only <code>@safe<\/code> functions or those inferred to be <code>@safe<\/code> can be called.<\/li>\n<\/ol>\n<h3>The need for @trusted<\/h3>\n<p>The above rules work well to prevent memory corruption, but they prevent a lot of valid, and actually safe, code. For example, consider a function that wants to use the system call <a href=\"http:\/\/pubs.opengroup.org\/onlinepubs\/009695399\/functions\/read.html\"><code>read<\/code><\/a>, which is prototyped like this:<\/p>\n<pre class=\"prettyprint lang-d\">ssize_t read(int fd, void* ptr, size_t nBytes);<\/pre>\n<p>For those unfamiliar with this function, it reads data from the given file descriptor, and puts it into the buffer pointed at by <code>ptr<\/code> and expected to be <code>nBytes<\/code> bytes long. It returns the number of bytes actually read, or a negative value if an error occurs.<\/p>\n<p>Using this function to read data into a stack-allocated buffer might look like this:<\/p>\n<pre class=\"prettyprint lang-d\">ubyte[128] buf;\r\nauto nread = read(fd, buf.ptr, buf.length);<\/pre>\n<p>How is this done inside a <code>@safe<\/code> function? The main issue with using <code>read<\/code> in <code>@safe<\/code> code is that pointers can only pass a single value, in this case a single <code>ubyte<\/code>. <code>read<\/code> expects to store more bytes of the buffer. In D, we would normally pass the data to be read as a dynamic array. However, <code>read<\/code> is not D code, and uses a common C idiom of passing the buffer and length separately, so it cannot be marked <code>@safe<\/code>. Consider the following call from <code>@safe<\/code> code:<\/p>\n<pre class=\"prettyprint lang-d\">auto nread = read(fd, buf.ptr, 10_000);<\/pre>\n<p>This call is definitely <em>not<\/em> safe. What is safe in the above <code>read<\/code> example is only the one call, where the understanding of the <code>read<\/code> function and calling context assures memory outside the buffer will not be written.<\/p>\n<p>To solve this situation, D provides the\u00a0<a href=\"http:\/\/dlang.org\/spec\/function.html#trusted-functions\"><code>@trusted<\/code>\u00a0 attribute<\/a>, which tells the compiler that the code inside the function is assumed to be <code>@safe<\/code>, but will not be mechanically checked. It&#8217;s on you, the developer, to make sure the code is actually <code>@safe<\/code>.<\/p>\n<p>A function that solves the problem might look like this in D:<\/p>\n<pre class=\"prettyprint lang-d\">auto safeRead(int fd, ubyte[] buf) @trusted\r\n{\r\n    return read(fd, buf.ptr, buf.length);\r\n}<\/pre>\n<p>Whenever marking an entire function <code>@trusted<\/code>, consider if code could call this function from <em>any context<\/em> that would compromise memory safety. If so, this function should not be marked <code>@trusted<\/code> <strong>under any circumstances<\/strong>. Even if the intention is to only call it in safe ways, the compiler will not prevent unsafe usage by others. <code>safeRead<\/code> should be fine to call from any <code>@safe<\/code> context, so it&#8217;s a great candidate to mark <code>@trusted<\/code>.<\/p>\n<p>A more liberal API for the <code>safeRead<\/code> function might take a <code>void[]<\/code> array as the buffer. However, recall that in <code>@safe<\/code> code, one can cast any dynamic array to a <code>void[]<\/code> array &#8212; including an array of pointers. Reading file data into an array of pointers could result in an array of dangling pointers. This is why <code>ubyte[]<\/code> is used instead.<\/p>\n<h3>@trusted escapes<\/h3>\n<p>A <code>@trusted<\/code> escape is a single expression that allows <code>@system<\/code> (the unsafe default in D) calls such as <code>read<\/code> without exposing the potentially unsafe call to any other part of the program. Instead of writing the <code>safeRead<\/code> function, the same feat can be accomplished inline within a <code>@safe<\/code> function:<\/p>\n<pre class=\"prettyprint lang-d\">auto nread = ( () @trusted =&gt; read(fd, buf.ptr, buf.length) )();<\/pre>\n<p>Let&#8217;s take a closer look at this escape to see what is actually happening. D allows declaring a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Anonymous_function\">lambda function<\/a> that evaluates and returns a single expression, with the <a href=\"http:\/\/dlang.org\/spec\/expression.html#Lambda\"><code>() =&gt; expr<\/code> syntax<\/a>. In order to call the lambda function, parentheses are appended to the lambda. However, operator precedence will apply those parentheses to the expression and not the lambda, so the entire lambda must be wrapped in parentheses to clarify the call. And finally, the lambda can be tagged <code>@trusted<\/code> as shown, so the call is now usable from the <code>@safe<\/code> context that contains it.<\/p>\n<p>In addition to simple lambdas, whole <a href=\"http:\/\/dlang.org\/spec\/function.html#nested\">nested functions<\/a> or multi-statement lambdas can be used. However, remember that adding a trusted nested function or saving a lambda to a variable exposes the rest of the function to potential safety concerns! Take care not to expose the escape too much because this risks having to manually verify code that should just be mechanically checked.<\/p>\n<h3>Rules of Thumb for @trusted<\/h3>\n<p>The previous examples show that tagging something as <code>@trusted<\/code> has huge implications. If you are disabling memory safety checks, but allowing any <code>@safe<\/code> code to call it, then you must be sure that it cannot result in memory corruption. These rules should give guidance on where to put <code>@trusted<\/code> marks and avoid getting into trouble:<\/p>\n<h4>Keep @trusted code small<\/h4>\n<p><code>@trusted<\/code> code is never mechanically checked for safety, so every line must be reviewed for correctness. For this reason, it&#8217;s always advisable to keep the code that is <code>@trusted<\/code> as small as possible.<\/p>\n<h4>Apply @trusted to entire functions when the unsafe calls are leaky<\/h4>\n<p>Code that modifies or uses data that <code>@safe<\/code> code also uses creates the potential for unsafe calls to leak into the mechanically checked portion of a <code>@safe<\/code> function. This means that portion of the code must be manually reviewed for safety issues. It&#8217;s better to mark the whole thing <code>@trusted<\/code>, as that&#8217;s more in line with the truth. This is not a hard and fast rule; for example, the <code>read<\/code> call from the earlier example is perfectly safe, even though it will affect data that is used later by the function in <code>@safe<\/code> mode.<\/p>\n<p>A pointer allocated with C&#8217;s <code>malloc<\/code> in the beginning of the function, and <code>free<\/code>&#8216;d later, could have been copied somewhere in between. In this case, the dangling pointer may violate <code>@safe<\/code>, even in the mechanically checked part. Instead, try wrapping the entire portion that uses the pointer as <code>@trusted<\/code>, or even the entire function. Alternatively, use <a href=\"http:\/\/dlang.org\/spec\/statement.html#scope-guard-statement\">scope guards<\/a> to guarantee the lifetime of the data until the end of the function.<\/p>\n<h4>Never use @trusted on template functions that accept arbitrary types<\/h4>\n<p>D is smart enough to <a href=\"http:\/\/dlang.org\/spec\/function.html#function-attribute-inference\">infer<\/a> <code>@safe<\/code> for template functions that follow the rules. This includes member functions of templated types. Just let the compiler do its job here. To ensure the function is actually <code>@safe<\/code> in the right contexts, create an <code>@safe unittest<\/code>\u00a0 to call it. Marking the function <code>@trusted<\/code> allows any operator overloads or members that might violate memory safety to be ignored by the safety checker! Some tricky ones to remember are <a href=\"http:\/\/dlang.org\/spec\/struct.html#struct-postblit\">postblit<\/a> and <a href=\"http:\/\/dlang.org\/spec\/operatoroverloading.html#cast\">opCast<\/a>.<\/p>\n<p>It&#8217;s still OK to use <code>@trusted<\/code> escapes here, but be very careful. Consider especially possible types that contain pointers when thinking about how such a function could be abused. A common mistake is to mark a range function or range usage <code>@trusted<\/code>. Remember that most ranges are templates, and can be easily inferred as <code>@system<\/code> when the type being iterated has a <code>@system<\/code> postblit or constructor\/destructor, or is generated from a user-provided lambda.<\/p>\n<h4>Use @safe to find the parts you need to mark as @trusted<\/h4>\n<p>Sometimes, a template intended to be <code>@safe<\/code> may not be inferred <code>@safe<\/code>, and it&#8217;s not clear why. In this case, try temporarily marking the template function <code>@safe<\/code> to see where the compiler complains. That&#8217;s where <code>@trusted<\/code> escapes should be inserted if appropriate.<\/p>\n<p>In some cases, a template is used pervasively, and tagging it as <code>@safe<\/code> may make too many parts break. Make a copy of the template under a different name that you mark <code>@safe<\/code>, and change the calls that are to be checked so that they call the alternative template instead.<\/p>\n<h4>Consider how the function may be edited in the future<\/h4>\n<p>When writing a trusted function, always think about how it could be called with the given API, and ensure that it should be <code>@safe<\/code>. A good example from above is making sure <code>safeRead<\/code> cannot accept an array of pointers. However, another possibility for unsafe code to creep in is when someone edits a part of the function later, invalidating the previous verification, and the whole function needs to be rechecked. Insert comments to explain the danger of changing something that would then violate safety. Remember, pull request diffs don&#8217;t always show the entire context, including that a long function being edited is <code>@trusted<\/code>!<\/p>\n<h4>Use types to encapsulate @trusted operations with defined lifetimes<\/h4>\n<p>Sometimes, a resource is only dangerous to create and\/or destroy, but not to use during its lifetime. The dangerous operations can be encapsulated into a type&#8217;s constructor and destructor, marked <code>@trusted<\/code>, which allows <code>@safe<\/code> code to use the resource in between those calls. This takes a lot of planning and care. At no time can you allow <code>@safe<\/code> code to ferret out the actual resource so that it can keep a copy past the lifetime of the managing struct! It is essential to make sure the resource is alive as long as <code>@safe<\/code> code has a reference to it.<\/p>\n<p>For example, a reference-counted type can be perfectly safe, as long as a raw pointer to the payload data is never available. D&#8217;s\u00a0<a href=\"https:\/\/dlang.org\/phobos\/std_typecons.html#.RefCounted\"><code>std.typecons.RefCounted<\/code><\/a> cannot be marked <code>@safe<\/code>, since it uses\u00a0<a href=\"https:\/\/dlang.org\/spec\/class.html#alias-this\"><code>alias this<\/code><\/a> to devolve to the protected allocated struct in order to function, and any calls into this struct are unaware of the reference counting. One copy of that payload pointer, and then when the struct is <code>free<\/code>&#8216;d, a dangling pointer is present.<\/p>\n<h3>This can&#8217;t be @safe!<\/h3>\n<p>Sometimes, the compiler allows a function to be <code>@safe<\/code>, or is inferred <code>@safe<\/code>, and it&#8217;s obvious that shouldn&#8217;t be allowed. This is caused by one of two things: either a function that is called by the <code>@safe<\/code> function (or some deeper function) is marked <code>@trusted<\/code> but allows unsafe calls, or there is a bug or hole in the <code>@safe<\/code> system. Most of the time, it is the former. <code>@trusted<\/code> is a very tricky attribute to get correct, as is shown by most of this post. Frequently, developers will mark a function <code>@trusted<\/code> only thinking of some uses of their function, not realizing the dangers it allows. Even core D developers make this mistake! There can be template functions that are inferred safe because of this, and sometimes it&#8217;s difficult to even find the culprit. Even after the root cause is discovered, it&#8217;s often difficult to remove the <code>@trusted<\/code> tag as it will break many users of the function. However, it&#8217;s better to break code that is expecting a promise of memory safety than subject it to possible memory exploits. The sooner you can deprecate and remove the tag, the better. Then insert trusted escapes for cases that can be proven safe.<\/p>\n<p>If it does happen to be a hole in the system, please <a href=\"https:\/\/issues.dlang.org\/enter_bug.cgi\">report the issue<\/a>, or ask questions on the <a href=\"http:\/\/forum.dlang.org\/group\/learn\">D forums<\/a>. The D community is generally happy to help, and memory safety is a particular focus for Walter Bright, the creator of the language.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Steven Schveighoffer is the creator and maintainer of the dcollections and iopipe libraries. He was the primary instigator of D&#8217;s inout feature and the architect of a major rewrite of the language&#8217;s built-in arrays. He also authored the oft-recommended introductory article on the latter. In computer programming, there is a concept of memory-safe code, which [&hellip;]<\/p>\n","protected":false},"author":9,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[9,20],"tags":[],"_links":{"self":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/283"}],"collection":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/users\/9"}],"replies":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/comments?post=283"}],"version-history":[{"count":19,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/283\/revisions"}],"predecessor-version":[{"id":302,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/283\/revisions\/302"}],"wp:attachment":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/media?parent=283"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/categories?post=283"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/tags?post=283"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}