{"id":888,"date":"2017-06-16T13:37:55","date_gmt":"2017-06-16T13:37:55","guid":{"rendered":"http:\/\/dlang.org\/blog\/?p=888"},"modified":"2021-10-08T11:12:01","modified_gmt":"2021-10-08T11:12:01","slug":"life-in-the-fast-lane","status":"publish","type":"post","link":"https:\/\/dlang.org\/blog\/2017\/06\/16\/life-in-the-fast-lane\/","title":{"rendered":"Life in the Fast Lane"},"content":{"rendered":"<p><a href=\"https:\/\/dlang.org\/blog\/2017\/03\/20\/dont-fear-the-reaper\/\"><img loading=\"lazy\" class=\"size-full wp-image-181 alignright\" src=\"http:\/\/dlang.org\/blog\/wp-content\/uploads\/2016\/08\/d6.png\" alt=\"\" width=\"200\" height=\"200\" \/>The first post<\/a> I wrote in the <a href=\"http:\/\/dlang.org\/blog\/the-gc-series\/\">GC series<\/a> introduced the D garbage collector and the language features that use it. Two key points that I tried to get across in the article were:<\/p>\n<ol>\n<li><strong>The GC can only run when memory allocations are requested<\/strong>. Contrary to popular misconception, the D GC isn\u2019t generally going to decide to pause your Minecraft clone in the middle of the hot path. It will only run when memory from the GC heap is requested, and then only if it needs to.<\/li>\n<li><strong>Simple C and C++ allocation strategies can mitigate GC pressure<\/strong>. Don\u2019t allocate memory in inner loops \u2013 preallocate as much as possible, or fetch it from the stack instead. Minimize the total number of heap allocations. These strategies work because of point #1 above. The programmer can dictate when it is possible for a collection to occur simply by being smart about when GC heap allocations are made.<\/li>\n<\/ol>\n<p>The strategies in point #2 are fine for code that a programmer writes herself, but\u00a0they aren&#8217;t going to help at all with third-party libraries. For those situations, D provides built-in mechanisms to guarantee that no GC allocations can occur, both in the language and the runtime. There are also command-line options that can help make sure the GC stays out of the way.<\/p>\n<p>Let\u2019s imagine a hypothetical programmer named J.P. who, for reasons he considers valid, has decided he would like to avoid garbage collection completely in his D program. He has two immediate options.<\/p>\n<h4>The GC chill pill<\/h4>\n<p>One option is to make a call to <code>GC.disable<\/code> when the program is starting up. This doesn&#8217;t stop allocations, but puts a hold on collections. That means <em>all<\/em> collections, including any that may result from allocations in other threads.<\/p>\n<pre class=\"prettyprint lang-d\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">void main() {\r\n    import core.memory;\r\n    import std.stdio;\r\n    GC.disable;\r\n    writeln(\"Goodbye, GC!\");\r\n}<\/pre>\n<p>Output:<\/p>\n<pre class=\"prettyprint lang-text\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">Goodbye, GC!<\/pre>\n<p>This has the benefit that all language features making use of the GC heap will still work as expected. But, considering that allocations are still going without any cleanup, when you do the math you&#8217;ll realize this might be problematic. If allocations start to get out of hand, something&#8217;s gotta give. From <a href=\"http:\/\/dlang.org\/phobos\/core_memory.html#.GC.disable\">the documentation<\/a>:<\/p>\n<blockquote><p>Collections may continue to occur in instances where the implementation deems necessary for correct program behavior, such as during an out of memory condition.<\/p><\/blockquote>\n<p>Depending on J.P.&#8217;s perspective, this might not be a good thing. But if this constraint is acceptable, there are some additional steps that can help keep things under control. J.P. can make calls to <code>GC.enable<\/code> or <code>GC.collect<\/code> as necessary. This provides greater control over collection cycles than the simple C and C++ allocation strategies.<\/p>\n<h4>The GC wall<\/h4>\n<p>When the GC is simply intolerable, J.P. can turn to the <code>@nogc<\/code> attribute. Slap it at the front of the <code>main<\/code> function and thou shalt suffer no collections.<\/p>\n<pre class=\"prettyprint lang-d\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">@nogc\r\nvoid main() { ... }<\/pre>\n<p>This is the ultimate GC mitigation strategy. <code>@nogc<\/code> applied to <code>main<\/code> will guarantee that the garbage collector will never run anywhere further along the callstack. No more caveats about collecting \u201cwhere the implementation deems necessary\u201d.<\/p>\n<p>At first blush, this may appear to be a much better option than <code>GC.disable<\/code>. Let\u2019s try it out.<\/p>\n<pre class=\"prettyprint lang-d\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">@nogc\r\nvoid main() {\r\n    import std.stdio;\r\n    writeln(\"GC be gone!\");\r\n}\r\n<\/pre>\n<p>This time, we aren\u2019t going to get past compilation:<\/p>\n<pre class=\"prettyprint lang-sh\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">Error: @nogc function 'D main' cannot call non-@nogc function 'std.stdio.writeln!string.writeln'<\/pre>\n<p>What makes <code>@nogc<\/code> tick is the compiler\u2019s ability to enforce it. It\u2019s a very blunt approach. If a function is annotated with <code>@nogc<\/code>, then any function called from inside it must also be annotated with <code>@nogc<\/code>. As may be obvious, <code>writeln<\/code> is not.<\/p>\n<p>That\u2019s not all:<\/p>\n<pre class=\"prettyprint lang-d\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">@nogc \r\nvoid main() {\r\n    auto ints = new int[](100);\r\n}<\/pre>\n<p>The compiler isn\u2019t going to let you get away with that one either.<\/p>\n<pre class=\"prettyprint lang-sh\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">Error: cannot use 'new' in @nogc function 'D main'<\/pre>\n<p>Any language feature that allocates from the GC heap is out of reach inside a function marked <code>@nogc<\/code> (refer to <a href=\"https:\/\/dlang.org\/blog\/2017\/03\/20\/dont-fear-the-reaper\/\">the first post in this series<\/a> for an overview of those features). It\u2019s turtles all the way down. The big benefit here is that it guarantees that third-party code can\u2019t use those features either, so can\u2019t be allocating GC memory behind your back. Another downside is that any third-party library that is not <code>@nogc<\/code> aware is not going to be available in your program.<\/p>\n<p>Using this approach requires a number of workarounds to make up for non-<code>@nogc<\/code> language features and library functions, including several in the standard library. Some are trivial, some are not, and others can\u2019t be worked around at all (we\u2019ll dive into the details in a future post). One example that might not be obvious is throwing an exception. The idiomatic way is:<\/p>\n<pre class=\"prettyprint lang-d\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">throw new Exception(\"Blah\");<\/pre>\n<p>Because of the <code>new<\/code> in that line, this isn\u2019t possible in <code>@nogc<\/code> functions. Getting around this requires preallocating any exceptions that will be thrown, which in turn runs into the issue that any exception memory allocated from the regular heap still needs to be deallocated, which leads to ideas of reference counting or stack allocation\u2026 In other words, it\u2019s a big can of worms. There\u2019s currently <a href=\"https:\/\/github.com\/dlang\/DIPs\/blob\/master\/DIPs\/DIP1008.md\">a D Improvement Proposal<\/a> from Walter Bright intended to stuff all the worms back into the can by making <code>throw new Exception<\/code> work without the GC when it needs to.<\/p>\n<p>It\u2019s not an insurmountable task to get around the limitations of <code>@nogc main<\/code>, it just requires a good bit of motivation and dedication.<\/p>\n<p>One more thing to note about <code>@nogc main<\/code> is that it doesn\u2019t banish the GC from the program completely. D has support for <a href=\"https:\/\/dlang.org\/spec\/class.html#static-constructor\">static constructors and destructors<\/a>. The former are executed by the runtime before entering <code>main<\/code> and the latter upon exiting. If any of these exist in the program and are not annotated with <code>@nogc<\/code>, then GC allocations and collections can technically be present in the program. Still, <code>@nogc<\/code> applied to <code>main<\/code>\u00a0means there won&#8217;t be any collections running once <code>main<\/code> is entered, so it&#8217;s effectively the same as having no GC at all.<\/p>\n<h4>Working it out<\/h4>\n<p>Here\u2019s where I\u2019m going to offer an opinion. There\u2019s a wide range of programs that can be written in D without disabling or cutting the GC off completely. The simple strategies of minimizing GC allocations and keeping them out of the hot path will get a lot of mileage and <em>should be preferred<\/em>. It can\u2019t be repeated enough given how often it\u2019s misunderstood: D\u2019s GC will only have a chance to run when the programmer allocates GC memory and it will only run if it needs to. Use that knowledge to your advantage by keeping the allocations small, infrequent, and isolated outside your inner loops.<\/p>\n<p>For those programs where more control is actually needed, it probably isn\u2019t going to be necessary to avoid the GC entirely. Judicious use of <code>@nogc<\/code> and\/or the <code>core.memory.GC<\/code> API can often serve to avoid any performance issues that may arise. Don\u2019t put <code>@nogc<\/code> on <code>main<\/code>, put it on the functions where you really want to disallow GC allocations. Don\u2019t call <code>GC.disable<\/code> at the beginning of the program. Call it instead before entering a critical path, then call <code>GC.enable<\/code> when leaving that path. Force collections at strategic points, such as between game levels, with <code>GC.collect<\/code>.<\/p>\n<p>As with any performance tuning strategy in software development, it pays to understand as fully as possible what\u2019s actually happening under the hood. Adding calls to the <code>core.memory.GC<\/code> API in places where you <em>think<\/em> they make sense could potentially make the GC do needless work, or have no impact at all. Better understanding can be achieved with a little help from the toolchain.<\/p>\n<p>The DRuntime <a href=\"https:\/\/dlang.org\/spec\/garbage.html#gc_config\">GC option<\/a> <code>--DRT-gcopt=profile:1<\/code> can be passed to a compiled program (not to the compiler!) for some tune-up assistance. This will report some useful GC profiling data, such as the total number of collections and the total collection time.<\/p>\n<p>To demonstrate, <strong>gcstat.d<\/strong> appends twenty values to a dynamic array of integers.<\/p>\n<pre class=\"prettyprint lang-d\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">void main() {\r\n    import std.stdio;\r\n    int[] ints;\r\n    foreach(i; 0 .. 20) {\r\n        ints ~= i;\r\n    }\r\n    writeln(ints);\r\n}<\/pre>\n<p>Compiling and running with the GC profile switch:<\/p>\n<pre class=\"prettyprint lang-sh\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">dmd gcstat.d\r\ngcstat --DRT-gcopt=profile:1\r\n[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]\r\n        Number of collections:  1\r\n        Total GC prep time:  0 milliseconds\r\n        Total mark time:  0 milliseconds\r\n        Total sweep time:  0 milliseconds\r\n        Total page recovery time:  0 milliseconds\r\n        Max Pause Time:  0 milliseconds\r\n        Grand total GC time:  0 milliseconds\r\nGC summary:    1 MB,    1 GC    0 ms, Pauses    0 ms &lt;    0 ms<\/pre>\n<p>This reports one collection, which almost certainly happened as the program was shutting down. The runtime terminates the GC as it exits which, in the current implementation, will generally trigger a collection. This is done primarily to run destructors on collected objects, even though D does not require destructors of GC-allocated objects to ever be run (a topic for a future post).<\/p>\n<p>DMD supports a command-line option, <code>-vgc<\/code>, that will display every GC allocation in a program, including those that are hidden behind language features like the array append operator.<\/p>\n<p>To demonstrate, take a look at <strong>inner.d<\/strong>:<\/p>\n<pre class=\"prettyprint lang-d\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">void printInts(int[] delegate() dg)\r\n{\r\n    import std.stdio;\r\n    foreach(i; dg()) writeln(i);\r\n} \r\n\r\nvoid main() {\r\n    int[] ints;\r\n    auto makeInts() {\r\n        foreach(i; 0 .. 20) {\r\n            ints ~= i;\r\n        }\r\n        return ints;\r\n    }\r\n\r\n    printInts(&amp;makeInts);\r\n}<\/pre>\n<p>Here, <code>makeInts<\/code> is an inner function. A pointer to a non-static inner function is not a function pointer, but a <code>delegate<\/code> (a context pointer\/function pointer pair; if an inner function is <code>static<\/code>, a pointer of type <code>function<\/code> is produced instead). In this particular case, the delegate makes use of a variable in its parent scope. Here\u2019s the output of compiling with <code>-vgc<\/code>:<\/p>\n<pre class=\"prettyprint lang-sh\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">dmd -vgc inner.d\r\ninner.d(11): vgc: operator ~= may cause GC allocation\r\ninner.d(7): vgc: using closure causes GC allocation<\/pre>\n<p>What we\u2019re seeing here is that memory needs to be allocated so that the delegate can carry the state of <code>ints<\/code>, making it a <em>closure<\/em> (which is not itself a type \u2013 the type is still <code>delegate<\/code>). Move the declaration of <code>ints<\/code> inside the scope of <code>makeInts<\/code> and recompile. You\u2019ll find that the closure allocation goes away. A better option is to change the declaration of <code>printInts<\/code> to look like this:<\/p>\n<pre class=\"prettyprint lang-d\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">void printInts(scope int[] delegate() dg)<\/pre>\n<p>Adding <code>scope<\/code> to any function parameter ensures that any references in the parameter cannot be escaped. In other words, it now becomes impossible to do something like assign <code>dg<\/code> to a global variable, or return it from the function. The effect is that there is no longer a need to create a closure, so there will be no allocation. See the documentation for more on <a href=\"https:\/\/dlang.org\/spec\/function.html#closures\">function pointers, delegates and closures<\/a>, and <a href=\"https:\/\/dlang.org\/spec\/function.html#parameters\">function parameter storage classes<\/a>.<\/p>\n<h4>The gist<\/h4>\n<p>Given that the D GC is very different from those in languages like Java and C#, it\u2019s certain to have different performance characteristics. Moreover, D programs tend to produce far less garbage than those written in a language like Java, where almost everything is a reference type. It helps to understand this when embarking on a D project for the first time. The strategies an experienced Java programmer uses to mitigate the impact of collections aren&#8217;t likley to apply here.<\/p>\n<p>While there is certainly a class of software in which no GC pauses are ever acceptable, that is an arguably small set. Most D projects can, and should, start out with the simple mitigation strategies from point #2 at the top of this article, then adapt the code to use <code>@nogc<\/code> or <code>core.memory.GC<\/code> as and when performance dictates. The command-line options demonstrated here can help ferret out the areas where that may be necessary.<\/p>\n<p>As time goes by, it\u2019s going to become easier to micromanage garbage collection in D programs. There\u2019s a concerted effort underway to make Phobos, D\u2019s standard library, as <code>@nogc<\/code>-friendly as possible. Language improvements such as Walter\u2019s proposal to modify how exceptions are allocated should speed that work considerably.<\/p>\n<p>Future posts in <a href=\"http:\/\/dlang.org\/blog\/the-gc-series\/\">this series<\/a> will look at how to allocate memory outside of the GC heap and use it alongside GC allocations in the same program, how to compensate for disabled language features in <code>@nogc<\/code> code, strategies for handling the interaction of the GC with object destructors, and more.<\/p>\n<p><em>Thanks to Vladimir Panteleev, Guillaume Piolat, and Steven Schveighoffer for their valuable feedback on drafts of this article.<\/em><\/p>\n<p><em>The article has been amended to remove a misleading line about Java and C#, and to add some information about multiple threads.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The first post I wrote in the GC series introduced the D garbage collector and the language features that use it. Two key points that I tried to get across in the article were: The GC can only run when memory allocations are requested. Contrary to popular misconception, the D GC isn\u2019t generally going to [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[26,24],"tags":[],"_links":{"self":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/888"}],"collection":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/comments?post=888"}],"version-history":[{"count":30,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/888\/revisions"}],"predecessor-version":[{"id":1003,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/888\/revisions\/1003"}],"wp:attachment":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/media?parent=888"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/categories?post=888"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/tags?post=888"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}