{"id":723,"date":"2017-04-10T12:59:16","date_gmt":"2017-04-10T12:59:16","guid":{"rendered":"http:\/\/dlang.org\/blog\/?p=723"},"modified":"2021-10-08T11:11:09","modified_gmt":"2021-10-08T11:11:09","slug":"the-new-ctfe-engine","status":"publish","type":"post","link":"https:\/\/dlang.org\/blog\/2017\/04\/10\/the-new-ctfe-engine\/","title":{"rendered":"The New CTFE Engine"},"content":{"rendered":"<p><em>Stefan Koch is the maintainer of <a href=\"https:\/\/github.com\/UplinkCoder\/sqlite-d\">sqlite-d<\/a>, a native D <a href=\"https:\/\/www.sqlite.org\">sqlite<\/a> reader, and has contributed to projects like <a href=\"https:\/\/github.com\/SDC-Developers\/SDC\">SDC<\/a> (the\u00a0Stupid D Compiler) and <a href=\"http:\/\/vibed.org\">vibe.d<\/a>. He was also responsible for a 10% performance improvement in D&#8217;s current CTFE implementation and is currently writing a new CTFE engine, the subject of this post.<\/em><\/p>\n<hr \/>\n<p>For the past nine months, I&#8217;ve been working on a project called NewCTFE, a reimplementation of the <a href=\"https:\/\/dlang.org\/spec\/function.html#interpretation\">Compile-Time Function Evaluation (CTFE)<\/a> feature of the D compiler front-end. CTFE is considered one of the game-changing features of D.<\/p>\n<p>As the name implies, CTFE allows certain functions to be evaluated by the compiler while it is compiling the source code in which the functions are implemented. As long as all arguments to a function are available at compile time and the function is pure (has no side effects), then the function qualifies for CTFE. The compiler will replace the function call with the result.<\/p>\n<p>Since this is an integral part of the language, pure functions can be evaluated anywhere a compile-time constant may go. A simple example can be found in the <a href=\"https:\/\/dlang.org\/phobos\/index.html\">standard library<\/a> module, <a href=\"https:\/\/dlang.org\/phobos\/std_uri.html\"><code>std.uri<\/code><\/a>, where CTFE is used to compute a lookup table. It <a href=\"https:\/\/github.com\/dlang\/Phobos\/blob\/master\/std\/uri.d\">looks like<\/a> this:<\/p>\n<pre class=\"prettyprint lang-d\">private immutable ubyte[128] uri_flags = \/\/ indexed by character\r\n({\r\n\r\n    ubyte[128] uflags;\r\n\r\n    \/\/ Compile time initialize\r\n    uflags['#'] |= URI_Hash;\r\n\r\n    foreach (c; 'A' .. 'Z' + 1)\r\n    {\r\n        uflags[c] |= URI_Alpha;\r\n        uflags[c + 0x20] |= URI_Alpha; \/\/ lowercase letters\r\n\r\n    }\r\n\r\n    foreach (c; '0' .. '9' + 1) uflags[c] |= URI_Digit;\r\n\r\n    foreach (c; \";\/?:@&amp;=+$,\") uflags[c] |= URI_Reserved;\r\n\r\n    foreach (c; \"-_.!~*'()\") uflags[c] |= URI_Mark;\r\n\r\n    return uflags;\r\n\r\n})();<\/pre>\n<p>Instead of populating the table with magic values, a simple expressive <a href=\"https:\/\/dlang.org\/spec\/expression.html#FunctionLiteral\">function literal<\/a> is used. This is much easier to understand and debug than some opaque static array. The <code>({<\/code> starts a function-literal, the <code>})<\/code> closes it. The <code>()<\/code> at the end tells the compiler to immediately evoke that literal such that <code>uri_flags<\/code> becomes the result of the literal.<\/p>\n<p>Functions are only evaluated at compile time if they need to be. <code>uri_flags<\/code> in the snippet above is declared in module scope. When a module-scope variable is initialized in this manner, the initializer must be available at compile time. In this case, since the initializer is a function literal, an attempt will be made to perform CTFE. This particular literal has no arguments and is pure, so the attempt succeeds.<\/p>\n<p>For a more in-depth discussion of CTFE, see <a href=\"https:\/\/wiki.dlang.org\/User:Quickfur\/Compile-time_vs._compile-time\">this article<\/a> by H. S. Teoh at the D Wiki.<\/p>\n<p>Of course, the same technique can be applied to more complicated problems as well; <code>std.regex<\/code>, for example, can build a specialized automaton for a regex at compile time using CTFE. However, as soon as <a href=\"https:\/\/dlang.org\/phobos\/std_regex.html\"><code>std.regex<\/code><\/a> is used with CTFE for non-trivial patterns, compile times can become extremely high (in D everything that takes longer than a second to compile is bloat-ware :)). Eventually, as patterns get more complex, the compiler will run out of memory and probably take the whole system down with it.<\/p>\n<p>The blame for this can be laid at the feet of the current CTFE interpreter&#8217;s architecture. It&#8217;s an AST interpreter, which means that it interprets the AST while traversing it. To represent the result of interpreted expressions, it uses DMD&#8217;s AST node classes. This means that every expression encountered will allocate one or more AST nodes. Within a tight loop, the interpreter can easily generate over <code>100_000_000<\/code> nodes and eat a few gigabytes of RAM. That can exhaust memory quite quickly.<\/p>\n<p><a href=\"https:\/\/issues.dlang.org\/show_bug.cgi?id=12844\">Issue 12844<\/a> complains about <code>std.regex<\/code> taking more than 16GB of RAM. For one pattern. Then there&#8217;s <a href=\"https:\/\/issues.dlang.org\/show_bug.cgi?id=6498\">issue 6498<\/a>, which executes a simple <code>0<\/code> to <code>10_000_000<\/code> while loop via CTFE and runs out of memory.<\/p>\n<p>Simply freeing nodes doesn&#8217;t fix the problem; we don\u2019t know which nodes to free and enabling the garbage collector makes the whole compiler brutally slow. Luckily there is another approach which doesn&#8217;t allocate for every expression encountered. It involves compiling the function to a virtual ISA (Instruction Set Architecture). This virtual ISA, also known as bytecode, is then given to a dedicated interpreter for that ISA (in the case in which a virtual ISA happens to be the same as the ISA of the host, we call it a JIT (Just in Time) interpreter).<\/p>\n<p>The NewCTFE project concerns itself with implementing such a bytecode interpreter. Writing the actual interpreter (a CPU emulator for a virtual CPU\/ISA) is reasonably simple. However, compiling code to a virtual ISA is exactly as much work as compiling it to a real ISA (though, a virtual ISA has the added benefit that it can be extended for customized needs, but that makes it harder to do JIT later). That&#8217;s why it took a month just to get the first simple examples running on the new CTFE engine, and why slightly more complicated ones still aren&#8217;t running even after 9 months of development. At the end of the post, you&#8217;ll find an approximate timeline of the work done so far.<\/p>\n<p>I&#8217;ll be giving <a href=\"http:\/\/dconf.org\/2017\/talks\/koch.html\">a presentation<\/a> at <a href=\"http:\/\/dconf.org\/2017\/index.html\">DConf 2017<\/a>, where I&#8217;ll discuss my experience implementing the engine and explain some of the technical details, particularly regarding the trade-offs and design choices I&#8217;ve made. The current estimation is that the 1.0 goals will not be met by then, but I&#8217;ll keep coding away until it&#8217;s done.<\/p>\n<p>Those interested in keeping up with my progress can follow my status updates <a href=\"http:\/\/forum.dlang.org\/thread\/btqjnieachntljobzrho@forum.dlang.org\">in the D forums<\/a>. At some point in the future, I will write another article on some of the technical details of the implementation. In the meantime, I hope the following listing does shed some light on how much work it is to implement NewCTFE \ud83d\ude42<\/p>\n<ul>\n<li><strong>May 9th 2016<\/strong><br \/>\nAnnouncement of the plan to improve CTFE.<\/li>\n<li><strong>May 27th 2016<\/strong><br \/>\nAnnouncement that work on the new engine has begun.<\/li>\n<li><strong>May 28th 2016<\/strong><br \/>\nSimple memory management change failed.<\/li>\n<li><strong>June 3rd 2016<\/strong><br \/>\nDecision to implement a bytecode interpreter.<\/li>\n<li><strong>June 30th 2016<\/strong><br \/>\nFirst code (taken from <a href=\"https:\/\/issues.dlang.org\/show_bug.cgi?id=6498\">issue 6498<\/a>) consisting of simple integer arithmetic runs.<\/li>\n<li><strong>July 14th 2016<\/strong><br \/>\nASCII string indexing works.<\/li>\n<li><strong>July 15th 2016<\/strong><br \/>\nInitial <code>struct<\/code> support<\/li>\n<li><strong>Sometime between July and August<\/strong><br \/>\nFirst switches work.<\/li>\n<li><strong>August 17th 2016<\/strong><br \/>\nSupport for the special cases <code>if(__ctfe)<\/code> and <code>if(!__ctfe)<\/code><\/li>\n<li><strong>Sometime between August and September<\/strong><br \/>\nTernary expressions are supported<\/li>\n<li><strong>September 08th 2016<\/strong><br \/>\nFirst Phobos unit tests pass.<\/li>\n<li><strong>September 25th 2016<\/strong><br \/>\nSupport for returning strings and ternary expressions.<\/li>\n<li><strong>October 16th 2016<\/strong><br \/>\nFirst (almost working) version of the LLVM backend.<\/li>\n<li><strong>October 30th 2016<\/strong><br \/>\nFirst failed attempts to support function calls.<\/li>\n<li><strong>November 01st<\/strong><br \/>\nDRuntime unit tests pass for the first time.<\/li>\n<li><strong>November 10th 2016<\/strong><br \/>\nFailed attempt to implement string concatenation.<\/li>\n<li><strong>November 14th 2016<\/strong><br \/>\nArray expansion, e.g. assignment to the <code>length<\/code> property, is supported.<\/li>\n<li><strong>November 14th 2016<\/strong><br \/>\nAssignment of array indexes is supported.<\/li>\n<li><strong>November 18th 2016<\/strong><br \/>\nSupport for\u00a0arrays as function parameters.<\/li>\n<li><strong>November 19th 2016<\/strong><br \/>\nPerformance fixes.<\/li>\n<li><strong>November 20th 2016<\/strong><br \/>\nFixing the broken <code>while(true) \/ for (;;)<\/code> loops; they can now be broken out of \ud83d\ude42<\/li>\n<li><strong>November 25th 2016<\/strong><br \/>\nFixes to <code>goto<\/code> and switch handling.<\/li>\n<li><strong>November 29th 2016<\/strong><br \/>\nFixes to <code>continue<\/code> and <code>break<\/code> handling.<\/li>\n<li><strong>November 30th 2016<\/strong><br \/>\nInitial support for <code>assert<\/code><\/li>\n<li><strong>December 02nd 2016<\/strong><br \/>\nBailout on <code>void<\/code>-initialized values (since they can lead to undefined behavior).<\/li>\n<li><strong>December 03rd 2016<\/strong><br \/>\nInitial support for returning <code>struct<\/code> literals.<\/li>\n<li><strong>December 05th 2016<\/strong><br \/>\nPerformance fix to the bytecode generator.<\/li>\n<li><strong>December 07th 2016<\/strong><br \/>\nFixes to <code>continue<\/code> and <code>break<\/code> in <code>for<\/code> statements (<code>continue<\/code> <em>must not skip<\/em> the increment step)<\/li>\n<li><strong>December 08th 2016<\/strong><br \/>\nArray literals with variables inside are now supported: <code>[1, n, 3]<\/code><\/li>\n<li><strong>December 08th 2016<\/strong><br \/>\nFixed a bug in <code>switch<\/code> statements.<\/li>\n<li><strong>December 10th 2016<\/strong><br \/>\nFixed a nasty evaluation order bug.<\/li>\n<li><strong>December 13th 2016<\/strong><br \/>\nSome progress on function calls.<\/li>\n<li><strong>December 14th 2016<\/strong><br \/>\nInitial support for strings\u00a0in\u00a0switches.<\/li>\n<li><strong>December 15th 2016<\/strong><br \/>\nAssignment of static arrays is now supported.<\/li>\n<li><strong>December 17th 2016<\/strong><br \/>\nFixing <code>goto<\/code> statements (we were ignoring the last <code>goto<\/code> to any label :)).<\/li>\n<li><strong>December 17th 2016<\/strong><br \/>\nDe-macrofied string-equals.<\/li>\n<li><strong>December 20th 2016<\/strong><br \/>\nImplement check to guard against dereferencing null pointers (yes\u2026 that one was oh so fun).<\/li>\n<li><strong>December 22ed 2016<\/strong><br \/>\nInitial support for pointers.<\/li>\n<li><strong>December 25th 2016<\/strong><br \/>\n<code>static immutable<\/code> variables can now be accessed (yes the result is recomputed \u2026 who cares).<\/li>\n<li><strong>January 02nd 2017<\/strong><br \/>\nFirst Function calls are supported !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!<\/li>\n<li><strong>January 17th 2017<\/strong><br \/>\nRecursive function calls work now \ud83d\ude42<\/li>\n<li><strong>January 23rd 2017<\/strong><br \/>\nThe <strong>interpret3.d<\/strong> unit-test passes.<\/li>\n<li><strong>January 24th 2017<\/strong><br \/>\nWe are green on 64bit!<\/li>\n<li><strong>January 25th 2017<\/strong><br \/>\nGreen on all platforms !!!!! (with blacklisting though)<\/li>\n<li><strong>January 26th 2017<\/strong><br \/>\nFixed special case <code>cast(void*) size_t.max<\/code> (this one cannot go through the normal pointer support, which assumes that you have something valid to dereference).<\/li>\n<li><strong>January 26th 2017<\/strong><br \/>\nMember function calls are supported!<\/li>\n<li><strong>January 31st 2017<\/strong><br \/>\nFixed a bug in <code>switch<\/code> handling.<\/li>\n<li><strong>February 03rd 2017<\/strong><br \/>\nInitial function pointer support.<\/li>\n<li><strong>Rest of Feburary 2017<\/strong><br \/>\nWild goose chase for <a href=\"https:\/\/issues.dlang.org\/show_bug.cgi?id=17220\">issue #17220<\/a><\/li>\n<li><strong>March 11th 2017<\/strong><br \/>\nInitial support for slices.<\/li>\n<li><strong>March 15th 2017<\/strong><br \/>\nString slicing works.<\/li>\n<li><strong>March 18th 2017<\/strong><br \/>\n<code>$<\/code> in slice expressions is now supported.<\/li>\n<li><strong>March 19th 2017<\/strong><br \/>\nThe concatenation operator (<code>c = a ~ b<\/code>) works.<\/li>\n<li><strong>March 22ed 2017<\/strong><br \/>\nFixed a <code>switch<\/code> fallthrough bug.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Stefan Koch is the maintainer of sqlite-d, a native D sqlite reader, and has contributed to projects like SDC (the\u00a0Stupid D Compiler) and vibe.d. He was also responsible for a 10% performance improvement in D&#8217;s current CTFE implementation and is currently writing a new CTFE engine, the subject of this post. For the past nine [&hellip;]<\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[12,8,9],"tags":[],"_links":{"self":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/723"}],"collection":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/comments?post=723"}],"version-history":[{"count":7,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/723\/revisions"}],"predecessor-version":[{"id":730,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/723\/revisions\/730"}],"wp:attachment":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/media?parent=723"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/categories?post=723"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/tags?post=723"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}