{"id":1044,"date":"2017-08-23T13:01:52","date_gmt":"2017-08-23T13:01:52","guid":{"rendered":"http:\/\/dlang.org\/blog\/?p=1044"},"modified":"2021-10-08T11:08:19","modified_gmt":"2021-10-08T11:08:19","slug":"d-as-a-better-c","status":"publish","type":"post","link":"https:\/\/dlang.org\/blog\/2017\/08\/23\/d-as-a-better-c\/","title":{"rendered":"D as a Better C"},"content":{"rendered":"<p><em><a href=\"http:\/\/walterbright.com\/\">Walter Bright<\/a> is the BDFL of the D Programming Language and founder of <a href=\"http:\/\/digitalmars.com\/\">Digital Mars<\/a>. He has decades of experience implementing compilers and interpreters for multiple languages, including Zortech C++, the first native C++ compiler. He also created <a href=\"http:\/\/www.classicempire.com\/\">Empire, the Wargame of the Century<\/a>. This post is <a href=\"https:\/\/dlang.org\/blog\/the-d-and-c-series\/#betterC\">the first in a series<\/a>\u00a0about\u00a0<a href=\"https:\/\/dlang.org\/blog\/2017\/08\/23\/d-as-a-better-c\/\">D\u2019s BetterC mode<\/a><\/em><\/p>\n<hr \/>\n<p><img loading=\"lazy\" class=\"size-full wp-image-181 alignleft\" src=\"http:\/\/dlang.org\/blog\/wp-content\/uploads\/2016\/08\/d6.png\" alt=\"\" width=\"200\" height=\"200\" \/>D was designed from the ground up to interface directly and easily to C, and to a lesser extent C++. This provides access to endless C libraries, the Standard C runtime library, and of course the operating system APIs, which are usually C APIs.<\/p>\n<p>But there&#8217;s much more to C than that. There are large and immensely useful programs written in C, such as the Linux operating system and a very large chunk of the programs written for it. While D programs can interface with C libraries, the reverse isn&#8217;t true. C programs cannot interface with D ones. It&#8217;s not possible (at least not without considerable effort) to compile a couple of D files and link them in to a C program. The trouble is that compiled D files refer to things that only exist in the D runtime library, and linking that in (it&#8217;s a bit large) tends to be impractical.<\/p>\n<p>D code also can&#8217;t exist in a program unless D controls the <code>main()<\/code> function, which is how the startup code in the D runtime library is managed. Hence D libraries remain inaccessible to C programs, and chimera programs (a mix of C and D) are not practical. One cannot pragmatically &#8220;try out&#8221; D by add D modules to an existing C program.<\/p>\n<p>That is, until Better C came along.<\/p>\n<p>It&#8217;s been done before, it&#8217;s an old idea. Bjarne Stroustrup wrote a paper in 1988 entitled &#8220;<a href=\"http:\/\/www.drdobbs.com\/open-source\/a-better-c\/223000087\">A Better C<\/a>&#8220;. His early C++ compiler was able to compile C code pretty much unchanged, and then one could start using C++ features here and there as they made sense, all without disturbing the existing investment in C. This was a brilliant strategy, and drove the early success of C++.<\/p>\n<p>A more modern example is Kotlin, which uses a different method. Kotlin syntax is not compatible with Java, but it is fully interoperable with Java, relies on the existing Java libraries, and allows a gradual migration of Java code to <a href=\"https:\/\/en.wikipedia.org\/wiki\/Kotlin_(programming_language)\">Kotlin<\/a>. Kotlin is indeed a &#8220;Better Java&#8221;, and this shows in its success.<\/p>\n<h3>D as Better C<\/h3>\n<p>D takes a radically different approach to making a better C. It is not an extension of C, it is not a superset of C, and does not bring along C&#8217;s longstanding issues (such as the preprocessor, array overflows, etc.). D&#8217;s solution is to subset the D language, removing or altering features that require the D startup code and runtime library. This is, simply, the charter of the <code>-betterC<\/code> compiler switch.<\/p>\n<p>Doesn&#8217;t removing things from D make it no longer D? That&#8217;s a hard question to answer, and it&#8217;s really a matter of individual preference. The vast bulk of the core language remains. Certainly the D characteristics that are analogous to C remain. The result is a language somewhere in between C and D, but that is fully upward compatible with D.<\/p>\n<h3>Removed Things<\/h3>\n<p>Most obviously, the garbage collector is removed, along with the features that depend on the garbage collector. Memory can still be allocated the same way as in C &#8211; using <code>malloc()<\/code> or some custom allocator.<\/p>\n<p>Although C++ classes and COM classes will still work, D polymorphic classes will not, as they rely on the garbage collector.<\/p>\n<p>Exceptions, typeid, static construction\/destruction, RAII, and unittests are removed. But it is possible we can find ways to add them back in.<\/p>\n<p>Asserts are altered to call the C runtime library assert fail functions rather than the D runtime library ones.<\/p>\n<p>(This isn&#8217;t a complete list, for that see <a href=\"http:\/\/dlang.org\/dmd-windows.html#switch-betterC\">http:\/\/dlang.org\/dmd-windows.html#switch-betterC<\/a>.)<\/p>\n<h3>Retained Things<\/h3>\n<p>More importantly, what remains?<\/p>\n<p>What may be initially most important to C programmers is memory safety in the form of array overflow checking, no more stray pointers into expired stack frames, and guaranteed initialization of locals. This is followed by what is expected in a modern language &#8212; modules, function overloading, constructors, member functions, Unicode, nested functions, dynamic closures, Compile Time Function Execution, automated documentation generation, highly advanced metaprogramming, and Design by Introspection.<\/p>\n<h3>Footprint<\/h3>\n<p>Consider a C program:<\/p>\n<pre class=\"prettyprint lang-c_cpp\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">#include &lt;stdio.h&gt;\r\n\r\nint main(int argc, char** argv) {\r\n    printf(\"hello world\\n\");\r\n    return 0;\r\n}\r\n<\/pre>\n<p>It compiles to:<\/p>\n<pre class=\"prettyprint lang-assembly_x86\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">_main:\r\npush EAX\r\nmov [ESP],offset FLAT:_DATA\r\ncall near ptr _printf\r\nxor EAX,EAX\r\npop ECX\r\nret\r\n<\/pre>\n<p>The executable size is 23,068 bytes.<\/p>\n<p>Translate it to D:<\/p>\n<pre class=\"prettyprint lang-d\" data-caption=\"\" data-highlight=\"\" data-visibility=\"visible\" data-start-line=\"1\">import core.stdc.stdio;\r\n\r\nextern (C) int main(int argc, char** argv) {\r\n    printf(\"hello world\\n\");\r\n    return 0;\r\n}\r\n<\/pre>\n<p>The executable size is the same, 23,068 bytes. This is unsurprising because the C compiler and D compiler generate the same code, as they share the same code generator. (The equivalent full D program would clock in at 194Kb.) In other words, nothing extra is paid for using D rather than C for the same code.<\/p>\n<p>The <code>Hello World<\/code> program is a little too trivial. Let&#8217;s step up in complexity to the infamous sieve benchmark program:<\/p>\n<pre class=\"prettyprint lang-c_cpp\" data-start-line=\"1\" data-visibility=\"visible\" data-highlight=\"\" data-caption=\"\">#include &lt;stdio.h&gt;\r\n\r\n\/* Eratosthenes Sieve prime number calculation. *\/\r\n\r\n#define true    1\r\n#define false   0\r\n#define size    8190\r\n#define sizepl  8191\r\n\r\nchar flags[sizepl];\r\n\r\nint main() {\r\n    int i, prime, k, count, iter;\r\n\r\n    printf (\"10 iterations\\n\");\r\n    for (iter = 1; iter &lt;= 10; iter++) {\r\n        count = 0;\r\n        for (i = 0; i &lt;= size; i++)\r\n            flags[i] = true;\r\n        for (i = 0; i &lt;= size; i++) {\r\n            if (flags[i]) {\r\n                prime = i + i + 3;\r\n                k = i + prime;\r\n                while (k &lt;= size) {\r\n                    flags[k] = false;\r\n                    k += prime;\r\n                }\r\n                count += 1;\r\n            }\r\n        }\r\n    }\r\n    printf (\"\\n%d primes\", count);\r\n    return 0;\r\n}\r\n<\/pre>\n<p>Rewriting it in Better C:<\/p>\n<pre class=\"prettyprint lang-d\" data-caption=\"\" data-highlight=\"\" data-visibility=\"visible\" data-start-line=\"1\">import core.stdc.stdio;\r\n\r\nextern (C):\r\n\r\n__gshared bool[8191] flags;\r\n\r\nint main() {\r\n    int count;\r\n\r\n    printf(\"10 iterations\\n\");\r\n    foreach (iter; 1 .. 11) {\r\n        count = 0;\r\n        flags[] = true;\r\n        foreach (i; 0 .. flags.length) {\r\n            if (flags[i]) {\r\n                const prime = i + i + 3;\r\n                auto k = i + prime;\r\n                while (k &lt; flags.length) {\r\n                    flags[k] = false;\r\n                    k += prime;\r\n                }\r\n                count += 1;\r\n            }\r\n        }\r\n    }\r\n    printf(\"%d primes\\n\", count);\r\n    return 0;\r\n}<\/pre>\n<p>It looks much the same, but some things are worthy of note:<\/p>\n<ul>\n<li><code>extern (C):<\/code> means use the C calling convention.<\/li>\n<li>D normally puts static data into thread local storage. C sticks them in global storage. <code>__gshared<\/code> accomplishes that.<\/li>\n<li><code>foreach<\/code> is a simpler way of doing for loops over known endpoints.<\/li>\n<li><code>flags[] = true;<\/code> sets all the elements in <code>flags<\/code> to <code>true<\/code> in one go.<\/li>\n<li>Using <code>const<\/code> tells the reader that <code>prime<\/code> never changes once it is initialized.<\/li>\n<li>The types of <code>iter<\/code>, <code>i<\/code>, <code>prime<\/code> and <code>k<\/code> are inferred, preventing inadvertent type coercion errors.<\/li>\n<li>The number of elements in <code>flags<\/code> is given by <code>flags.length<\/code>, not some independent variable.<\/li>\n<\/ul>\n<p>And the last item leads to a very important hidden advantage: accesses to the <code>flags<\/code> array are bounds checked. No more overflow errors! We didn&#8217;t have to do anything<br \/>\nin particular to get that, either.<\/p>\n<p>This is only the beginning of how D as Better C can improve the expressivity, readability, and safety of your existing C programs. For example, D has nested functions, which in my experience work very well at prying goto&#8217;s from my cold, dead fingers.<\/p>\n<p>On a more personal note, ever since <code>-betterC<\/code> started working, I&#8217;ve been converting many of my old C programs still in use into D, one function at a time. Doing it one function at a time, and running the test suite after each change, keeps the program in a correctly working state at all times. If the program doesn&#8217;t work, I only have one function to look at to see where it went wrong. I don&#8217;t particularly care to maintain C programs anymore, and with <code>-betterC<\/code> there&#8217;s no longer any reason to.<\/p>\n<p>The Better C ability of D is available in the 2.076.0 beta: <a href=\"http:\/\/dlang.org\/download.html#dmd_beta\">download it<\/a> and <a href=\"http:\/\/dlang.org\/changelog\/2.076.0.html\">read the changelog<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>There are large and immensely useful programs written in C, such as the Linux operating system and a very large chunk of the programs written for it. While D programs can interface with C libraries, the reverse isn&#8217;t true. C programs cannot interface with D ones. It&#8217;s not possible (at least not without considerable effort) to compile a couple of D files and link them in to a C program. The trouble is that compiled D files refer to things that only exist in the D runtime library, and linking that in (it&#8217;s a bit large) tends to be impractical.<\/p>\n<p>That is, until Better C came along.<\/p>\n","protected":false},"author":15,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[31,26,8,20],"tags":[],"_links":{"self":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1044"}],"collection":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/users\/15"}],"replies":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/comments?post=1044"}],"version-history":[{"count":12,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1044\/revisions"}],"predecessor-version":[{"id":1597,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1044\/revisions\/1597"}],"wp:attachment":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/media?parent=1044"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/categories?post=1044"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/tags?post=1044"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}