{"id":1604,"date":"2018-06-20T13:13:23","date_gmt":"2018-06-20T13:13:23","guid":{"rendered":"http:\/\/dlang.org\/blog\/?p=1604"},"modified":"2021-10-08T11:01:39","modified_gmt":"2021-10-08T11:01:39","slug":"how-an-engineering-company-chose-to-migrate-to-d","status":"publish","type":"post","link":"https:\/\/dlang.org\/blog\/2018\/06\/20\/how-an-engineering-company-chose-to-migrate-to-d\/","title":{"rendered":"How an Engineering Company Chose to Migrate to D"},"content":{"rendered":"<p><em>Bastiaan Veelo is the lead developer of a specialised program for the<br \/>\ncomputer aided geometric design of ship hulls called Fairway, for the<br \/>\ncompany SARC in the Netherlands.<\/em><\/p>\n<hr \/>\n<p><img loading=\"lazy\" class=\"alignleft size-full wp-image-181\" src=\"http:\/\/dlang.org\/blog\/wp-content\/uploads\/2016\/08\/d6.png\" alt=\"\" width=\"200\" height=\"200\" \/>Imagine there is this little-known programming language in which you enjoy programming in your free time. You know it is ready for prime time and you dream about using it at work everyday. This is the story about how I made a dream like that come true.<\/p>\n<h2 id=\"myearlyacquaintancewithd\">My early acquaintance with D<\/h2>\n<p>Back when \u201cgoogle\u201d was not yet a common verb, I was doing a web search for \u201cparsing C++\u201d. The reason was that writing a report for an assignment had derailed into writing a syntax highlighter <a href=\"https:\/\/en.wikipedia.org\/wiki\/Noweb\">for noweb<\/a> <a href=\"https:\/\/en.wikipedia.org\/wiki\/GNU_bison\">using bison<\/a> <a href=\"https:\/\/en.wikipedia.org\/wiki\/Flex_(lexical_analyser_generator)\">and flex<\/a>, and I found out firsthand that C++ is not easy to parse. That web search <a href=\"https:\/\/web.archive.org\/web\/20011005113647\/http:\/\/www.digitalmars.com:80\/d\/overview.html\">brought up this page<\/a> (<a href=\"https:\/\/dlang.org\/overview.html#for_d\">present version<\/a>) with an overview of the D Programming Language, and the following statement has had me hooked ever since:<\/p>\n<blockquote><p>D\u2019s lexical analyzer and parser are totally independent of each other and of the semantic analyzer. This means it is easy to write simple tools to manipulate D source perfectly without having to build a full compiler. It also means that source code can be transmitted in tokenized form for specialized applications.<\/p><\/blockquote>\n<p>\u201cGenius,\u201d I thought, \u201chere we have someone who knows what he\u2019s doing.\u201d This is representative of the pragmatic professionalism that still radiates from the D community, and it combines with an unpretentious flair that makes it pleasant to be around. This funny quote <a href=\"https:\/\/web.archive.org\/web\/20011015103847\/http:\/\/www.digitalmars.com:80\/d\/\">decorated its homepage<\/a> for many years:<\/p>\n<blockquote><p>\u201cGreat, just what I need.. another D in programming.\u201d \u2013 Segfault<\/p><\/blockquote>\n<p>Nevertheless, I didn\u2019t have many opportunities to use the language and I largely remained sitting on the fence, observing its development.<\/p>\n<h2 id=\"programmingprofessionally\">Programming professionally<\/h2>\n<p>With mostly academic programming experience, I started programming professionally in 2006 for <a href=\"https:\/\/www.sarc.nl\/\">SARC, a Dutch engineering company<\/a> serving the maritime industry. Since the early \u201980s they have been developing software for ship design and onboard loading calculations, which today amounts to roughly half a million lines of code. I think their success can partly be attributed to their choice of programming language: Extended Pascal (<a href=\"http:\/\/pascal-central.com\/docs\/iso10206.pdf\">the ISO 10206 standard<\/a>, not one of the many proprietary extensions of Pascal).<\/p>\n<p>Extended Pascal was a great improvement over ISO 7185 Pascal. Its compiler, by Prospero Software from England, was fast and well documented. The language is small enough and its syntax appropriately verbose to make engineering professionals quickly productive in programming. Personally though, I spent most of my time programming in C++, modernizing their <a href=\"https:\/\/www.sarc.nl\/fairway\/\">system for computer aided design of ship hulls<\/a> <a href=\"https:\/\/www.qt.io\/\">using Qt<\/a> <a href=\"https:\/\/bitbucket.org\/Coin3D\/coin\/wiki\/Home\">and Coin3D<\/a>.<\/p>\n<h2 id=\"whenyourcompanyoutlivesaprogramminglanguage\">When your company outlives a programming language<\/h2>\n<p>Although selecting an ISO standard in favor of a proprietary Pascal dialect seemed wise at the time, it is apparent now that the company has outlived the language. Prospero Development Software Ltd was officially dissolved 15 years ago. Still, its former director, Tony Hetherington, continued giving support many years after, but he\u2019d be close to 86 years old now and can no longer be reached. Its website is gone, <a href=\"https:\/\/web.archive.org\/web\/20131023234615\/http:\/\/www.prosperosoftware.com:80\/\">last archived in 2013<\/a>. <a href=\"http:\/\/www.gnu-pascal.de\/gpc\/h-index.html\">There\u2019s GNU Pascal<\/a>, which also supports ISO 10206, but that project has stopped moving and long ago lost synchrony with gcc. Although there is no immediate crisis, it is clear that something needs to happen sometime if the company wants to continue its activities in the coming decades.<\/p>\n<h2 id=\"changingtheodds\">Changing the odds<\/h2>\n<p>A couple of years ago, I secretly started playing with the fantasy of replacing Extended Pascal with D. Even though D\u2019s syntax is somewhat different from Pascal, it shares at least four important similarities: support for nested functions, boundary checking, modules, and compilation speed. In addition, it has many traits that make the language attractive to engineers: good focus on performance and numerics, garbage collection, dynamic arrays, easy parallelization, understandable templates, contract programming, memory safety, unit tests, and even wysiwyg strings and formatted numerals. D\u2019s language features encourage experimentation, which resonates well with engineers.<\/p>\n<p>So I wondered what I could do to highlight D\u2019s significance to my employer and show it\u2019s an attractive language to switch to. I thought I could make a compelling case if I could write a parser in D that would take Extended Pascal source and transpile it to D source. At least I would have fun trying!<\/p>\n<p>So I went over to <a href=\"https:\/\/code.dlang.org\/\">code.dlang.org<\/a> to see if there were any D alternatives to flex and bison. There, <a href=\"https:\/\/code.dlang.org\/packages\/pegged\">I found Pegged<\/a>, and instantly the fun began. Pegged combines the functionality of flex and bison in one incredibly easy to use package, for which its creator Philippe Sigaud obviously <a href=\"https:\/\/github.com\/PhilippeSigaud\/Pegged\/wiki\/Pegged-Tutorial\">enjoyed writing excellent documentation<\/a>. Nowadays, Pegged is part of <a href=\"https:\/\/tour.dlang.org\/\">the D language tour<\/a> and you can <a href=\"https:\/\/tour.dlang.org\/tour\/en\/dub\/pegged\">try it out on-line<\/a> without having to install a thing. The beauty is that the grammar from the Extended Pascal <a href=\"http:\/\/pascal-central.com\/docs\/iso10206.pdf\">language specification<\/a> maps almost linearly to <a href=\"https:\/\/github.com\/veelo\/Pascal2D\/blob\/master\/source\/epgrammar.d\">the PEG from which<\/a> Pegged generates the parser. For this it makes heavy use of D\u2019s generic programming capabilities and compile-time function evaluation \u2014 it can generate a parser at compile time if you want it to!<\/p>\n<p>However, it wasn\u2019t smooth sailing all along. As I was testing D, I suddenly found <em>myself<\/em> being tested as well. I learned the hard way that there is <a href=\"https:\/\/en.wikipedia.org\/wiki\/Left_recursion\">a phenomenon called left-recursion<\/a>, from which a PEG parser typically cannot break out of. And the Extended Pascal grammar is left-recursive in several ways. Consequently, I spent many evenings and weekends researching parsing theory, until eventually I managed to extend Pegged with <a href=\"https:\/\/github.com\/PhilippeSigaud\/Pegged\/wiki\/Left-Recursion\">support for all kinds of left-recursion<\/a>! From one thing came another, and I added <a href=\"https:\/\/github.com\/PhilippeSigaud\/Pegged\/wiki\/Extended-PEG-Syntax#longest-match-alternation\">longest match alternation<\/a>, <a href=\"https:\/\/github.com\/PhilippeSigaud\/Pegged\/wiki\/Grammar-Debugging\">case insensitive literals<\/a>, the <a href=\"https:\/\/github.com\/PhilippeSigaud\/Pegged\/wiki\/Parse-Result\"><code>toHTML()<\/code> method<\/a> for dynamically <a href=\"https:\/\/cdn.rawgit.com\/PhilippeSigaud\/Pegged\/ade2aa5d\/pegged\/examples\/extended_pascal\/example.html\">browsing the syntax tree<\/a>, and a tracer for <a href=\"https:\/\/github.com\/PhilippeSigaud\/Pegged\/wiki\/Grammar-Debugging\">logging the parsing process<\/a>.<\/p>\n<p>Obviously, I was having fun. But more importantly, I was demonstrating that the D programming language is accessible enough that a naval architect can understand other people\u2019s code and expand it in non-trivial ways. The icing on the cake came when I was asked to present my experiences at <a href=\"http:\/\/dconf.org\/2017\/talks\/veelo.html\">DConf 2017 in Berlin<\/a>, which <a href=\"https:\/\/youtu.be\/t5y9dVMdI7I\">you can watch here<\/a> (and <a href=\"https:\/\/youtu.be\/3ugQ1FFGkLY\">here\u2019s the extra bit I presented at lunch time<\/a> for the livestream audience).<\/p>\n<p>At this time, I was able to automatically translate the following trivial example:<\/p>\n<pre class=\"prettyprint lang-pascal\">program hello(output);\r\n\r\nbegin\r\n    writeln('Hello D''s \"World\"!');\r\nend.<\/pre>\n<p>into D:<\/p>\n<pre class=\"prettyprint lang-d\">import std.stdio;\r\n\r\n\/\/ Program name: hello\r\nvoid main(string[] args)\r\n{\r\n    writeln(\"Hello D's \\\"World\\\"!\");\r\n}<\/pre>\n<h1 id=\"languagecompetition\">Language competition<\/h1>\n<p>Having come this far, <a href=\"https:\/\/www.sarc.nl\/\">the founder of SARC<\/a> agreed that it was time to investigate the merits of various alternative programming languages. We would do a thorough and objective comparison based on trial translations of a comprehensive set of language features. Due to the amount of manual labor that this requires, we had to drastically prune the space of programming languages in an initial review round. Note that what I am about to present does <em>not<\/em> declare which programming language is the best in our industry. What <em>we<\/em> are looking for is a language that allows an efficient transition from Extended Pascal without interrupting our business, and which enables us to take advantage of modern insights and tools.<\/p>\n<p>In the initial review round we looked at general language characteristics. Here I\u2019ll just highlight what fell through the sieve and why.<\/p>\n<p>Performance is important to us, which is why we did not consider interpreted languages. C++ is in use for one component of our software, but that was written from the ground up. We feel that the <a href=\"http:\/\/bulletin.iis.nsk.su\/files\/article\/markin.pdf\">options for translation<\/a> are not favorable, that its long compile times are a serious hindrance to productivity, and that there are too many ways in which one can shoot one\u2019s self in the foot. We cannot require our expert naval architects to also become experts in C++.<\/p>\n<p>Nowadays, whenever D is publicly evaluated, the younger languages Go and Rust are often brought up as alternatives. Here, we need not go into an in-depth comparison of these languages because both Rust and Go lack one feature that we rely on heavily: nested functions with access to variables in their enclosing scope. Solutions for eliminating nested functions, like bringing them into global scope and passing extra variables, or breaking files up into smaller modules, we find unattractive because it would complicate automated translation, and we\u2019d like to preserve the structure and style of our current code. <a href=\"https:\/\/gcc.gnu.org\/onlinedocs\/gcc\/Nested-Functions.html\">GNU C does offer nested functions<\/a>, but it is a non-standard extension and it has been predicted that many will <a href=\"https:\/\/www.youtube.com\/watch?v=Lo6Q2vB9AAg&amp;feature=youtu.be&amp;t=24m37s\">move away from C due to its unsafe features<\/a>. After this initial pruning, three languages remained on our shortlist: <strong>Free Pascal<\/strong>, <strong>Ada<\/strong> and <strong>D<\/strong>.<\/p>\n<p>As a basis for our detailed comparison, we wrote fifteen small programs that each used a specific feature of Extended Pascal that is important in our current code base. We then translated those programs into each language on our shortlist. We kept a simple score board on how well these features were represented in each language: +1 if the feature is supported or can be implemented, 0 if the lack of the feature can be worked around, and -1 if it can\u2019t. This is what came out of that evaluation:<\/p>\n<table>\n<colgroup>\n<col \/>\n<col style=\"text-align: center;\" \/>\n<col style=\"text-align: center;\" \/>\n<col style=\"text-align: center;\" \/> <\/colgroup>\n<thead>\n<tr>\n<th>Test<\/th>\n<th style=\"text-align: center;\">Free Pascal<\/th>\n<th style=\"text-align: center;\">Ada<\/th>\n<th style=\"text-align: center;\">D<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Arrays beginning at arbitrary indices<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Sets<\/td>\n<td style=\"text-align: center;\">0<\/td>\n<td style=\"text-align: center;\">0<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Schema types<\/td>\n<td style=\"text-align: center;\">0<\/td>\n<td style=\"text-align: center;\">0<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Types with custom initial values<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #ff0000;\">-1<\/span><\/td>\n<td style=\"text-align: center;\">0<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Classes<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Casts<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Protection against use of dangling pointers<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #ff0000;\">-1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Thread safe memory [de]allocation<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Calling into Windows API<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Forwarding Windows callbacks to nested functions<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Speed of calculations<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Calling procedures written in assembly<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\">0<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Calling procedures in a DLL<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Binary compatibility of strings<\/td>\n<td style=\"text-align: center;\">0<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<td style=\"text-align: center;\"><span style=\"color: #008000;\">+1<\/span><\/td>\n<\/tr>\n<tr>\n<td>Binary compatible file i\/o<\/td>\n<td style=\"text-align: center;\"><span style=\"color: #ff0000;\">-1<\/span><\/td>\n<td style=\"text-align: center;\">0<\/td>\n<td style=\"text-align: center;\">0<\/td>\n<\/tr>\n<tr>\n<td><strong>Score<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>6<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>10<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>14<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>So, Free Pascal is the only candidate with negative scores, Ada positions itself in the middle, and D achieves an almost perfect score. Not effortlessly, though; we\u2019ll talk about some of the technical challenges later. Because Free Pascal is, like D, fully Open Source and written in itself, extending the language and filling in the gaps is theoretically possible. Although some of its deficiencies could certainly be resolved that way, others would be quite complicated and\/or unlikely to be accepted upstream.<\/p>\n<p>We also estimated the productivity of the languages. Free Pascal scored high because it is closest to what we are used to. Despite its dissimilar syntax, D scored high because of its expressiveness and flexibility. Ada scored lowest because of its rigidity and because of the extra work the programmer has to put in (most importantly casts and conversions). Ada is more verbose than Pascal which was disliked by some of us because it can somewhat obscure the essence of what a piece of code tries to express, and frequently the code became not only verbose but cryptic, which was unanimously disliked.<\/p>\n<p>Third, we estimated the future prospects and the advantages each language could bring to the table. Although Free Pascal has a more active community than we expected it to have, we do not see great potential for growth. Ada is renowned for its support for writing reliable code (although it has no monopoly in that field) but it does come at a cost and requires real effort. D has a dynamic and open community, supports both script-like productivity and high performance, includes various features for writing reliable software (approaching Ada but at a much lower cost), and offers some unique advanced features with which wonders can be accomplished.<\/p>\n<p>Finally, we estimated the effort of translation. Although Free Pascal is very similar to Extended Pascal, missing features pose a real problem and would require a high degree of manual translation and rewriting. <a href=\"http:\/\/p2ada.sourceforge.net\/\">Although <code>p2ada<\/code> exists<\/a>, it only works partially in our case and does not fully support Extended Pascal. Because Ada frequently requires additional code (casting to the correct type, pulling in a package, instantiating a generic type, adding a pragma, splitting up <code>Put_Line<\/code>s etc.), writing or extending a reliable transpiler into Ada would be more difficult than doing the same into D.<\/p>\n<h1 id=\"selectingawinner\">Selecting a winner<\/h1>\n<p>I gave away the winner in the title, but we landed at that conclusion as follows. Ada was the first language to be dropped. We really felt that the extra work that the programmer has to put in is a brake on productivity and creativity. Although it barely played a role in our evaluation, illustrative is the difference between the Ada and D equivalents to the <a href=\"https:\/\/www.fluentcpp.com\/2017\/09\/25\/expressive-cpp17-coding-challenge\/\">Expressive C++17 Challenge<\/a>. The <a href=\"https:\/\/seb.wilzba.ch\/b\/2018\/02\/the-expressive-c17-coding-challenge-in-d\/\">D solution<\/a> is both concise and expressive, the <a href=\"https:\/\/two-wrongs.com\/expressive-ada-2012-challenge\">Ada solution<\/a> is hardly expressive and consists of more lines than I want to write or read. Also of secondary importance, but difficult to ignore, is the difference between the communities surrounding the languages, which in Ada\u2019s case is AdaCore Support, who has no problems demanding annual five-figure subscription fees.<\/p>\n<p>Although akin to our current language, Free Pascal was mainly dropped due to its porting challenges and our estimation that its potential is lower and its future outlook is less optimistic than that of D. If we were to choose Free Pascal, we would basically invest a lot of effort only to arrive at a technological solution that we felt would be of lower quality than we currently have.<\/p>\n<p>And that\u2019s were I saw a dream come true: A clap on the table by the company founder and it was decided to commit to the effort of bringing twenty-five years worth of Extended Pascal code to D!<\/p>\n<h1 id=\"whatmakesadifference\">What makes a difference<\/h1>\n<p>In short, my experience is that if a feature is not present in the language, D is powerful enough that the feature can be implemented in a library. Translating each sample program by hand has really helped to focus on replicating functionality, leaving the translation process for later concern. This has led to <a href=\"https:\/\/veelo.github.io\/Pascal2D\/\">writing a compatibility library<\/a> with types and functions that are vital for the conversion. Now that equivalents are known and the parser is done, I just have to implement code generation.<\/p>\n<p>Below follows another example that currently translates automatically and executes identically. It iterates over a fixed length array running from <code>2<\/code> to <code>20<\/code> inclusive, fills it with values, prints the memory footprint and writes it to binary file:<\/p>\n<pre class=\"prettyprint lang-pascal\">program arraybase(input,output);\r\n\r\ntype t = array[2..20] of integer;\r\nvar a : t;\r\n    n : integer;\r\n    f : bindable file of t;\r\n\r\nbegin\r\n  for n := 2 to 20 do\r\n    a[n] := n;\r\n  writeln('Size of t in bytes is ',sizeof(a):1); { 76 }\r\n  if openwrite(f,'array.dat') then\r\n    begin\r\n      write(f,a);\r\n      close(f);\r\n    end;\r\nend.<\/pre>\n<p>Transpiled to D (or should I say Dascal?) and <a href=\"https:\/\/code.dlang.org\/packages\/dfmt\">post-processed by dfmt<\/a> to fix up formatting:<\/p>\n<pre class=\"prettyprint lang-d\">import epcompat;\r\nimport std.stdio;\r\n\r\n\/\/ Program name: arraybase\r\nalias t = StaticArray!(int, 2, 20);\r\n\r\nt a;\r\nint n;\r\nBindable!t f;\r\n\r\nvoid main(string[] args)\r\n{\r\n    for (n = 2; n &lt;= 20; n++)\r\n        a[n] = n;\r\n    writeln(\"Size of t in bytes is \", a.sizeof); \/\/ 76\r\n    if (openwrite(f, \"array.dat\"))\r\n    {\r\n        epcompat.write(f, a);\r\n        close(f);\r\n    }\r\n}<\/pre>\n<p>Of course this is by no means idiomatic D, but the fact that it is recognizable and readable is nice, especially for my colleagues who will have to go through an unusual transition. By the way, did you notice that code comments are preserved?<\/p>\n<p>One <em>very-nice-to-have<\/em> feature is binary file compatibility; In fact it may have been the killer feature, without which D might not have been so victorious. The case is that whenever a persistent data structure is extended in our software, we make sure that we can still read and convert that structure from its prior format. That way, if a client pulls out an old design from its archives and runs it through our current software, it will still work without the user even being aware that conversion occurs, possibly in multiple steps. Not having to give up that ability is very attractive.<\/p>\n<p>But it wasn\u2019t easy to get there. The main difficulty is the difference in how strings are represented in D and the Prospero implementation of Extended Pascal, in memory and on file. This presented the challenge of how to preserve binary compatibility in file I\/O with data structures that contain string members.<\/p>\n<h2 id=\"strings\">Strings<\/h2>\n<p>In Prospero Extended Pascal, strings are implemented as a schema type, which is a parameterized type that can be used in the following ways:<\/p>\n<pre class=\"prettyprint lang-pascal\">type string80 = string(80);\r\nvar str1 : string80;\r\n    str2 : string(60);\r\nprocedure foo(s : string);<\/pre>\n<p>This defines <code>string80<\/code> to be an alias for a string type discriminated to have a capacity of 80 characters. Discriminated string variables, like <code>str1<\/code> and <code>str2<\/code>, can be passed to functions and procedures that take undiscriminated strings as arguments, like <code>foo<\/code>, which thereby work on strings of any capacity. In memory, <code>str1<\/code> is laid out as a sequence of 80 <code>char<\/code>s, followed by a <code>ushort<\/code> that encodes the length of the string. I say <em>encodes<\/em> because a shorter string is padded with <code>\\0<\/code>s up to the capacity and the <code>ushort<\/code> actually contains the length of that padding. This way, when a pointer to the string is passed to a C function and the contents of the string occupy its full capacity, the <code>0<\/code> in the padding length doubles as the terminating <code>\\0<\/code> of the C string.<\/p>\n<p>My first thought was to mimic this data representation with a D template. But that would require procedures like <code>foo<\/code> to be turned into templates as well, which would escalate horribly into template bloat, a problem with multiple string arguments and argument ordering, and would complicate translation. Besides, schema types can also be discriminated at run time, which does not translate to a template.<\/p>\n<p>Could some sort of inheritance scheme be the solution? Not really, because instances of D classes live on the heap, so a string embedded in a <code>struct<\/code> would just be a pointer instead of the <code>char<\/code> array and <code>ushort<\/code>.<\/p>\n<p>But binary layout is actually only relevant in files, and in a stroke of insight I realized that this must be why <a href=\"https:\/\/dlang.org\/spec\/attribute.html#uda\">user-defined attributes, or UDAs,<\/a> exist. If I annotate the string with the correct capacity for file I\/O, then I can just use native D <code>string<\/code>s everywhere, which genuinely must be the best possible translation and solves the function argument issue. Annotation can be done with an instance of a <code>struct<\/code> like<\/p>\n<pre class=\"prettyprint lang-d\">struct EPString\r\n{\r\n    ushort capacity;\r\n}<\/pre>\n<p>The above Pascal snippet then translates to D like so:<\/p>\n<pre class=\"prettyprint lang-d\">@EPString(80) struct string80 { string _; alias _ this; }\r\nstring80 str1;\r\n@EPString(60) string str2;\r\nvoid foo(string s);<\/pre>\n<p>Notice how the <code>string80<\/code> alias is translated into the slightly convoluted <code>struct<\/code> instead of a normal D <code>alias<\/code>, which would have looked like<\/p>\n<pre class=\"prettyprint lang-d\">@EPString(80) alias string80 = string;\r\n&lt;\/code&gt;<\/pre>\n<p>Although that compiles, there is no way to retrieve the UDA in that case because plain <code>alias<\/code> does not introduce a symbol. Then <code>hasUDA!(typeof(str1), EPString)<\/code> would have been equivalent to <code>hasUDA!(string, EPString)<\/code> which evaluates to <code>false<\/code>. By using the <code>struct<\/code>, <code>string80<\/code> is a symbol so <code>typeof(str1)<\/code> gives <code>string80<\/code>, and <code>hasUDA!(string80, EPString)<\/code> evaluates to <code>true<\/code> in this example.<\/p>\n<p>There is one side effect that we will have to learn to accept, and that is that taking a slice of a string does not produce the same result in D as it does in Extended Pascal. That is because string indices start at 1 in Extended Pascal and at 0 in D. My strategy is to eliminate slices from the source and replace them with a call to the standard <code>substr<\/code> function, which I can implement with index correction. Finding all string slices can be accomplished with a switch in the transpiler that makes it insert a <code>static if<\/code> to test if the slice is being taken on a <code>string<\/code>, and abort compilation if it is. (<code>Array<\/code>s are transpiled into a custom array type that handles slices and indices compatibly with Extended Pascal.)<\/p>\n<h2 id=\"binarycompatiblefileio\">Binary compatible file I\/O<\/h2>\n<p>Now, to write <code>struct<\/code>s to file and handle any embedded <code>@EPString()<\/code>-annotated strings specially, we can use compile-time introspection in an overload to <code>toFile<\/code> that acts on <code>struct<\/code>s as shown below. I have left out handling of aliased strings for clarity, as well as <code>shortstring<\/code>, which is a legacy string type with yet a different binary format.<\/p>\n<pre class=\"prettyprint lang-d\">void toFile(S)(S s, File f) if (is(S == struct))\r\n{\r\n    import std.traits;\r\n    static if (!hasIndirections!S)\r\n        f.lockingBinaryWriter.put(s);\r\n    else\r\n        \/\/ TODO unions\r\n        foreach(field; FieldNameTuple!S)\r\n        {\r\n            \/\/ If the member has itself a toFile method, call it.\r\n            static if (hasMember!(typeof(__traits(getMember, s, field)), \"toFile\") &amp;&amp;\r\n                       __traits(compiles, __traits(getMember, s, field).toFile(f)))\r\n                __traits(getMember, s, field).toFile(f);\r\n            \/\/ If the member is a struct, recurse.\r\n            else static if (is(typeof(__traits(getMember, s, field)) == struct))\r\n                toFile(__traits(getMember, s, field), f);\r\n            \/\/ Treat strings specially.\r\n            else static if (is(typeof(__traits(getMember, s, field)) == string))\r\n            {\r\n                \/\/ Look for a UDA on the member string.\r\n                static if (hasUDA!(__traits(getMember, s, field), EPString))\r\n                {\r\n                    enum capacity = getUDAs!(__traits(getMember, s, field), EPString)[0].capacity;\r\n                    static assert(capacity &gt; 0);\r\n                    writeAsEPString(__traits(getMember, s, field), capacity, f);\r\n                }\r\n                else static assert(false, `Need an @EPString(n) in front of ` ~ fullyQualifiedName!S ~ `.` ~ field );\r\n            }\r\n            \/\/ Just write other data members.\r\n            else static if(!isFunction!(__traits(getMember, s, field)))\r\n                f.lockingBinaryWriter.put(__traits(getMember, s, field));\r\n        }\r\n}\r\n<\/pre>\n<p>At the time of writing, I still have work to do for <code>union<\/code>s, which are used in the translation of variant records (including considering the use of one of the seven existing library solutions <a href=\"https:\/\/dlang.org\/phobos\/std_variant.html#.Algebraic\">1<\/a>, <a href=\"https:\/\/code.dlang.org\/packages\/tag\">2<\/a>, <a href=\"https:\/\/code.dlang.org\/packages\/tagged_union\">3<\/a>, <a href=\"https:\/\/code.dlang.org\/packages\/taggedalgebraic\">4<\/a>, <a href=\"https:\/\/github.com\/nordlow\/phobos-next\/blob\/master\/src\/vary.d#L30\">5<\/a>, <a href=\"https:\/\/code.dlang.org\/packages\/minivariant\">6<\/a>, <a href=\"https:\/\/code.dlang.org\/packages\/sumtype\">7<\/a>).<\/p>\n<p>Currently, <a href=\"https:\/\/forum.dlang.org\/post\/zwpctoccawmkwfoqkoyf@forum.dlang.org\">detecting <code>union<\/code>s is a bit involved <\/a>. Also, there is a complication in the determination of the size of a union when the largest variant contains strings: the D version of that variant may <em>not<\/em> be the largest because D <code>string<\/code>s are just slices. I\u2019ll probably work around this by adding a dummy variant that is a fixed size array of bytes to force the size of the <code>union<\/code> to be compatible with Extended Pascal. This is the reason why D scored a mere <code>0<\/code> in file format compatibility. It is amazing what D allows you to do though, so I may be able to do all of that automatically and award D a perfect score retroactively. On the other hand, it is probably easiest to just add the dummy variant in the Pascal source at the few places where it matters and be done with it.<\/p>\n<h1 id=\"thewayforward\">The way forward<\/h1>\n<p>Obviously, this is long term planning. It has taken years to grow into D; it will possibly take a year, and probably longer, to migrate to D. Unless others turn up who are in the same boat as us (please contribute!) it\u2019ll be me who has to row this ship to D-land and I still have my regular duties to attend to. My colleagues will continue to develop in Extended Pascal as usual, and once <a href=\"https:\/\/github.com\/veelo\/Pascal2D\">my transpiler<\/a> is able to translate all or almost all of it, we will make the switch to D overnight. From then on, we\u2019ll be in it for the long run. We trust to be with D and D to be with us for decades to come!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Imagine there is this little-known programming language in which you enjoy programming in your free time. You know it is ready for prime time and you dream about using it at work everyday. This is the story about how I made a dream like that come true.<\/p>\n","protected":false},"author":30,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[26,28,12,9,34],"tags":[],"_links":{"self":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1604"}],"collection":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/users\/30"}],"replies":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/comments?post=1604"}],"version-history":[{"count":5,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1604\/revisions"}],"predecessor-version":[{"id":1609,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1604\/revisions\/1609"}],"wp:attachment":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/media?parent=1604"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/categories?post=1604"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/tags?post=1604"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}