{"id":1772,"date":"2018-11-06T15:07:53","date_gmt":"2018-11-06T15:07:53","guid":{"rendered":"http:\/\/dlang.org\/blog\/?p=1772"},"modified":"2021-10-08T11:02:26","modified_gmt":"2021-10-08T11:02:26","slug":"lost-in-translation-encapsulation","status":"publish","type":"post","link":"https:\/\/dlang.org\/blog\/2018\/11\/06\/lost-in-translation-encapsulation\/","title":{"rendered":"Lost in Translation: Encapsulation"},"content":{"rendered":"<blockquote><p>I first learned programming in BASIC. Outgrew it, and switched to Fortran. Amusingly, my early Fortran code looked just like BASIC. My early C code looked like Fortran. My early C++ code looked like C. \u2013 <a href=\"http:\/\/walterbright.com\/\">Walter Bright, the creator of D<\/a><\/p><\/blockquote>\n<p>Programming in a language is not the same as <em>thinking<\/em> in that language. A natural side effect of experience with one programming language is that we view other languages through the prism of its features and idioms. Languages in the same family may look and feel similar, but there are guaranteed to be subtle differences that, when not accounted for, can lead to compiler errors, bugs, and missed opportunities. Even when good docs, books, and other materials are available, most misunderstandings are only going to be solved through trial-and-error.<\/p>\n<p>D programmers come from a variety of programming backgrounds, C-family languages perhaps being the most common among them. Understanding the differences and how familiar features are tailored to D can open the door to more possibilities for organizing a code base, and designing and implementing an API. This article is the first of a few that will examine D features that can be overlooked or misunderstood by those experienced in similar languages.<\/p>\n<p>We\u2019re starting with a look at a particular feature that\u2019s common among languages that support Object-Oriented Programming (OOP). There&#8217;s one aspect in particular of the D implementation that experienced programmers are sure they already fully understand and are often surprised to later learn they don&#8217;t.<\/p>\n<h2 id=\"encapsulation\">Encapsulation<\/h2>\n<p>Most readers will already be familiar with the concept of encapsulation, but I want to make sure we\u2019re on the same page. For the purpose of this article, I\u2019m talking about encapsulation in the form of separating interface from implementation. Some people tend to think of it strictly as it relates to object-oriented programming, but it\u2019s a concept that\u2019s more broad than that. Consider this C code:<\/p>\n<pre class=\"prettyprint lang-c_cpp\">#include &lt;stdio.h&gt;\r\nstatic size_t s_count;\r\n\r\nvoid print_message(const char* msg) {\r\n    puts(msg);\r\n    s_count++;\r\n}\r\n\r\nsize_t num_prints() { return s_count; }<\/pre>\n<p>In C, functions and global variables decorated with <code>static<\/code> become private to the <em>translation unit<\/em> (i.e. the source file along with any headers brought in via <code>#include<\/code>) in which they are declared. Non-static declarations are publicly accessible, usually provided in header files that lay out the public API for clients to use. Static functions and variables are used to hide implementation details from the public API.<\/p>\n<p>Encapsulation in C is a minimal approach. C++ supports the same feature, but it also has anonymous namespaces that can encapsulate type definitions in addition to declarations. Like Java, C#, and other languages that support OOP, C++ also has <em>access modifiers<\/em> (alternatively known as access specifiers, protection attributes, visibility attributes) which can be applied to <code>class<\/code> and <code>struct<\/code> member declarations.<\/p>\n<p>C++ supports the following three access modifiers, common among OOP languages:<\/p>\n<ul>\n<li><code>public<\/code> &#8211; accessible to the world<\/li>\n<li><code>private<\/code> &#8211; accessible only within the class<\/li>\n<li><code>protected<\/code> &#8211; accessible only within the class and its derived classes<\/li>\n<\/ul>\n<p>An experienced Java programmer might raise a hand to say, \u201cUm, excuse me. That\u2019s not a complete definition of <code>protected<\/code>.\u201d That\u2019s because in Java, it looks like this:<\/p>\n<ul>\n<li><code>protected<\/code> &#8211; accessible within the class, its derived classes, and classes in the same package.<\/li>\n<\/ul>\n<p>Every class in Java belongs to a package, so it makes sense to factor packages into the equation. Then there\u2019s this:<\/p>\n<ul>\n<li><em>package-private<\/em> (not a keyword) &#8211; accessible within the class and classes in the same package.<\/li>\n<\/ul>\n<p>This is the default access level in Java when no access modifier is specified. This combined with <code>protected<\/code> make packages a tool for encapsulation beyond classes in Java.<\/p>\n<p>Similarly, C# has assemblies, <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/ms973231.aspx\">which MSDN defines as<\/a> \u201ca collection of types and resources that forms a logical unit of functionality\u201d. In C#, the meaning of <code>protected<\/code> is identical to that of C++, but the language has two additional forms of protection that relate to assemblies and that are analogous to Java\u2019s <code>protected<\/code> and package-private.<\/p>\n<ul>\n<li><code>internal<\/code> &#8211; accessible within the class and classes in the same assembly.<\/li>\n<li><code>protected internal<\/code> &#8211; accessible within the class, its derived classes, and classes in the same assembly.<\/li>\n<\/ul>\n<p>Examining encapsulation in other programming languages will continue to turn up similarities and differences. Common encapsulation idioms are generally adapted to language-specific features. The fundamental concept remains the same, but the scope and implementation vary. So it should come as no surprise that D also approaches encapsulation in its own, language-specific manner.<\/p>\n<h2 id=\"modules\">Modules<\/h2>\n<p>The foundation of D\u2019s approach to encapsulation <a href=\"https:\/\/dlang.org\/spec\/module.html\">is the module<\/a>. Consider this D version of the C snippet from above:<\/p>\n<pre class=\"prettyprint lang-d\">module mymod;\r\n\r\nprivate size_t _count;\r\n\r\nvoid printMessage(string msg) {\r\n    import std.stdio : writeln;\r\n\r\n    writeln(msg);\r\n    _count++;\r\n}\r\n\r\nsize_t numPrints() { return _count; }<\/pre>\n<p>In D, access modifiers can apply to module-scope declarations, not just <code>class<\/code> and <code>struct<\/code> members. <code>_count<\/code> is <code>private<\/code>, meaning it is not visible outside of the module. <code>printMessage<\/code> and <code>numPrints<\/code> have no access modifiers; they are <code>public<\/code> by default, making them visible and accessible outside of the module. Both functions could have been annotated with the keyword <code>public<\/code>.<\/p>\n<p><em>Note that imports in module scope are <code>private<\/code> by default, meaning the symbols in the imported modules are not visible outside the module, and local imports, as in the example, are never visible outside of their parent scope.<\/em><\/p>\n<p>Alternative syntaxes are supported, giving more flexibility to the layout of a module. For example, there\u2019s C++ style:<\/p>\n<pre class=\"prettyprint lang-d\">module mymod;\r\n\r\n\/\/ Everything below this is private until either another\r\n\/\/ protection attribute or the end of file is encountered.\r\nprivate:\r\n    size_t _count;\r\n\r\n\/\/ Turn public back on\r\npublic:\r\n    void printMessage(string msg) {\r\n        import std.stdio : writeln;\r\n\r\n        writeln(msg);\r\n        _count++;\r\n    }\r\n\r\n    size_t numPrints() { return _count; }<\/pre>\n<p>And this:<\/p>\n<pre class=\"prettyprint lang-d\">module mymod;\r\n\r\nprivate {\r\n    \/\/ Everything declared within these braces is private.\r\n    size_t _count;\r\n}\r\n\r\n\/\/ The functions are still public by default\r\nvoid printMessage(string msg) {\r\n    import std.stdio : writeln;\r\n\r\n    writeln(msg);\r\n    _count++;\r\n}\r\n\r\nsize_t numPrints() { return _count; }<\/pre>\n<p>Modules can belong to packages. A package is a way to group related modules together. In practice, the source files corresponding to each module should be grouped together in the same directory on disk. Then, in the source file, each directory becomes part of the module declaration:<\/p>\n<pre class=\"prettyprint lang-d\">\/\/ mypack\/amodule.d\r\nmypack.amodule;\r\n\r\n\/\/ mypack\/subpack\/anothermodule.d\r\nmypack.subpack.anothermodule;<\/pre>\n<p><em>Note that it\u2019s possible to have package names that don\u2019t correspond to directories and module names that don\u2019t correspond to files, but it\u2019s bad practice to do so. A deep dive into packages and modules will have to wait for a future post.<\/em><\/p>\n<p><code>mymod<\/code> does not belong to a package, as no packages were included in the module declaration. Inside <code>printMessage<\/code>, the function <code>writeln<\/code> is imported from the <code>stdio<\/code> module, which belongs to the <code>std<\/code> package. Packages have no special properties in D and primarily serve as namespaces, but they are a common part of the codescape.<\/p>\n<p>In addition to <code>public<\/code> and <code>private<\/code>, the <code>package<\/code> access modifier can be applied to module-scope declarations to make them visible only within modules in the same package.<\/p>\n<p>Consider the following example. There are three modules in three files (only one module per file is allowed), each belonging to the same root package.<\/p>\n<pre class=\"prettyprint lang-d\">\/\/ src\/rootpack\/subpack1\/mod2.d\r\nmodule rootpack.subpack1.mod2;\r\nimport std.stdio;\r\n\r\npackage void sayHello() {\r\n    writeln(\"Hello!\");\r\n}\r\n\r\n\/\/ src\/rootpack\/subpack1\/mod1.d\r\nmodule rootpack.subpack1.mod1;\r\nimport rootpack.subpack1.mod2;\r\n\r\nclass Speaker {\r\n    this() { sayHello(); }\r\n}\r\n\r\n\/\/ src\/rootpack\/app.d\r\nmodule rootpack.app;\r\nimport rootpack.subpack1.mod1;\r\n\r\nvoid main() {\r\n    auto speaker = new Speaker;\r\n}<\/pre>\n<p>Compile this with the following command line:<\/p>\n<pre>cd src\r\ndmd -i rootpack\/app.d<\/pre>\n<p><em>The <code>-i<\/code> switch tells the compiler to automatically compile and link imported modules (excluding those in the standard library namespaces <code>core<\/code> and <code>std<\/code>). Without it, each module would have to be passed on the command line, else they wouldn\u2019t be compiled and linked.<\/em><\/p>\n<p>The class <code>Speaker<\/code> has access to <code>sayHello<\/code> because they belong to modules that are in the same package. Now imagine we do a refactor and we decide that it could be useful to have access to <code>sayHello<\/code> throughout the <code>rootpack<\/code> package. D provides the means to make that happen by allowing the <code>package<\/code> attribute to be parameterized with the fully-qualified name (FQN) of a package. So we can change the declaration of <code>sayHello<\/code> like so:<\/p>\n<pre class=\"prettyprint lang-d\">package(rootpack) void sayHello() {\r\n    writeln(\"Hello!\");\r\n}<\/pre>\n<p>Now all modules in <code>rootpack<\/code> and <em>all modules in packages that descend from <code>rootpack<\/code><\/em> will have access to <code>sayHello<\/code>. Don\u2019t overlook that last part. A parameter to the <code>package<\/code> attribute is saying that a package and all of its descendants can access this symbol. It may sound overly broad, but it isn\u2019t.<\/p>\n<p>For one thing, only a package that is a direct ancestor of the module\u2019s parent package can be used as a parameter. Consider a module <code>rootpack.subpack.subsub.mymod<\/code>. That name contains all of the packages that are legal parameters to the <code>package<\/code> attribute in <code>mymod.d<\/code>, namely <code>rootpack<\/code>, <code>subpack<\/code>, and <code>subsub<\/code>. So we can say the following about symbols declared in <code>mymod<\/code>:<\/p>\n<ul>\n<li><code>package<\/code> &#8211; visible only to modules in the parent package of <code>mymod<\/code>, i.e. the <code>subsub<\/code> package.<\/li>\n<li><code>package(subsub)<\/code> &#8211; visible to modules in the <code>subsub<\/code> package and modules in all packages descending from <code>subsub<\/code>.<\/li>\n<li><code>package(subpack)<\/code> &#8211; visible to modules in the <code>subpack<\/code> package and modules in all packages descending from <code>subpack<\/code>.<\/li>\n<li><code>package(rootpack<\/code>) &#8211; visible to modules in the <code>rootpack<\/code> package and modules in all packages descending from <code>rootpack<\/code>.<\/li>\n<\/ul>\n<p>This feature makes packages another tool for encapsulation, allowing symbols to be hidden from the outside world but visible and accessible in specific subtrees of a package hierarchy. In practice, there are probably few cases where expanding access to a broad range of packages in an entire subtree is desirable.<\/p>\n<p>It\u2019s common to see parameterized package protection in situations where a package exposes a common public interface and hides implementations in one or more subpackages, such as a <code>graphics<\/code> package with subpackages containing implementations for DirectX, Metal, OpenGL, and Vulkan. Here, D\u2019s access modifiers allow for three levels of encapsulation:<\/p>\n<ul>\n<li>the <code>graphics<\/code> package as a whole<\/li>\n<li>each subpackage containing the implementations<\/li>\n<li>individual modules in each package<\/li>\n<\/ul>\n<p>Notice that I didn\u2019t include <code>class<\/code> or <code>struct<\/code> types as a fourth level. The next section explains why.<\/p>\n<h2 id=\"classesandstructs\">Classes and structs<\/h2>\n<p>Now we come to the motivation for this article. I can\u2019t recall ever seeing anyone <a href=\"https:\/\/forum.dlang.org\/\">come to the D forums<\/a> professing surprise about package protection, but the behavior of access modifiers in classes and structs is something that pops up now and then, largely because of expectations derived from experience in other languages.<\/p>\n<p>Classes and structs use the same access modifiers as modules: <code>public<\/code>, <code>package<\/code>, <code>package(some.pack)<\/code>, and <code>private<\/code>. The <code>protected<\/code> attribute can only be used in classes, as inheritance is not supported for structs (nor for modules, which aren\u2019t even objects). <code>public<\/code>, <code>package<\/code>, and <code>package(some.pack)<\/code> behave exactly as they do at the module level. The thing that surprises some people is that <code>private<\/code> also behaves the same way.<\/p>\n<pre class=\"prettyprint lang-d\">import std.stdio;\r\n\r\nclass C {\r\n    private int x;\r\n}\r\n\r\nvoid main() {\r\n    C c = new C();\r\n    c.x = 10;\r\n    writeln(c.x);\r\n}<\/pre>\n<p><em><a href=\"https:\/\/run.dlang.io\/is\/L7geN6\">Run this example online<\/a><\/em><\/p>\n<p>Snippets like this are posted in the forums now and again by people exploring D, accompanying a question along the lines of, \u201cWhy does this compile?\u201d (and sometimes, \u201cI think I\u2019ve found a bug!\u201d). This is an example of where experience can cloud expectations. Everyone knows what <code>private<\/code> means, so it\u2019s not something most people bother to look up in the language docs. However, <a href=\"https:\/\/dlang.org\/spec\/attribute.html#visibility_attributes\">those who do would find this<\/a>:<\/p>\n<blockquote><p>Symbols with private visibility can only be accessed from within the same module.<\/p><\/blockquote>\n<p><code>private<\/code> in D always means <em>private to the module<\/em>. The module is the lowest level of encapsulation. It\u2019s easy to understand why some experience an initial resistance to this, that it breaks encapsulation, but the intent behind the design is to <em>strengthen<\/em> encapsulation. It\u2019s inspired by the C++ <code>friend<\/code> feature.<\/p>\n<p>Having implemented and maintained a C++ compiler for many years, Walter understood the need for a feature like <code>friend<\/code>, but felt that it wasn\u2019t the best way to go about it.<\/p>\n<blockquote><p>Being able to declare a \u201cfriend\u201d that is somewhere in some other file runs against notions of encapsulation.<\/p><\/blockquote>\n<p>An alternative is to take a Java-like approach of one class per module, but he felt that was too restrictive.<\/p>\n<blockquote><p>One may desire a set of closely interrelated classes that encapsulate a concept, and those should go into a module.<\/p><\/blockquote>\n<p>So the way to view a module in D is not just as a single source file, but as a unit of encapsulation. It can contain free functions, classes, and structs, all operating on the same data declared in module scope and class scope. The public interface is still protected from changes to the private implementation inside the module. Along those same lines, <code>protected<\/code> class members are accessible not just in derived classes, but also in the module.<\/p>\n<p>Sometimes though, there really is a benefit to denying access to private members in a module. The bigger a module becomes, the more of a burden it is to maintain, especially when it\u2019s being maintained by a team. Every place a <code>private<\/code> member of a class is accessed in a module means more places to update when a change is made, thereby increasing the maintenance burden. The language provides the means to alleviate the burden in the form of <a href=\"https:\/\/dlang.org\/spec\/module.html#package-module\">the special <em>package module<\/em><\/a>.<\/p>\n<p>In some cases, we don\u2019t want to require the user to import multiple modules individually. Splitting a large module into smaller ones is one of those cases. Consider the following file tree:<\/p>\n<pre>-- mypack\r\n---- mod1.d\r\n---- mod2.d<\/pre>\n<p>We have two modules in a package called <code>mypack<\/code>. Let\u2019s say that <code>mod1.d<\/code> has grown extremely large and we\u2019re starting to worry about maintaining it. For one, we want to ensure that private members aren\u2019t manipulated outside of class declarations with hundreds or thousands of lines in between. We want to split the module into smaller ones, but at the same time we don\u2019t want to break user code. Currently, users can get at the module\u2019s symbols by importing it with <code>import mypack.mod1<\/code>. We want that to continue to work. Here\u2019s how we do it:<\/p>\n<pre>-- mypack\r\n---- mod1\r\n------ package.d\r\n------ split1.d\r\n------ split2.d\r\n---- mod2.d<\/pre>\n<p>We\u2019ve split <code>mod1.d<\/code> into two new modules and put them in a package named <code>mod1<\/code>. We\u2019ve also created a special <code>package.d<\/code> file, which looks like this:<\/p>\n<pre class=\"prettyprint lang-d\">module mypack.mod1;\r\n\r\npublic import mypack.mod1.split1,\r\n              mypack.mod1.split2;<\/pre>\n<p>When the compiler sees <code>package.d<\/code>, it knows to treat it specially. Users will be able to continue using <code>import mypack.mod1<\/code> without ever caring that it\u2019s now split into two modules in a new package. The key is the module declaration at the top of <code>package.d<\/code>. It\u2019s telling the compiler to treat this package as the module <code>mod1<\/code>. And instead of automatically importing all modules in the package, the requirement to list them as public imports in <code>package.d<\/code> allows more freedom in implementing the package. Sometimes, you might want to require the user to explicitly import a module even when a <code>package.d<\/code> is present.<\/p>\n<p>Now users will continue seeing <code>mod1<\/code> as a single module and can continue to import it as such. Meanwhile, encapsulation is now more stringently enforced internally. Because <code>split1<\/code> and <code>split2<\/code> are now separate modules, they can\u2019t touch each other\u2019s private parts. Any part of the API that needs to be shared by both modules can be annotated with <code>package<\/code> protection. Despite the internal transformation, the public interface remains unchanged, and encapsulation is maintained.<\/p>\n<h2 id=\"wrappingup\">Wrapping up<\/h2>\n<p>The full list of access modifiers in D can be defined as such:<\/p>\n<ul>\n<li><code>public<\/code> &#8211; accessible everywhere.<\/li>\n<li><code>package<\/code> &#8211; accessible to modules in the same package.<\/li>\n<li><code>package(some.pack)<\/code> &#8211; accessible to modules in the package <code>some.pack<\/code> and to the modules in all of its descendant packages.<\/li>\n<li><code>private<\/code> &#8211; accessible only in the module.<\/li>\n<li><code>protected<\/code> (classes only) &#8211; accessible in the module and in derived classes.<\/li>\n<\/ul>\n<p>Hopefully, this article has provided you with the perspective to think in D instead of your \u201cnative\u201d language when thinking about encapsulation in D.<\/p>\n<p><em>Thanks to Ali \u00c7ehreli, Joakim Noah, and Nicholas Wilson for reviewing and providing feedback on this article.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>D programmers come from a variety of programming backgrounds, C-family languages perhaps being the most common among them. Understanding the differences and how familiar features are tailored to D can open the door to more possibilities for organizing a code base, and designing and implementing an API. This article is the first of a few that will examine D features that can be overlooked or misunderstood by those experienced in similar languages.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[26,20],"tags":[],"_links":{"self":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1772"}],"collection":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/comments?post=1772"}],"version-history":[{"count":21,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1772\/revisions"}],"predecessor-version":[{"id":2229,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1772\/revisions\/2229"}],"wp:attachment":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/media?parent=1772"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/categories?post=1772"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/tags?post=1772"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}