{"id":1143,"date":"2017-10-20T13:57:12","date_gmt":"2017-10-20T13:57:12","guid":{"rendered":"http:\/\/dlang.org\/blog\/?p=1143"},"modified":"2021-10-08T11:07:38","modified_gmt":"2021-10-08T11:07:38","slug":"unit-testing-in-action","status":"publish","type":"post","link":"https:\/\/dlang.org\/blog\/2017\/10\/20\/unit-testing-in-action\/","title":{"rendered":"Unit Testing In Action"},"content":{"rendered":"<p><em><a href=\"http:\/\/www.funkwerk.com\"><img loading=\"lazy\" class=\"alignleft size-medium\" src=\"http:\/\/www.funkwerk.com\/wp-content\/themes\/funkwerk\/library\/images\/logo.png\" width=\"250\" height=\"81\" \/><\/a>Mario Kr\u00f6plin is a developer at <a href=\"http:\/\/www.funkwerk.com\/\">Funkwerk AG<\/a>, a German company whose passenger information system is developed in D and was <a href=\"https:\/\/dlang.org\/blog\/2017\/07\/28\/project-highlight-funkwerk\/\">recently highlighted on this blog<\/a>. That post describes Funkwerk\u2019s use of third-party unit testing frameworks and says, \u201cthe team recently discovered a way to combine xUnit testing with D\u2019s built-in <code>unittest<\/code>, which may lead to another transition in their unit testing.\u201d That\u2019s Mario\u2019s subject in this post.<\/em><\/p>\n<hr \/>\n<h3 id=\"thereandbackagain\">There and Back Again<\/h3>\n<p>Ten years ago, programming in D was like starting over in our company. And, of course, unit testing was part of it right from the beginning. D\u2019s built-in simple support made it easy to quickly write lots of unit tests. Until some of them failed. And soon, the failure became the rule. There\u2019s always someone else to blame: D\u2019s simple unit-test support is too simple. A look at Python reveals that the modules <code>doctest<\/code> and <code>unittest<\/code> live side by side in the standard library. We concluded that D\u2019s unit test support corresponds to Python\u2019s <code>doctest<\/code>, which means that there must be something else for the real unit testing.<\/p>\n<p>Even back then, we immediately found such a unit testing framework in DUnit [<em>An old D1 unit testing framework that you can read about at the old <a href=\"http:\/\/www.dsource.org\/projects\/dmocks\/wiki\/DUnit\">dsource.org<\/a> \u2013 Ed.<\/em>]. Thanks to good advice for <a href=\"http:\/\/xunitpatterns.com\/\">xUnit testing<\/a>, we were happy and content with this approach. At the end of life of D1, a replacement library for D2 was soon found. After a bumpy start, I found myself in the role of the maintainer of <a href=\"https:\/\/github.com\/linkrope\/dunit\">dunit<\/a> [<em>A D2 unit-testing framework that is separate from DUnit \u2013 Ed<\/em>].<\/p>\n<p>During <a href=\"http:\/\/dconf.org\/2013\/\">DConf 2013<\/a>, I copied a first example use of user-defined attributes to dunit. This allowed imitating <a href=\"http:\/\/junit.org\/junit4\/\">JUnit 4<\/a>, where, for example, test methods are annotated with <code>@Test<\/code>. By now, dunit imitates <a href=\"http:\/\/junit.org\/junit5\/\">JUnit 5<\/a>. So if you want to write unit tests in Java style, dunit is a good choice. But which D programmers would want to do that?<\/p>\n<p>Recently, we reconsidered the weaknesses of D\u2019s unit test support. Various solutions have been found to bypass the blockers (described in the following). On the other hand, good guidelines are added, for example, to use attributes even for <code>unittest<\/code> functions. So we decided to return to making use of D\u2019s built-in unit test support. From our detour we retain some ideas to keep the test implementation maintainable.<\/p>\n<h3 id=\"expectations\">Expectations<\/h3>\n<p>Whenever a unit test fails at run time, the question is, why? The error message refers to the line number, where you find something like <code>assert(answer == 42)<\/code>. But what is the value of <code>answer<\/code> if it isn&#8217;t <code>42<\/code>? The irony is that this need is well understood. If you use a static assert instead, the error message reads like: <code>static assert 54 == 42 is false<\/code>. The fear of code bloat is the reason why you don\u2019t get this automatically at run time.<\/p>\n<p>If you look at the <a href=\"https:\/\/dlang.org\/spec\/spec.html\">Language Reference<\/a>, you will notice that the chapter <a href=\"https:\/\/dlang.org\/spec\/unittest.html\">Unit Tests<\/a> covers primarily the special <code>unittest<\/code> function. It is assumed that <code>assert<\/code> is used for test verification, which is introduced in the chapter <a href=\"https:\/\/dlang.org\/spec\/contracts.html\">Contract Programming<\/a>. In theory, it\u2019s completely OK to reuse <code>assert<\/code> for test verification. Any failure reveals a programming error that must be fixed. In practice, however, test expectations are quite different from preconditions, postconditions, and invariants. While the expectations are usually specific (<code>actual == expected<\/code>) the contracts rather exclude specific values \u200b\u200b(<code>value != 0<\/code> or <code>value !is null<\/code>).<\/p>\n<p>So there are lots of implementations of templates like <code>assertEquals<\/code> or <code>test!\"==\"<\/code>. The problem shows up if you want to have the most helpful error messages: <code>expected 42 but got 54<\/code>. For this, <code>assertEquals<\/code> is too symmetrical. In fact, JUnit\u2019s <code>assertEquals(expected, actual)<\/code> was turned into <a href=\"http:\/\/testng.org\/doc\/\">TestNG\u2019s<\/a> <code>assertEquals(actual, expected)<\/code>. Even with UFCS (<a href=\"https:\/\/tour.dlang.org\/tour\/en\/gems\/uniform-function-call-syntax-ufcs\">Uniform Function Call Syntax<\/a>), it is not clear how <code>a.assertEquals(b)<\/code> should be used. From time to time, programmers don\u2019t write the arguments in the intended order. Then the error messages are the opposite of helpful. They are misleading: <code>expected 54 but got 42<\/code>.<\/p>\n<p>Fluent assertions avoid this symmetry problem: <code>actual.should.eq(expected)<\/code> or <code>expect(actual).to.eq(expected)<\/code> are harder to use incorrectly. Thanks to UFCS and lazy parameters, the implementation in D is no problem. The common criticism is \u201cthe natural language formulation is too verbose\u201d, or just \u201ctoo many dots\u201d. Currently, however, this seems to be the only way to get the most helpful error messages.<\/p>\n<p>The next problem is that string comparisons are seldom as simple as: <code>expected foo but got bar<\/code>. Non-printable characters or lengthy texts, such as XML or JSON representations, sabotage error messages that were meant to be helpful. This can be avoided by escaping special characters and by showing differences. Finally, this is what <a href=\"https:\/\/github.com\/gedaiu\/fluent-asserts\">the fluent-asserts library<\/a> does.<\/p>\n<h3 id=\"testexecution\">Test Execution<\/h3>\n<p>At large, we want to get as much information as possible from a failed test run. How many test cases fail? Which test cases fail? Does the happy path fail or rather edge cases? Is it worth addressing the failures, or is it better to undo the change? The approach of stopping on the first error is contrary to these needs. The original idea was to run the unit tests before the start of the actual program. By now, however, separate test runners are often used, which continue in case of a failure. To emphasize this, test expectations usually throw their own exceptions, instead of the unrecoverable <code>AssertError<\/code>. This change already shows how many test cases fail.<\/p>\n<p>Finding out what\u2019s tested in the failing test cases is more difficult. At best, there are corresponding comments for documented unit tests. But an empty <a href=\"https:\/\/tour.dlang.org\/tour\/en\/gems\/documentation\">DDoc<\/a> comment, <code>\/\/\/<\/code>, is all that\u2019s needed to include the body of the <code>unittest<\/code> function as an example in the documentation. In the worst case, the unit test goes on and on verifying this and that.<\/p>\n<p>The idea of the <a href=\"http:\/\/wiki.c2.com\/?SentenceStyleForNamingUnitTests\">Sentence Style For Naming Unit Tests<\/a> is that the name of the test function describes the test case. In D, however, the <code>unittest<\/code> functions are anonymous. On the other hand, D has <a href=\"https:\/\/dlang.org\/spec\/attribute.html#uda\">user-defined attributes<\/a> so that you can even use strings for the test description instead of CamelCase names. <a href=\"https:\/\/github.com\/atilaneves\/unit-threaded\">unit-threaded<\/a>, for example, shows these string attributes so that you get a good impression of the extent of the problem in case of a failure. In addition, unit-threaded satisfies the requirement to execute test cases selectively. For example, only the one problematic test case or all tests except those tagged as \u201cslow\u201d. It\u2019s promising to use unit-threaded as needed. You let D run the <code>unittest<\/code> functions as long as they pass. Only for troubleshooting should you switch to unit-threaded. You have to be careful, however, to only use compatible features.<\/p>\n<p>By the way: the parallel test execution (from it\u2019s name, the main goal of unit-threaded) was quite problematic with the first test suite we converted. On the other hand, the speedup was just 10%.<\/p>\n<h3 id=\"coverage\">Coverage<\/h3>\n<p>The D compiler has <a href=\"https:\/\/dlang.org\/code_coverage.html\">built-in code-coverage analysis<\/a>. The ratio of the lines executed in the test is often used as an indicator for the quality of the tests. (See: <a href=\"https:\/\/dlang.org\/blog\/2017\/01\/20\/testing-in-the-d-standard-library\/\">Testing in the D Standard Library<\/a>) A coverage of 100% cannot be achieved, for example, if you have an <code>assert(0)<\/code>. Lower thresholds for the coverage can always be achieved by cheating. The fact that the <code>unittest<\/code> functions are also incorporated in the coverage is questionable. Imagine that a single line that has not yet been executed requires a lengthy unit test. As a consequence, this new unit test could significantly raise the coverage.<\/p>\n<p>In order to avoid such measurement errors, we decided from the beginning to extract non-trivial unit tests to separate modules. We place these in parallel to the <code>src<\/code> tree in a <code>unittest<\/code> directory. Test utilities are also placed in the <code>unittest<\/code> directory, so that reading the actual code is not encumbered by large <code>version (unittest)<\/code> sections. (We also have test directories for <a href=\"http:\/\/xunitpatterns.com\/customer%20test.html\">customer tests<\/a>.) For the coverage, we only count the modules under <code>src<\/code>. Code-coverage analysis creates a report file for each module. For a summary, which we output at the end of each successful test run, we have written a simple script. By now, <a href=\"https:\/\/github.com\/ohdatboi\/covered\">covered<\/a> is a ready-made solution.<\/p>\n<p>In order to fully exploit the code-coverage analysis, an unusual formatting is required, for example, for the short-circuit evaluation of expressions with <code>&amp;&amp;,<\/code> <code>||<\/code>, and <code>?:<\/code>. We hope that <a href=\"https:\/\/github.com\/dlang-community\/dfmt\">dfmt<\/a> can be changed to reformat the code temporarily.<\/p>\n<h3 id=\"fixtures\">Fixtures<\/h3>\n<p>What can you do to prevent the test implementation from getting out of control? After all, test code is also code that needs to be maintained. Sometimes the test implementation is more obscure than the code being tested.<\/p>\n<p>As a solution the xUnit patterns suggest a structuring of the test implementation as a <a href=\"http:\/\/xunitpatterns.com\/Four%20Phase%20Test.html\">Four-Phase Test<\/a>: fixture setup, exercise system under test, result verification, fixture teardown. The term <em>fixture<\/em> refers to the test context. For JUnit, this is the test class with attributes that are available to all test methods. A method with the annotation <code>@BeforeEach<\/code> initializes the attributes. This is the fixture setup. Another method with the annotation <code>@AfterEach<\/code> implements the fixture teardown. All methods annotated with <code>@Test<\/code> focus on exercise and verification.<\/p>\n<p>At first glance, this approach seems to be incompatible with D\u2019s <code>unittest<\/code> functions. The <code>unittest<\/code> functions do not get automatic access to the attributes of a class, even if they are defined in the context of a class. On the other hand, one can mimic the approach, for example, by implementing the fixtures next to the <code>unittest<\/code> functions as a <code>struct<\/code>:<\/p>\n<pre class=\"prettyprint lang-d\">unittest\r\n{\r\n    Fixture fixture;\r\n    fixture.setup;\r\n    scope (exit) fixture.teardown;\r\n    (fixture.x * fixture.y).should.eq(42);\r\n}<\/pre>\n<p>The test implementation can be improved by executing the fixture setup in the constructor (or in <a href=\"https:\/\/dlang.org\/spec\/operatoroverloading.html#function-call\"><code>opCall()<\/code><\/a>, since default constructors are disallowed in <code>struct<\/code>s) and the fixture teardown in the destructor:<\/p>\n<pre class=\"prettyprint lang-d\">unittest\r\n{\r\n    with (Fixture())\r\n        (x * y).should.eq(42);\r\n}<\/pre>\n<p>The <code>with (Fixture())<\/code> pulls the context, in which test methods are executed implicitly in JUnit, explicitly into the <code>unittest<\/code> function. With this simple pattern you can structure unit tests in a tried and trusted way without having to use a framework for test classes ever again.<\/p>\n<h3 id=\"parameterizedtests\">Parameterized Tests<\/h3>\n<p>A parameterized test is a means to reuse a test implementation with different values \u200b\u200bor with different types. Within a <code>unittest<\/code> function this would be no problem. Our goal, however, is to get as much information as possible from a failing test run. For which values \u200b\u200bor which types does the test fail? unit-threaded provides support for parameterized tests with <code>@Values<\/code> \u200b\u200band <code>@Types<\/code>. If unit-threaded is not used to run the <code>unittest<\/code> functions, these test cases do not work at all.<\/p>\n<p>With the new <a href=\"https:\/\/dlang.org\/spec\/version.html#staticforeach\"><code>static foreach<\/code> feature<\/a> however, it is easy to implement parameterized tests without the support of a framework:<\/p>\n<pre class=\"prettyprint lang-d\">static foreach (i; 0 .. 2)\r\n    static foreach (j; 0 .. 2)\r\n        @(format!\"%s + %s == 1\"(i, j))\r\n        unittest\r\n        {\r\n            (i + j).should.eq(1);\r\n        }<\/pre>\n<p>And if you run the failing test with unit-threaded, the descriptions of the failing test cases reveal the problem without the need to take a look at the test implementation:<\/p>\n<pre class=\"prettyprint lang-d\">0 + 0 == 1: expected 1 but got 0\r\n1 + 1 == 1: expected 1 but got 2<\/pre>\n<h3 id=\"conclusion\">Conclusion<\/h3>\n<p>D\u2019s built-in unit test support works best when there are no failures. As shown, however, you do not need to change too much to be able to work properly in situations where you rely on helpful error messages. The imitation of a solution from another programming language is often easy in D. Nevertheless, one should reconsider such solutions from time to time.<\/p>\n<p>If we had a wish, we would want separate libraries for expectations and for test execution. Currently, you get frameworks where not all features are great, or they are overloaded with alternative solutions. Such a separation should probably be supported by the <a href=\"https:\/\/dlang.org\/phobos\/index.html\">Phobos runtime library<\/a>. Currently, each framework defines expectations with its own unit test exceptions. In order to combine them, ugly interdependencies are required to match the exceptions thrown in one library to the exceptions caught in another library. A unit test exception in Phobos could avoid this problem.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ten years ago, programming in D was like starting over in our company. And, of course, unit testing was part of it right from the beginning. D\u2019s built-in simple support made it easy to quickly write lots of unit tests. Until some of them failed. And soon, the failure became the rule. There\u2019s always someone else to blame: D\u2019s simple unit-test support is too simple. A look at Python reveals that the modules doctest and unittest live side by side in the standard library. We concluded that D\u2019s unit test support corresponds to Python\u2019s doctest, which means that there must be something else for the real unit testing.<\/p>\n","protected":false},"author":25,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[26,28,9],"tags":[],"_links":{"self":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1143"}],"collection":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/users\/25"}],"replies":[{"embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/comments?post=1143"}],"version-history":[{"count":4,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1143\/revisions"}],"predecessor-version":[{"id":1373,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/posts\/1143\/revisions\/1373"}],"wp:attachment":[{"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/media?parent=1143"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/categories?post=1143"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dlang.org\/blog\/wp-json\/wp\/v2\/tags?post=1143"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}