Software Development11 Dec 2013 10:05 pm

The Problem

Recently, I have been doing some in-browser automated testing with Testem and Mocha. For continuous integration purposes we are using the PhantomJS headless browser.

All was fine until one day when—boom—a bunch of new tests, which I had written and debugged in Chrome and FireFox, were failing in CI under Phantom. It turns out that Phantom 1.9 has some date parsing issues and cannot parse dates of the form “2011 Feb 09 12:39:09.”

We had previously used a ISO-8601 polyfill to fix a similar problem with legacy Firefox browsers. This problem was a little different. Our date format, though understood by most browsers, isn’t ISO-8601 compliant. More importantly, the polyfill only fixed the “static” Date.parse method. In the current case, the date was being parsed by the Date constructor.

The ISO-8601 polyfill explicitly avoided dealing with the constructor and for good reason. Though I did not know it at the time, JavaScript never intended for anyone to subclass its internal Date type. (More on this later.) Changing our code over to use Date.parse and modifying the polyfill would mean touching a lot of code, and making it more complex, just for CI—something I didn’t want to do. So I set out to write my own polyfill that would handle the constructor as well as Date.parse.

Initial Attempts: Subclassing Date

My initial simple-minded approach was to just subclass Date and override its constructor, like so:

(function () {
 
    function fixStringDate (sDate) {
        //Implementation omitted for brevity
        return sDate;
    }
 
    Date = (function (JSDate) {
 
        function ctor() {
            this.constructor = newDate;
        }
        ctor.prototype = JSDate.prototype;
        newDate.prototype = new ctor();
 
        function newDate() {
            if (arguments.length === 1 && typeof arguments[0] === "string") {
                JSDate.prototype.constructor.call(null, fixStringDate(arguments[0]));
            } else {
                JSDate.prototype.constructor.apply(null, arguments);
            }
        }
 
        newDate.parse = function (sDate) {
            return JSDate.parse(fixStringDate(sDate));
        }
 
        return newDate;
 
    })(Date)
 
})();

As you can see, this is pretty much boilerplate code for creating a subclass in JavaScript. And it seemed to work. After running the code, calls like new Date(‘2011 Feb 09 12:39:09’) no longer threw errors in PhantomJS. That’s the good part.

The bad part is that calling any instance method of the object so created results in a TypeError with a message to the effect of “not a date object”. This stackoverflow question outlines the issue very well. In short, Date isn’t really so much a class as a collection of static methods that only allow themselves to be called with an object whose immediate type is Date. Mere sub-types of Date don’t pass muster.

So the obvious thing to do was to create a real date object internal to the new class and delegate all method calls down to it. Thus:

function newDate() {
    if (arguments.length === 1 && typeof arguments[0] === "string") {
        JSDate.prototype.constructor.call(null, fixStringDate(arguments[0]));
        this.__realDateObject = new JSDate(fixStringDate(arguments[0]))
    } else {
        JSDate.prototype.constructor.apply(null, arguments);
        this.__realDateObject = JSDate.apply(null, arguments);
    }
}
 
var functions = ["getDate", "getDay", ... "valueOf"]
for (var i = 0; i < functions.length; i++) {
    (function (funcName) {
        newDate.prototype[funcName] = function () {
            return JSDate.prototype[funcName].apply(this.__realDateObject, arguments);
        };
    })(functions[i]);
}

This solved the problem for our specific cases and I could have stopped here. We fortunately never invoked the constructor with more than one argument.

Since we might do otherwise in the future, I wrote some tests to cover all the cases. In doing so, I found out that calling apply on JSDate (a reference to the original Date class) does not work. It runs, but the object returned is “not a Date object.”

I messed around with a lot of ways to invoke apply on the JSDate constructor including several worthy of mention. None worked in this case. Ultimately I had to resort to brute force, relying on the fact that the Date constructor accepts a reasonably finite number of arguments:

function newDate() {
    if (arguments.length === 1 && typeof arguments[0] === "string") {
        JSDate.prototype.constructor.call(null, fixStringDate(arguments[0]));
        this.__realDateObject = new JSDate(fixStringDate(arguments[0]))
    } else {
        JSDate.prototype.constructor.apply(null, arguments);
        if (arguments.length == 1)
            this.__realDateObject = new JSDate(arguments[0]);
        else if (arguments.length == 2)
            this.__realDateObject = new JSDate(arguments[0], arguments[1]);
        //etc...
    }
}

At this point it became obvious that newDate isn’t really acting as a subclass at all. It is acting more like a decorator around the Date type. So all the subclassing code can be removed, which gets us to:

(function () {
 
    function fixStringDate (sDate) {
        //Implementation omitted for brevity
        return sDate;
    }
 
    Date = (function (JSDate) {
 
        function newDate() {
            if (arguments.length === 1 && typeof arguments[0] === "string") {
                this.__realDateObject = new JSDate(fixStringDate(arguments[0]))
            } else {
                if (arguments.length == 1)
                    this.__realDateObject = new JSDate(arguments[0]);
                else if (arguments.length == 2)
                    this.__realDateObject = new JSDate(arguments[0], arguments[1]);
                //etc...
             }
        }
 
        var functions = ["getDate", "getDay", ... "valueOf"]
        for (var i = 0; i < functions.length; i++) {
            (function (funcName) {
                newDate.prototype[funcName] = function () {
                    return JSDate.prototype[funcName].apply(this.__realDateObject, arguments);
                };
            })(functions[i]);
        }
 
        newDate.parse = function (sDate) {
            return JSDate.parse(fixStringDate(sDate));
        }
 
        return newDate;
 
    })(Date)
 
})();

The Final Solution

Having done this, I realized I was making everything much harder than it needed to be. If I am not actually subclassing Date and just decorating it, all I really need to do is decorate the constructor and return a real Date object from it. Doing so lets me drop all that messy delegation code for the instance methods. Finally my code becomes simple and clean:

(function () {
 
    function fixStringDate (sDate) {
        //Implementation omitted for brevity
        return sDate;
    }
 
    Date = (function (JSDate) {
 
        function newDate() {
            var theDate;
            if (arguments.length === 1 && typeof arguments[0] === "string") {
                theDate = new JSDate(fixStringDate(arguments[0]))
            } else {
                if (arguments.length == 1)
                    theDate = new JSDate(arguments[0]);
                else if (arguments.length == 2)
                    theDate = new JSDate(arguments[0], arguments[1]);
                //etc...
             }
             return theDate;
        }
 
        newDate.parse = function (sDate) {
            return JSDate.parse(fixStringDate(sDate));
        }
 
        return newDate;
 
    })(Date)
 
})();

Intuition tells me this code has some drawbacks that could merit a return to the subclassing solution for certain edge cases. So I am glad to have gone through the whole learning experience. Yet for now, the simpler solution is enough. YAGNI

Full code with tests is available on GitHub.

Automated Testing21 Nov 2013 09:46 pm

Authors Note: I was very edified when none other than David Heinemeier Hansson, the creator of Rails, wrote a blog post expressing similar, albeit more general, sentiments to those I present here with respect to the general inappropriateness of low level unit testing. I feel he has said more boldly what I have here intimated timidly.

I am a strong proponent of automated testing and test driven development. But if asked if I do unit testing verses integration or some higher level of testing, I will usually ask the questioner to define “unit.”

To some, this may seem like questioning the definition of “is” but I don’t think so. Consider a set of unit tests that test a single class, the stereotypical case. Let us assume the class was somewhat complex and there were fifty tests to exercise all of the code in the class. Later the class gets refactored into a facade class, supporting the interface of the original class, and a small set of simpler classes behind it that work together to do the work of the original class.

Should we now write unit tests for each of these new classes? Why? What is the ROI? If the original set sufficiently tested the original class, and the refactoring was just that, a change in code structure that did not modify its behavior, do they not now sufficiently test the classes as a group?

Indeed, in my experience, even if we started out with a set of classes that work together to perform a business function, the best value in testing is still to write tests that test that the business function is performed correctly. Such tests will always be valid as long as the business function, i.e. the functional specification of the code, does not change. Tests below this level, in my experience, are fragile and break upon refactoring because they are too closely tied to the implementation of the code under test. If the value of testing is to enable refactoring with confidence, then the tests must survive the refactoring.

How does this fit in with TDD? In TDD we should never write code unless we have a failing test. If each test expresses a detail of the functional requirements of the software (as opposed to a detail of its implementation), then as each test passes, a functional requirement is met, we refactor and move on. It should not matter if we wrote one line, one class or a dozen classes to make the test pass.

Some may argue that writing comprehensive tests at this higher level of abstraction is too difficult. Rather one should write general tests that level. One might, for example, assert that a value is returned. But lower level tests should be written to assert that the correct value is returned for every edge case.

This can sometimes be true. Sometimes tests for edge cases at higher levels of abstraction are harder to set up than the effort is worth, and a lower level test, even if fragile, gives better ROI. However in my experience, in the general case, what makes testing the edge cases difficult at the higher level is usually the same thing that makes any testing difficult: bad design, inexperience with testing, bad tooling, or a combination thereof. Even granting that higher level tests are objectively harder to write, if one practices writing harder tests, it eventually gets easy and one becomes a better tester than they otherwise would have been.

In the end, for each business function, there needs to be an API the implementation of which is responsible for performing that function, and that implementation needs to be testable within one or more contexts (a given system state that can be mocked or otherwise simulated). If the implementation is a single function, a class or an entire module is not the relevant concern. The concern is what are the inputs and what are the outputs and testing that the anticipated inputs all lead to their correct outputs.

Yes, we want to write our tests at the lowest level possible within this context. But we do not want to go below this level. We do not want to be testing components that are simply implementation details of the software’s functionality. Such tests break under refactoring, lead to higher maintenance costs, rarely add value and hence have poor ROI.

There is an exception. For teams or developers new to TDD and writing well designed code in general, lower level tests can provide value. Writing lower level tests is easier. More importantly, being forced to make the lower level components testable helps one to learn good design. It enforces loose coupling, proper abstractions and the like. However once these skills are internalized, they can be exercised without needing to write tests to enforce them. These tests are a learning tool that can and should be discarded.

There is a corollary to this. If a developer doesn’t stop writing these low level tests once he no longer needs them, if he doesn’t instead start writing test at the business functional level, it is entirely possible to develop a system that is fully “tested” but fails to do the right thing. Every low level unit can work as intended but in aggregate fail to work together as intended. One needs tests that assert that the system as a whole, or meaningful segments of it, perform as intended.

I will conclude by admitting that I have not truly answered the question I started out with. How do we correctly define a unit? I have asserted that the best definition of a unit “the code that implements the lowest level business function.” In short we need to be able to discern the boundary between business function and implementation detail. Pointers on how to do this shall perhaps be the topic of another post. For now I will only say that finding the level of test abstraction that will maximize ROI is as much an art, learned from experience, as it is anything else. But one will never develop the art, unless one first realizes it is to be sought after. And challenging those who have not already done so to start looking is the real point of this post.

† Throughout this discussion, I am using the term “business functionality” loosely to refer to what the software is supposed to do conceptually, the details of its functional specification as distinct from details of the implementation of that specification. The term “business” itself may not be properly applicable to all real world cases.

Automated Testing&Web Development17 Nov 2013 09:28 pm

Background

I have been working with our UI team recently to help them do better testing of their Backbone.js based single-page web application. We found it useful to bring in Squire.js to assist us in doing dependency injection into our many Require.js modules. Squire works quite well for this but invariably when writing these sorts of apps, you need to pull in libraries that are not AMD compliant at all or are simply “AMD aware.” When these sorts of modules enter the mix, Squire needs a little help.

jQuery is a great example of this sort of library. Recent versions are AMD aware, and include a define() call. Unlike a true AMD module, though, jQuery’s functionality is not fully encapsulated within the factory function provided to define. Indeed, none of jQuery’s initialization is handled in its factory function. Rather jQuery initializes upon load, just like any legacy JavaScript module. jQuery must do this in order to remain compatible with the millions of lines of non-AMD code that use it.

The Problem

This presents a problem when using Squire. In order to supply alternate versions of AMD modules to the module under test, Squire creates a new Require.JS context in which to load the module under test and its dependencies. Each new Require.JS context in turn loads afresh all the javascript files that are needed by that context. If all of these files are AMD modules, whose state is fully encapsulated within their factory functions, and only initialize when told to do so, then everything is fine. In the case of jQuery, or other non-AMD modules, which initialize upon load and store state in the global space, this can be a problem.

Consider this simple example. Two separate tests use Squire to load jQuery and the jQuery.BlockUI plug-in. Depending on timing details between your browser and your web server, both jQuery instances may load first, followed by both plugin instances, or they may load interleaved: jQuery, plugin, jQuery, plugin. The latter will work out well, the former (and in our experience most typical case) will not. In the former case, because of the shared global namespace, the second jQuery module loaded is the one that the global jQuery and $ variables point to when both of the BlockUI plug-ins load. Because of this, they both plug into the second jQuery instance leaving the first one plug-in free. For non-AMD modules who access jQuery from the global $ variable, this is not a problem. The instance they get has the plugin. For AMD modules that are handed a jQuery instance as an argument to their factory function, the context that loaded the first jQuery instance is stuck with that instance, which did not get its plug-in. This should lead to a lot of failing tests.

The Solution

At first pass, it may seem that the solution is to some how ensure the load order or otherwise ensure that both jQuery instances get their plug-in. That may be a theoretical ideal, but most non-AMD libraries were never designed to have multiple instances loaded and running and doing so can cause all kinds of problems. jQuery, because it supports loading multiple versions of itself at the same time actually handles this better than most. Nonetheless the best solution is to simply avoid loading multiple instances of non-AMD libraries. The question is how to do this in Squire.

Best we can tell, Squire does not explicitly support this. However, there is a simple workaround that can be put in place to enable it. The trick is to require jQuery, any plug-ins, and any other non-AMD modules that may be loading twice at the same time as you require Squire itself. Then for each of these libraries, tell Squire to mock the module and provide Squire with the initial instance as the mock. For modules that don’t return anything when invoked by Require (BlockUI plug-in in our case), Squire must still be told to mock it, but null can be provided as the value for the mock.

Here is some example code taken from a complete working example on github.

define([
  'vendor/squire/Squire', 
  'data/mock-data', 
  'vendor/jquery', 
  'vendor/jquery.blockui'], function(Squire, mock_data, $) {
  var injector = new Squire();
  injector.mock('data/real-data', mock_data);
 
  //Our fix to avoid loading jQuery and BlockUI twice
  injector.mock('jquery', function() { return $; });
  injector.mock('vendor/jquery.blockui', null);
 
  injector.require(['app/example-view'], function(View) {
    describe('Testing with Squire only', function() {
      var view = null;
      before(function() {
        view = new View();
      });
      it('$.blockUI should be defined', function() {
        assert.isDefined(view.getBlockUI(), '$.blockUI was undefined in example-view');
      });
      it('the data should be mocked', function() {
        view.getDataType().should.equal('mock');
      });
    });
  });
});

This approach works because by requiring the modules up front using Require and its default context, we rely on the standard Require logic to ensure the modules only load once. By telling Squire to mock the modules, it will not try to load them but will use the mocks provided, the common instances loaded by Require.

In the case of non-AMD libraries that return nothing to the factory function, such as the BlockUI plug-in above, simply requiring it will cause Require to load it. Upon load, the library does its thing (registers itself with jQuery) and that is all that is needed from it. Telling Squire to mock it keeps it from being loaded again in the new context, and because the library doesn’t provide a value, providing null as it mock value to Squire works just fine.

One final item to note is that in defining the mock for jQuery we cannot simply write

injector.mock('jquery', $ );

rather we must do

injector.mock('jquery', function() { return $; });

The reason for this is that contrary to the Squire documents, the second argument to mock is not always “the mock itself.” The second argument to mock works just like the final argument to define in Require. It may be an object or a function. If it is a function, then Require presumes it to be a factory function that it will invoke in order to get the mock. Since both jQuery and classes (i.e. constructors) are functions, they must be wrapped in factory in order not to be invoked as a factory.

« Previous PageNext Page »