SmallTestingGuide

What are Small Tests?

Small Tests are the tests that most developer read, write, and run with the greatest frequency. Small Tests are bundled with the source code, can be executed in any environment, and run extremely fast. Small tests cover the codebase in as fine a granularity as possible in order to make it very easy to locate problems when tests fail.

A Basic Example

Consider the following example, adapted from Martin Fowler [1]

import unittest

TALISKER = "Talisker"
HIGHLAND_PARK = "Highland Park"

class OrderTests(unittest.TestCase):

    def setUp(self):
        self.warehouse = Warehouse()
        self.warehouse.add(TALISKER, 50) 
        self.warehouse.add(HIGHLAND_PARK, 25) 

    def test_order_is_filled_if_enough_in_warehouse(self):
        order = Order(TALISKER, 50) 
        order.fill(self.warehouse)
        self.assertTrue(order.is_filled())
        self.assertEqual(self.warehouse.get_inventory(TALISKER), 0)

    def test_order_does_not_remove_if_not_enough(self):
        order = Order(TALISKER, 51) 
        order.fill(self.warehouse)
        self.assertFalse(order.is_filled())
        self.assertEqual(self.warehouse.get_inventory(TALISKER), 50)

In this example, the basic structure of a small test in python is shown. Hopefully it appears familiar to anyone who has experience with xUnit-style tests in any language. The test case [[OrderTests]] inherits from unittest.TestCase in order to fit into the normal testing frameworks such as python-nosetests. There is a setUp method which prepares the initial state common to all of the tests in this test case. Each test method begins with "test" so that unittest knows which methods to run. Assertions are made using self.assert*()--more thorough documentation of the available assertion methods can be found at [2]. It is not shown above, but an optional tearDown method can take care of any cleanup actions that must take place after each test method.

Isolation

The basic example above may have a small problem. The system under test (SUT), Order, is tested alongside its collaborator, Warehouse. This means that if there is a problem with the code in Warehouse, it is likely that some of the [[OrderTests]] will fail. This may create a confusing situation when another developer encounters this failure and has to figure out where the problem is.

First, we must acknowledge that it is not certain that this is a real problem. This particular example is very simple and hopefully easy to understand. As such, it is probably sufficient to add truly isolated tests for Warehouse and not worry about a lack of isolation in the [[OrderTests]].

class WarehouseTests(unittest.TestCase):

    def setUp(self):
        self.warehouse = Warehouse()
        self.warehouse.add('Glenlivit', 10) 

    def test_warehouse_shows_new_inventory(self):
       self.assertEqual(self.warehouse.get_inventory('Glenlivit'), 10) 

    def test_warehouse_shows_added_inventory(self):
        self.warehouse.add('Glenlivit', 15) 
        self.assertEqual(self.warehouse.get_inventory('Glenlivit'), 25) 

    def test_warehouse_shows_removed_inventory(self):
        self.warehouse.remove('Glenlivit', 10) 
        self.assertEqual(self.warehouse.get_inventory('Glenlivit'), 0)

With this added coverage, a developer will at least be able to infer a problem with Warehouse if both [[WarehouseTests]] and [[OrderTests]] are failing. This is bending the rule of maximum isolation for small tests, but it is a practical approach.

Test Doubles

In our example, a Warehouse is very easy to create, doesn't introduce any other dependencies, and presumably runs very fast. However, if this were not the case, it would be imperative to test Order in isolation. But how could we accomplish this isolation?

The principle approach to isolating the SUT from its dependencies is to introduce a Test Double (like a stunt double) that fills in for each dependency for the purposes of that test. There are a variety of more specific types of Test Doubles--and an even wider range of vocabulary used to describe those types. For the purposes of internal consistency this document will use the vocabulary defined by Gerard Meszaros in his book XUnit Test Patterns. More detail about this particular terminology can be found at [3]. Not all of the approaches Meszaros describes are relevant here, so we won't cover them all. But it is useful for you to be familiar with all of them.

Test Stub

A Test Stub is a Test Double that provides a canned response to method calls, irrespective of the inputs provided to the method call. It helps when you are trying to set up a particular situation in which to exercise the SUT. After the exercise, the state of the SUT is verified in the normal way with assertions.

class OrderTestsWithStub(unittest.TestCase):

    def test_order_is_filled_if_enough_in_warehouse(self):

        class StubWarehouse(object):

            def get_inventory(self, item):
                return 50

            def remove(self, item, qty):
                pass

        warehouse = StubWarehouse()
        order = Order(TALISKER, 50)
        order.fill(warehouse)
        self.assertTrue(order.is_filled())

In this example, the [[StubWarehouse]] pays no attention to the item in question--it always returns 50 for how much is available. In addition, only the Warehouse methods that are required for this test to run are defined. Because of the limited applicability of this stub, it is defined directly in the test. A more configurable stub might make more sense living at a higher level in the code.

Mock Object

Mock Objects take a different approach from classical unit testing in order to verify the correctness of code. Where the classical approach verifies the end state after the SUT is exercised, Mock Objects verify the behavior of the SUT during the exercise.

Mocks can be coded directly, but most often they are created with the help of a separate library. One such library for python is Mox [4]. With Mox, when you first create a mock object, it is in "record" mode. The test then goes through the motions with the mock object, teaching it what to expect from the SUT and what to return. When this is complete, the mock object is placed in "replay" mode and the SUT is exercised. During the verification phase, the mock object confirms that its methods were called in order with the appropriate arguments.

# The comments in this example are included only for the sake of
# showing the subtleties of mox--they should not be included in a
# real test. 
class OrderTestsWithMox(unittest.TestCase):

    def test_order_is_filled_if_enough_in_warehouse(self):
        # Create the Order as usual
        order = Order(TALISKER, 50) 

        # Create the mock warehouse object in record mode
        mocker = mox.Mox()
        warehouse = mocker.CreateMockAnything()

        # Record the sequence of actions expected from the Order object
        warehouse.get_inventory(TALISKER).AndReturn(50)
        warehouse.remove(TALISKER, 50) 

        # Put all mock objects in replay mode
        mocker.ReplayAll()

        # Exercise the Order object
        order.fill(warehouse)

        # Verify that the order is filled and that the warehouse saw
        # the correct behavior
        self.assertTrue(order.is_filled())
        mocker.VerifyAll()

Behavior verification, however, is risky business. Often we are not concerned with the particular behavior of the SUT, we just want to make sure that it correctly implements its interface.

For example, suppose we changed the implementation of Order to the following.

class Order(object):
    # ...
    def fill(self, warehouse):
	try:
            warehouse.remove(self._item, self._quantity)
            self._filled = True
        except:
            pass
    # ...

This is a perfectly valid approach, yet it would break the mock object test above. This might create confusion and it would certainly require making modifications to the test. Because of the tendency of to overspecify the requirements of the software, it is recommended that developers avoid mock objects and behavior verification unless it is truly necessary.

A good example of a case where behavior verification is preferred is when testing a cache. A cache is specifically intended to have no noticeable difference in interface behavior than its underlying backend. In this case, behavior verification would be the simplest and best way to verify the intended functionality.

Fake Object

A Fake Object provides a working implementation of object it is standing in for, but usually with simplifications that make it suitable for testing and unsuitable for production. It is often desirable to create Fake Objects to stand in for databases that would normally cause a test to run too slowly to qualify as a small test.

class FakePersonGateway(object):

    def __init__(self):
        self._person_data = {}

    def insert(self, person):
        person.id = len(self._person_data)
        self._person_data[person.id] = person

    def find_by_name(self, name):
        for person in self._person_data.values():
            if person.name == name:
                return person

    def find_by_parent(self, parent_id):
        people = []
        for person in self._person_data.values():
            if person.mother_id == parent_id or person.father_id == parent_id:
                people.append(person)
        return people


class FamilyTreeTests(unittest.TestCase):

    def setUp(self):
        self.gateway = FakePersonGateway()
        bob = Person('Bob Smith')
        self.gateway.insert(bob)
        alice = Person('Alice Smith')
        self.gateway.insert(alice)
        james = Person('James Smith', father_id=bob.id, mother_id=alice.id)
        self.gateway.insert(james)

    def test_child_descends_from_mother(self):
        tree = FamilyTree(self.gateway)
        self.assertTrue(tree.descends_from('Alice Smith', 'James Smith'))

    def test_father_does_not_descend_from_mother(self):
        tree = FamilyTree(self.gateway)
        self.assertFalse(tree.descends_from('Alice Smith', 'Bob Smith'))

In this example, the SUT is [[FamilyTree]], which depends on a [[PersonGateway]] to look up a given person's children. Because small tests are not allowed to talk to a database (to prevent slow tests), we substitute in a FakePersonGateway which operates out of a local in-memory dictionary.

Injecting Test Doubles

In the Warehouse and Order examples above, there is no difficulty in substituting in the Test Double warehouse for the real thing. However, in some cases the dependency you want to replace is hard-coded into the SUT. In this case there are a few approaches to injecting the Test Double.

Inversion of Control

One approach to solving this problem is to adopt the architectural pattern of Inversion of Control [5]. In short this approach can be summed up as "Don't use hard-coded dependencies."

Inversion of control was used in the [[FamilyTree]] example above. Since the initialization of [[FamilyTree]] was written as

class FamilyTree(object):

    def __init__(self, person_gateway):
        self._person_gateway = person_gateway

it was possible to directly inject the [[FakePersonGateway]] into the [[FamilyTree]] object during construction. If the dependency had been hard-coded, as

class FamilyTree(object):

    def __init__(self):
        self._person_gateway = mylibrary.dataaccess.person_gateway()

then we would have been tempted to write a less useful, slow test that operated directly on a test database.

Stubout

Another approach to managing hard-coded dependencies is shown by the python library stubout (which is bundled with python mox).