Jump to: navigation, search


Why Write Automated Tests?

Often when a developer is testing code, the goal in mind is to verify that a recently written piece of code works correctly. But good automated tests accomplish much more. Good automated tests primarily serve other people--as living documentation and continuous troubleshooting as the system evolves and grows. The goal of making tests easy to run and understand by others is equally as important as the goal of testing the production code.

Types of Tests

There are many different types of tests. You will very often hear developers use terminology like unit test, integration test, system test, black box test, functional test, characterization test, and more. Often, the same term means something different when used by different people. To avoid semantic arguments, this guide only distinguishes three types of tests: small, medium and large tests. The hope is that these terms are general enough to mean something similar to everyone. This guide will be focusing on small tests.

All Tests

Every test should be stable, independent, and readable.


Stable tests need to be changed less frequently than unstable tests. Change isn't always avoidable, but stability can be maximized by writing tests that only depend on the public interfaces of the code under test. In this way only interface changes require tests to be modified. For small tests this requirement means only calling public functions and methods. In large tests, this may mean different things depending on the context. In a web application, for example, stability might require only depending on the public HTTP API.


A test should stand on it's own two feet and be self-contained. No test should depend on another test to run.


Readability is as important in test code as it is in production code. A test is most likely the first thing a developer will read when they make a change that causes the test to fail. The test itself is then the first and best opportunity to correct a misunderstanding about how the production code works. This observation argues not just for cleanliness of the test code but also for maximum clarity and significance in the test name and its assertion messages.

Small Tests

AKA: Unit tests

Most developers in their day-to-day work will be reading, writing, and running small tests. The small test workflow has a critical impact on a developer's productivity. In order to have the best impact possible, all small tests should be extremely fast, maximally isolated, and portable. The goal is that any developer can run tests after each small change and that any failing test will be as useful as possible in helping to diagnose the problem.

Example: Unit tests that test application logic of each class fall under this category.


Small tests should not take longer than 1/10th of a second to run--and on average they should run much faster than that [Feathers 2005]. The idea is that a developer can get feedback on the effects of a code change in a few seconds. How do we make sure small tests run this fast? There are a few rules to follow. [Feathers 2005]

  1. Small tests do not talk to a database.
  2. Small tests do not touch the file system.
  3. Small tests do not communicate across a network.

It's not that these are bad things to do in tests. It's just that tests that cover these areas take too long to provide feedback. Hence, they are the province of medium and large tests and are run as part of a larger development cycle.


Small tests should test at the finest granularity of all sizes of tests. This ensures that it is very easy to find the exact source of a problem when tests fail--perhaps just from the name of the test itself. Isolation can be achieved by having only one concept per test. Often, this means only one function in only one class should be tested at a time. In practice, if a small test restricts itself to one feature of a few collaborating classes from the same package, it may be sufficiently isolated. Again, sufficient isolation means that it is very easy for a developer to find the source of the problem when a test fails.


Small tests should be able to be run by any developer without requiring more hardware resources than it takes to develop.

Medium Tests

AKA: Functional Tests

Medium-sized tests are less isolated and slower to run than small tests, but just as portable. The medium testing level is where we can bring in real dependencies like database, file system, or hypervisor interaction. It is also acceptable to employ fake dependencies to isolate a subsystem for testing. Medium tests should not require a full deployment of the system under test. So, for example, medium tests might require a local sqlite database, but must not require a mysql server installation. In this way, any developer can run medium tests locally, but not as frequently as small tests because they are not expected to run as fast.

There are a lot of different approaches that fit under the category of medium testing. But too much testing at this level risks making the system hard to change and duplicating effort. Therefore, a project-specific strategy is required to make medium tests most effective.

Example: Functional tests where the the app is broken down according to functional components and each component is tested for it's behavior and interaction are good candidates for Medium tests.

Large Tests

AKA: Integration Tests

Large tests focus on end-to-end functionality of the entire system in a realistic deployment. In general no test fakes are allowed in large tests. Large tests are the costliest tests to run, both in terms of time and hardware resources. As such, it is not expected that every developer can run large tests directly. However, large tests should be run against trunk as frequently as possible in an automated fashion and if possible, developers should be given appropriate tools to trigger large test runs on development branches. Large tests are highly dependent upon the installation against which they're running. Large tests are expected to be capable of testing multiple systems functioning together (i.e. Dashboard, Keystone, Glance, and Nova) under a predefined configuration.

Unlike small and medium tests, it is not required that large tests live in the same source repository as the system under test. However, all developers should have access to the large tests so that they can be added and modified along with changes to the system. Indeed all developers share the responsibility of keeping large tests up-to-date.

Example: Scenario and complete system tests where user interaction with the system under test is tested are an example of Large tests.


Feathers, Michael C. (2005). Working Effectively with Legacy Code. Prentice Hall. ISBN 0-13-117705-2.