Generalities about Testing (automake)

15.1 Generalities about Testing

The purpose of testing is to determine whether a program or system behaves as expected (e.g., known inputs produce the expected outputs, error conditions are correctly handled or reported, and older bugs do not resurface).

The minimal unit of testing is usually called test case, or simply test. How a test case is defined or delimited, and even what exactly constitutes a test case, depends heavily on the testing paradigm and/or framework in use, so we won’t attempt any more precise definition. The set of the test cases for a given program or system constitutes its testsuite.

A test harness (also testsuite harness) is a program or software component that executes all (or part of) the defined test cases, analyzes their outcomes, and report or register these outcomes appropriately. Again, the details of how this is accomplished (and how the developer and user can influence it or interface with it) varies wildly, and we’ll attempt no precise definition.

A test is said to pass when it can determine that the condition or behaviour it means to verify holds, and is said to fail when it can determine that such condition of behaviour does not hold.

Sometimes, tests can rely on non-portable tools or prerequisites, or simply make no sense on a given system (for example, a test checking a Windows-specific feature makes no sense on a GNU/Linux system). In this case, accordingly to the definition above, the tests can neither be considered passed nor failed; instead, they are skipped – i.e., they are not run, or their result is anyway ignored for what concerns the count of failures an successes. Skips are usually explicitly reported though, so that the user will be aware that not all of the testsuite has really run.

It’s not uncommon, especially during early development stages, that some tests fail for known reasons, and that the developer doesn’t want to tackle these failures immediately (this is especially true when the failing tests deal with corner cases). In this situation, the better policy is to declare that each of those failures is an expected failure (or xfail). In case a test that is expected to fail ends up passing instead, many testing environments will flag the result as a special kind of failure called unexpected pass (or xpass).

Many testing environments and frameworks distinguish between test failures and hard errors. As we’ve seen, a test failure happens when some invariant or expected behaviour of the software under test is not met. An hard error happens when e.g., the set-up of a test case scenario fails, or when some other unexpected or highly undesirable condition is encountered (for example, the program under test experiences a segmentation fault).

TODO: Links to other test harnesses (esp. those sharing our terminology)?