noseFix Test Code Smells

Test smell or an anti-pattern: poor code pattern written repeatedly with documented possible improvements.

circle-check

A poorly built automated test suite quickly becomes a maintenance burden.

Test code can suffer from the same issues as production code. This page doesn't cover such issues for the most part. Instead, this page focuses on major, frequently occurring, test-specific anti-patterns with examples and suggested solutions.

For an arguably exhaustive list, see this Test Smells Catalogarrow-up-right1.

Anti-patterns are grouped according to the graphic below. Assertions are part of the Test Body, but they deserve a separate section.

Drawing

1. Poor Test Name

Test names are extremely important. Quick to write, hard to do so well. Poor test names hinder understanding, troubleshooting, and refactoring.

circle-check

Useless Test prefix/suffix

Some testing frameworks require the test function (method, class, file) to include "test" in its name to be discoverable. If it's not mandatory, or the framework provides @Test annotations - remove "test". It has no informational value.

Name that lies

  • Test name claims to do one thing, but actually does another

  • Test name claims to do one thing, but actually does multiple checks

    • Unit tests should almost always test one thing

    • Complex integration and E2E tests may include several checks halfway for fail-fast purposes, but their last, main assertion must match and be related to the test name.

Ambiguous

When we cannot tell from the test name what exactly is being tested. This is subjective and open to discussion, but it's worth making an initial effort.

test_for_bug_xyz

If a bug is discovered in later stages of development (say, in QA or even Production) and then fixed, a test should be written for it. But just like all other tests, the test name must describe the behavior and be placed into the relevant group of other tests, and not "test_for_bug_ticket_1234" and placed into a "bugs" folder.

Inconsistent

Names within a test suite have an inconsistent style.

circle-info

Decide on a naming style and convention and keep it consistent as much as possible, but bear in mind that in a large test suite of 100s and 1000s of tests, not all test names can fit into a simple template.


2. Poor Test Body

Prod logic duplication

A test provides input and checks output. Transforming input into output is the job of production code. Tests must not duplicate production logic - otherwise, the code exists in two places and the test adds no value.

Non-parameterized tests

2+ separate tests have been written, but they could've been a single parameterized test.

Over-parameterized tests

The opposite of non-parameterized tests. Logically different tests merged into one for convenience or conciseness, reducing clarity.

Signs when tests may be over-parameterized:

  • Inputs and outputs are both parameters (e.g., valid and invalid scenarios)

  • Test covers logically distinct scenarios in a single function

  • Difficulty in reasoning about what the test actually verifies (it verifies this AND that AND this)

  • if-else branching in test body:

Complex / Large Test

If a test fails, we want to understand and troubleshoot it very quickly. That is why test code should be very simple, mostly non-nested.

Presence of loops, branching, and others generally make tests unnecessarily complex.

Possible solutions: improve variable names, break up a large test into several smaller tests, parameterize where appropriate

circle-info

Complex integration or system tests are inevitably... complex. But their bodies should still be mostly flat and easy to read, with necessary complexities refactored or abstracted away into the framework or helper functions.


3. Assertions

Commented out or no assertions

  1. No assertions: Test "passes" without verifying anything. Worse even - it gives false confidence.

  2. Commented out assertions: Suppose a test fails, and the assertion is "temporarily" commented out or removed. The test runs and shows green. This may be OK for troubleshooting purposes on a local machine.

triangle-exclamation

Alternative actions:

  • Disable the test + add link to Created Ticket to resolve this

  • If a failed test doesn't block the build (e.g., separately run E2E tests), keep the test failing to keep the problem visible until it is resolved. Ultimately, it may be OK to just delete the test.

Too Many Assertions

More than one assertion means that if the first one fails, the second one will not run, and potentially important information will be missed.

Possible solution:

  • If assertions are logically similar - parameterize the test

  • If assertions test different scenarios - split into several tests

circle-info

Some frameworks allow soft (that don't fail the test immediately) and hard (fail right away) assertions.

A careful combination of a few soft assertions and hard assertions might be justified in some situations.

circle-info

The "one assertion" rule usually applies to unit tests.

It may be OK to have several "fail-fast" checks in the middle of complex integration tests before the main assertion.

Wrong assertion used

There is more than one way to assert the same thing. The problem arises when the test fails. A mismatch of assertion to situation causes you to be diverted away from the problem. Is the error message as meaningful as it could be?

Example 1: Asserting a boolean when comparing two values is possible

Example 2: Using boolean assertions on a list, when a more specific check is possible


4. Setup / Cleanup (Teardown)

Interdependent tests

In unique and very complex scenarios, it may be beneficial to construct a chain of interdependent tests for justifiable reasons.

But much more often than not, tests should be independent of each other. You should be able to run them in any order, and they must produce the same result.

Example:

  1. Test 1 creates {Object} and asserts creation was successful. Leaves it for Test 2.

  2. Test 2 updates {Object} and asserts update was successful.

Test 2 is dependent on Test 1. If the first one fails, it will cause a cascade of failures, usually false positives, meaning the "update" functionality works fine, but it was flagged as "failed".

Refactored example:

  1. Test 1 creates {Object}, asserts something, and deletes the {Object} as part of clean up (teardown)

  2. Test 2 creates an independent {Object} as part of setup, asserts something, and again deletes the {Object} as part of clean up (teardown).

The refactored example generally leads to slower runtime (more time necessary to create/delete independent test data), but the benefits of stability and deterministic behavior far outweigh the cost of troubleshooting (and rerunning) of large cascading failures.

circle-info

There are many other ways tests can be independent, mostly unnecessarily: 1) Test 1 invoking Test 2 in its body, when they could be written independently

2) Tests (over)sharing a database or another common resource. Can a separate instance be created? Sometimes not, as it requires fighting through insurmountable corporate bureaucracy, for example.

Non-extracted Setup/Cleanup

The general rule of thumb states that tests should be "self-contained", including their setup and teardown. Such tests are generally easier to understand and troubleshoot.

But dozens or even hundreds of tests may share the same setup/teardown. This then leads to large code duplications and an increasing maintenance cost.

In such cases, applying DRY (Don't Repeat Yourself) and refactoring setup/teardown into reusable components is beneficial.

After all, most mature test runners already provide fixtures to centralize setup and teardown code.

Other

Life is complex and messy. So is software development, including test automation. A golden rule in one context can be inadequate in another.

This is why giving prescriptive advice can be hard and at times - misleading.

Nevertheless, here are some other high-level occurrences in test automation that are likely to be considered anti-patterns.

  1. Tests that do not have environment/configuration setup yet still depend on them.

    1. "It works on my machine because I manually configured something, but it won't work on your machine."

    2. "You must wait for hour X to launch these tests for them to work."

  2. Hidden assertions and/or test data

    1. "The test failed, I'm trying to look at the data and the assertion, but they're hidden deep in the test framework behind multiple layers of abstraction."

References

The downsides of this catalog:

  • Overwhelming: it is very large.

  • Hard to remember: test smells often given custom names by their authors (e.g., Parasite, Local Hero), which is fun, but hard to remember.

  • Contains many duplicates and overlaps: a non-coordinated effort results in a list where items overlap partially or fully. E.g., the catalog contains "The Flickering Test", "Flaky test", "Flaky locator", "Having Flaky or Slow Tests", "Erratic Test", "Nondeterministic Test".

  • Few code examples.

  • Contains general anti-patterns applicable to production code (subjective): messy naming, unused code, useless comments, basic code duplication, poor abstractions, inheritance used instead of composition, etc. All of these issues are relevant to both prod and test code, but it makes it hard to discern test-specific smells for those who seek just them.

Last updated