Fix Test Code Smells
Test smell or an anti-pattern: poor code pattern written repeatedly with documented possible improvements.
Improve your test code by learning these common anti-patterns
A poorly built automated test suite quickly becomes a maintenance burden.
Test code can suffer from the same issues as production code. This page doesn't cover such issues for the most part. Instead, this page focuses on major, frequently occurring, test-specific anti-patterns with examples and suggested solutions.
For an arguably exhaustive list, see this Test Smells Catalog1.
Anti-patterns are grouped according to the graphic below. Assertions are part of the Test Body, but they deserve a separate section.
1. Poor Test Name
Test names are extremely important. Quick to write, hard to do so well. Poor test names hinder understanding, troubleshooting, and refactoring.
Invest time and thought into informative yet succinct names.
Improve a test name when you encounter a poor one.
Useless Test prefix/suffix
Some testing frameworks require the test function (method, class, file) to include "test" in its name to be discoverable. If it's not mandatory, or the framework provides @Test annotations - remove "test". It has no informational value.
Name that lies
Test name claims to do one thing, but actually does another
Test name claims to do one thing, but actually does multiple checks
Unit tests should almost always test one thing
Complex integration and E2E tests may include several checks halfway for fail-fast purposes, but their last, main assertion must match and be related to the test name.
Ambiguous
When we cannot tell from the test name what exactly is being tested. This is subjective and open to discussion, but it's worth making an initial effort.
test_for_bug_xyz
If a bug is discovered in later stages of development (say, in QA or even Production) and then fixed, a test should be written for it. But just like all other tests, the test name must describe the behavior and be placed into the relevant group of other tests, and not "test_for_bug_ticket_1234" and placed into a "bugs" folder.
Inconsistent
Names within a test suite have an inconsistent style.
2. Poor Test Body
Prod logic duplication
A test provides input and checks output. Transforming input into output is the job of production code. Tests must not duplicate production logic - otherwise, the code exists in two places and the test adds no value.
Non-parameterized tests
2+ separate tests have been written, but they could've been a single parameterized test.
Over-parameterized tests
The opposite of non-parameterized tests. Logically different tests merged into one for convenience or conciseness, reducing clarity.
Signs when tests may be over-parameterized:
Inputs and outputs are both parameters (e.g., valid and invalid scenarios)
Test covers logically distinct scenarios in a single function
Difficulty in reasoning about what the test actually verifies (it verifies this AND that AND this)
if-elsebranching in test body:
Complex / Large Test
If a test fails, we want to understand and troubleshoot it very quickly. That is why test code should be very simple, mostly non-nested.
Presence of loops, branching, and others generally make tests unnecessarily complex.
Possible solutions: improve variable names, break up a large test into several smaller tests, parameterize where appropriate
3. Assertions
Commented out or no assertions
No assertions: Test "passes" without verifying anything. Worse even - it gives false confidence.
Commented out assertions: Suppose a test fails, and the assertion is "temporarily" commented out or removed. The test runs and shows green. This may be OK for troubleshooting purposes on a local machine.
NEVER COMMIT SUCH CODE INTO THE SHARED REPOSITORY
Alternative actions:
Disable the test + add link to Created Ticket to resolve this
If a failed test doesn't block the build (e.g., separately run E2E tests), keep the test failing to keep the problem visible until it is resolved. Ultimately, it may be OK to just delete the test.
Too Many Assertions
More than one assertion means that if the first one fails, the second one will not run, and potentially important information will be missed.
Possible solution:
If assertions are logically similar - parameterize the test
If assertions test different scenarios - split into several tests
Wrong assertion used
There is more than one way to assert the same thing. The problem arises when the test fails. A mismatch of assertion to situation causes you to be diverted away from the problem. Is the error message as meaningful as it could be?
Example 1: Asserting a boolean when comparing two values is possible
Example 2: Using boolean assertions on a list, when a more specific check is possible
4. Setup / Cleanup (Teardown)
Interdependent tests
In unique and very complex scenarios, it may be beneficial to construct a chain of interdependent tests for justifiable reasons.
But much more often than not, tests should be independent of each other. You should be able to run them in any order, and they must produce the same result.
Example:
Test 1 creates
{Object}and asserts creation was successful. Leaves it for Test 2.Test 2 updates
{Object}and asserts update was successful.
Test 2 is dependent on Test 1. If the first one fails, it will cause a cascade of failures, usually false positives, meaning the "update" functionality works fine, but it was flagged as "failed".
Refactored example:
Test 1 creates
{Object}, asserts something, and deletes the{Object}as part of clean up (teardown)Test 2 creates an independent
{Object}as part of setup, asserts something, and again deletes the{Object}as part of clean up (teardown).
The refactored example generally leads to slower runtime (more time necessary to create/delete independent test data), but the benefits of stability and deterministic behavior far outweigh the cost of troubleshooting (and rerunning) of large cascading failures.
Non-extracted Setup/Cleanup
The general rule of thumb states that tests should be "self-contained", including their setup and teardown. Such tests are generally easier to understand and troubleshoot.
But dozens or even hundreds of tests may share the same setup/teardown. This then leads to large code duplications and an increasing maintenance cost.
In such cases, applying DRY (Don't Repeat Yourself) and refactoring setup/teardown into reusable components is beneficial.
After all, most mature test runners already provide fixtures to centralize setup and teardown code.
Other
Life is complex and messy. So is software development, including test automation. A golden rule in one context can be inadequate in another.
This is why giving prescriptive advice can be hard and at times - misleading.
Nevertheless, here are some other high-level occurrences in test automation that are likely to be considered anti-patterns.
Tests that do not have environment/configuration setup yet still depend on them.
"It works on my machine because I manually configured something, but it won't work on your machine."
"You must wait for hour X to launch these tests for them to work."
Hidden assertions and/or test data
"The test failed, I'm trying to look at the data and the assertion, but they're hidden deep in the test framework behind multiple layers of abstraction."
References
Open Catalog of Test Smells: https://test-smell-catalog.readthedocs.io/en/latest/
The downsides of this catalog:
Overwhelming: it is very large.
Hard to remember: test smells often given custom names by their authors (e.g., Parasite, Local Hero), which is fun, but hard to remember.
Contains many duplicates and overlaps: a non-coordinated effort results in a list where items overlap partially or fully. E.g., the catalog contains "The Flickering Test", "Flaky test", "Flaky locator", "Having Flaky or Slow Tests", "Erratic Test", "Nondeterministic Test".
Few code examples.
Contains general anti-patterns applicable to production code (subjective): messy naming, unused code, useless comments, basic code duplication, poor abstractions, inheritance used instead of composition, etc. All of these issues are relevant to both prod and test code, but it makes it hard to discern test-specific smells for those who seek just them.
Last updated