State of AI in Testing in 2026
The group page title is "AI-Assisted Testing". This implies a human in the loop at all times.
AI and Black-box and Exploratory Testing
In 2026, I spent dozens of hours experimenting and fine-tuning prompts (which can be slightly modified to become embedded AI instructions or "skills") with some of the best models available.
The following has been attempted to improve the output:
Usage of structure (e.g., Role - Context - Task - Format, or their close equivalents)
Usage of LLM-friendly markdown (headers, lists, examples, tables)
Experiments with personas (skilled roles) with a wide, narrow, and very narrow skill scope (very narrow works best)
Giving specific frameworks to work within (SFDIPOT to analyze, Techniques and Heuristics with concrete examples to create tests)
Iteratively accepting (and later removing) LLMs' suggestions to improve the output
The conclusion is clear: LLMs are capable of producing "strong junior" level of work, at best. But the depth and breadth of testing reach a plateau quickly. When given more context of internal, project-specific data, results might be of higher quality. Or not.
The core benefit is fast, structured creation of many simple and sometimes intermediate tests, given enough effort went into the prompt templates. This may be beneficial because:
It may be faster to review and correct than to create and type test scenarios from scratch
Generated tests may serve as a base for more sophisticated testing by a human
AI and Test Automation
AI is only as good as the context it receives. Without a custom, local context (existing code, skills, instructions) of very high quality that you can personally vouch for based on years of experience, expect LLMs to produce code of "average" quality found on the Internet. You can imagine what that looks like.
Last updated