Critique #6 Combinatorial Explosion in Unit Testing
How do we handle combinatorial explosion? When there are many classes, many dependencies, many pathways? How do we decide the right entry points for our tests?
In previous articles, I explained the benefits of testing behaviors, rather than testing structure; whereby we test behavioral outcomes rather than the implementation of the behavior….
But there was one problem I didn’t tackle - how do we handle combinatorial explosion when testing behavioral outcomes?
I received an excellent question from David Franck:
Thank you for this interesting article. I am interested in your point of view on combinatorial exploision. When behind an api and its behavior there are many possible cases, a large volume of code with many collaborators and logic, what is your strategy? Are you looking for other "entry points" to test?
I found this question to be quite insightful because it is the “missing piece” in the puzzle of testing behavioral outcomes. The challenge is the following: suppose that the behavior we’re modeling exhibits a combinatorial explosion, then by testing behavioral outcomes, we’ll face a combinatorial explosion in our tests!
Up to now, the “obvious“ mainstream solution is testing classes in isolation. Yes, that solves the problem of combinatorial explosion… BUT… it causes fragile tests...
Uncle Bob described the “fragile test“ problem in TDD Harms Architecture whereby he explained the structural coupling between tests and classes (one-to-one correspondence between test classes and production classes; creating a test class for each production class; creating test methods for each production method)… This STRUCTURAL COUPLING causes fragile tests because our tests are directly coupled to the UML class diagram; hence any refactoring to the UML class diagram causes tests to break (see my article Critique #5 Unit Testing Class Design?).
To summarize, we have two approaches for Combinatorial Explosion:
One approach is to test the overall externally visible behavioral outcome. This means our tests are robust; we can refactor our UML class diagram…. But the problem is that our tests will exhibit combinatorial explosion! But there’s a PROBLEM - have to write too many tests, they will be expensive to maintain.
The second approach is to test the classes in isolation, so we’d write a test per class. This is generally a straightforward way to solve the problem of combinatorial explosion. In this way, even though the overall behavior exhibits combinatorial explosion, the individual classes do not. But there’s a PROBLEM with one-to-one coupling between tests and classes. It causes fragile tests, which are expensive to maintain!
Hmm…
Both of these approaches are SUBOPTIMAL!
Is there a solution at all? What is the solution?
In this article:
We’ll first review the problems of solving the problem through one-to-one coupling between tests and classes because this is the approach that is currently used as the mainstream solution to solve combinatorial explosion.
We’ll then provide a more adaptive approach to solving combinatorial explosion. We’ll go back to the mathematical roots of combinatorial explosion and solve the problem of choosing “entry points“ at a more abstract level (independent of OOP or other paradigms, indeed independent of software development).
We’ll go back to the “real world“ and see how this translates into software development, including a real example of applying it to the Credit Score problem (an example of combinatorial explosion).
Lastly, at the end of the article, I’ll answer the original question: what is our strategy for choosing entry points for our tests?