TDD: 1 unit test ≠ 1 class
TDD misconception: The class is the unit of isolation. That's wrong! Instead, the behavior is the unit of isolation.
I thought that: class = unit of isolation
When I first started practicing TDD, I thought that “unit” in Unit Test meant “class.” I’d write a test class for every production class and a test method for every method. I’d isolate the class under test by mocking everything it touched - even things that didn’t make sense to mock:
Write a test class for each production class
Write a test method for each production method
Isolate the class under test by mocking out all its dependencies
Indeed, this is the same thing that Uncle Bob said many people try to do:
“Most people who are new to TDD… create a kind of one-to-one correspondence between the production code and the test code. For example, they may create a test class for every production code class. They may create test methods for every production code method.”
Of course this makes sense, at first. After all, the goal of any test suite is to test the elements of the system. Why wouldn’t you create tests that had a one-to-one correspondence with those elements? Why wouldn’t you create a test class for each class, and a set of test methods for each method? Wouldn’t that be the correct solution?
And, indeed, most of the books, articles, and demonstrations of TDD show precisely that approach. They show tests that have a strong structural correlation to the system being tested. So, of course, developers trying to adopt TDD will follow that advice.
– Uncle Bob (TDD Harms Architecture)
⚠️ But “class = unit of isolation” is problematic!
Let’s see what happens when you write a Unit Test Class for each Production Class, and when you write a Unit Test Method for each Production Method.
Let’s say you refactor your code. Recall that refactoring means changing structure, without changing observable behaviors:
“Refactoring (noun): a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior”
– Martin Fowler (Definition of Refactoring)
Examples of refactoring:
Changing the signature of a public method
Moving a public method from one class to another
Splitting a class into multiple classes
In these cases of refactoring, you’ll end up with broken unit tests! Why? Because the unit tests are coupled to the structure of the code! The tight coupling leads to fragile tests:
“The problem is – and I want you to think carefully about this next statement – a one-to-one correspondence implies extremely tight coupling.
Think of it! If the structure of the tests follows the structure of the production code, then the tests are inextricably coupled to the production code…
It, frankly, took me many years to realize this. If you look at the structure of FitNesse, which we began writing in 2001, you will see a strong one-to-one correspondence between the test classes and the production code classes…
And, of course, we experienced some of the problems that you would expect with such a sinister design. We had fragile tests. We had structures made rigid by the tests. We felt the pain of TDD. And, after several years, we started to understand that the cause of that pain was that we were not designing our tests to be decoupled.
– Uncle Bob (TDD Harms Architecture)
So what if the tests are fragile? Well, we end up with increased maintenance costs. Because when we are refactoring, the tests break, so we have double the effort:
Effort of making the change in the code itself —> this is necessary
Effort of “fixing“ the broken test —> this is a waste of time
So then we don’t want to refactor! This leads to code rot. Unmaintainable code makes future code changes even more expensive!
To summarize, the belief that “class = unit of isolation“ leads to higher maintenance costs and slower delivery.
The alternative: “behavior = unit of isolation“
In the above, we’ve seen the problem of structural coupling, where tests are coupled to the structure of code; for every class, there’s one test. This leads to fragile tests, causing higher maintenance costs (and slower delivery).
The alternative is behavioral coupling, where tests are coupled to behavior of the code; for every behavior, there’s one test. This leads to robust tests, causing reduced maintenance costs (and faster delivery).
This is exactly what Kent Beck remarked:
“Tests should be coupled to the behavior of code and decoupled from the structure of code.”
– Kent Beck (Twitter)
Suppose that we have a class BankAccount
, that exposes the withdraw()
method. Internally, it calls classes BalanceCalculator
and FeeCalculator
. In that case, we could say that BankAccount
is a “public” class, whereas BalanceCalculator
and FeeCalculator
are “internal“ classes.
In cases of structural coupling, we write 1 Test Class for 1 Production Class:
BankAccountTest
class forBankAccount
classBalanceCalculatorTest
class forBalanceCalculator
classFeeCalculatorTest
class forFeeCalculator
class
In case of behavioral coupling, we write a Test Class only for the “Public“ Production Class (that exposed behavior) and we do NOT write any Test Classes for the “Internal“ Production Classes (because they are just a structural implementation detail). So we just have:
BankAccountTest
class forBankAccount
class (because that’s the “public class“)
Please note, in both cases, the code coverage will be the same.
In the case of behavioral coupling, if “internal“ classes are refactored, it will not cause breakage to the tests, because tests are coupled only to the interface of the “public“ class. So in this way, we reduce maintenance costs.
This means we are testing BankAccount, which traverses BalanceCalculator and FeeCalculator. We do NOT mock BalanceCalculator and FeeCalculator.
The only thing we’ll mock out are external dependencies that involve I/O (e.g. file system, database, network, etc.) or non-determinism (e.g. system clock).