7 Comments
Jun 19, 2023·edited Jun 19, 2023Liked by Jelena Cupac

Isn't that a bad unit test ? it doesn't assert anything. From my understanding, it will pass on any function that accepts two parameters ?

Expand full comment
author

As you've observed the test will keep on passing. So this means even when we introduce regression bugs into source code, the test still passes.

Expand full comment
author

Yes, it is a bad unit test.

That approach - writing unit tests without assertions - is called "Assertion Free Testing" https://martinfowler.com/bliki/AssertionFreeTesting.html and unfortunately it happens in reality. Common manifestations of it are when management imposes code coverage targets onto teams who never wrote unit tests before or don't know how to write unit tests well. So then developers can just easily call every method of every class, no assertions and get 100% coverage.

Expand full comment

While I agree with the idea that high coverage is a weak metric, this article still presents a straw man of testing.

The code was being written without proper test-first development. The test naming was also poor - we know it's a test, it doesn't need to be called "testX".

Before adding mutation testing, we need to use TDD correctly. This means two things:

1. Red, Green Refactor

2. Triangulation - adding multiple use cases to the tests so that the code can't work by accident

If we do those things, only writing any line of production code to satisfy tests, then we'll get 100% code coverage as a bi-product. Moreover, wherever there's NOT coverage, we most likely have an unexpected bug in our code or tests.

That said, mutation testing can prove other stuff. I'm yet to find a good tool for doing it at scale.

Expand full comment
author

TOPIC 3: MAINSTREAM DEVELOPMENT

The challenge is that most mainstream development is test last, with teams having low code coverage - until the company mandates code coverage targets. That's what I've observed in multiple cases, and that's what motivated we towards the series.

So to get to TDD (to the ultimate conclusions you already summarized) this is the approach I use:

Step 1. Firstly, I focus on removing the illusion of code coverage. Reason is many teams feel like 80/90/100% code coverage is an achievement. So I use Mutation Testing to show that their tests are not achieving their purpose - aren't protecting against regression bugs.

Step 2. So then I move onto showing how to use the Mutation Testing approach so that the team retroactively adds assertions. This is a painful process because team has to edit tests, but the tests are not understandable. We do reach 100% Mutation Score for a subset of the codebase - and see how it was painful and time consuming.

Step 3. Then I drill into the qualitative aspects, like test understandability (test names, setup, assertions) and we see how neither metric can detect those kinds of problems, but that it has impact on miatianaibility. So then we refactor the tests.

Step 4. After the whole time-consuming painful process above then I show TDD, as something that woudl have helped us achieve the same result, but in a much more natural/convenient way. And due to the fact that TDD naturally leads to high metrics as a byproduct, we no longer chase the metrics.

Expand full comment
author

TOPIC 2: RED-GREEN-REFACTOR & TRIANGULATION

The RED part in TDD cycle would have forced us to have an appropriate assertion.

Tringulation would help us evolve our implementation towards higher generality in an incremental way, backed up by tests along the way.

As a side effect, it naturally leads to 100% Mutation Coverage.

Expand full comment
author

Yes, that is correct.

TOPIC 1: Beyond Metrics

This article only deals with the metrics Code Coverage & Mutation Coverage. As you've observed, the test naming is poor (it does not indicate behavior), and various other things could be done badly (I didn't illustrate it here but the test variables might be unreadable, arrange/given too long, assert/then too long).

All those qualitative aspects are crucial for test meaningfulness, but are not covered by metrics at all.

If we had been practicing TDD, then we would have been compelled to think about behavior, and so at least the test name would have reflected behavioural expectations.

Expand full comment