Quality and Metrics

Yes, it is important to pay attention to high quality software development. Of course, one can debate at length about how to define the term “quality,” but there is little doubt that any software product is doomed to fail without sufficient quality. Over time, an entire industry has developed the topic of software quality, aiming to make life easier for product owners—and, by extension, developers—through automated code scans, metrics, and analysis tools. These tools promise transparency, predictability, and a sense of control in a world where software systems grow more complex every year.

However, the more we rely on these tools, the more we risk confusing the measurement with the thing being measured. Quality is not a number, and it certainly isn’t a dashboard full of green checkmarks. It is a combination of maintainability, usability, performance, security, and—perhaps most importantly—the craftsmanship and mindset of the people building the software.

But is that really the case?

Let’s look at this from the perspective of Goodhart’s Law. This law states that as soon as you set a measurable target—such as a specific test coverage percentage within an application—people will strive to achieve that number, often without regard for the actual goal behind the metric. The moment a metric becomes a target, it stops being a good metric. Suddenly, the number becomes more important than the underlying quality it was supposed to represent.

A Practical Example

Suppose you are leading an IT project and are tasked by your company to improve or maintain the quality of the software project. Your first instinct might be to ensure 80 percent of test coverage in the code, enforced by an analysis tool. No code gets into the software without meeting this 80 percent threshold. This seems like a smart and sensible measure, but what happens next? Developers, who previously wrote only a few but meaningful tests, are now required to write many more tests. They might get a bit more development time, but in practice, it’s nowhere near enough to deliver the required test coverage.

And test coverage is only one example. Similar patterns emerge with static code analysis scores, cyclomatic complexity thresholds, or performance benchmarks. Once a number is defined, teams begin optimizing that number—sometimes at the expense of readability, maintainability, or even common sense. A class might be split into several smaller ones just to reduce complexity of metrics, even if the result becomes harder to understand. A performance test might be tuned to pass a benchmark, while real-world performance remains unchanged.

What now?

Software developers are generally very creative people, so they start coming up with creative solutions to this problem. They identify parts of the code that are easy to test and provide a lot of coverage, but unfortunately, these tests add little value for the actual use case. Tests are written that technically work—sometimes even with the help of artificial intelligence—just to reach those last annoying percentage points of the required metric. What often doesn’t happen anymore, due to lack of time, is writing a few really good tests that actually simulate the user’s real-world scenarios.

The same applies to other metrics: developers might silence warnings instead of addressing them, restructure code purely to satisfy a linter, or introduce unnecessary abstractions to please a complexity scanner. The result is software that looks good on paper but is harder to maintain, harder to understand, and ultimately less valuable to the people who use it.

What’s the Moral of the Story?

Metrics are important, but setting standards too high can be counterproductive. In the example above, a lower value—between 65 and 75 percent—could already achieve the desired improvements in quality, without pushing developers toward burnout due to excessive testing. The same principle applies to other quality indicators: they should guide, not dictate. Metrics should highlight potential issues, not become rigid rules that stifle creativity and motivation.

The focus should always be on the people who write the code; ultimately, only they can truly improve quality. A motivated developer who understands the product, the users, and the long-term vision will always produce better software than a developer who is forced to chase arbitrary numbers. Automation and artificial intelligence are great for support, but they are no substitute for motivated and well-trained project team members. Quality emerges from thoughtful design, meaningful collaboration, and a shared understanding of what “good” means—not from blindly optimizing metrics.

If we want truly high-quality software, we must empower developers, not overwhelm them with dashboards. Metrics should illuminate the path, not become the destination.

Leave a Reply Cancel reply