Wilson Funkhouser

The Joyful Art of Decluttering: A Marie Kondo Approach to Test Coverage

In 2019, Marie Kondo sparked a pop culture revolution in home living with the famous words: “Does it spark joy?” According to her KonMari method, if an object doesn’t bring you joy regularly, you should remove it from your life. In essence, the KonMari method is a human-centered method. It treats objects as objects only. The philosophy behind the KonMari method is that the buck stops at human beings. Indeed, it is only human enjoyment of objects in their lives that makes these objects truly meaningful. 

Just as Marie Kondo sees a home as a living space with humans as the focus, I see every company’s quality assurance (QA) as a dynamic ecosystem with humans at the center. Tests are just a part of this ecosystem. Humans–whether developers, stakeholders, or end users–are its true inhabitants: the ones who write, consume, and benefit from these tests. Therefore, applying KonMari to test coverage fundamentally requires that we write both our code and tests from the perspective of the humans involved. 

The stakes of cluttering our QA ecosystems are higher than those of cluttering our homes. Not only do we take up space in our testing suites, but we will also inevitably burn out our developers by burdening their time, talent, and energy. Whether or not we agree with voices that say unit tests don’t improve code quality, or that the best code to write is no code at all, these perspectives remind us that at the end of the day, tests don’t improve code quality–humans do. The more of any kind of test human developers need to write and maintain (unit, integration, end-to-end/regression, manual), the harder it will be to grow and flourish our QA ecosystems as a whole. 

Defining Joy: Perception Matters  

How do we determine if a test sparks joy for the humans in our ecosystems? Since we are discussing burnout and the effect of testing on humans in our ecosystem, we must first define joy in a way that accounts for the subjective perceptions–and cognitive biases–inherent to the human experience. For the purposes of this article, we define joy as perceived human value. 

Marie Kondo advises us that an important step of the decluttering process involves holding and touching each item in our household and asking: Does this item spark joy? Likewise, as we begin to declutter our test suites, we as engineering leaders must physically hold each of our tests, look at it from the perspective of our developers, stakeholders, and users, and ask: Is the perceived value of this test a net positive? Every team must begin their decluttering process by weighing the perceived positive value of each test (i.e. the debugging and protection it provides you, your engineers, and other members of your QA ecosystem) against its perceived negative value (i.e. the amount of time, effort, and potential burnout it could cause these same humans). 

Defining net value requires assessing value from the perspective of all humans involved. For example, a stakeholder may derive sufficient positive value from just one or two smoke tests for a feature, while an engineering team may need a full suite to feel comfortable relying on automated testing. At the same time, while a test may provide minor negative value to a stakeholder, it may have substantial negative value for engineers. 

Why Decluttering is a Challenge for Teams

  1. Many teams struggle to remove tests in general. A lot of teams still believe that there is no such thing as “too much testing,” or simply that already written tests should never be removed, especially if they never fail. This belief reflects the traditional sunk cost fallacy: people tend to commit to behaviors because of previously invested resources. Because of this fallacy, developers can often assume that because they paid for a test in time and money in the past, it is worth keeping even several years later. However, our justification for keeping a test must be rooted not in previous effort but in the present value a test delivers for humans in your ecosystem. 
  2. Different humans get different amounts of value from the same test. The perceived value of a test often matters more than its actual value. And this perceived value will always be different for different people in different companies. For some people, value is knowing that a majority of users aren’t going to encounter a bug in your application. Others might only care about the right users not encountering a bug. In an e-commerce company, business stakeholders might place great value on tests that cover checkout flows, even if broken flows don’t actually take users out of the conversion funnel. On the other hand, users’ perceived value of checkout breaking may not be as high, especially if it’s an amenable experience (i.e., when a 404 page on Amazon shows cute animals, or when a broken page on Chrome displays an interactive dinosaur game). Your developers get value knowing that they aren’t deploying bugs, even though it takes effort to write and maintain that test. For some tests, the value to your developers may be negative. If a test reduces the hours of active feature development your team has available, it may have negative value for your customers. Depending on the needs of your ecosystem, developers, stakeholders, and users will all perceive different amounts of value from the same test, as well as from different types of tests (whether end-to-end tests, functional tests, unit and feature tests, or manual tests). Decluttering requires that all the humans in your ecosystem work together to determine which tests are valuable enough to keep, and which should be removed. 

Guidelines for Decluttering

To determine a test’s perceived net value–to weigh its positive value against its negative value–we must evaluate a few key considerations: 

  • Do you trust the test to do its job? How reliable, stable, and well-written is the test? How often does the test break because of brittleness? Beyond test quality, trusting a test also means ensuring that it properly addresses your most important priorities now. Since tests need to reflect the changing needs of your business over time, your team must periodically re-evaluate the tests in your test suite to make sure they are aligned with your most important business objectives, as well as account for the changes in your understanding of how people are using your application over time. For example, it is commonplace to have less complete testing around parts of your application that are seldom used or edge-cases. You may be satisfied with some bare minimum set of tests.  But if that functionality has become a normal part of your business operations, do those  tests adequately cover it anymore? What parts of the site does a test currently protect, and are these parts high-priority for you, your users, and your business’ key performance indicators (KPIs)? If the test does not properly and reliably cover important business priorities, it is likely not providing perceived positive value for the stakeholders in your ecosystem.

 

  • Does the test actually do its job? In other words, does your test expedite the resolution of bugs? In practice, we cannot evaluate this question solely based on whether the application breaks or not: no test suite is perfect, and bugs will always make it to production. But is the test written to assist the developer in identifying and fixing the issue? Or does this test bring more pain to developers than effective help with bug resolution? In general, provided you don’t have too many tests in your test suite failing simultaneously, the more granular a test (i.e. a unit test), the faster your team can identify bugs. A well written test suite makes it clear what the underlying issue is. But greater granularity comes with extra cost. If your test costs developers excessive time and energy, catches bugs, but still requires lengthy investigation to identify and resolve the bug, it is likely not providing sufficient positive value for your developers. A test that optimally aids bug resolution may require more effort to maintain, but provide enough perceived value in resolution that it is a net positive. Depending on the needs of your business, it will be important to balance different levels of testing (for example, unit and web regression tests) to achieve a faster overall response time and reduce developer burnout. 

 

Towards a Healthier QA Ecosystem  

In the end, decluttering your test suite does not always require binary decisions between keeping or killing your tests. There are plenty of opportunities to re-purpose or refactor certain tests. It may be appropriate to declutter your regression tests and add to your unit tests if you need your tests to be more lightweight and bring joy. At other times, just having one web regression test provides sufficient value to make you feel confident that your team is preventing bugs and taking away burden from your developers. Or you may need to replace some of your E2E testing with a level of feature or unit testing to provide stability and maximize value and joy to your humans. Some tests may not bring sufficient joy when automated, but depending on your team size and how frequently you deploy, may be worth taking time out of your developers’ day to perform manually. There are always a gradient of options to choose from. 

Regardless of what kind of company you are and what the humans in your ecosystem need, each business has a responsibility to ask themselves what the true cost of joy is. Ultimately, the key to having a healthy QA ecosystem is that your teams actively and consistently assess which tests to keep, re-purpose, and remove–as well as examine the net impact of each of these decisions on your developers, stakeholders, and end users.