How to Build E2E Test Cases

Developing end-to-end (E2E) test cases poses a substantial challenge compared to writing unit and API test cases. Blocks of code and APIs can be tested against a well-defined, limited, and predetermined set of business rules. Test-driven-development (TDD) techniques can empower developers to write relevant tests alongside their code. 

Developing E2E tests requires a fully different approach. This level of testing is meant to replicate user behavior that’s interacting with many blocks of code and multiple APIs simultaneously. Below we recommend a process that will help you build accurate, effective test cases for your E2E testing regime. Note that we will not cover test scripting here, but only test case development.

There are four considerations, each of which will be explored in turn:

  1. How to Scope End-to-End Testing
  2. What Bugs to Target
  3. Which User Flows to Follow
  4. How to Design Test Cases

How to Scope E2E Testing

The goal of E2E testing is to make sure that users can use your application without running into trouble. Usually, this is done by running automated E2E regression tests against said application. One approach to choosing your scope could be to test every possible way users could use an application. This would certainly represent true 100% coverage. Unfortunately, it would also yield a testing codebase even larger than the product codebase, and a test runtime that is likely as long as it takes to write the build being tested. 

Senior QA Engineers are often used to determining scope as well. Combining experience, knowledge of the code-base, and knowledge of the web app’s business metrics, a QA engineer can propose tests that should stop your users from encountering bugs when performing high-value actions. Unfortunately, “should” is the weakness of this approach: biased understanding of the web app, cost, and reliance on one individual leads inevitably to bugs making their way into production. 

The team should therefore instead test only how users are actually using the application. Doing so yields the optimal balance of achieving thorough test coverage without expending excessive resources or runtime or relying on an expert to predict how customers use the website. This approach requires user data, rather than an expansive exploration of the different feature options in the application, to manage. To mine user data, you’ll need to use some form of product analytics to understand how your users currently use your application

What Bugs to Target

E2E testing should not replace or substantially repeat the efforts of unit and API testing. Unit and API testing should test business logic. Generally, a unit test ensures that a block of code always results in the correct output variable(s) for given input variable(s). An API test ensures that for a given call, the correct response occurs. 

E2E testing is meant to ensure that user interactions always work and that a user can complete a workflow successfully. E2E test validations should therefore make certain that an interaction point (button, form, page, etc) exists and can be used. Then, they should verify that a user can move through all of these interactions and, at the end, the application returns what is expected in both the individual elements and also the result of user-initiated data transformations. Well-built tests will also look for javascript or browser errors. If tests are written in this way, the relevant blocks of code and APIs will all be tested for functionality during the test execution. 

Which User Flows to Follow

The risk of a bloated test suite, beyond high maintenance cost, is runtime that grows too long for tests to be run in the deployment process for each build. If you keep runtimes to only a few minutes, you can test every build and provide developers immediate feedback about what they may have broken, so they can rapidly fix the bug. 

To prevent test suite bloat, we suggest splitting your test cases into two groups: core and edge. Core test cases are meant to reflect your core features—what people are doing repeatedly. These are usually associated with revenue or bulk usability; a significant number of users are doing them, so if they fail you’re in trouble. Edge cases are the ways that people use the application that are unexpected, unintended, or rare, but might still break the application in an important way. The testing team will need to pick and choose which of these cases to include based on business value. It’s important to be careful of writing edge case tests for every edge bug that occurs. Endlessly playing “whack a mole” can again cause the edge test suite to become bloated and excessively resource-intensive to maintain.

If runtime allows, we recommend running your core and edge tests with every build. Failing that, we recommend running core feature tests with every build, and running the longer-runtime edge case tests occasionally, in order to provide feedback on edge case bugs at a reasonable frequency.

How to Design Test Cases

Every test case, whether it is core or edge, should focus on a full and substantial user experience. At the end of a passing test, you should be certain that the user will have completed a given task to their satisfaction.

Each test will be a series of interactions, with an element on the page: a link, a button, a form, a drawing element, etc. For each element, the test should validate that it exists and that it can be interacted with. Between interactions, the test writer should look for meaningful changes in the application on the DOM that indicate whether or not the application has responded in the accepted way. Finally, the data in the test (an address, a product selected, some other string or variable entered into the test) should be used to ensure that the test transforms or returns that data in the way that it’s expected to.

If you build E2E test cases with this process in mind, you will achieve high-fidelity continuous testing of your application without pouring unnecessary or marginally-valuable hours into maintaining the suite. You will be able to affordably ensure that users can use your application in the way they intend to do so.

Beyond Whack-A-Mole: Shifting From a Reactive to Proactive E2E Testing Strategy

Of all of the classic arcade games, Whack-A-Mole just might be the most frustrating. You can’t win the game of Whack-A-Mole. Every time you think you’ve hit the mole, the little scoundrel always finds a way to pop up again somewhere else, and you’re always one step behind. 

In the world of end-to-end (E2E) testing, we can get stuck playing Whack-A-Mole when we reactively write tests to bugs that pop up in production in order to prevent them from appearing again. I like to call this practice Whack-A-Mole testing because it’s a common approach, yet can easily become an endless cycle in which testers are always one step behind. As leaders in our organizations, we must instead figure out how to be one step ahead, committing to developing more courageous, forward-thinking E2E testing strategies.

WHY WHACK-A-MOLE TESTING IS SO COMMON

Quality Assurance (QA) teams live in an unenviable world: their work is largely unnoticed until something goes wrong. They’re then faced with explaining their decision not to test something, especially if the results cost the organization money. We’ve heard so many times, “Why didn’t this get tested?” Whatever the reason for the revenue-impacting bug, if QA “fails” to assure quality, they can be chastised by other leaders in the company. QA leaders and engineers are thus under immense pressure to prevent mistakes and ensure the same bugs don’t pop up in production more than once.

Prior to co-founding ProdPerfect, I used to do some consulting work with factories. Operations such as oil refineries are able to clearly track metrics related directly to revenue and cost, and can thus fairly easily make decisions related to return on investment (ROI). In a refinery, you can easily measure how many barrels of oil were produced each day, and thus prioritize what investments to make to increase this daily number. Many factories lose sight of the ROI on their actions and become reactive to the most recent problem that occurred in the plant. But when they do make the commitment to prioritize their work on high-ROI improvements, it’s fairly easy to measure the results. 

When you measure performance of software QA within an organization, you can’t measure six man-hours of code in barrels. Test code isn’t uniformly valuable, and each update doesn’t have an equal impact (good or bad) on the business. Because of this variance, it can be hard to resolve multiple competing priorities: How do we deliver value and quality without sacrificing speed? The lack of clarity in measuring speed, value, and quality prevents engineers from having clear ROI agreement and unified productivity KPIs.

Despite the challenging and often organizationally unclear work of measuring performance in engineering, the pressure to perform still exists for engineers. And since QA teams often feel the brunt of this pressure, QA leaders are often forced into operating from a reactive stance in their E2E testing strategy. In other words, due to a lack of clear prioritization metrics, many teams resort to playing Whack-A-Mole with their testing to appease the organization at large. Sticking to high-ROI initiatives in testing is difficult, but just as important as in a refinery.

THE FALLACY OF WHACK-A-MOLE TESTING

To challenge this practice, we as leaders first need to ask ourselves: Does a bug slipping through in one part of an application increase the likelihood that it’s going to happen again in the same place more than anywhere else? I like to think of the comparable, yet common fallacy in gambling: If you pull the slot machine six times and didn’t get jackpot, does this mean the next time you pull it, there’s a higher likelihood of hitting jackpot? Though our guts may tell us otherwise, the answer to both questions is ultimately no.

The fallacy of Whack-A-Mole testing is assuming that if we create another test where we previously saw a bug, then we’re more secure than if we wrote that test elsewhere. It’s simply not the case that because a bug happened in a certain area, then it follows that there’s an increased likelihood that the bug will happen in that area again. Whack-A-Mole testing is NOT a proactive testing strategy: a Whack-A-Mole test doesn’t add a test to the area of the application with the greatest need for tests. Instead, it is a passive strategy: we’re testing an area as a reaction to seeing a bug there. The fact that a bug came up last week shouldn’t change our organizational focus on writing tests for high-priority areas: areas that are more likely to produce bugs, that are important for customer use, or that directly impact revenue.

WHY WHACK-A-MOLE TESTING HURTS MORE THAN IT HELPS

Whack-A-Mole testing hurts engineering organizations for several reasons:

  1. It distracts us from writing well-prioritized tests. Before any bug shows up in production, a QA team has developed a strategy for what to test, based on a certain mechanism of prioritization. When we write Whack-A-Mole tests, we’re pivoting our test-writing resources away from whatever prioritization mechanism we otherwise had and towards the Whack-A-Mole test. As a result, we delay writing future high-priority tests.
  2. It adds maintenance burden to your team. Whack-A-Mole testing decreases your team’s capacity to write future high-priority tests. Every time you write an E2E test, you’re committing to maintain that test. This creates a fixed, unavoidable ongoing level of work that decreases your capacity to write more tests with the same number of engineers. Most teams I’ve seen attempt to make up for this by simply hiring more.  
  3. It slows down developer productivity and velocity. In a continuous delivery process, each new E2E test adds to the test suite run-time, which lengthens your regression cycle. If your test suite takes half an hour to run and you’re running with each commit, this means either that 1) your developers aren’t producing anything during that time or 2) your developers are checking in code less often because they don’t want to wait for the tests to run (or both). In both cases, you lose developer productivity and provide less-frequent quality feedback for each build, meaning each incremental Whack-a-Mole test costs developer velocity. 
  4. It bloats your test suite and increases instability. At some point, your test suite will have grown large enough that there’s a high probability that it will fail on a given run due to instability. Once instability reaches a certain critical mass, the test suite fails so frequently that developers stop paying attention. When that happens, it starts providing negative value: adding deployment runtime without contributing to quality. 

REPAIRING THE DAMAGE OF WHACK-A-MOLE TESTING

How do we reverse the damage of Whack-A-Mole testing? First, our organizations need to re-examine our testing choices through a blank-slate exercise. We must ask ourselves: If we were to build this strategy from the ground up once again, what would be our testing priorities? What would we test to balance test coverage with speed, maintenance burden, and stability? Once we’ve defined what’s ideal for us to test, we need to then overlay that outlook on what’s currently being tested as is. And here’s the hard part: we need to have the courage to shut down tests that don’t align with this strategy. And we need to move on.

One helpful aid in this process is to evaluate tests that have been in the suite for 6 months or longer and review: Have they caught any bugs in the last 6 months? If they haven’t, your team should strongly consider retiring them, as they’re likely not worth prioritizing. Unless you’re committed to testing every possible permutation of behavior, it’s essential to re-prioritize. What we learn by doing this exercise is that most Whack-A-Mole tests never actually catch a bug. Whack-A-Mole tests may give us short-term comfort in the face of organizational political pressure. But when we let the data speak instead of human impulse, we see that the vast majority of the time, writing Whack-A-Mole tests provides little to no real business value. It’s when we stare this harsh reality in the face that we find the courage that we need to reprioritize our test suites, drastically reducing the number of unnecessary tests.

Ultimately, developing a new testing strategy from the ground-up that highlights your most important priorities will free up your developers and QA resources to properly cover what’s truly important to your business. The benefit is three-fold: 1) better quality, 2) better speed and cost, and 3) higher trust from your developers in the test suite.

SHARING A COMMITMENT TO BETTER QA

Many QA teams resort to Whack-A-Mole testing because they’re under pressure to respond to quality problems in prod.  A better QA practice is only possible when all leaders across our organizations share and fulfill the commitment to stick to the team’s QA strategy, rather than muck with it every time something seems to go wrong.

First, it’s crucial for engineering and product leaders to recognize alongside QA leaders that Whack-A-Mole testing does not necessarily improve quality assurance. We must understand that a bug appearing in a certain place shouldn’t necessarily change priorities for what to test moving forward. Our leaders must keep their commitment to a well-defined trade-off mechanism between speed, productivity, and QA. Each organization’s trade-off will be different and change over time, but it needs to be sacred at any given time. When we see a bug in our code, we need to commit to asking: “Do we need to rethink our strategy, or possibly rethink our trade-off point? Are we prioritizing the right way? Are we making the right commitment to what an acceptable level of testing looks like? How might our testing approach be unduly burdening our team?” This discipline helps us resist the knee-jerk reaction to build a new Whack-a-Mole test.

At ProdPerfect, our commitment to each other is that we will not be reactive in determining QA priorities. We invite you to join us in this commitment: We will not act upon knee-jerk reactions to immediately build tests for every bug. Rather, we will learn from bugs. We will evaluate them over a period of time by overseeing where bugs are slipping out and what damage they’re producing. Then, we’ll make data-driven decisions to inform our testing strategy. We’ll collectively own the consequences and costs of making changes to our testing strategy. But we will not knee-jerk respond by playing Whack-A-Mole with our testing approach.

Making this commitment requires organizational discipline and courageous leadership from all. All our leaders must agree to see quality assurance as a partnership in which organizations need to effectively balance their testing priorities and determine the best level of coverage for the business. Each of us has something to benefit in making such a commitment, and I invite you to share in this commitment with me. Instead of burdening our processes and teams with reactive Whack-A-Mole testing, let’s care for them well by thinking proactively and letting data alone drive our testing strategy.

Preventing Human Burnout: A Meaningful Approach to Measuring E2E Test Coverage

TEST COVERAGE IN A LIVING ECOSYSTEM

I like to see any company’s Quality Assurance (QA) as a living, breathing ecosystem. The ecosystem is defined by your business needs, the complexity of your application, and the innumerable ways in which you QA your system. Together, you, your developers, stakeholders, and customers all live in this ecosystem and vie for the free energy therein. In order to maintain ecosystem equilibrium, your company must balance each of these moving parts. Every ecosystem has rates of churn of their constituents. If you push your developers or customers too hard, they will burn out and leave the ecosystem. In this way, your QA ecosystem is as much about maintaining a stable application as it is about maintaining stable humans. 

The QA ecosystem is what makes designing metrics for test coverage uniquely challenging. Each business has different needs for different types of tests, just as each business has a distinct web application and a unique user base. And because there are so many types of tests and ways to test, there are countless ways to measure adequate coverage. Despite this complexity, the basis for making measurement-related decisions remains largely unspoken: Should we prioritize mere statistical coverage of code, features that are more important to our company’s needs, or areas that the internal team deems more likely to break?

There is no single solution. If we want to measure end-to-end test coverage successfully, we must first identify metrics that are truly meaningful to our individual businesses and which reflect that individual humans are part of this system. In distinguishing proper QA coverage metrics for your business, here are three key considerations to keep in mind: 1) moving fast, 2) having acceptable coverage, and 3) covering business priorities.

  1. Moving fast: Are you impeding your developers unduly? To move fast, we need to balance the cost of adding tests with the cost of repairing bugs and the cost of runtime in deployment. The more tests you write, the more stressful and time-consuming writing code becomes. The fewer tests you write, the more stressful and time-consuming maintaining code becomes.
  2. Having acceptable coverage: Since no QA system will realistically have 100% coverage for any application or cover every single use case, we must focus on the level of coverage we actually need, accounting for where the limited resources of our ecosystem are best allocated.
  3. Covering business priorities: As we measure coverage, we need to balance business-level metrics or key performance indicators (KPIs) with the ways that users are actually using the web app. If your sign up breaks for a few hours but your business model focuses on another conversion metric (such as adding items to a cart), this may not be the worst possible bug for you. It’s up to each business to determine which test efforts directly yield ROI.

WHY DESIGNING QA METRICS IS PARTICULARLY HARD

  1. Engineers have to manage complex product interactions in a changing ecosystem. Worse yet, different types of testing often have different owners with different stakeholders who have different needs. This means managing variations in developer expectations, management needs, and in test suites. In the QA realm, since there is so much variation in what applications do, there are many ways to test: Do you test at the unit and the controller level? Do you stub out or do you test calls to the database? Do you need to test your load balancing? And since the test ecosystem needs to remain dynamic, the test suite will change every time the product changes. Even if nothing changes other than an increase in the total number of users, engineers need to make sure that there are an adequate number of tests for load testing the site.

  2. There are no tools nor individuals fully available and equipped to effectively measure coverage. Product Owners tend to think of high-level features and development processes, then delegate responsibilities. Product Managers and QA Managers tend to focus on regression testing and manual testing. DevOps tend to own building test environments. And Product Engineers (and most engineers) tend to be the ones writing unit, integration, and API tests. There’s no singular vantage point for fully understanding what has been covered. It gets even more complicated when thinking of feature complexity. Take the example of a login: there are innumerable ways in which the simple concept of logging into a website occurs. You can login from the homepage, from the checkout, or even through your email provider. Today, it’s impossible for a human to navigate this complexity. When I was a Product Owner, I remember reviewing the design of my web applications and realizing, “Oh my. No one is using this.” Looking back, I’m ecstatic that I screwed up. It showed me that there are an incredible amount of biases and assumptions in how we think about applications we’ve built that simply don’t align with the reality in how they’re actually used on the ground.

  3. Product Owners often don’t have an effective means of managing incident response. Most of the time, when something breaks, the go-to response is, “let’s write a new test for the bug we found and make sure it doesn’t break again.” This put-out-every-fire-as-it-comes approach to writing tests leads to a bloated test suite with hundreds of web regression tests and thousands of unit tests. This tactic may sound sensible, but in reality, regression testing must consider speed—not just of test suite runtime, but of human developer productivity. In the industry, we talk about burnout because of too many bugs, but what we don’t talk about as often is burnout from too much testing. We still lack an organized ethos for how companies can balance test coverage with other variables, including speed and fighting employee burnout. To fully solve this problem requires fundamental shifts in how we assess our QA ecosystems.

TWO CONSIDERATIONS FOR MEANINGFUL DESIGN

  1. Set a defined framework for meaningfulness. Determine what should be tested in a way key stakeholders agree yields intrinsic value for your company. In addition to building an ecosystem which prioritizes preventing developer burnout, what’s meaningful to us at ProdPerfect is focusing on a data-driven, user-based framework: we test what happens most—based on how users are actually using a site. We’ve defined a framework for covering a critical amount of everything that goes on on a given application. It ensures that the things people do most frequently on an app aren’t going to break, while avoiding overburdening developers with maintaining bloated test suites. As an organization, we’re devoted to minimizing the time needed to keep up with QA in order to prevent developer burnout and customer burnout from being exposed to too many bugs. Our principle is that just as we need to account for the burnout of having too many bugs, we likewise need to account for the burnout of having too many tests.

  2. Determine an acceptable level of coverage when it comes to bugs. Once the framework is defined, set expectations for what level of coverage is acceptable with respect to the impact of a bug when it reaches production. Is there a critical mass of users on certain parts of the webapp? Is some functionality hypercritical for customers getting value out of the product? For your internal KPIs? If you can answer the question of coverage acceptability in a way that allows you to have a sustainable business model and is backed by quantitative analysis, your employees will be more likely to maintain their zeal for their work, directly impacting the development of your product and the satisfaction of your customers.

A RUBRIC FOR COVERAGE

QA is a living, breathing ecosystem inhabited by humans. These humans are limited resources that can burnout from of any number of factors. For this reason, the rubric should never be to have 100% coverage. The rubric always needs to consider the humans at the company and the humans using the app when deciding how much test coverage is meaningful.

When it comes to QA metrics, there are no right or wrong answers. There is simply the question: “Is your ecosystem stable and sustainable?” For some companies, stability means: Yes, we will burn out developers. They will spend a year here and leave, because we are writing so many friggin’ tests. For ProdPerfect, because we prioritize maintaining a balance between the three considerations of 1) moving fast, 2) having acceptable coverage, and 3) covering business priorities, we’re empowering both ourselves and customers to let the right things break and stop the right things from breaking. And we’re going to keep building on whatever parts we can in order to meet changing needs in the changing ecosystem of QA, humans included.

ProdPerfect Removes the Burden of QA Testing

There are typically three levels of quality assurance testing maturity. One is the classic waterfall approach where it takes weeks to get a deploy ready. Then, there is the continuous development and continuous delivery approach where QA engineers are put in place to handle automation. The most mature way of tackling QA is removing QA engineering as a separate practice, and making all your engineers responsible for the quality of features.

The problem is that none of these levels of maturity seem to be able to get QA right.

“No one has a good answer. Enterprise are failing in waterfall structures. Agile teams are failing or running into difficulty hiring and maintaining QA engineers. Silicon Valley is having to hire only the most senior folks, and even then it is through force of will and pain they are able to keep test suites to a point they are happy with,” said Dan Widing, founder and CEO of the automated QA testing provider ProdPerfect.

Automating QA
There is a better way. ProdPerfect removes the struggle it takes to set up a QA engineering department, and automates QA testing using live user data. This is “dramatically cheaper, dramatically faster, gets you a result faster, [and] is going to nearly guarantee that you catch bugs as part of your process,” Widing explained.

ProdPerfect is able to obtain live user data by analyzing web traffic and creating flows of common user behavior. That behavior is then built into an end-to-end testing suite that ProdPerfect maintains, expands and updates based on actual user data over time.

According to Widing, QA testing is “incredibly difficult, painstaking work that almost tends to be underappreciated by the organization itself,” and the folks who are having to deal with this are just overburdened with work. “We have a mechanism that lets us shake out the environment the customer needs us to test against… and then we are using a testing framework that lets us plug in our learnings from these steps to produce an automatically updated test suite,” he continued. “The experience the customer gets is a black box QA engineering department… What you get at the end is an auto-updated test suite that can run continuously in your CI system that just tests your application.”

ProdPerfect covers every core workflow with applications, provides 95 percent or more of test stability, less than four-hour regeneration of broken tests, and less than 48-hour test coverage for new feature sets.

“You don’t need to do anything to build, maintain, or expand the testing suite. We got it. You need to respond to bug reports, of course, and keep a stable testing environment up and running for us, but that’s all. Very frequently people call this ‘magic’ or ‘too good to be true,’” the company stated on its website.

Getting the right metrics
ProdPerfect not only works to ensure QA testing is covered, but also works to help teams understand what the right metrics to quantify success are.

“That is something we put into our service every step of the way. What your browser automation should be doing is catching as many significant bugs as possible whatever stage it is testing at and then otherwise staying as much out of the way,” said Widing.

You will know you have a solid testing foundation in place when you don’t ship a fire drill-style bug and have to wake up in the middle of the night and figure out how to deal with it or who is on top of it, Widing explained.

Since ProdPerfect is already analyzing what users are doing, it can project how things should be working and make sure they stay working. The solution tests features continuously, detects any significant bugs and verifies the feature set is actually working.

“We aim to stay out of the way by crafting what are the other metrics that are important to make sure you are not slowing down the software team,” said Widing.

Additionally, the solution will measure against minimum-frequency thresholds to confirm its performance.

“If you don’t set up your design and data strategy or set up the right tooling, everything falls apart and you have to work particularly hard to make sure all the pieces work together otherwise any singular improvement is not going to help you at all,” Widing said.

This article was first published on SDTimes.com.

Who Should Determine End-to-End Test Cases?

“A knife has the purpose of cutting things, so to perform its function well it must have a sharp cutting edge. Man, too, has a function…”

-Aristotle

In the distant (in software-years, which are much like dog years) past, a company’s development team would focus on new product code, and then a dedicated quality assurance (QA) team would write corresponding test code (including any unit tests). One of the pitfalls of this practice was that developers might get “lazy” about code quality, and might throw quality concerns “over the wall” to QA. This slowed down development and led to an ultimately antagonistic relationship between developers and QA teams, so it fell out of favor.

The “QA does QA” practice has mostly given way to moving testing into the hands of the developers themselves. Most of the time, developers now write their own unit tests and API tests. This makes sure developers take ownership of quality and thereby incentivizes them to put more focus on writing high quality code in the first place. How this is implemented varies: some teams use test-driven development (TDD) to write tests first and then build code to pass those tests. Some teams add peer code review. Some teams embed QA within dev teams to help them plan for quality at the onset. These practices are similarly meant to keep developers from building tests that are easy to pass.

The swing from QA-driven test-writing to developer-driven test-writing has, for some teams, crept into browser or end-to-end (E2E) testing. Contemporary dev teams either assign E2E test-writing to developers or to QA automation engineers, and different leaders can have strong opinions on who should really be taking point, us included.

At ProdPerfect, we believe that developers are the right choice to take point on writing unit and API tests, but making the right tradeoffs in what should be a core E2E test is near impossibly hard. Developers have a strong sense (through the context of developing them) of the intent of unit-level and API-level code, so they know best how to reliably test their own code. But it’s a stretch to expect developers to bear the burden of comprehensive end-to-end testing themselves. Adequately testing the full application for the myriad of probable user journeys throughout involves monitoring, analyzing, and accounting for complex interactions between many code modules. Then, from that set of possibilities, they must accurately choose the right set that deploys developer time, server resources, server time, and stated outcomes to balance business objectives. And they must re-evaluate those choices on a regular basis. Developers typically focus on small slices of an application at a time. To expect developers to fully bear the burden of comprehensive E2E testing is asking them to understand the entire universe of the application’s development and usage forwards and backwards in time. Truly no one is positioned to do so.

Developers are good at doing what they’re hired to do: developing code to innovate product—and even testing that code—and should remain primarily focused on doing so. It’s a waste of resources to task developers with end-to-end testing, and they’re not positioned to do it best.

Instead, due to the complexity of effective end-to-end testing, the ideal person to determine and execute end-to-end user tests is someone whose core expertise and focus is in understanding the entire user journey and the outcomes thereof, not someone who is asked to tack on end-to-end testing as an afterthought. E2E testing should be driven by an independent group with a mandate to focus on it and the time invested to maintain it: this can be the product team or it can be QA as a whole (a QA analyst, QA automation engineering team, etc). These groups can, with the help of tools and data, wrap their arms around the different user journeys, develop test cases for them, and write tests designed to catch bugs at the user journey level, and maintain them over time. This level of testing doesn’t require intimate understanding of the underlying modules of code behind the application; it’s instead meant to ensure that users can always use the application as they want to. Software teams should leave testing of lower levels of the application to those lower levels of testing—unit and API/integration testing.

Ideally, QA teams should not simply be tasked with guessing at how users are using their applications. They can and should employ advanced product analytics to understand these user journeys and how they are evolving with time. In this way, focused testers are then able to fully understand which test cases are most relevant and write the best corresponding tests to ensure quality without bloating the testing suite.

In any successful business, different roles are designed to allow talented individuals to specialize and focus. Whether it’s specializing in sales operations vs. selling and closing, marketing content vs. advertising strategy, or development and testing, specialization allows teams to operate with focus and excellence. With E2E, it follows that a specialized and complex need should be filled by a designated individual with a specialized focus and toolset in order to get the highest quality result without wasting resources.