Decision guide for browser test tooling

Published in

New Work Development

16 min readDec 6, 2022

At NEW WORK SE, every team is free to decide on the browser test automation tooling for their project. However, information from tooling vendors and other parties (e.g. blog posts) is often not detailed enough and also misleading. To fill the gaps, I decided to compile the knowledge that I gathered during daily business and Hackweek research.

This document is for you if you have to decide upon which tooling to use for running automated browser tests in your project. But this is not your usual browser test tooling shootout, discussing which tool provides the best API or the coolest new feature. No, this guide is different. Next to the tools, it also discusses idiosyncrasies of browsers, the validity of the tools’ test results and, last but not least, the requirements of your project.

These are the three main sections of the document:

Browsers, which discusses basics, compatibility and common gotchas related to testing
Browser test tools, which provides an overview and discusses important differences and takeaways in more depth
Your project’s characteristics, giving some hints on which project characteristics may have an influence on your browser test tool decision

In case you do not want to read through everything in all detail, just skip to the conclusion at the end of the document.

Browsers

First, we need to talk about browsers. Because a) that is what your customers are using to interact with your web application. And b) it is browsers that you aim to automate your tests in.

The basics

At time of writing, the following browsers are the most well-known: Chrome, Edge, Firefox, Safari. There are also a lot of others, like Opera, Brave, Samsung Internet, or UC Browser.

One can say that the majority of websites gets rendered by one of these three rendering engines:

Blink, which powers Chrome, Edge, Samsung Internet Browser, Opera and others
Gecko, which powers Firefox
WebKit, which powers Safari (macOS/iOS) and any browser running on iOS devices

Browser compatibility and interoperability

Each browser should follow web standards. Some browsers adhere well to those standards, some not so well.

For Chrome, Edge, Firefox and Safari, the Web Platform Tests project provides empirical evidence of how well they do. But there is also a lot of (albeit anecdotal, still large scale) evidence out there.

Examples:

Browser bugs filed by the React Spectrum team at Adobe (as of September 2022, “almost 60% are WebKit. ~25% Chrome, ~16% Firefox”)
Browser vendors themselves acknowledge the problem and decided to act upon it in the Interop 2022 initiative
Developer quotes from the MDN Browser Compatibility Report 2020

Common gotchas related to testing

Diverging behavior between browsers is (sadly) pretty normal. But apart from diverging behavior between browser A and browser B, there may even be differences between browsers of the same name (e.g. Safari macOS and Safari iOS) or even between browsers that are using the exact same build (e.g. headless and headed instances). Some of these differences are well-known, some of them not.

Safari macOS ain’t Safari iOS

Even when the Safari version is identical on macOS and iOS, you may encounter differences between both. One possible reason for this is that the WebKit version used in Safari is not guaranteed to be the same across iOS and macOS (see Safari version history on Wikipedia).

There could be other reasons that are hard to pinpoint, probably due to macOS/iOS specifics. Whatever the reason, developers experience these differences in the wild. The MDN Browser Compatibility Report 2020 contains a number of developer quotes revolving around the topic¹.

Chrome iOS ain’t Chrome

Any web browser app running on iOS (be it Chrome iOS, Firefox iOS, Edge iOS or else) has to use the WebKit rendering engine. This is because Apple mandates it. Apple’s App Store Review Guidelines as of September 2022:

“Apps that browse the web must use the appropriate WebKit framework and WebKit Javascript.”

But even when the same rendering engine is used, we have seen differences in browser behavior. For example, the browser chrome (the UI elements of the browser, e.g. the address field, bookmark buttons) can behave differently, i.e. collapse/expand in different ways or take up more space on the screen, reducing the viewport size.

WebKit ain’t Safari

It is not safe to say that a WebKit build behaves the same as the Safari binary shipped with iOS or macOS. Here is an anecdotal example of some test passing in a WebKit build, but failing in Safari. However, this example does not provide enough information in order to make a clear statement: It neither mentions the WebKit version nor the Safari version and there are no statements by WebKit or Apple engineers. Maybe there are no differences if you compare a WebKit build’s behavior with Safari that uses the exact same WebKit version.

What is safe to say, though, is that you could get different results if the versions of your WebKit build and the WebKit version in your customer’s Safari browser do not match. Imagine for example you are testing your application in the latest WebKit build that contains a new feature. Your customer’s Safari browser may not use the latest WebKit version, so it lacks that new WebKit feature. In this scenario, your test passes in WebKit, but fails in Safari used by your customer.

This of course applies to other constellations of “open-source builds vs. commercial builds” as well, e.g. Chromium vs. Chrome, or even constellations of nightly vs. stable channels (mentioned further down below).

Headless ain’t headed

Headless browsers show different behavior in certain areas when compared to a headed instance. We are aware of the following examples:

‘Accept-Language’ header not set as expected
Viewport height is greater in headless mode
Intersection observer may behave differently in headless browsers
In Chrome, not all code paths may be executed when running the browser in headless mode (“headless is a different browser”)

Nightly/beta channel ain’t stable channel

This should be obvious. A pre-release channel may contain bugs, but it is also probable that a feature shipped in a pre-release channel is not available in the stable release channel yet. In other words: If you run your tests in a pre-release channel and see them pass, it could be that the same tests fail when run in the stable channel, or vice versa.

Mobile browser emulation ain’t an actual mobile browser

Similar to the previous section, this should be obvious as well. Sadly, browser test tool vendors’ marketing is often misleading and we observed people falling for it.

To make it clear: If you emulate an iPhone e.g. in your Chrome’s Devtools running on a desktop device, this only sets certain parameters (such as device pixel ratio, viewport size, touch events and more). This setup cannot magically reproduce a proper Safari browser on a proper iPhone. Because after all, the iPhone browser is emulated in a Chrome browser powered by the Blink engine and not by WebKit.

Browser test tools

Every browser test tool has its purpose, its own strengths and deficits. Thus, comparing them is hard and in some aspects unfair. But for the sake of an overview, we provide comparison tables, clustered in separate sections to make the information digestible. After that, we discuss notable differences in more detail.

Browser test tool overview

All information is split as follows:

Basic information: Purpose, licence, year of release, architecture and more
Browser support: Which (mobile) browsers the tool supports
Validity of test results: How close the test results are to real life conditions
Feature set: Every tool we discuss in this document is able to do what you can expect from a browser automation tool: Navigate to URLs, find elements, click, type, drag/drop, read/create cookies, wait for conditions, start headless browsers, create screenshots, run tests/browsers in parallel etc. This section lists features that are not supported equally well across tools.
Developer experience: How easy it is to set up, to write and debug tests, how good the documentation is, etc.
Key strengths and limitations: What the tools are exceptionally good or bad at

One word about WebDriver: The comparison tables do not list specific WebDriver-based tools. This is because the overview would become too loaded.

Basic information

Browser support

Validity of test results

Feature set

Automatic wait on elements: The tool tries to automatically wait for an element before interacting with it. This eliminates a lot explicit wait commands in the test code
Remote execution: The support running browsers on a remote machine (not on the machine that executes the test code)
Video recordings: The tool can create a video recording of what happened in the browser during the test run
Switch tabs: The tool supports creating tabs in a browser instance
iFrame support: The tool supports switching between iFrames
Shadow DOM support: The tool works with Shadow DOM elements without extra effort
Network inspection/interception/mocking: The tool is able to “look” into what is happening in the network. Therefore, blocking or mocking network requests is also possible
Browser console inspection: The tool is able to “look” into the browser console
Test recorder: The tool is able to record interaction with the UI of an app and create test code from that
Test script debugger: The tool supports replaying a test in the browser, allowing the engineer to inspect what exactly was going on
Component testing: The tool supports testing of UI components in isolation (like it is often done with JSDOM)

Developer experience

Key strengths and limitations

Important differences and takeaways

Certain differences between the tools are worth discussing in more detail.

Different interpretations of “cross-browser”

All of the above-listed tools except for Puppeteer can claim that they support “cross-browser” testing. This is because they support running tests in more than one browser. Right? Well… technically, yes. But there are some things to keep in mind.

Cypress and Playwright do not support Safari: Instead, they support WebKit. We already discussed above that WebKit ain’t Safari
Cypress and Playwright do not offer testing on real mobile browsers: Instead, they rely on mobile emulation in the browser’s dev tools. We already discussed that mobile browser emulation ain’t an actual mobile browser
Puppeteer and Playwright couple browser versions to the version of the tool. The browsers installed by Playwright and Puppeteer are always a little ahead of the current stable browser versions. For example, if you upgrade Playwright, you get a new set of browser versions. If you do not upgrade Playwright for a longer time, the browser versions covered in your tests lag behind the real world

Native vs. synthetic events

Let’s talk about user events emulated by the browser test tools, for example mouse events and keyboard events.

Playwright, Puppeteer and all WebDriver-based tools emulate “native events”. This is the type of events a real user creates when they interact with the browser. Native events are received by the browser from the operating system (OS).

Cypress and TestCafé on the other hand produce “synthetic events”. These events are produced by executing JavaScript directly in the browser, no OS-level events are involved.

It is important to understand that an application-under-test may behave differently, depending on whether it receives a native event or a synthetic event. To illustrate the difference by example of a mouse click, imagine a button that is covered by a transparent element. For a real user, the button would be visible, but not clickable — the transparent element would receive the click instead of the button. An automation tool producing a native click event would yield the same result. However, with a synthetic click event, the button would still receive the click.

For context, resorting to synthetic click events in WebDriver is considered a bad practice since a very long time. You can find discussions about the topic dating back to the pre-Cypress, pre-TestCafé era (example from 2014 and from 2016).

But there is more to it. Certain browser features just cannot be automated at all without native events. Some examples are: Sending keys to the page (tab key for accessibility testing), file upload or full-screen mode.

Feature set comparison

Cypress, Playwright and TestCafé set the benchmark when it comes to the feature set. For example, all four tools allow for…

… network inspection/interception/mocking
… browser console inspection
… automatic waiting on elements
… point-and-click recording of test steps
… debugging test scripts
… recording video and screenshots of the executed test

Anything based on WebDriver has to start at a lower plateau. Some WebDriver-based tools implement certain features where possible (e.g. automatic waiting for elements), but they cannot implement features when constrained by WebDriver (e.g. browser console inspection). Third-party services like Browserstack and Sauce Labs can fill that gap, but not across all browsers.

Ease of use and developer experience

Cypress, Playwright and TestCafé lead the pack in terms of ease of use. They come bundled with everything required to write browser tests, are easy to learn, offer great features for debugging tests and offer comprehensive documentation.

Puppeteer and WebDriver-based tools do not shine here. However, to be fair with regards to WebDriver, there are differences between tools created by the Selenium project versus tools created by other maintainers. For example, WebdriverIO does not have that steep of a learning curve, has debugging functionality and comprehensive documentation.

Ownership, maintenance and governance

WebDriver as a W3C standard is part of the Selenium project, which is open-source with the Software Freedom Conservancy as copyright holder. There are tools built upon WebDriver that follow a similar approach. One example is WebdriverIO, which is backed by the OpenJS Foundation.

Cypress, Playwright, Puppeteer and TestCafé are all in some way backed or maintained by a company. This may also apply to WebDriver-based tools. There is, for example, Nightwatch.js with Browserstack as copyright holder. Keep in mind that such companies may discontinue working on their tool at any time. One famous example for that is Angular/Google announcing in 2021 to pull the plug from Protractor (a WebDriver-based tool).

Your project’s characteristics

All of the above information should help you getting a decent understanding of browsers and browser automation tools. However, you cannot make a decision without taking some of your project’s characteristics into account. The most important seem to be browser support and certain team characteristics.

Which browsers do you have to support?

This is informed by several aspects. Here are some examples.

The type of application you are building: Desktop-only or both, desktop and mobile
The targeted region(s): At NEW WORK SE, this is usually DACH. If you target more/other regions, keep in mind that browser/device market share can be quite different there
The size of your userbase: More users means more diversity in browsers and devices. Also, if you have many users on your page, a small relative browser usage number may translate into a quite large absolute number
The targeted customers: In a B2B project, your customers may work on administered desktop machines that only allow certain browsers or browser versions. For example, larger companies only allow their employees to use the Firefox Extended Support Release (ESR) channel instead of the latest stable Firefox channel

If you need inspiration with regards to browser share numbers, have a look at the reports of your tracking suite. Alternatively, explore the data available on the Statcounter browser market share overview.

What are your team’s characteristics?

Think about the characteristics of your team that may have an influence on your tool decision. Here are some examples.

Who will write the browser tests most of the time: Our suggestion is that frontend engineers do it, directly when working on the app. This most probably makes JavaScript a default as the programming language, but maybe you have someone “full-stack” on your team who prefers a different language
Expertise in browser automation tools: If you have never written automated browser tests (or not even any other code) before, the learning curve can be quite steep and you may want to start off with a tool that is easier to set up and learn

Conclusion

Trade-offs everywhere

It should be no surprise with all the things discussed above: None of the browser test tools mentioned in this document is a universal fit for any kind of project. There is no clear winner, no one tool to rule ’em all:

Playwright and Puppeteer do not offer great/any cross-browser support
Cypress, Playwright and Puppeteer do not offer mobile browser support
Cypress and TestCafé do not provide 100% valid test results
WebDriver-based solutions lack ease of use and good developer experience

This means that your path to a browser test tool decision will be paved with trade-offs. And after you arrived at a decision, brace yourself to also feel the trade-offs in your daily business.

Two types of tools

As of 2022, you can basically decide between two types of tools: The first type offers great features and developer experience (they satisfy engineer’s needs). The second type provides great support for off-the-shelf browsers and validity of test results (they satisfy your customer’s needs).

First type: Cypress, TestCafé
Somehow both, first and second type: Playwright
Second type: WebDriver-based tools

Cypress and TestCafé are at home in the first category. Cypress has defined what can be expected from a browser test tool in terms of feature set, debuggability and developer experience. TestCafé is also a great tool in these categories. However, the lack of native events in both tools and some more limitations in Cypress (no iFrame support, no tab support, …) make them less attractive.

Playwright is somehow at home in both categories. It provides great developer experience and also validity of test results, but it comes with one caveat: It has very limited browser support. It could be a good fit if your applicaton is desktop-only and only supports Chrome and Firefox. If you are able to manage browser versions (i.e. always be a little behind Playwright’s official releases), then this could work for you.

WebDriver-based tools are at home in the second category. WebDriver is boring technology in the best sense of the word. And it does a very good job at being where your customers are. Some tools built upon WebDriver offer an okay-ish developer experience and they are quick in picking up the latest WebDriver features. Good examples are WebdriverIO and Nightwatch.js.

A nuanced comparison

Of course it is a bit unfair to paint a black and white picture. So let’s compare the discussed tools in a nuanced manner, focusing on the following properties:

Browser support: How broad the browser support is and how close to real life conditions
Validity of test results: How close the test results are to real life conditions
Feature set: Number and type of features
Developer experience: Ease of use, debuggability, etc
Ownership, maintenance and governance: How independent the tool is from the favor of a company, how many maintainers it has, how independent it is in terms of governance

A radar chart comparing Cypress, Playwright, Puppeteer, TestCafé and WebDriver at a glance

Our recommendation

As mentioned in the introduction, this comparison of browser test tools is different, and here is why: We want to bring the customer perspective to the table. That may sound strange at first, since your customers do not pick your test tool and they do not know which tool you are using. But it’s them who feel the consequences if you miss bugs.

Ideally, your testing happens in a stable-channel off-the-shelf browser, because that is what your customers are using. And ideally, your testing happens with native OS-level events, because that is what your customers are producing when working with your app. So, in light of that, the two best options are currently:

Any WebDriver-based tool that provides good developer experience
Playwright in case your app is desktop-only and Chrome/Firefox-only

But after all, you have to decide yourself of course. We would be glad if this document helps you on your way to an informed decision. Have fun building great apps for your customers. Good luck!

Thanks to Björn Brauer, Markus Wolf, Amilcar Gomes and Daniel Lauschke for reviewing this document!

Remarks

[1]: Developer quotes from the MDN Browser Compatibility Report 2020:

“Safari desktop does not support type=”date” on the `<input>` element on desktop though Safari iOS has support for this attribute.”
“So the scroll worked fine in Chrome in iOS, in Chrome and Safari on macOS. But in Safari for iOS there was somewhere where that scroll was interfering with our scroll.”
“With iOS 13 they changed something and now scrolling works like garbage on iOS, especially on older devices such as an iPhone SE or older iPads. All the animations are broken and it doesn’t feel natural if you’re swiping through this gallery. Now I’ve used this in 5 or 6 projects, and something that used to work fine is now broken.”

[2]: Tools built on top of WebDriver may have a different license

[3]: Chromium is also possible

[4]: Every given Playwright version supports: Chromium (current Chrome version + 1), Chrome/Edge current stable and beta, Firefox current stable, WebKit current trunk build

[5]: Depends on which tool you are using. For example, WebdriverIO ships with “automatic waits”

[6]: Depends on which tool you are using. WebdriverIO features a plugin that converts Chrome DevTools Recorder scripts to WebdriverIO test code

[7]: Depends on which tool you are using. Nightwatch.js supports component tests for Vue and React components

[8]: Depends on which tool you are using. Time to first test and debuggability is quite good with WebdriverIO, but not so much with the bare selenium-webdriver NPM package