Playwright for visual regression testing in Storybook is stupid easy
Playwright is shaping up to be an invaluable tool for testing. Imagine if Puppeteer could also run Firefox, Safari, and more, and imagine it had an even better API (which is saying something, because its API is pretty good to begin with!).
It sounds too good to be true, but thatâs what makes Playwright so amazing. Since implementing it on Snykâs design system, itâs done a great job of not only providing visual regression testing (VRT), but catching browser-specific regressions which has been invaluable.
In this example weâll show how to get Playwright to take screenshots of Storybook stories to catch regressions. But nothing here is Storybook-dependent; if you had any other example of your components or pages youâd like to screenshot, anything will work (even your live production site, if youâd like).
Hereâs the basic premise from a technical standpoint:
- The components and/or pages weâd like to screenshot must be easily viewable in some way (e.g. Storybook)
- Weâll create a static build of our components/pages both to test a âproductionâ env as well as improve test speed
- To load that static build, weâll start up a local static file server
- Playwright will then use our static file server to take screenshots
And from a workflow standpoint, hereâs how you and your team would use VRT:
- Whenever a component and/or page is added, a VRT test is written and a screenshot is taken locally.
- Screenshots are committed to git locally as part of the PR.
- Visual changes are thus part of the PR review process (and GitHub has an incredible image diff tool built in to PR reviews!)
- Regressions are then caught in the offending PR, and must be resolved in the PR to get it passing again.
Step one: setup
Install Playwright just like you would any other test runner: via npm:
npm i -D @playwright/test servor
Note: this setup uses the super minimal servor for a static file server, but you can replace with any static file server instead.
Next, add a playwright.config.ts
file in your project root (this example uses Chrome, Firefox, and Safari, but even more are possible):
import { type PlaywrightTestConfig, devices } from '@playwright/test';
// settings
const PORT = process.env.PORT || 5678;
const viewport = { width: 1280, height: 800 };
const deviceScaleFactor = 2;
const locale = 'en-us';
const config: PlaywrightTestConfig = {
// note: full options are specified because `devices`
// have inconsistent viewports, etc.
projects: [
{
name: 'chromium',
use: {
userAgent: devices['Desktop Chrome'].userAgent,
viewport,
deviceScaleFactor,
isMobile: false,
hasTouch: false,
defaultBrowserType: devices['Desktop Chrome'].defaultBrowserType,
locale,
},
},
{
name: 'firefox',
use: {
userAgent: devices['Desktop Firefox'].userAgent,
viewport,
deviceScaleFactor,
isMobile: false,
hasTouch: false,
defaultBrowserType: devices['Desktop Firefox'].defaultBrowserType,
locale,
},
},
{
name: 'webkit',
use: {
userAgent: devices['Desktop Safari'].userAgent,
viewport,
deviceScaleFactor,
isMobile: false,
hasTouch: false,
defaultBrowserType: devices['Desktop Safari'].defaultBrowserType,
locale,
},
},
],
webServer: {
command: `npx servor index.html ${PORT}`,
port: PORT,
},
};
export default config;
Then weâll set up our test:vrt
script in package.json
:
"scripts": {
+ "build:sb": "npx build-storybook && npx sb extract",
+ "test:vrt": "npm run build:sb && npx playwright test",
}
This shows the most basic setup where a fresh Storybook build is done before every test run. This can get pretty slow, so adjust to taste if you find yourself wanting to run VRT on its own. You can (and should!) go beyond this post and configure your test setup to whatever works best for you.
Lastly, youâll want to install all the browser binaries necessary to run your tests:
npx playwright install
From time to time youâll need to run npx playwright install
again as Playwright updates and newer browser binaries are available. But itâs not something you have to think about; Playwright will give you a friendly reminder in
its test output when itâs necessary to do so.
After youâve read this blog post and kicked the tires a bit, you may come back to adding additional setup here or adding creature comforts like the VS Code Extension. Playwright has built a wonderful little ecosystem of fun things to explore that may be huge timesavers for you and/or your team (and for as hard as testwriting can be, every little thing helps!).
Step two: writing a test
If youâve written tests for a Puppeteer, Cypress, or some other similar browser-based test runner, youâll find lots of similarities to Playwright in the general load browser â navigate to URL â write assertions workflow. But even if you havenât, Playwright is the friendliest introduction to browser-based testing Iâve ever seen.
Since weâre using Storybook, weâll write a loadStory
function to make Storybook loading easier. Youâll definitely want to move this into a common utilities file.
// src/test/utils.ts
import * as Playwright from '@playwright/test';
import fs from 'fs';
// config
const PROJECT_ROOT = new URL('../../', import.meta.url);
const STORYBOOK_DIR = new URL('./storybook-static/', PROJECT_ROOT);
/** Load Storybook story for Playwright testing */
export async function loadStory(page: Playwright.Page, storyID: string) {
// load stories.json
const storiesManifestLoc = new URL('stories.json', STORYBOOK_DIR);
if (!fs.existsSync(storiesManifestLoc)) {
console.error('â Could not find storybook-static/stories.json. Try rebuilding with `npm run build:sb`');
process.exit(1);
}
const storiesManifest = JSON.parse(fs.readFileSync(storiesManifestLoc, 'utf8'));
// load specific story
const storyMeta = storiesManifest.stories[storyID];
if (!storyMeta) {
console.error(`â Could not find story ID "${storyID}". Try rebuilding with \`npm run build:sb\``);
process.exit(1);
}
const search = new URLSearchParams({ viewMode: 'story', id: storyMeta.id });
await page.goto(`iframe.html?${search.toString()}`, { waitUntil: 'networkidle' });
// wait for page to finish rendering before starting test
await page.waitForSelector('#root');
}
Hereâs the main gist of what this is doing:
stories.json
was made from thenpx sb extract
command earlier. We use this to load stories.await page.goto(âŚ)
tells Playwright to visit a specific story URL (in the isolated iframe mode){ waitUntil: 'networkidle' }
is a handy little command to tell Playwright to wait for all assets to load before beginning the testawait page.waitForSelector('#root')
is the final piece needed to ensure Playwright doesnât get too antsy and start taking screenshots of pages before any components rendered- There are a few friendly
console.error
messages in places to help debugging
With setup out of the way, weâll write our first test. Letâs pretend we want to take a screenshot of our button story, which lives at /story/components-button--default
. Strap inâitâs going to be a real doozy:
// src/components/button/button.test.ts
import { expect, test } from '@playwright/test';
import { loadStory } from '../../test/utils.js';
test('Button: default', async ({ page }) => {
await loadStory(page, 'components-button--default');
await expect(page).toHaveScreenshot();
});
Just kiddingâthatâs it! đ In two lines, we can load a page and take a screenshot. When you run npm run test:vrt
the first time it will create PNGs from the screenshots it took:
src/components/button/button.jsx
src/components/button/button.test.ts
+ src/components/button/button.test.ts-snapshots/Button-default-chromium-darwin.png
+ src/components/button/button.test.ts-snapshots/Button-default-firefox-darwin.png
+ src/components/button/button.test.ts-snapshots/Button-default-webkit-darwi.png
Commit these to your project. It will then compare against these files in future runs, and if the pixel diff is significant enough Playwright will throw the test suite.
â¨Tip: our example is given the default
.test.ts
extension, but if youâre also using another test runner for unit testing like Vitest, youâll have conflicts. Consider using something like an.e2e.ts
extension for Playwright tests, which you can set via the testMatch config option.
To run tests again, run npm run test:vrt
locally to run the full suite. Any regressions should be either fixed, or have updated screenshots with npx playwright test --update-snapshots
.
Step three: writing a behavioral test
The real win in using Playwright is testing component interactivity cross-browser, and ensuring it behaves as expected. Letâs expand our button tests with one testing the focus state:
// src/components/button/button.test.ts
import { expect, test } from '@playwright/test';
import { loadStory } from '../../test/utils.js';
test('Button: default', async ({ page }) => {
// âŚ
});
test('Button: focus', async ({ page }) => {
await loadStory(page, 'components-action-button--default');
// focus on button (note: Storybook has several hidden buttons; select ours with text âButtonâ)
await page.locator('text=button').first().focus();
await expect(page).toHaveScreenshot();
});
Because weâre using actual browsers, we can simply select an element and call .focus()
. This is powerful, getting to test accessibility compliance automatically across multiple browsers. And thereâs so much more
you could test, such as:
- Testing forms show validation errors after blur, etc.
- Testing dialogs open as expected
- Testing keyboard functionality on complex components (e.g. comboboxes)
- Responsive testing using page.setViewport()
If youâre testing accessibility of a website such as visible focus ring, the only real way to test this is using VRT. And it couldnât be easier to do than with Playwright.
Step four: Playwright VRT in CI
Of course, what good is VRT if itâs not baked into continuous integration? Tests are worthless if theyâre never run, after all.
Itâs worth noting that for VRT to work, the test environment must match the environment that took the screenshots. Meaning, if you and your team take screenshots on a Mac, VRT in CI needs to run on a Mac. Weâll come back to the scenario where this isnât possible for you, and why paid VRT tools do it differently. But for now weâll run with the assumption committing screenshots to git as part of a pull request works best for you and your team.
Weâll set up our VRT suite using GitHub Actions, but this can be done in any CI setup. Weâll create our /.github/workflows/ci.yml
file along with a vrt
job that runs our script:
on:
push:
branches:
- main
pull_request:
workflow_dispatch:
jobs:
vrt:
runs-on: macos-12 # note: "macos-latest" is 11.x
concurrency:
group: ${{ github.ref }}/vrt
cancel-in-progress: true
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: 18
- run: npm ci --ignore-scripts
- run: npx playwright install --with-deps
- run: npm run test:vrt
- name: Upload failures
if: ${{ failure() }}
uses: actions/upload-artifact@v3
with:
name: Playwright Failure
path: test-results/
This will run in GitHub Actions on every PR, and report any failures that happened. Playwright will list out the specific tests that failed in its error logs, but to see the actual diff youâll need to download the artifact it generated. But assuming it was an actual regression and not a flaky test, the PR diff itself will give pretty strong hints as to what caused the regression.
You could get even fancier with this like using GitHubâs comment API to post the image diffs in a PR comment. But this is just a basic setup which took very little time to create, and is about as low-cost as you can get.
Thatâs all there is to it! đ This is all you need to get up-and-running with Playwright VRT for the low, low cost of free. And Playwright provides some exciting tools to iterate on this foundation to best meet your teamâs needs.
Sidenote: when local screenshots arenât possible
Back to the problem mentioned earlier of creating/updating screenshots locally: when does that fall apart? You may need an alternate approach if:
- It isnât possible for your CI environment to match your local machine
- The people that need to update screenshots use different OS
- Youâre finding local screenshots are too flaky and/or too inconsistent between machines
The solution for all of the following is to move screenshot creation into CI. But this raises other questions, such as When should this happen? Who has the authority to approve this? How can changes be tracked?
The simplest solution is to create a manual workflow that can be run from the GitHub UI inside each PR. Youâll have to make sure the action makes a commit in the right place, and follows all guidelines you have in place. But there are dozens of other ways to solve this problem that will differ based on team and project needs.
There are also additional problems to solve by moving screenshot creation to CI, such as:
- The VRT tests now canât be run locally
- The dev workflow now requires usage of an external website
- Diffs/regressions will be extremely annoying to look at
- The mechanism to update screenshots now has to be maintained
Solving all those problems will either take a decent investment or buying into one of the many paid VRT SaaS products that handles everything for you. At this point itâs up to you to decide whatâs best. But arguably the best approach to using Playwright is the local screenshots/local testing approach outlined in the previous steps. So try to exhaust every option to make that possible before trying to generate screenshots in CI.
Additional tips
- Sadly, dialing in tests to reduce flakiness is just part of VRT. Even paid VRT products struggle with this. Fortunately, you can set sensible defaults in your global config as well as using overrides in individual tests (but worth noting that Playwright has pretty great defaults out-of-the-box that are neither too strict nor too lax).
- Allow for some margin of error with text antialiasing (â ď¸ warning: trying to get perfect results is a rabbit hole)
-
maxDiffPixels
andmaxDiffPixelRatio
can be hard to choose between. WhereasmaxDiffPixels
is great at catching precise regressions such as a thin border change, it also increases test flakiness by not dealing well with text antialiasing noise. Likewise,maxDiffPixelRatio
is ideal for text-heavy screenshots, but it can sometimes be too loose when it comes to catching subtle pixel regressions for larger components/pages. Knowing when to use one over the other will take experimentation and distinguishing between the goals of what youâre trying to capture in your VRT. - Screenshots have too much whitespace? Try the clip option
- Add a suggestion to update branches if possible. This helps eliminate the possibility of a VRT failure caused merging an out-of-date PR.