How to Fix Flaky Tests in Playwright: 10 Battle-Tested Strategies

Published: · 5 min read

Flaky tests destroy team confidence and slow deployment. Here are 10 proven strategies to eliminate flaky Playwright tests — from someone who's fixed thousands of them.

How to fix flaky tests in Playwright - debugging strategies and best practices

The Cost of Flaky Tests

A flaky test is one that sometimes passes and sometimes fails without any code changes. They seem harmless at first — just re-run the pipeline, right?

Wrong. Flaky tests are silent killers:

  • Eroded trust — Teams stop believing test results
  • Wasted time — Engineers debug tests instead of building features
  • Slower releases — "Just re-run it" becomes the norm
  • Hidden bugs — Real failures get dismissed as "flaky"

At one company, I inherited a test suite with a 68% pass rate. Not because the application was broken — but because the tests were. Here's how I fixed it.

Strategy 1: Use Role-Based Locators

The problem: CSS selectors and XPath break when developers change class names or restructure HTML.

The fix: Use Playwright's role-based locators:

Bad:

page.locator('#submit-btn')

page.locator('.form-container > button:nth-child(2)')

Good:

page.getByRole('button', { name: 'Submit' })

page.getByLabel('Email address')

page.getByText('Welcome back')

Role-based locators are more resilient because they target what users see, not implementation details.

Strategy 2: Never Use Hard-Coded Waits

The problem: waitForTimeout() is the number one cause of flaky tests.

await page.waitForTimeout(5000); // ❌ Please don't do this

Why it fails: 5 seconds might be enough on your machine but not in CI. Or it might be way too long, slowing tests unnecessarily.

The fix: Wait for specific conditions:

await page.waitForLoadState('networkidle');

await expect(page.getByRole('button')).toBeEnabled();

await expect(page.getByText('Success')).toBeVisible();

Playwright's auto-waiting handles most cases automatically. Trust it.

Strategy 3: Isolate Test State

The problem: Tests depend on state from previous tests.

test('login', ...);  // Creates session

test('add to cart', ...);  // Expects logged-in state

If the login test fails, the cart test also fails — but not because of a cart bug.

The fix: Each test should set up its own state:

test.beforeEach(async ({ page }) => {
  await loginAsUser(page, 'testuser');
});

Or use Playwright's storage state to share authentication without dependencies:

await page.context().storageState({ path: 'auth.json' });

Strategy 4: Handle Loading States Explicitly

The problem: Clicking a button that's still loading, or reading text before it's rendered.

The fix: Wait for loading indicators to disappear:

// Wait for spinner to go away
await expect(page.getByTestId('loading-spinner')).toBeHidden();
// Then interact with the element
await page.getByRole('button', { name: 'Submit' }).click();

Or wait for the element to be in a specific state:

await expect(page.getByRole('button')).toBeEnabled();
await page.getByRole('button').click();

Strategy 5: Use data-testid for Dynamic Content

The problem: Elements generated dynamically have unpredictable locators.

The fix: Add data-testid attributes for testing:

In your application code:

<button data-testid="checkout-button">Checkout</button>

In your test:

await page.getByTestId('checkout-button').click();

This creates a contract between frontend and tests that survives refactoring.

Strategy 6: Retry Failed Assertions (Not Whole Tests)

The problem: A test fails once and you re-run the entire suite.

The fix: Use Playwright's built-in expect retries:

// playwright.config.ts
export default defineConfig({
  expect: {
    timeout: 10000, // Wait up to 10 seconds for assertions
  },
});

Assertions like toBeVisible() and toHaveText() will automatically retry until timeout — no manual retries needed.

Strategy 7: Handle Network Variability

The problem: API calls take longer in CI than locally.

The fix: Wait for network responses explicitly:

// Wait for specific API call to complete
await page.waitForResponse(resp =>
  resp.url().includes('/api/products') && resp.status() === 200
);

Or use networkidle for simpler cases:

await page.goto('/dashboard', { waitUntil: 'networkidle' });

Strategy 8: Run Tests in Parallel Correctly

The problem: Tests interfere with each other when running in parallel.

Test A creates user "testuser@example.com"

Test B also creates user "testuser@example.com"

One fails due to duplicate email.

The fix: Use unique data per test:

import { faker } from '@faker-js/faker';
const email = faker.internet.email();
await page.getByLabel('Email').fill(email);

Or isolate tests in separate browser contexts (Playwright does this by default).

Strategy 9: Use Trace Viewer for Debugging

The problem: You can't see what happened when a test failed in CI.

The fix: Enable traces on failure:

// playwright.config.ts
export default defineConfig({
  use: {
    trace: 'on-first-retry',
  },
});

Now, when a test fails and retries, Playwright captures:

  • Screenshots at every step
  • DOM snapshots
  • Network requests
  • Console logs

Open traces with:

npx playwright show-trace trace.zip

This is the single best debugging tool for flaky tests.

Strategy 10: Set Realistic Timeouts

The problem: Default timeouts are too short for slow environments.

The fix: Configure appropriate timeouts based on your CI environment:

// playwright.config.ts
export default defineConfig({
  timeout: 60000,        // Test timeout: 60 seconds
  expect: {
    timeout: 10000,      // Assertion timeout: 10 seconds
  },
  use: {
    actionTimeout: 15000, // Click/fill timeout: 15 seconds
    navigationTimeout: 30000, // Page load timeout: 30 seconds
  },
});

Don't make them too long — slow failures are frustrating. Find the right balance for your infrastructure.

Bonus: The Flaky Test Triage Process

When you encounter a flaky test, follow this process:

1. Reproduce locally

Run the test 10 times:

npx playwright test tests/checkout.spec.ts --repeat-each=10

If it passes every time locally but fails in CI, it's likely a timing or environment issue.

2. Check the trace

Open the trace file from CI and look for:

  • Slow network requests
  • Elements not visible when clicked
  • Unexpected modal or popup

3. Identify the root cause

Common causes:

  • Hard-coded wait times
  • Race conditions
  • Shared test state
  • Unstable locators

4. Fix or delete

If you can't fix it after 30 minutes, delete it. A flaky test is worse than no test. You can always rewrite it properly later.

My Flaky Test Scorecard

After applying these strategies at CooperVision, we went from 68% pass rate to 98%+ in three months.

Pass rate:

  • Before: 68%
  • After: 98.5%

Average test time:

  • Before: 4.2 minutes
  • After: 1.8 minutes

"Re-run pipeline" requests:

  • Before: Daily
  • After: Rare

Team trust in automation:

  • Before: Low
  • After: High

The biggest win wasn't technical — it was cultural. When tests are reliable, developers actually care about failures.

Need Help with Your Flaky Tests?

If your test suite is unreliable and slowing down your team, I can help. I've fixed test suites at Fortune 500 companies and can usually identify the main issues in just a few hours.

Book a consultation

Subscribe

Get notified when I publish something new, and unsubscribe at any time.

Latest articles

Read all my blog posts

· 2 min read

Page Object Model in Playwright with TypeScript: Complete Guide

Learn how to structure scalable Playwright tests using the Page Object Model pattern. Includes TypeScript examples, best practices, and real-world architecture patterns.

Page Object Model in Playwright with TypeScript: Complete Guide

· 4 min read

How to Migrate from Selenium to Playwright in 2026: Complete Guide

A practical, step-by-step guide to migrating your Selenium test suite to Playwright. Includes code comparison, common pitfalls, and a migration strategy that won't disrupt your team.

How to Migrate from Selenium to Playwright in 2026: Complete Guide