Fetching latest headlinesโ€ฆ
I Built Tautest: A Mutation Testing Workflow for AI-Written Tests
NORTH AMERICA
๐Ÿ‡บ๐Ÿ‡ธ United Statesโ€ขMay 11, 2026

I Built Tautest: A Mutation Testing Workflow for AI-Written Tests

0 views0 likes0 comments
Originally published byDev.to

AI coding agents are getting really good at writing tests.

But I kept running into one uncomfortable problem:

Passing tests do not always mean strong tests.

Sometimes an AI agent writes tests that pass, but those tests only confirm that the current implementation runs. They do not necessarily prove that the behavior is protected.

That is why I built Tautest.

Tautest is an open-source CLI and GitHub Action that runs mutation testing on changed source lines, finds weak tests, and generates an AI-ready fix prompt for Claude Code, Cursor, Codex, or human reviewers.

GitHub:

https://github.com/canblmz1/tautest

npm package:

https://www.npmjs.com/package/tautest

The problem

Letโ€™s say your code has a condition like this:

if (age >= 65) {
  return subtotal * 0.2;
}

Your normal tests might pass.

But what if this condition is mutated to:

if (age > 65) {
  return subtotal * 0.2;
}

If your tests still pass, then the exact boundary at 65 is not protected.

That is a weak test.

This is the kind of thing Tautest is designed to expose.

Demo

Regular tests pass, but Tautest finds a surviving mutant that the tests missed. After adding the missing boundary test, the mutation score improves to 100%.

Tautest demo

What Tautest does

Tautest is not a mutation testing engine.

It uses StrykerJS as the mutation testing engine and adds a workflow layer around it.

Tautest:

  • reads changed source lines from git diff
  • runs StrykerJS mutation testing only on those changed lines
  • parses surviving mutants
  • generates Markdown, JSON, and terminal reports
  • writes an AI-ready fix prompt
  • can post a sticky GitHub PR comment
  • supports Vitest
  • has Jest beta support

The goal is simple:

Do not just ask whether the tests pass. Ask whether the tests fail when the behavior is mutated.

Example output

A regular test run can be green:

Test Files  1 passed
Tests       3 passed

But Tautest can still find a surviving mutant:

Tautest: MIXED (75.00%, threshold 60.00%)
Killed: 3 | Survived: 1 | No coverage: 0

Top surviving mutants:
- src/discount.ts:2 EqualityOperator

The surviving mutant:

age >= 65  ->  age > 65

After adding the missing boundary test:

it("applies the senior discount at exactly 65", () => {
  expect(calculateDiscount(65, 80)).toBe(16);
});

Tautest reports:

Tautest: STRONG (100.00%, threshold 60.00%)
Killed: 4 | Survived: 0

The AI fix prompt workflow

One thing I wanted Tautest to do was help AI coding agents write better tests without letting them rewrite production code.

So Tautest generates a file:

.tautest/fix-prompt.md

The prompt includes rules like:

  • do not change production code
  • only edit or add test files
  • every new test must pass against the original code
  • every new test must fail against the mutant behavior
  • do not weaken existing assertions
  • do not write filler tests like expect(true).toBe(true)

The workflow becomes:

  1. Run Tautest.
  2. Open .tautest/fix-prompt.md.
  3. Paste it into Claude Code, Cursor, Codex, or use it yourself.
  4. Add the missing test.
  5. Run your normal tests.
  6. Run Tautest again.

Install

For Vitest projects:

pnpm add -D tautest @stryker-mutator/core @stryker-mutator/vitest-runner
pnpm exec tautest init --yes --runner vitest --no-install
pnpm exec tautest doctor
pnpm exec tautest run --base origin/main

For Jest projects, Jest support is currently beta:

pnpm add -D tautest @stryker-mutator/core @stryker-mutator/jest-runner
pnpm exec tautest init --yes --runner jest --no-install

GitHub Action usage

Tautest also ships with a GitHub Action that can run on pull requests and post a sticky PR comment.

name: Tautest

on:
  pull_request:

permissions:
  contents: read
  pull-requests: write

jobs:
  tautest:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: actions/setup-node@v4
        with:
          node-version: 20

      - uses: pnpm/action-setup@v4
        with:
          version: 10

      - run: pnpm install --frozen-lockfile
      - run: pnpm build

      - uses: canblmz1/tautest/packages/github-action@v1
        with:
          base: ${{ github.base_ref }}
          threshold: 60
          comment: changes
          cache: true

Important notes:

  • fetch-depth: 0 is required because Tautest needs git history.
  • pull-requests: write is required for sticky PR comments.
  • The v1 action currently ships from the monorepo path.

What Tautest does not do

Tautest is intentionally limited.

It does not:

  • implement its own mutation engine
  • replace StrykerJS
  • call any LLM API
  • prove that your tests are perfect
  • fully support monorepos in v1
  • classify AI-written tests with certainty

It is a deterministic workflow:

changed source lines -> mutation testing -> surviving mutants -> report -> fix prompt

Why I built it

AI coding agents are useful, but I do not want to blindly trust generated tests.

I wanted a workflow where an AI agent can write or improve tests, but a deterministic tool checks whether those tests actually protect behavior.

That is the main idea behind Tautest.

It is not:

AI wrote tests, so trust them.

It is:

AI wrote tests, now mutate the changed code and see whether those tests actually fail.

Current status

Tautest v1.0.0 is published.

Validated before v1:

  • [email protected] published
  • @tautest/[email protected] published
  • Release Readiness workflow passed
  • source-changing PR smoke passed
  • mutation run completed in GitHub Actions
  • JSON output parsed
  • sticky PR comment create and update verified
  • artifact upload verified

Roadmap

Some things I want to improve next:

  • Node 24 GitHub Action runtime migration
  • better cache observability
  • monorepo beta support
  • standalone GitHub Action repo
  • PR line annotations
  • more Jest fixtures

Links

GitHub:

https://github.com/canblmz1/tautest

npm:

https://www.npmjs.com/package/tautest

Core package:

https://www.npmjs.com/package/@tautest/core

I would love feedback on:

  • whether the README and demo explain the idea clearly
  • whether the GitHub Action workflow makes sense
  • whether the AI fix prompt workflow is useful
  • whether this should stay JS and TS focused for now

Comments (0)

Sign in to join the discussion

Be the first to comment!