Skill: dev-tdd

Fork

TDD development with Red-Green-Refactor cycle. Use to implement a feature by writing tests BEFORE the code. Trigger automatically when the user asks for TDD, wants to write tests first, mentions "test first", or asks to implement, add, create, fix, correct code, a new feature, a bugfix, or a functionality.

Configuration

Property	Value
Context	fork
Allowed tools	`Read`, `Write`, `Edit`, `Bash`, `Glob`, `Grep`
Keywords	`dev`, `tdd`, `as a reference`, `adapt`, `improve`, `it worked when i tried`, `waste`

Detailed description

Test-Driven Development (TDD)

Iron Law

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

If code was written before the test: delete it. Start over with TDD.

Don't keep it "as a reference"
Don't "adapt" it by writing the tests
Don't look at it
Delete = delete

Implement from scratch starting from the tests. Period.

TDD Cycle

┌─────────┐     ┌─────────┐     ┌──────────┐
│   RED   │ ──▶ │  GREEN  │ ──▶ │ REFACTOR │
│  Test   │     │  Code   │     │  Clean   │
│  fail   │     │  pass   │     │   up     │
└─────────┘     └─────────┘     └──────────┘
      ▲                              │
      └──────────────────────────────┘

Phase 1: RED - Write a failing test

Write ONE minimal test showing the expected behavior

describe('Module', () => {
  describe('function', () => {
    it('should [behavior] when [condition]', () => {
      // Arrange - Prepare
      // Act - Execute
      // Assert - Verify
    });
  });
});

Good test vs Bad test

Good: Clear name, tests real behavior, one thing only

test('retries failed operations 3 times', async () => {
  let attempts = 0;
  const operation = () => {
    attempts++;
    if (attempts < 3) throw new Error('fail');
    return 'success';
  };

  const result = await retryOperation(operation);

  expect(result).toBe('success');
  expect(attempts).toBe(3);
});

Bad: Vague name, tests the mock instead of the code

test('retry works', async () => {
  const mock = jest.fn()
    .mockRejectedValueOnce(new Error())
    .mockRejectedValueOnce(new Error())
    .mockResolvedValueOnce('success');
  await retryOperation(mock);
  expect(mock).toHaveBeenCalledTimes(3);
});

Verify RED (MANDATORY - never skip)

npm test path/to/test.test.ts

Confirm:

The test fails (no syntax error)
The failure message is the expected one
The failure comes from the missing feature (not a typo)

Test passes immediately? You're testing existing behavior. Fix the test.

Test has a syntax error? Fix it, rerun until you get a proper failure.

Phase 2: GREEN - Minimal code

Write the simplest code to pass the test. Nothing more.

Good: Just enough to pass

async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {
  for (let i = 0; i < 3; i++) {
    try {
      return await fn();
    } catch (e) {
      if (i === 2) throw e;
    }
  }
  throw new Error('unreachable');
}

Bad: Over-engineering, YAGNI

async function retryOperation<T>(
  fn: () => Promise<T>,
  options?: {
    maxRetries?: number;
    backoff?: 'linear' | 'exponential';
    onRetry?: (attempt: number) => void;
  }
): Promise<T> {
  // Features not requested by a test
}

Don't add features, refactor other code, or "improve" beyond the test.

Verify GREEN (MANDATORY)

npm test path/to/test.test.ts

Confirm:

The test passes
Other tests still pass
Clean output (no errors, warnings)

Test fails? Fix the code, not the test.

Other tests fail? Fix them now.

Phase 3: REFACTOR - Clean up

After GREEN only:

Remove duplications
Improve names
Extract helpers

Keep tests green. Don't add behavior.

Commit

git commit -m "test(scope): add tests for [feature]"
git commit -m "feat(scope): implement [feature]"

Then start over: next failing test for the next feature.

Why the order matters

"I'll write the tests after to verify"

Tests written after the code pass immediately. Passing immediately proves nothing:

The test may be testing the wrong thing
The test may be testing the implementation instead of the behavior
The test may miss forgotten edge cases
You never saw the test catch the bug

Test-first forces you to see the test fail, proving it tests something.

"I already manually tested all the cases"

Manual testing is ad-hoc:

No record of what was tested
Impossible to rerun when the code changes
Easy to forget cases under pressure
"It worked when I tried" ≠ complete test

Automated tests are systematic. They run the same way every time.

"Deleting X hours of work is a waste"

Sunk cost fallacy. The time is already lost. The choice now:

Delete and rewrite in TDD (X more hours, high confidence)
Keep and add tests after (30 min, low confidence, likely bugs)

The "waste" is keeping code you can't trust.

Common rationalizations

Excuse	Reality
"Too simple to test"	Simple code breaks. The test takes 30 seconds.
"I'll write the tests after"	Tests that pass immediately prove nothing.
"Tests-after reach the same goal"	Tests-after = "what does it do?" Tests-first = "what should it do?"
"I already manually tested"	Ad-hoc ≠ systematic. No record, not replayable.
"Deleting X hours of work is a waste"	Sunk cost. Keeping unverified code = technical debt.
"I keep it as a reference and write the tests first"	You'll adapt it. It's disguised test-after. Delete = delete.
"I need to explore first"	OK. Throw away the exploration, start in TDD.
"It's hard to test = unclear design"	Listen to the test. Hard to test = hard to use.
"TDD will slow me down"	TDD faster than debugging. Pragmatic = test-first.
"Manual testing is faster"	Manual testing doesn't prove edge cases. You retest on every change.
"Existing code has no tests"	We're improving it. Add tests for the existing code.
"It's different because..."	No. No exception without explicit user permission.

Red Flags — STOP and start over

Stop immediately if you find yourself:

Writing code before the test
Writing the test after the implementation
A test that passes immediately
Unable to explain why the test failed
Adding tests "later"
Rationalizing "just this once"
"I already manually tested"
"Tests-after reach the same goal"
"It's the spirit that counts, not the ritual"
"I keep it as a reference" or "I adapt the existing code"
"I already spent X hours, deleting is a waste"
"TDD is dogmatic, I'm pragmatic"
"It's different because..."

All these signals mean: delete the code. Start over in TDD.

Qualities of a good test

Quality	Good	Bad
Minimal	One thing only. "and" in the name? Split it.	`test('validates email and domain and whitespace')`
Clear	The name describes the behavior	`test('test1')`
Intentional	Demonstrates the desired API	Obscures what the code should do

Complete example: Bug Fix

Bug: Empty email accepted

RED

test('rejects empty email', async () => {
  const result = await submitForm({ email: '' });
  expect(result.error).toBe('Email required');
});

Verify RED

$ npm test
FAIL: expected 'Email required', got undefined

GREEN

function submitForm(data: FormData) {
  if (!data.email?.trim()) {
    return { error: 'Email required' };
  }
  // ...
}

Verify GREEN

$ npm test
PASS

REFACTOR: Extract the validation for other fields if necessary.

Verification checklist

Before declaring the work done:

Each new function/method has a test
Each test was seen failing before implementing
Each test failed for the right reason (missing feature, not a typo)
Minimal code written to pass each test
All tests pass
Clean output (no errors, warnings)
Tests on real code (mocks only if unavoidable)
Edge cases and errors covered

Can't check all boxes? TDD was skipped. Start over.

When you're stuck

Problem	Solution
Don't know how to test	Write the desired API. Write the assertion first. Ask the user.
Test too complicated	Design too complicated. Simplify the interface.
Everything must be mocked	Code too coupled. Use dependency injection.
Huge test setup	Extract helpers. Still complex? Simplify the design.

Useful commands

# Run the tests
npm test

# Tests in watch mode
npm run test:watch

# With coverage
npm run test:coverage

# A specific file
npm test -- --grep "test name"

Rules

NEVER write the code before the tests
A test that passes from the start is a BAD test
Cover edge cases (null, undefined, empty, limits)
Mocks ONLY for external dependencies (API, DB, filesystem)
NEVER modify a test to make it pass — fix the implementation
Each test MUST be seen failing before writing the code
Delete code written without a test. No exception.

Automatic triggering

This skill is automatically activated when:

The matching keywords are detected in the conversation
The task context matches the skill's domain

Triggering examples

"I want to dev..."
"I want to tdd..."
"I want to as a reference..."

Context fork

Fork means the skill runs in an isolated context:

Does not pollute the main conversation
Results are returned cleanly
Ideal for autonomous tasks

Practical examples

1. TDD Example: Price calculation function with discount

TDD Example: Price calculation function with discount

User request

"Create a calculatePrice function that applies a percentage discount"

Phase 1: RED - Tests

// src/__tests__/calculatePrice.test.ts

describe('calculatePrice', () => {
  // Nominal case
  it('should return original price when no discount', () => {
    expect(calculatePrice(100, 0)).toBe(100);
  });

  it('should apply percentage discount correctly', () => {
    expect(calculatePrice(100, 20)).toBe(80);
  });

  // Edge cases
  it('should handle decimal prices', () => {
    expect(calculatePrice(99.99, 10)).toBeCloseTo(89.99);
  });

  it('should return 0 when price is 0', () => {
    expect(calculatePrice(0, 50)).toBe(0);
  });

  // Error cases
  it('should throw when discount > 100', () => {
    expect(() => calculatePrice(100, 150)).toThrow('Discount must be between 0 and 100');
  });

  it('should throw when discount < 0', () => {
    expect(() => calculatePrice(100, -10)).toThrow('Discount must be between 0 and 100');
  });
});

Expected result: 6 tests fail

Phase 2: GREEN - Minimal implementation

// src/utils/calculatePrice.ts

export function calculatePrice(price: number, discountPercent: number): number {
  if (discountPercent < 0 || discountPercent > 100) {
    throw new Error('Discount must be between 0 and 100');
  }
  return price * (1 - discountPercent / 100);
}

Expected result: 6 tests pass

Phase 3: REFACTOR - Improvement

// src/utils/calculatePrice.ts

const MIN_DISCOUNT = 0;
const MAX_DISCOUNT = 100;

function validateDiscount(discount: number): void {
  if (discount < MIN_DISCOUNT || discount > MAX_DISCOUNT) {
    throw new Error(`Discount must be between ${MIN_DISCOUNT} and ${MAX_DISCOUNT}`);
  }
}

export function calculatePrice(price: number, discountPercent: number): number {
  validateDiscount(discountPercent);
  const discountMultiplier = 1 - discountPercent / 100;
  return price * discountMultiplier;
}

Expected result: 6 tests still pass

Commits

# After RED phase
git commit -m "test(pricing): add tests for calculatePrice function

- Test nominal cases (no discount, with discount)
- Test edge cases (decimal prices, zero price)
- Test error handling (invalid discount range)"

# After GREEN + REFACTOR phase
git commit -m "feat(pricing): implement calculatePrice with discount

- Add price calculation with percentage discount
- Validate discount range (0-100)
- Extract validation to separate function"

Configuration​

Detailed description​

Test-Driven Development (TDD)

Iron Law​

TDD Cycle​

Phase 1: RED - Write a failing test​

Write ONE minimal test showing the expected behavior​

Good test vs Bad test​

Verify RED (MANDATORY - never skip)​

Phase 2: GREEN - Minimal code​

Verify GREEN (MANDATORY)​

Phase 3: REFACTOR - Clean up​

Commit​

Why the order matters​

"I'll write the tests after to verify"​

"I already manually tested all the cases"​

"Deleting X hours of work is a waste"​

Common rationalizations​

Red Flags — STOP and start over​

Qualities of a good test​

Complete example: Bug Fix​

Verification checklist​

When you're stuck​

Useful commands​

Rules​

Automatic triggering​

Triggering examples​

Context fork​

Practical examples​

1. TDD Example: Price calculation function with discount​