Skill: dev-debug

Fork

Debug and resolve problems. Use when the user has a bug, an error, an unexpected behavior, or wants to understand why something is not working.

Configuration

Property	Value
Context	fork
Allowed tools	`Read`, `Glob`, `Grep`, `Bash`
Keywords	`dev`, `debug`, `quick fix for now`, `let's just try changing x`

Detailed description

Debug a Problem

Goal

Identify the root cause of a bug methodically via a systematic 4-phase approach.

Systematic Methodology (4 Phases)

┌──────────────────────────────────────────────────────────────────┐
│                   SYSTEMATIC DEBUGGING                            │
├──────────────────────────────────────────────────────────────────┤
│                                                                   │
│  PHASE 1: OBSERVATION          Collect without interpreting       │
│  ═══════════════════                                              │
│  - Reproduce the exact symptom                                    │
│  - Document the environment                                       │
│  - Capture logs, stack traces, states                             │
│  - DO NOT jump to conclusions                                     │
│                                                                   │
│  PHASE 2: HYPOTHESES           Reason systematically              │
│  ════════════════════                                             │
│  - List ALL possible causes                                       │
│  - Rank by probability (high/medium/low)                          │
│  - Define a validation test for each hypothesis                   │
│  - Use the 5 Whys technique                                       │
│                                                                   │
│  PHASE 3: INVESTIGATION        Prove, do not assume               │
│  ══════════════════════                                           │
│  - Test ONE hypothesis at a time                                  │
│  - Use strategic tracing and logging                              │
│  - Isolate with binary search (code or git bisect)                │
│  - Document each tested hypothesis (confirmed/refuted)            │
│                                                                   │
│  PHASE 4: VERIFICATION         Confirm the fix is real            │
│  ════════════════════                                             │
│  - Reproduce the original bug (must fail without the fix)         │
│  - Apply the minimal fix                                          │
│  - Prove the bug is fixed                                         │
│  - Verify the absence of side effects                             │
│  - Add a non-regression test                                      │
│  - Defense in depth: assertions on invariants                     │
│                                                                   │
└──────────────────────────────────────────────────────────────────┘

Phase 1: Observation

Key questions:

What is happening exactly?
What should happen?
When did it start?
Is it 100% reproducible?
What are the aggravating/mitigating factors?

# Recent logs
tail -100 logs/app.log 2>/dev/null

# Latest commits
git log --oneline -10

# Recent changes in the suspect area
git log --oneline -5 -- src/path/suspect/

Phase 2: Hypotheses

Hypothesis matrix

#	Hypothesis	Probability	Validation test
1	[Most likely]	High	[How to verify]
2	[Secondary]	Medium	[How to verify]
3	[Less likely]	Low	[How to verify]

5 Whys technique

Problem: The application crashes at login

Why? -> The JWT token is invalid
Why? -> The token has expired
Why? -> The refresh token was not called
Why? -> The interceptor did not detect the expiration
Why? -> Timezone bug in the comparison

Root cause: Timezone bug in the refresh logic

Common causes by type

Bug type	Frequent causes
Null/Undefined	Missing data, race condition, API changed
Type error	Wrong type, JSON parsing, implicit conversion
Off-by-one	Array index, loop, `<` vs `<=` comparison
Race condition	Async without await, shared state, timing
Memory leak	Event listeners, closures, circular references
Regression	Recent change, side effect, dependency update

Phase 3: Investigation

Tracing techniques

// Strategic tracing (not console.log everywhere)
function trace(label: string, data: unknown) {
  console.log(`[TRACE:${label}]`, JSON.stringify(data, null, 2));
}

// Trace points at boundaries
trace('INPUT', { args });       // Function entry
trace('STATE', { variables });  // Intermediate state
trace('OUTPUT', { result });    // Function exit
trace('BRANCH', { condition }); // Decision taken

Binary search in the code

1. Comment out half of the suspect code
2. Does the bug persist?
   - Yes -> The bug is in the remaining half
   - No  -> The bug is in the commented half
3. Repeat until isolating the exact line

Git bisect (find the faulty commit)

git bisect start
git bisect bad                 # Current version broken
git bisect good <commit>       # Last known good version
# Test and mark good/bad until finding the commit
git bisect reset

Phase 4: Verification (MANDATORY)

Prove the fix works

WITHOUT the fix: reproduce the bug -> failure confirmed
WITH the fix: same scenario        -> success confirmed
Existing tests                     -> all pass
Non-regression test                -> written and passes
Side effects                       -> verified absent

Defense in depth

// Add assertions on critical invariants
function processPayment(amount: number, userId: string) {
  assert(amount > 0, 'Payment amount must be positive');
  assert(userId, 'User ID is required');
  // ...business code...
}

Completion checklist

[ ] Bug reproduced reliably
[ ] Root cause identified (not just the symptom)
[ ] Minimal fix applied (no opportunistic refactoring)
[ ] Non-regression test added
[ ] Existing tests pass
[ ] No side effects
[ ] Fix documentation (descriptive commit message)

Expected output

## Diagnosis: [Bug description]

### Phase 1 - Observation
**Symptom:** [What happens]
**Expected behavior:** [What should happen]
**Reproduction:** [Steps 1, 2, 3...]

### Phase 2 - Hypotheses
| # | Hypothesis | Probability | Result |
|---|------------|-------------|--------|
| 1 | [...] | High | Confirmed/Refuted |

### Phase 3 - Investigation
**Root cause:** `src/xxx.ts:42` - [Technical explanation]
**5 Whys:** [Causal chain]

### Phase 4 - Verification
- [x] Bug reproduced
- [x] Fix applied
- [x] Non-regression test added
- [x] All tests pass
- [x] No side effects

Rules

Do not assume - verify (Phase 4 mandatory)
One bug at a time
Understand BEFORE fixing
Always add a non-regression test
Document every tested hypothesis, even refuted ones
The fix must be MINIMAL - no opportunistic refactoring

Iron Law: No fix without root cause

IMPORTANT: NEVER propose a fix before identifying the root cause. Symptom fixes mask the real problem and create new bugs.

Red flags (rationalizations to avoid)

Phrase	Problem
"Quick fix for now"	Avoids the root cause
"Let's just try changing X"	Guess-and-check, not systematic
"I don't really understand but it should work"	Blind fix
"It's urgent, no time to investigate"	Systematic investigation is FASTER

Rule of 3 failures

After 3 failed fix attempts: STOP. Do not attempt a 4th fix.

Question the basic assumptions
Broaden the search perimeter
Check whether the problem is architectural (not just a local bug)
Consider git bisect to find the faulty commit

Metrics

Approach	Average time	1st-attempt fix rate
Systematic (4 phases)	15-30 min	~95%
Random fixing (guess-and-check)	2-3h	~40%

Automatic triggering

This skill is automatically activated when:

The matching keywords are detected in the conversation
The task context matches the skill's domain

Triggering examples

"I want to dev..."
"I want to debug..."
"I want to quick fix for now..."

Context fork

Fork means the skill runs in an isolated context:

Does not pollute the main conversation
Results are returned cleanly
Ideal for autonomous tasks

Practical examples

1. Debugging example

Debugging example

Reported issue

"Users can no longer log in since this morning. 500 error on /api/login"

Diagnostic

1. Reproduction

curl -X POST https://api.example.com/api/login \
  -H "Content-Type: application/json" \
  -d '{"email":"test@example.com","password":"test123"}'

Result:

{
  "error": "Internal Server Error",
  "statusCode": 500
}

Bug reproduced 100%.

2. Information gathering

# Recent logs
tail -100 /var/log/app/error.log

Logs found:

2024-01-15 09:23:45 ERROR [AuthService] Connection refused: Redis
2024-01-15 09:23:45 ERROR [AuthService] Failed to get session: ECONNREFUSED
2024-01-15 09:23:46 ERROR [AuthController] Login failed: Redis connection error

# Latest commits
git log --oneline -5

Result:

a1b2c3d (HEAD) chore: update dependencies
f4e5d6c feat: add rate limiting
9g8h7i6 fix: typo in README

3. Hypothesis

The commit f4e5d6c feat: add rate limiting likely added a Redis dependency that is not configured in production.

4. Verification

# Check the suspect commit
git show f4e5d6c --stat

Modified files:

src/middleware/rate-limit.ts | 45 +++++
src/config/redis.ts          | 12 ++
package.json                 |  1 +

# Check Redis config in prod
echo $REDIS_URL
# (empty)

Cause confirmed: REDIS_URL variable not defined in production.

5. Root cause identified

Element	Value
File	`src/config/redis.ts:8`
Problem	Redis connection without configured URL
Why	Deployment without the environment variable
Faulty commit	`f4e5d6c`

// src/config/redis.ts:8
const redis = new Redis(process.env.REDIS_URL); // REDIS_URL is undefined

Solution

Option 1: Configure Redis (recommended)

# Add the variable in production
heroku config:set REDIS_URL=redis://...

Option 2: Graceful fallback (temporary)

// src/config/redis.ts
const redis = process.env.REDIS_URL
  ? new Redis(process.env.REDIS_URL)
  : null;

// src/middleware/rate-limit.ts
if (!redis) {
  console.warn('Rate limiting disabled: Redis not configured');
  return next();
}

Applied resolution

# 1. Configure Redis in prod
heroku config:set REDIS_URL="redis://..."

# 2. Restart the application
heroku restart

# 3. Verify the fix
curl -X POST https://api.example.com/api/login ...
# ✅ 200 OK

Post-mortem

Impact

Duration: 2h15 (09:00 - 11:15)
Affected users: ~500
Severity: P1 (major feature broken)

Preventive actions

Add environment variable checks at startup
Add integration test for rate limiting
Document required variables in the README

Non-regression test added

describe('Rate Limiting', () => {
  it('should work without Redis configured', () => {
    delete process.env.REDIS_URL;
    // The middleware must not crash
    expect(() => rateLimitMiddleware(req, res, next)).not.toThrow();
  });
});

Configuration​

Detailed description​

Debug a Problem

Goal​

Systematic Methodology (4 Phases)​

Phase 1: Observation​

Phase 2: Hypotheses​

Hypothesis matrix​

5 Whys technique​

Common causes by type​

Phase 3: Investigation​

Tracing techniques​

Binary search in the code​

Git bisect (find the faulty commit)​

Phase 4: Verification (MANDATORY)​

Prove the fix works​

Defense in depth​

Completion checklist​

Expected output​

Rules​

Iron Law: No fix without root cause​

Red flags (rationalizations to avoid)​

Rule of 3 failures​

Metrics​

Automatic triggering​

Triggering examples​

Context fork​

Practical examples​

1. Debugging example​