Claude Code — Autonomous Development Pipeline (GSD + autonomous-dev)

The strongest known pattern for fully autonomous requirements → prototype development using Claude Code, combining GSD for planning/context management with the autonomous-dev harness for adversarial verification.

Why / When to Use

Use when you need to run a multi-phase development task overnight or in CI without human-in-the-loop review at each step. The pipeline is designed to catch its own failures before declaring “done.”

Core Concept

Four fundamental failure modes in autonomous coding, and how this stack addresses them:

FailureProblemSolution
DriftClaude interprets requirements differently than intendedGSD locks requirements in PROJECT.md, REQUIREMENTS.md before code is written
Context rotQuality degrades mid-execution on long tasksGSD spawns fresh subagent contexts (200K window each), rotates as needed
No verification gate”Done” = Claude says done, not actually workingautonomous-dev: 0 test failures gate + spec-blind reviewer
No recoveryFailure at step 6 = restart from 0GSD’s verify step diagnoses, generates fix plans, re-executes

The Two Components

GSD (Get Shit Done)

Handles the requirements → structured plan → execution phase.

State files GSD maintains across sessions:

  • PROJECT.md — project rules, constraints, architectural decisions
  • REQUIREMENTS.md — full feature spec locked before any code
  • ROADMAP.md — phases, milestones, what’s been completed
  • STATE.md — current phase, what’s in progress, what’s next

Key commands:

# Bootstrap from requirements file, fully autonomous
gsd headless new-milestone --context requirements.md --auto
 
# Or interactive phase-by-phase
/gsd-new-project       # parallel research agents → roadmap
/gsd-discuss-phase     # lock decisions: API shapes, data model
/gsd-plan-phase        # 2–3 tasks per plan, fits in 50% context window
/gsd-execute-phase     # wave-based parallel subagents, atomic commits
/gsd-verify-work       # diagnose → fix plan → re-execute
/gsd-autonomous        # runs all phases to completion (headless)

Install:

npx get-shit-done-cc@latest --claude --global

autonomous-dev Harness

Adds adversarial verification on top of GSD’s execution layer. The key innovation: a spec-blind reviewer agent tests the implementation without having seen the source code — only the acceptance criteria. This is the closest approximation to an independent QA reviewer.

Hard gates (pipeline stops if any fail):

  1. Tests written before implementation (spec → test → code order)
  2. 0 test failures required to proceed
  3. No stubs or placeholders allowed
  4. Security scan must pass
  5. Spec-blind validation: separate agent writes behavioural tests from acceptance criteria alone, then validates against the implementation

The adversarial layer:

implementer agent    → builds the feature
reviewer agent       → sees only: acceptance criteria + running code
                     → writes its own tests from spec, never from implementation
                     → verdict: pass / fail / escalate

Install and trigger:

/implement    # runs the full autonomous-dev pipeline

Full End-to-End Pipeline

requirements.md
  ↓
GSD: /gsd-new-project   → parallel research agents → roadmap
  ↓
GSD: /gsd-discuss-phase → lock API shapes, data model
  ↓
GSD: /gsd-plan-phase    → 2–3 tasks per plan, 50% context headroom
  ↓
GSD: /gsd-execute-phase → wave-based parallel subagents, atomic commits
  ↓
autonomous-dev: /implement
  ├── acceptance tests written BEFORE implementation
  ├── 0 failures gate — loops back if failing
  ├── no stubs/placeholders gate
  └── security scan gate + spec-blind reviewer
  ↓
GSD: /gsd-verify-work   → diagnose → fix plan → re-execute
  ↓
prototype on branch, PR opened
  ↓
YOU review diff and merge → production

Human Checkpoints (Intentionally Minimal)

CheckpointWhy humanTime
Approve roadmap after /gsd-new-projectConfirm scope before any code5 min
Review /gsd-discuss-phase decisionsLock API shapes, data model10–15 min
Review final PR diffMerge decisionYour call

Everything else — research, planning, coding, testing, fixing, committing — runs autonomously.

GitHub Actions Pattern (Overnight / CI)

Trigger autonomously when requirements.md is pushed:

on:
  push:
    paths: ['requirements.md']
 
steps:
  - name: Run autonomous pipeline
    env:
      ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
    run: |
      npm install -g @anthropic-ai/claude-code
      npx get-shit-done-cc@latest --claude --global
      claude --dangerously-skip-permissions -p \
        "/gsd-autonomous" --max-turns 100

Local Overnight Pattern

tmux new -s build
claude --dangerously-skip-permissions
/gsd-autonomous     # runs all phases to completion
# detach (Ctrl+B, D), close laptop

Comparison with Simpler Approaches

MethodLaptop needed?Adversarial testing?Context managementBest for
Ralph bash loopYesNoNone (relies on CLAUDE.md)Simple sequential tasks
GSD aloneYesNo★★★★★Large multi-phase projects
GSD + autonomous-devYes/CI★★★★★★★★★★Production-quality autonomous builds
Claude Code RoutinesNoNoNoneScheduled cloud automation

Gotchas

  • The “fully autonomous, zero human touch” framing is aspirational — still want PR review gates for production
  • Some hook behaviours and config details may shift between Claude Code versions; cross-check against code.claude.com/docs
  • autonomous-dev’s spec-blind reviewer only works if acceptance criteria are precise — vague specs produce false passes
  • GSD spawns many subagents; costs can accumulate quickly on large codebases

Source

Conversations “CC-Autonomous” (Claude Code project) and “Evaluating Claude code automation credibility” — 2026-05-19. Article by Kevin Collins (Echofold / Manus Fellow), published April 2026. GitHub: autonomous-dev harness repo.