Claude Code — Autonomous Development Pipeline (GSD + autonomous-dev)

The strongest known pattern for fully autonomous requirements → prototype development using Claude Code, combining GSD for planning/context management with the autonomous-dev harness for adversarial verification.

Why / When to Use

Use when you need to run a multi-phase development task overnight or in CI without human-in-the-loop review at each step. The pipeline is designed to catch its own failures before declaring “done.”

Core Concept

Four fundamental failure modes in autonomous coding, and how this stack addresses them:

Failure	Problem	Solution
Drift	Claude interprets requirements differently than intended	GSD locks requirements in PROJECT.md, REQUIREMENTS.md before code is written
Context rot	Quality degrades mid-execution on long tasks	GSD spawns fresh subagent contexts (200K window each), rotates as needed
No verification gate	”Done” = Claude says done, not actually working	autonomous-dev: 0 test failures gate + spec-blind reviewer
No recovery	Failure at step 6 = restart from 0	GSD’s verify step diagnoses, generates fix plans, re-executes

The Two Components

GSD (Get Shit Done)

Handles the requirements → structured plan → execution phase.

State files GSD maintains across sessions:

PROJECT.md — project rules, constraints, architectural decisions
REQUIREMENTS.md — full feature spec locked before any code
ROADMAP.md — phases, milestones, what’s been completed
STATE.md — current phase, what’s in progress, what’s next

Key commands:

# Bootstrap from requirements file, fully autonomous
gsd headless new-milestone --context requirements.md --auto
 
# Or interactive phase-by-phase
/gsd-new-project       # parallel research agents → roadmap
/gsd-discuss-phase     # lock decisions: API shapes, data model
/gsd-plan-phase        # 2–3 tasks per plan, fits in 50% context window
/gsd-execute-phase     # wave-based parallel subagents, atomic commits
/gsd-verify-work       # diagnose → fix plan → re-execute
/gsd-autonomous        # runs all phases to completion (headless)

Install:

npx get-shit-done-cc@latest --claude --global

autonomous-dev Harness

Adds adversarial verification on top of GSD’s execution layer. The key innovation: a spec-blind reviewer agent tests the implementation without having seen the source code — only the acceptance criteria. This is the closest approximation to an independent QA reviewer.

Hard gates (pipeline stops if any fail):

Tests written before implementation (spec → test → code order)
0 test failures required to proceed
No stubs or placeholders allowed
Security scan must pass
Spec-blind validation: separate agent writes behavioural tests from acceptance criteria alone, then validates against the implementation

The adversarial layer:

implementer agent    → builds the feature
reviewer agent       → sees only: acceptance criteria + running code
                     → writes its own tests from spec, never from implementation
                     → verdict: pass / fail / escalate

Install and trigger:

/implement    # runs the full autonomous-dev pipeline

Full End-to-End Pipeline

requirements.md
  ↓
GSD: /gsd-new-project   → parallel research agents → roadmap
  ↓
GSD: /gsd-discuss-phase → lock API shapes, data model
  ↓
GSD: /gsd-plan-phase    → 2–3 tasks per plan, 50% context headroom
  ↓
GSD: /gsd-execute-phase → wave-based parallel subagents, atomic commits
  ↓
autonomous-dev: /implement
  ├── acceptance tests written BEFORE implementation
  ├── 0 failures gate — loops back if failing
  ├── no stubs/placeholders gate
  └── security scan gate + spec-blind reviewer
  ↓
GSD: /gsd-verify-work   → diagnose → fix plan → re-execute
  ↓
prototype on branch, PR opened
  ↓
YOU review diff and merge → production

Human Checkpoints (Intentionally Minimal)

Checkpoint	Why human	Time
Approve roadmap after `/gsd-new-project`	Confirm scope before any code	5 min
Review `/gsd-discuss-phase` decisions	Lock API shapes, data model	10–15 min
Review final PR diff	Merge decision	Your call

Everything else — research, planning, coding, testing, fixing, committing — runs autonomously.

GitHub Actions Pattern (Overnight / CI)

Trigger autonomously when requirements.md is pushed:

on:
  push:
    paths: ['requirements.md']
 
steps:
  - name: Run autonomous pipeline
    env:
      ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
    run: |
      npm install -g @anthropic-ai/claude-code
      npx get-shit-done-cc@latest --claude --global
      claude --dangerously-skip-permissions -p \
        "/gsd-autonomous" --max-turns 100

Local Overnight Pattern

tmux new -s build
claude --dangerously-skip-permissions
/gsd-autonomous     # runs all phases to completion
# detach (Ctrl+B, D), close laptop

Comparison with Simpler Approaches

Method	Laptop needed?	Adversarial testing?	Context management	Best for
Ralph bash loop	Yes	No	None (relies on CLAUDE.md)	Simple sequential tasks
GSD alone	Yes	No	★★★★★	Large multi-phase projects
GSD + autonomous-dev	Yes/CI	★★★★★	★★★★★	Production-quality autonomous builds
Claude Code Routines	No	No	None	Scheduled cloud automation

Gotchas

The “fully autonomous, zero human touch” framing is aspirational — still want PR review gates for production
Some hook behaviours and config details may shift between Claude Code versions; cross-check against code.claude.com/docs
autonomous-dev’s spec-blind reviewer only works if acceptance criteria are precise — vague specs produce false passes
GSD spawns many subagents; costs can accumulate quickly on large codebases

Source

Conversations “CC-Autonomous” (Claude Code project) and “Evaluating Claude code automation credibility” — 2026-05-19. Article by Kevin Collins (Echofold / Manus Fellow), published April 2026. GitHub: autonomous-dev harness repo.

PKM

Claude Code — Autonomous Development Pipeline (GSD + autonomous-dev)

Claude Code — Autonomous Development Pipeline (GSD + autonomous-dev)

Why / When to Use

Core Concept

The Two Components

GSD (Get Shit Done)

autonomous-dev Harness

Full End-to-End Pipeline

Human Checkpoints (Intentionally Minimal)

GitHub Actions Pattern (Overnight / CI)

Local Overnight Pattern

Comparison with Simpler Approaches

Gotchas

Source

Table of Contents

Graph View