Claude Code — Autonomous Development Pipeline (GSD + autonomous-dev)
The strongest known pattern for fully autonomous requirements → prototype development using Claude Code, combining GSD for planning/context management with the autonomous-dev harness for adversarial verification.
Why / When to Use
Use when you need to run a multi-phase development task overnight or in CI without human-in-the-loop review at each step. The pipeline is designed to catch its own failures before declaring “done.”
Core Concept
Four fundamental failure modes in autonomous coding, and how this stack addresses them:
| Failure | Problem | Solution |
|---|---|---|
| Drift | Claude interprets requirements differently than intended | GSD locks requirements in PROJECT.md, REQUIREMENTS.md before code is written |
| Context rot | Quality degrades mid-execution on long tasks | GSD spawns fresh subagent contexts (200K window each), rotates as needed |
| No verification gate | ”Done” = Claude says done, not actually working | autonomous-dev: 0 test failures gate + spec-blind reviewer |
| No recovery | Failure at step 6 = restart from 0 | GSD’s verify step diagnoses, generates fix plans, re-executes |
The Two Components
GSD (Get Shit Done)
Handles the requirements → structured plan → execution phase.
State files GSD maintains across sessions:
PROJECT.md— project rules, constraints, architectural decisionsREQUIREMENTS.md— full feature spec locked before any codeROADMAP.md— phases, milestones, what’s been completedSTATE.md— current phase, what’s in progress, what’s next
Key commands:
# Bootstrap from requirements file, fully autonomous
gsd headless new-milestone --context requirements.md --auto
# Or interactive phase-by-phase
/gsd-new-project # parallel research agents → roadmap
/gsd-discuss-phase # lock decisions: API shapes, data model
/gsd-plan-phase # 2–3 tasks per plan, fits in 50% context window
/gsd-execute-phase # wave-based parallel subagents, atomic commits
/gsd-verify-work # diagnose → fix plan → re-execute
/gsd-autonomous # runs all phases to completion (headless)Install:
npx get-shit-done-cc@latest --claude --globalautonomous-dev Harness
Adds adversarial verification on top of GSD’s execution layer. The key innovation: a spec-blind reviewer agent tests the implementation without having seen the source code — only the acceptance criteria. This is the closest approximation to an independent QA reviewer.
Hard gates (pipeline stops if any fail):
- Tests written before implementation (spec → test → code order)
- 0 test failures required to proceed
- No stubs or placeholders allowed
- Security scan must pass
- Spec-blind validation: separate agent writes behavioural tests from acceptance criteria alone, then validates against the implementation
The adversarial layer:
implementer agent → builds the feature
reviewer agent → sees only: acceptance criteria + running code
→ writes its own tests from spec, never from implementation
→ verdict: pass / fail / escalate
Install and trigger:
/implement # runs the full autonomous-dev pipelineFull End-to-End Pipeline
requirements.md
↓
GSD: /gsd-new-project → parallel research agents → roadmap
↓
GSD: /gsd-discuss-phase → lock API shapes, data model
↓
GSD: /gsd-plan-phase → 2–3 tasks per plan, 50% context headroom
↓
GSD: /gsd-execute-phase → wave-based parallel subagents, atomic commits
↓
autonomous-dev: /implement
├── acceptance tests written BEFORE implementation
├── 0 failures gate — loops back if failing
├── no stubs/placeholders gate
└── security scan gate + spec-blind reviewer
↓
GSD: /gsd-verify-work → diagnose → fix plan → re-execute
↓
prototype on branch, PR opened
↓
YOU review diff and merge → production
Human Checkpoints (Intentionally Minimal)
| Checkpoint | Why human | Time |
|---|---|---|
Approve roadmap after /gsd-new-project | Confirm scope before any code | 5 min |
Review /gsd-discuss-phase decisions | Lock API shapes, data model | 10–15 min |
| Review final PR diff | Merge decision | Your call |
Everything else — research, planning, coding, testing, fixing, committing — runs autonomously.
GitHub Actions Pattern (Overnight / CI)
Trigger autonomously when requirements.md is pushed:
on:
push:
paths: ['requirements.md']
steps:
- name: Run autonomous pipeline
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
npm install -g @anthropic-ai/claude-code
npx get-shit-done-cc@latest --claude --global
claude --dangerously-skip-permissions -p \
"/gsd-autonomous" --max-turns 100Local Overnight Pattern
tmux new -s build
claude --dangerously-skip-permissions
/gsd-autonomous # runs all phases to completion
# detach (Ctrl+B, D), close laptopComparison with Simpler Approaches
| Method | Laptop needed? | Adversarial testing? | Context management | Best for |
|---|---|---|---|---|
| Ralph bash loop | Yes | No | None (relies on CLAUDE.md) | Simple sequential tasks |
| GSD alone | Yes | No | ★★★★★ | Large multi-phase projects |
| GSD + autonomous-dev | Yes/CI | ★★★★★ | ★★★★★ | Production-quality autonomous builds |
| Claude Code Routines | No | No | None | Scheduled cloud automation |
Gotchas
- The “fully autonomous, zero human touch” framing is aspirational — still want PR review gates for production
- Some hook behaviours and config details may shift between Claude Code versions; cross-check against code.claude.com/docs
- autonomous-dev’s spec-blind reviewer only works if acceptance criteria are precise — vague specs produce false passes
- GSD spawns many subagents; costs can accumulate quickly on large codebases
Source
Conversations “CC-Autonomous” (Claude Code project) and “Evaluating Claude code automation credibility” — 2026-05-19. Article by Kevin Collins (Echofold / Manus Fellow), published April 2026. GitHub: autonomous-dev harness repo.