Claude Agent SDK — Hermes Orchestration Pattern

Use the Claude Agent SDK (Python) to let Hermes act as a top-level orchestrator that delegates long-running development tasks to Claude Code sub-agents, all routing through your existing LiteLLM proxy.

Why / When to Use

When you need Hermes to programmatically control Claude Code agents for multi-step dev workflows: multi-file refactoring, test generation loops, CI/CD automation, or research→spec→implement→review cycles. The Agent SDK gives you typed events, session state, budget controls, and MCP support — without building the tool loop yourself.

Core Concept / Architecture

Hermes (top-level orchestrator)
│
│  calls claude_agent_sdk.query() per sub-task
▼
Claude Agent SDK (Python library, same process)
│
│  spawns Claude Code CLI subprocess internally
▼
Claude Code Agent
├── Read / Edit / Write / Bash (built-in tools)
├── LiteLLM proxy (localhost:4000)
│   ├── GLM-4.7 / GLM-4.5-air via Z.ai
│   └── OpenRouter fallback
└── MCP servers (git, CI, browser — optional)

Hermes plans the high-level task, then delegates each phase to a named subagent via the SDK. Sessions preserve context across turns — agent reads files in turn 1, edits in turn 2 without losing state.

Setup

pip install claude-agent-sdk

# Route through LiteLLM proxy instead of Anthropic directly
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=<your_litellm_master_key>

The SDK sends claude-* model strings; LiteLLM maps them to GLM/OpenRouter transparently. No SDK changes needed.

Core Commands

from claude_agent_sdk import query, ClaudeCodeOptions
 
# Basic subagent call from Hermes
result = query(
    prompt="Refactor auth module: extract JWT logic into auth/jwt.py",
    options=ClaudeCodeOptions(
        allowed_tools=["Read", "Edit", "Glob", "Grep", "Bash"],
        max_budget_usd=0.50,
        cwd="/path/to/project"
    )
)
 
# Stream typed events back to Hermes
async for event in result:
    if event.type == "tool_use":
        print(f"Agent used: {event.tool_name}")
    elif event.type == "result":
        print(f"Done: {event.output}")

Four Subagent Types

Agent	Allowed Tools	Purpose
`refactor-agent`	Read, Edit, Glob, Grep, Bash	Multi-file codebase changes
`test-agent`	Read, Edit, Bash	Test generation + execution loops (self-correcting)
`ci-agent`	Bash, git MCP	CI/CD pipeline automation, auto-fix and commit
`reviewer`	Read, Glob, Grep (read-only)	Code review, no modifications

Key Options / Variants

max_budget_usd — hard cost cap per query() call; budget resets need careful handling
allowed_tools — scope each subagent to only what it needs (principle of least privilege)
cwd — scopes file access to the target directory
Sessions: pass session_id across multiple query() calls to preserve agent state

Gotchas

The SDK locks to Claude model strings at the API level; routing through LiteLLM works because LiteLLM accepts Claude model names and maps them internally
test-agent self-corrects by reading Bash output (test failures), editing code, and re-running — give it Bash access and it loops until tests pass
ANTHROPIC_BASE_URL override is the same pattern used by Claude Code’s --proxy flag

Source

Conversation “Hermes orchestrating Claude agents for development workflows” — 2026-05-13

PKM

Claude Agent SDK — Hermes Orchestration Pattern

Claude Agent SDK — Hermes Orchestration Pattern

Why / When to Use

Core Concept / Architecture

Setup

Core Commands

Four Subagent Types

Key Options / Variants

Gotchas

Source

Table of Contents

Graph View