Claude Agent SDK — Hermes Orchestration Pattern

Use the Claude Agent SDK (Python) to let Hermes act as a top-level orchestrator that delegates long-running development tasks to Claude Code sub-agents, all routing through your existing LiteLLM proxy.

Why / When to Use

When you need Hermes to programmatically control Claude Code agents for multi-step dev workflows: multi-file refactoring, test generation loops, CI/CD automation, or research→spec→implement→review cycles. The Agent SDK gives you typed events, session state, budget controls, and MCP support — without building the tool loop yourself.

Core Concept / Architecture

Hermes (top-level orchestrator)
│
│  calls claude_agent_sdk.query() per sub-task
▼
Claude Agent SDK (Python library, same process)
│
│  spawns Claude Code CLI subprocess internally
▼
Claude Code Agent
├── Read / Edit / Write / Bash (built-in tools)
├── LiteLLM proxy (localhost:4000)
│   ├── GLM-4.7 / GLM-4.5-air via Z.ai
│   └── OpenRouter fallback
└── MCP servers (git, CI, browser — optional)

Hermes plans the high-level task, then delegates each phase to a named subagent via the SDK. Sessions preserve context across turns — agent reads files in turn 1, edits in turn 2 without losing state.

Setup

pip install claude-agent-sdk
# Route through LiteLLM proxy instead of Anthropic directly
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=<your_litellm_master_key>

The SDK sends claude-* model strings; LiteLLM maps them to GLM/OpenRouter transparently. No SDK changes needed.

Core Commands

from claude_agent_sdk import query, ClaudeCodeOptions
 
# Basic subagent call from Hermes
result = query(
    prompt="Refactor auth module: extract JWT logic into auth/jwt.py",
    options=ClaudeCodeOptions(
        allowed_tools=["Read", "Edit", "Glob", "Grep", "Bash"],
        max_budget_usd=0.50,
        cwd="/path/to/project"
    )
)
 
# Stream typed events back to Hermes
async for event in result:
    if event.type == "tool_use":
        print(f"Agent used: {event.tool_name}")
    elif event.type == "result":
        print(f"Done: {event.output}")

Four Subagent Types

AgentAllowed ToolsPurpose
refactor-agentRead, Edit, Glob, Grep, BashMulti-file codebase changes
test-agentRead, Edit, BashTest generation + execution loops (self-correcting)
ci-agentBash, git MCPCI/CD pipeline automation, auto-fix and commit
reviewerRead, Glob, Grep (read-only)Code review, no modifications

Key Options / Variants

  • max_budget_usd — hard cost cap per query() call; budget resets need careful handling
  • allowed_tools — scope each subagent to only what it needs (principle of least privilege)
  • cwd — scopes file access to the target directory
  • Sessions: pass session_id across multiple query() calls to preserve agent state

Gotchas

  • The SDK locks to Claude model strings at the API level; routing through LiteLLM works because LiteLLM accepts Claude model names and maps them internally
  • test-agent self-corrects by reading Bash output (test failures), editing code, and re-running — give it Bash access and it loops until tests pass
  • ANTHROPIC_BASE_URL override is the same pattern used by Claude Code’s --proxy flag

Source

Conversation “Hermes orchestrating Claude agents for development workflows” — 2026-05-13