Agent Builder
Design, evaluate, and improve agentic harnesses. Methodology playbook plus a catalog of architectures, frameworks, memory substrates, and lab patterns.
The Answer
Agent Builder is the skill you load before deciding how an agent should be shaped, which framework to pick, and what state substrate to trust. It answers “should this be one agent or four” before you write code.
The Problem It Solves
Harness decisions compound. Picking LangGraph versus CrewAI versus raw LLM loops changes how state flows, how errors surface, and what fails in production. Most teams decide by vibes, copy a blog-post starter, and inherit its limits. Two weeks later the stale-context bug is the framework’s fault but looks like a prompting problem.
How It Works
Two knowledge bodies, one skill.
Methodology covers principles, shapes, tools, state, context, extensibility, UX, a design playbook, an evaluation playbook, output patterns, and cross-client portability notes. 11 topic files.
Catalog inventories architecture Types I through V, a six-component harness model, 7 framework deep dives (LangGraph, CrewAI, Pydantic AI, smolagents, DSPy, AutoGen, Bedrock), memory substrates, and 14 production lab patterns (Anthropic, OpenAI, Perplexity, Manus, Google, Devin, Cursor, Windsurf, and more). 6 catalog files.
Four modes cover the job: design for a new harness, evaluation for an existing one, design + evaluation for target architecture plus acceptance criteria, catalog-lookup for factual questions.
What Makes It Different
Host-specific variants (Anthropic, Codex) carry their own trigger metadata, so the same skill activates correctly in either runtime. Fully worked examples trace the full path from requirements to implemented harness, not just principles.