Back to projects
Active Started Feb 2026

Agent Builder

Design, evaluate, and improve agentic harnesses. Methodology playbook plus a catalog of architectures, frameworks, memory substrates, and lab patterns.

Claude Code Skill Markdown Knowledge Base

The Answer

Agent Builder is the skill you load before deciding how an agent should be shaped, which framework to pick, and what state substrate to trust. It answers “should this be one agent or four” before you write code.

The Problem It Solves

Harness decisions compound. Picking LangGraph versus CrewAI versus raw LLM loops changes how state flows, how errors surface, and what fails in production. Most teams decide by vibes, copy a blog-post starter, and inherit its limits. Two weeks later the stale-context bug is the framework’s fault but looks like a prompting problem.

How It Works

Two knowledge bodies, one skill.

Methodology covers principles, shapes, tools, state, context, extensibility, UX, a design playbook, an evaluation playbook, output patterns, and cross-client portability notes. 11 topic files.

Catalog inventories architecture Types I through V, a six-component harness model, 7 framework deep dives (LangGraph, CrewAI, Pydantic AI, smolagents, DSPy, AutoGen, Bedrock), memory substrates, and 14 production lab patterns (Anthropic, OpenAI, Perplexity, Manus, Google, Devin, Cursor, Windsurf, and more). 6 catalog files.

Four modes cover the job: design for a new harness, evaluation for an existing one, design + evaluation for target architecture plus acceptance criteria, catalog-lookup for factual questions.

What Makes It Different

Host-specific variants (Anthropic, Codex) carry their own trigger metadata, so the same skill activates correctly in either runtime. Fully worked examples trace the full path from requirements to implemented harness, not just principles.