Interface Built Right
End-to-end UI design tool for AI coding agents. Web/iOS/macOS archetype routing, /ibr:build orchestrator, custom CDP engine, deterministic rule + sensor layers.
Install
claude plugin marketplace add tyroneross/RossLabs-AI-Toolkit
claude plugin install ibr@rosslabs-ai-toolkit
The Problem
AI coding agents can render a UI without knowing whether it is the right UI. Tests pass, the page returns 200, the screenshot looks plausible — and the navbar is shifted 40px, the empty state has no CTA, the iOS app violates HIG, the form has no focus indicators. A regression checker tells you something changed. It does not tell you what to build, which patterns fit the platform, or whether the result is discoverable. IBR started as the regression checker; v1.0+ is the rest.
What I Built
IBR is an end-to-end design tool for AI coding agents — design, build, and validate interfaces across iOS, macOS, and web. The validation engine that used to be the product is now the verification layer underneath a build orchestrator, platform-aware design routers, and a custom CDP engine. Works from terminal, Codex, Claude Code slash commands, or code. Zero config.
/ibr:build Orchestrator
/ibr:build <topic> runs a fixed sequence:
- Preamble — platform, scope, design mode, archetype hints, UI template, references, density
- Optional imagegen concepts — generated only when useful; require explicit approval before becoming a
visual-target - Design Director — produces
design-intent.json, specialist planning passes, target roles, and validation criteria - Brainstorm & Plan — guided exploration with platform-specific rules and a concrete implementation plan
- Implement — Calm Precision, web/iOS/macOS routers, component patterns, data-viz guidance when needed
- Validate — scan, match wireframe / visual targets, test interactions, iterate until passing
The orchestrator is the one place that knows the active design system, the reference templates, and the verification gates — so an agent calling it cannot skip the design step on the way to a passing test. Specialist passes cover flow, visual system, interaction states, content states, Mockup Gallery targets, data visualization, and validation. IBR does not spawn tiny per-atom agents; those stay inside component patterns and design tokens.
Two-Tier Architecture
IBR scans return structured data, not raw element dumps. Two layers run on every scan.
Tier 1 — Deterministic Rule Engine (no LLM, zero tokens). Pure algorithms against runtime data:
| Rule Preset | What It Checks | Algorithm |
|---|---|---|
wcag-contrast | Text contrast ratios, AA and AAA | WCAG 2.1 relative luminance |
touch-targets | Interactive element sizing | 44px mobile (WCAG 2.5.5), 24px desktop (WCAG 2.5.8) |
calm-precision | Gestalt, Signal-to-Noise, Fitts, Hick, Content-Chrome, Cognitive Load | Principle-based checks |
Tier 2 — Sensor Layer (structured summaries). Pre-computed so the model focuses on judgment, not pattern re-discovery:
| Sensor | What It Produces |
|---|---|
visualPatterns | Groups elements by style fingerprint per category |
componentCensus | Tag/role counts + orphan cursor:pointer elements with no handler |
interactionMap | Which interactive-looking elements actually have handlers |
contrast | WCAG pass/fail grouped, only failures listed |
navigation | Link structure with depth and counts |
semanticState | Page intent, states, available actions |
oneLiners | 5-second scannable summary lines |
ibr scan --output summary cuts roughly 60% of tokens.
Platform Routers
Web — web-design-router classifies into seven archetypes (SaaS dashboard, data/research tool, editor/workbench, AI agent chat, commerce/checkout, content/publication, internal admin). Each archetype sets defaults for navigation, density, primary content, mobile behavior, and validation focus.
iOS — ios-design-router classifies into six archetypes — Utility, Content/Feed, Productivity, Consumer/Habit, Editorial, Tool/Pro — and routes to matching reference files. A meditation app and a developer tool both ask “what navigation pattern?” and get different correct answers without the agent guessing. Sits next to ios-design (HIG rules: navigation, color, type, motion, SF Symbols, materials, Liquid Glass) and apple-platform (architecture, SwiftData, Watch connectivity, concurrency, CI/CD, TestFlight — folded in from the standalone apple-dev skill, which is now deprecated).
Reference library — seven domain files lifted from the Calm Precision iOS design system, loaded on demand by archetype rather than all at once: navigation, lists and cards, buttons, color and typography, motion and states, task economy, and the archetype catalog the router reads from.
CDP Engine
The browser layer is a custom Chrome DevTools Protocol engine over WebSocket. No Playwright, no Puppeteer, no heavyweight automation dependency. Built-in LLM-native features the legacy stack could not do cleanly:
| Feature | What it does |
|---|---|
| queryAXTree-first resolution | Find elements by semantic name + role, not fragile CSS selectors. 4-tier: CDP-native search → Jaro-Winkler fuzzy → vision fallback |
| DOM chunking | Filter to interactive/leaf elements; 60–70% fewer tokens |
| Adaptive modality | Scores AX tree quality. High → text data; low → include screenshot. Vision only when needed |
| Resolution cache | Same intent → element query twice = instant; clears on navigation |
| observe / extract / interact | Preview available actions, pull structured data, click/type by accessible name |
| Hydration waiting | Detects Next.js/React markers + polls AX tree until stable — eliminates “0 elements” on modern SPAs |
Safari support via safaridriver + macOS AX API; cross-browser diff via ibr compare-browsers.
Verification, Still
Comparison did not go away — it became a step inside the loop instead of the whole product. Capture a baseline, run the change, diff with Pixelmatch at a 0.1 threshold, classify the result as MATCH, EXPECTED_CHANGE, UNEXPECTED_CHANGE, or LAYOUT_BROKEN, and feed the verdict back into the iterate phase. Landmark detection extracts the accessibility tree on capture so the verdict carries page intent (auth form, list view, dashboard) rather than just a pixel count.
Memory
A three-tier preference store keeps the agent’s design context small and current: summary.json as a hot cache under 2KB, a preferences/ directory with full details, and an archive/ for evicted entries. Cap is 50 active preferences; the least recently used drops to archive when a 51st arrives. Prompt hooks inject relevant baselines and preferences into the agent’s context before a UI change instead of asking it to fetch them. Zod schemas validate every preference file and auto-migrate old shapes when the format changes.
Why the Repositioning
IBR already had build orchestration, design-system enforcement, and platform skills. Calling it a testing tool described one slice and left the other three under-marketed and under-used. The new framing — design-first, verified end to end — matches what the plugin actually does on a real build, and gives the iOS work, the apple-platform integration, and the CDP engine a coherent place to live.