Prompt Builder
Prompt Policy Engine. Classify, diagnose, rewrite, and score prompts calibrated to model tier and deployment context. Callable by agents, tools, humans.
The Answer
Feed it a raw prompt and get back a structured rewrite, a 5-dimension score, the top three issues, tagged assumptions, and a temperature hint, all calibrated to the model tier and deployment you name.
The Problem It Solves
Prompts are rewritten by vibes. A prompt tuned for Opus backend use gets pasted into a Haiku mobile app and underperforms, but nobody can point to which dimension regressed. “Improve this prompt” returns a longer version that is not measurably better.
How It Works
Every call returns a fixed contract:
- A 6-Part-Stack prompt: Role, Task, Constraints, Context, Output Format, Acceptance Criteria
- A 5-dimension score: Accuracy, Clarity, Constraint Strength, Output Determinism, Completeness
- Top 3 diagnosed issues in the original
- Tagged assumptions (
[ASSUMED: ...]) the caller should confirm - A runtime
TEMPERATURE_HINT - Regression detection when iterating on a prior version
Calibration covers model tier (T1 frontier, T2 mid, T3 small or fast) and deployment (interactive, backend, rag_pipeline, agent, plugin, eval_judge, personal_mobile). The same prompt emerges differently for each pairing.
What Makes It Different
Scoring runs alongside rewriting, so iterations are measurable rather than vibes-based. Tagged assumptions keep the rewrite honest about what it invented. The shared contract makes the skill callable by other plugins, agents, or humans through the same interface.