Evaluator-Optimizer¶
The evaluator-optimizer pattern uses iterative refinement through two components: a generator that produces output and an evaluator that assesses it against predefined criteria. The workflow cycles between generation and evaluation until the output meets quality standards.
This mirrors how humans refine their work — drafting, reviewing, and editing — to achieve optimal outcomes.
Why It Matters¶
- High-quality outputs — The evaluator catches and refines suboptimal results before delivery
- Adaptable to complex tasks — Ideal for nuanced or multi-faceted outputs that benefit from incremental improvement
- Human-like iteration — Mimics the draft-review-revise cycle that produces polished work
- Dynamic feedback — The evaluator can provide domain-specific feedback or adapt criteria, making it versatile across use cases
Key Components¶
| Component | Purpose | Example |
|---|---|---|
| Generator (LLM Call) | Produces an initial solution or response | Drafts a marketing tagline for a product |
| Evaluator (LLM Call) | Reviews output against evaluation criteria and provides actionable feedback | Assesses whether the tagline aligns with brand voice and audience preferences |
| Feedback Loop | Iteratively refines output until it meets criteria | If the tagline misses brand guidelines, the evaluator provides specific improvement guidance |
| Accepted Output | Delivers the finalized result once criteria are satisfied | A polished tagline ready for the campaign |
When to Use It¶
- Nuanced tasks — When outputs require multiple layers of refinement (creative content, technical documentation)
- High-stakes outputs — When accuracy and quality are critical (legal documents, strategic reports)
- Clear evaluation criteria — When you can define measurable standards for what "good" looks like
Example: Refining a Marketing Campaign¶
A company needs compelling marketing content (taglines, ad copy, social posts) that must reflect brand tone, resonate with the audience, and meet platform guidelines:
- Generator — Produces initial drafts of taglines, ad copy, and social media posts
- Evaluator — Reviews drafts against brand tone, audience fit, and platform requirements; provides actionable feedback ("Make the tagline more emotionally engaging for the target audience")
- Feedback Loop — Refines outputs iteratively until all criteria are met
- Accepted Output — Polished, ready-to-use marketing content aligned with campaign goals
Results:
- Quality — High-quality outputs aligned with brand and audience
- Efficiency — Reduces manual editing, saving time and resources
- Consistency — Uniform tone and style across campaign elements
How to Implement¶
- Define evaluation criteria — Establish clear, measurable standards (quality, alignment, tone)
- Set up the feedback loop — Ensure seamless communication between generator and evaluator for iterative refinement
- Set iteration limits — Prevent infinite loops by capping the number of refinement cycles
- Test the workflow — Validate that evaluator feedback effectively guides the generator toward desired outcomes
Based on Building Effective Agents by Anthropic.
Related¶
- Workflow Architecture Patterns Overview
- Prompt Chaining — sequential steps with gates (but not iterative)
- Autonomous Agents — agents with built-in evaluation through the think-act-observe loop
- Build > Design Your AI Workflow