Skip to main content

Prompt Optimization

MutagenT’s optimization engine automatically improves prompts using AI-driven mutation and evaluation cycles. Instead of manually tweaking prompts, let the system find better variations for you.

How It Works

The optimization process follows an evolutionary approach:

The Four Steps

  1. Analyze - Evaluate the current prompt against the dataset to establish a baseline score
  2. Mutate - Use AI to generate prompt variations (rewording, restructuring, adding/removing content)
  3. Test - Evaluate each variation against the same dataset using the evaluation criteria
  4. Select - Keep the best performing version as the new baseline
The cycle repeats until:
  • Target score is reached
  • Max iterations hit
  • No improvement found (patience exceeded)

Key Concepts

Evaluation Criteria

The metrics defined in your evaluation that score each prompt variant. Optimization improves scores across selected criteria.

Target Score

The goal score to achieve. Optimization stops early when this threshold is reached.

Patience

Number of iterations without improvement before stopping early. Prevents wasted compute.

Iterations

Maximum number of mutation-evaluation cycles. More iterations = better results (to a point).

Key Features

Optimization Jobs

Configure, run, and manage optimization jobs with full lifecycle control.

Real-time Streaming

Watch optimization progress in real-time via WebSocket updates.

Quick Start

# Start optimization
mutagent prompts optimize start 123 \
  --dataset 456 \
  --max-iterations 10 \
  --model claude-sonnet-4-6

# Check status
mutagent prompts optimize status <job-id>

# Get results
mutagent prompts optimize results <job-id>

Prerequisites for Optimization

Before running optimization, ensure you have:
1

A Prompt

The prompt you want to optimize, with defined variables and content
2

A Dataset

Test cases with inputs and expected outputs (20+ items recommended)
3

Evaluation Criteria

An evaluation definition with criteria (metrics, weights, thresholds) linked to the prompt
4

Configured LLM Provider

At least one LLM provider configured in Settings > Providers (e.g., OpenAI, Anthropic). The CLI runs a provider pre-flight check before starting optimization.

Optimization Strategies

Choose the right approach based on your needs:

Conservative

Small, incremental changes. Lower risk of breaking existing behavior. Good for production prompts.
mutagent prompts optimize start 123 \
  --dataset 456 \
  --max-iterations 20 \
  --model claude-sonnet-4-6
Best for:
  • Prompts already in production
  • Risk-sensitive applications
  • Fine-tuning existing prompts

Balanced

Moderate changes with balanced exploration. Good default for most use cases.
mutagent prompts optimize start 123 \
  --dataset 456 \
  --max-iterations 10 \
  --model claude-sonnet-4-6
Best for:
  • New prompt development
  • General improvement
  • Unknown optimization potential

Aggressive

Larger, more experimental changes. May find significantly better prompts but requires more iterations.
mutagent prompts optimize start 123 \
  --dataset 456 \
  --max-iterations 10 \
  --model claude-sonnet-4-6
Best for:
  • Prompts with low baseline scores
  • Exploring new approaches
  • When conservative optimization plateaus

When to Optimize

Once you have a working prompt, dataset, and evaluation criteria, run optimization to improve it before going to production.
If your prompt’s scores decline (due to model changes, new edge cases, etc.), optimization can help recover.
As part of your release process, optimize prompts to ensure they’re performing at their best.
Schedule regular optimization runs to prevent gradual degradation and capture improvement opportunities.

Optimization vs Manual Tuning

AspectManual TuningAutomated Optimization
TimeHours to daysMinutes to hours
ConsistencySubjectiveObjective, metric-driven
CoverageLimited variations triedMany variations explored
ReproducibilityHard to reproduceFully tracked and reproducible
Expertise requiredHighLow (once dataset is ready)

What’s Next?

Optimization Jobs

Learn to configure, run, and manage optimization jobs

Streaming Updates

Get real-time progress updates via WebSocket