Skip to main content

Optimization Jobs

An optimization job runs multiple mutation-evaluation cycles to improve a prompt. This guide covers job configuration, lifecycle management, and result handling.

Creating a Job

import { Mutagent } from '@mutagent/sdk';

const client = new Mutagent({ bearerAuth: 'sk_live_...' });

const job = await client.optimization.start({
  promptId: 'prompt_xxxx',
  datasetId: 'dataset_xxxx',
  config: {
    maxIterations: 10,
    targetScore: 0.9,
    mutationStrength: 0.5,
    evaluationMetrics: ['g_eval', 'semantic_similarity'],
  },
});

console.log('Job ID:', job.id);
console.log('Status:', job.status);
console.log('Initial Score:', job.currentScore);

Configuration Options

OptionTypeDefaultDescription
maxIterationsnumber10Maximum optimization cycles to run
targetScorenumber0.95Stop early when this score is reached
mutationStrengthnumber0.5How much to vary prompts (0.0-1.0)
evaluationMetricsstring[]All availableMetrics to optimize for
preserveVariablesbooleantrueKeep original variable names and structure
preserveStructurebooleanfalseMaintain overall prompt organization
minImprovementnumber0.01Stop if improvement falls below this
timeoutnumber3600000Max runtime in milliseconds (1 hour default)

Configuration Examples

Conservative optimization:
config: {
  maxIterations: 20,
  mutationStrength: 0.2,
  minImprovement: 0.005,
  preserveStructure: true,
}
Aggressive optimization:
config: {
  maxIterations: 10,
  mutationStrength: 0.8,
  targetScore: 0.95,
}
Metric-focused optimization:
config: {
  maxIterations: 15,
  evaluationMetrics: ['g_eval'],  // Focus on single metric
  targetScore: 0.9,
}

Job States

Jobs progress through these states:
StateDescriptionTransitions
pendingQueued, waiting to start-> running, cancelled
runningActively optimizing-> completed, paused, failed, cancelled
pausedTemporarily stopped-> running, cancelled
completedSuccessfully finishedTerminal
failedError occurredCan retry
cancelledManually stoppedTerminal

Managing Jobs

Check Status

const job = await client.optimization.getStatus('job_xxxx');

console.log('Status:', job.status);
console.log('Iteration:', `${job.currentIteration}/${job.maxIterations}`);
console.log('Current Score:', job.currentScore.toFixed(2));
console.log('Best Score:', job.bestScore.toFixed(2));
console.log('Started:', job.startedAt);
console.log('Elapsed:', `${(Date.now() - job.startedAt.getTime()) / 1000}s`);

List Jobs

// Get all jobs
const jobs = await client.optimization.list({
  limit: 20,
  status: 'running',  // Filter by status
});

jobs.data.forEach(job => {
  console.log(`${job.id}: ${job.status} - Score: ${job.currentScore}`);
});

Pause a Job

Temporarily stop a running job (can be resumed later):
await client.optimization.pause('job_xxxx');

// Check it paused
const job = await client.optimization.getStatus('job_xxxx');
console.log('Status:', job.status); // 'paused'
Pausing preserves the current best prompt and all progress. The job can be resumed from where it left off.

Resume a Job

Continue a paused job:
await client.optimization.resume('job_xxxx');

console.log('Job resumed from iteration', job.currentIteration);

Cancel a Job

Permanently stop a job (cannot be resumed):
await client.optimization.cancel('job_xxxx');

// Results up to this point are still available
const results = await client.optimization.getResults('job_xxxx');
console.log('Best score achieved:', results.bestScore);
Cancellation is permanent. If you might want to continue later, use pause instead.

Getting Results

Retrieve detailed results when a job completes:
const results = await client.optimization.getResults('job_xxxx');

console.log('=== Optimization Results ===\n');

// Original vs optimized
console.log('Original Prompt:');
console.log(results.originalPrompt.content);
console.log(`Score: ${results.originalScore.toFixed(2)}\n`);

console.log('Optimized Prompt:');
console.log(results.optimizedPrompt.content);
console.log(`Score: ${results.finalScore.toFixed(2)}\n`);

// Improvement summary
console.log('Summary:');
console.log(`  Improvement: +${(results.improvement * 100).toFixed(1)}%`);
console.log(`  Iterations: ${results.totalIterations}`);
console.log(`  Duration: ${results.duration / 1000}s`);
console.log(`  Variants tested: ${results.variantsTested}`);

Results Structure

interface OptimizationResults {
  jobId: string;
  status: 'completed' | 'cancelled' | 'failed';

  // Prompts
  originalPrompt: {
    id: string;
    content: string;
    variables: Record<string, string>;
  };
  optimizedPrompt: {
    id: string;
    content: string;
    variables: Record<string, string>;
  };

  // Scores
  originalScore: number;
  finalScore: number;
  improvement: number;           // finalScore - originalScore

  // Iteration history
  iterations: Array<{
    number: number;
    score: number;
    isNewBest: boolean;
    promptVariant: string;
  }>;

  // Metadata
  totalIterations: number;
  variantsTested: number;
  duration: number;              // milliseconds
  startedAt: Date;
  completedAt: Date;
}

Applying Results

After optimization, create a new prompt version with the optimized content:
const results = await client.optimization.getResults('job_xxxx');

// Only apply if there was meaningful improvement
if (results.improvement > 0.02) {
  // Create new version with optimized content
  const newVersion = await client.prompts.createVersion(results.originalPrompt.id, {
    content: results.optimizedPrompt.content,
    description: `Optimized from v${results.originalPrompt.version}. Score: ${results.originalScore.toFixed(2)} -> ${results.finalScore.toFixed(2)} (+${(results.improvement * 100).toFixed(1)}%)`,
  });

  console.log('Created optimized version:', newVersion.currentVersion);

  // Optionally run evaluation to verify
  const verification = await client.evaluations.run({
    promptId: newVersion.id,
    datasetId: 'golden_dataset_xxxx',
    metrics: ['g_eval', 'semantic_similarity'],
  });

  console.log('Verification evaluation:', verification.id);
} else {
  console.log('Improvement too small, keeping original');
}

Monitoring Progress

Polling

Check status periodically:
async function waitForJob(jobId: string) {
  while (true) {
    const job = await client.optimization.getStatus(jobId);

    console.log(`Iteration ${job.currentIteration}/${job.maxIterations}`);
    console.log(`Current: ${job.currentScore.toFixed(2)} | Best: ${job.bestScore.toFixed(2)}`);

    if (job.status === 'completed' || job.status === 'failed' || job.status === 'cancelled') {
      return job;
    }

    await new Promise(r => setTimeout(r, 5000)); // Wait 5s
  }
}
Use WebSocket streaming for real-time updates:
const unsubscribe = client.optimization.subscribe('job_xxxx', {
  onEvent: (event) => {
    if (event.type === 'iteration_complete') {
      console.log(`Iteration ${event.iteration}: ${event.score.toFixed(2)}`);
    }
    if (event.type === 'new_best') {
      console.log(`New best! Score: ${event.score.toFixed(2)}`);
    }
  },
  onComplete: (results) => {
    console.log('Done! Final score:', results.finalScore);
  },
});
See Streaming for full details.

Best Practices

Optimization is only as good as your test cases. Ensure your dataset is representative and well-designed before optimizing.
A target score of 1.0 is rarely achievable. Set targets based on your baseline and acceptable quality levels.
Due to the stochastic nature of optimization, running multiple jobs and comparing results can yield better outcomes.
Always run a separate evaluation on the optimized prompt to verify improvements hold on different test cases.
Keep preserveVariables: true to ensure the optimized prompt remains compatible with your application.

Troubleshooting

Check provider configuration and rate limits. Jobs queue when resources are constrained.
Try increasing mutation strength or using a different strategy. The prompt may be near optimal for the given dataset.
This shouldn’t happen (optimization keeps the best). Check if the dataset or evaluation changed between runs.
Increase the timeout config or use a smaller dataset for faster iterations.