Optimization Jobs

An optimization job runs multiple mutation-evaluation cycles to improve a prompt. This guide covers job configuration, lifecycle management, and result handling.

Creating a Job

import { Mutagent } from '@mutagent/sdk';

const client = new Mutagent({ bearerAuth: 'sk_live_...' });

const job = await client.optimization.start({
  promptId: 'prompt_xxxx',
  datasetId: 'dataset_xxxx',
  config: {
    maxIterations: 10,
    targetScore: 0.9,
    mutationStrength: 0.5,
    evaluationMetrics: ['g_eval', 'semantic_similarity'],
  },
});

console.log('Job ID:', job.id);
console.log('Status:', job.status);
console.log('Initial Score:', job.currentScore);

Configuration Options

Option	Type	Default	Description
`maxIterations`	number	10	Maximum optimization cycles to run
`targetScore`	number	0.95	Stop early when this score is reached
`mutationStrength`	number	0.5	How much to vary prompts (0.0-1.0)
`evaluationMetrics`	string[]	All available	Metrics to optimize for
`preserveVariables`	boolean	true	Keep original variable names and structure
`preserveStructure`	boolean	false	Maintain overall prompt organization
`minImprovement`	number	0.01	Stop if improvement falls below this
`timeout`	number	3600000	Max runtime in milliseconds (1 hour default)

Configuration Examples

Conservative optimization:

config: {
  maxIterations: 20,
  mutationStrength: 0.2,
  minImprovement: 0.005,
  preserveStructure: true,
}

Aggressive optimization:

config: {
  maxIterations: 10,
  mutationStrength: 0.8,
  targetScore: 0.95,
}

Metric-focused optimization:

config: {
  maxIterations: 15,
  evaluationMetrics: ['g_eval'],  // Focus on single metric
  targetScore: 0.9,
}

Job States

Jobs progress through these states:

State	Description	Transitions
`pending`	Queued, waiting to start	-> running, cancelled
`running`	Actively optimizing	-> completed, paused, failed, cancelled
`paused`	Temporarily stopped	-> running, cancelled
`completed`	Successfully finished	Terminal
`failed`	Error occurred	Can retry
`cancelled`	Manually stopped	Terminal

Managing Jobs

Check Status

const job = await client.optimization.getStatus('job_xxxx');

console.log('Status:', job.status);
console.log('Iteration:', `${job.currentIteration}/${job.maxIterations}`);
console.log('Current Score:', job.currentScore.toFixed(2));
console.log('Best Score:', job.bestScore.toFixed(2));
console.log('Started:', job.startedAt);
console.log('Elapsed:', `${(Date.now() - job.startedAt.getTime()) / 1000}s`);

List Jobs

// Get all jobs
const jobs = await client.optimization.list({
  limit: 20,
  status: 'running',  // Filter by status
});

jobs.data.forEach(job => {
  console.log(`${job.id}: ${job.status} - Score: ${job.currentScore}`);
});

Pause a Job

Temporarily stop a running job (can be resumed later):

await client.optimization.pause('job_xxxx');

// Check it paused
const job = await client.optimization.getStatus('job_xxxx');
console.log('Status:', job.status); // 'paused'

Pausing preserves the current best prompt and all progress. The job can be resumed from where it left off.

Resume a Job

Continue a paused job:

await client.optimization.resume('job_xxxx');

console.log('Job resumed from iteration', job.currentIteration);

Cancel a Job

Permanently stop a job (cannot be resumed):

await client.optimization.cancel('job_xxxx');

// Results up to this point are still available
const results = await client.optimization.getResults('job_xxxx');
console.log('Best score achieved:', results.bestScore);

Cancellation is permanent. If you might want to continue later, use pause instead.

Getting Results

Retrieve detailed results when a job completes:

const results = await client.optimization.getResults('job_xxxx');

console.log('=== Optimization Results ===\n');

// Original vs optimized
console.log('Original Prompt:');
console.log(results.originalPrompt.content);
console.log(`Score: ${results.originalScore.toFixed(2)}\n`);

console.log('Optimized Prompt:');
console.log(results.optimizedPrompt.content);
console.log(`Score: ${results.finalScore.toFixed(2)}\n`);

// Improvement summary
console.log('Summary:');
console.log(`  Improvement: +${(results.improvement * 100).toFixed(1)}%`);
console.log(`  Iterations: ${results.totalIterations}`);
console.log(`  Duration: ${results.duration / 1000}s`);
console.log(`  Variants tested: ${results.variantsTested}`);

Results Structure

interface OptimizationResults {
  jobId: string;
  status: 'completed' | 'cancelled' | 'failed';

  // Prompts
  originalPrompt: {
    id: string;
    content: string;
    variables: Record<string, string>;
  };
  optimizedPrompt: {
    id: string;
    content: string;
    variables: Record<string, string>;
  };

  // Scores
  originalScore: number;
  finalScore: number;
  improvement: number;           // finalScore - originalScore

  // Iteration history
  iterations: Array<{
    number: number;
    score: number;
    isNewBest: boolean;
    promptVariant: string;
  }>;

  // Metadata
  totalIterations: number;
  variantsTested: number;
  duration: number;              // milliseconds
  startedAt: Date;
  completedAt: Date;
}

Applying Results

After optimization, create a new prompt version with the optimized content:

const results = await client.optimization.getResults('job_xxxx');

// Only apply if there was meaningful improvement
if (results.improvement > 0.02) {
  // Create new version with optimized content
  const newVersion = await client.prompts.createVersion(results.originalPrompt.id, {
    content: results.optimizedPrompt.content,
    description: `Optimized from v${results.originalPrompt.version}. Score: ${results.originalScore.toFixed(2)} -> ${results.finalScore.toFixed(2)} (+${(results.improvement * 100).toFixed(1)}%)`,
  });

  console.log('Created optimized version:', newVersion.currentVersion);

  // Optionally run evaluation to verify
  const verification = await client.evaluations.run({
    promptId: newVersion.id,
    datasetId: 'golden_dataset_xxxx',
    metrics: ['g_eval', 'semantic_similarity'],
  });

  console.log('Verification evaluation:', verification.id);
} else {
  console.log('Improvement too small, keeping original');
}

Monitoring Progress

Polling

Check status periodically:

async function waitForJob(jobId: string) {
  while (true) {
    const job = await client.optimization.getStatus(jobId);

    console.log(`Iteration ${job.currentIteration}/${job.maxIterations}`);
    console.log(`Current: ${job.currentScore.toFixed(2)} | Best: ${job.bestScore.toFixed(2)}`);

    if (job.status === 'completed' || job.status === 'failed' || job.status === 'cancelled') {
      return job;
    }

    await new Promise(r => setTimeout(r, 5000)); // Wait 5s
  }
}

Streaming (Recommended)

Use WebSocket streaming for real-time updates:

const unsubscribe = client.optimization.subscribe('job_xxxx', {
  onEvent: (event) => {
    if (event.type === 'iteration_complete') {
      console.log(`Iteration ${event.iteration}: ${event.score.toFixed(2)}`);
    }
    if (event.type === 'new_best') {
      console.log(`New best! Score: ${event.score.toFixed(2)}`);
    }
  },
  onComplete: (results) => {
    console.log('Done! Final score:', results.finalScore);
  },
});

See Streaming for full details.

Best Practices

Start with a quality dataset

Optimization is only as good as your test cases. Ensure your dataset is representative and well-designed before optimizing.

Set realistic targets

A target score of 1.0 is rarely achievable. Set targets based on your baseline and acceptable quality levels.

Run multiple times

Due to the stochastic nature of optimization, running multiple jobs and comparing results can yield better outcomes.

Verify results

Always run a separate evaluation on the optimized prompt to verify improvements hold on different test cases.

Preserve variable structure

Keep preserveVariables: true to ensure the optimized prompt remains compatible with your application.

Troubleshooting

Job stuck in pending

Check provider configuration and rate limits. Jobs queue when resources are constrained.

No improvement after many iterations

Try increasing mutation strength or using a different strategy. The prompt may be near optimal for the given dataset.

Score decreased

This shouldn’t happen (optimization keeps the best). Check if the dataset or evaluation changed between runs.

Job timed out

Increase the timeout config or use a smaller dataset for faster iterations.

Prompts

Datasets

Evaluations

Optimization

Providers

Optimization Jobs

Optimization Jobs

Creating a Job

Configuration Options

Configuration Examples

Job States

Managing Jobs

Check Status

List Jobs

Pause a Job

Resume a Job

Cancel a Job

Getting Results

Results Structure

Applying Results

Monitoring Progress

Polling

Streaming (Recommended)

Best Practices

Troubleshooting

Prompts

Datasets

Evaluations

Optimization

Providers

​Optimization Jobs

​Creating a Job

​Configuration Options

​Configuration Examples

​Job States

​Managing Jobs

​Check Status

​List Jobs

​Pause a Job

​Resume a Job

​Cancel a Job

​Getting Results

​Results Structure

​Applying Results

​Monitoring Progress

​Polling

​Streaming (Recommended)

​Best Practices

​Troubleshooting

Optimization Jobs

Creating a Job

Configuration Options

Configuration Examples

Job States

Managing Jobs

Check Status

List Jobs

Pause a Job

Resume a Job

Cancel a Job

Getting Results

Results Structure

Applying Results

Monitoring Progress

Polling

Streaming (Recommended)

Best Practices

Troubleshooting