Optimization Jobs
An optimization job runs multiple mutation-evaluation cycles to improve a prompt. This guide covers job configuration, lifecycle management, and result handling.
Creating a Job
import { Mutagent } from '@mutagent/sdk' ;
const client = new Mutagent ({ bearerAuth: 'sk_live_...' });
const job = await client . optimization . start ({
promptId: 'prompt_xxxx' ,
datasetId: 'dataset_xxxx' ,
config: {
maxIterations: 10 ,
targetScore: 0.9 ,
mutationStrength: 0.5 ,
evaluationMetrics: [ 'g_eval' , 'semantic_similarity' ],
},
});
console . log ( 'Job ID:' , job . id );
console . log ( 'Status:' , job . status );
console . log ( 'Initial Score:' , job . currentScore );
Configuration Options
Option Type Default Description maxIterationsnumber 10 Maximum optimization cycles to run targetScorenumber 0.95 Stop early when this score is reached mutationStrengthnumber 0.5 How much to vary prompts (0.0-1.0) evaluationMetricsstring[] All available Metrics to optimize for preserveVariablesboolean true Keep original variable names and structure preserveStructureboolean false Maintain overall prompt organization minImprovementnumber 0.01 Stop if improvement falls below this timeoutnumber 3600000 Max runtime in milliseconds (1 hour default)
Configuration Examples
Conservative optimization:
config : {
maxIterations : 20 ,
mutationStrength : 0.2 ,
minImprovement : 0.005 ,
preserveStructure : true ,
}
Aggressive optimization:
config : {
maxIterations : 10 ,
mutationStrength : 0.8 ,
targetScore : 0.95 ,
}
Metric-focused optimization:
config : {
maxIterations : 15 ,
evaluationMetrics : [ 'g_eval' ], // Focus on single metric
targetScore : 0.9 ,
}
Job States
Jobs progress through these states:
State Description Transitions pendingQueued, waiting to start -> running, cancelled runningActively optimizing -> completed, paused, failed, cancelled pausedTemporarily stopped -> running, cancelled completedSuccessfully finished Terminal failedError occurred Can retry cancelledManually stopped Terminal
Managing Jobs
Check Status
const job = await client . optimization . getStatus ( 'job_xxxx' );
console . log ( 'Status:' , job . status );
console . log ( 'Iteration:' , ` ${ job . currentIteration } / ${ job . maxIterations } ` );
console . log ( 'Current Score:' , job . currentScore . toFixed ( 2 ));
console . log ( 'Best Score:' , job . bestScore . toFixed ( 2 ));
console . log ( 'Started:' , job . startedAt );
console . log ( 'Elapsed:' , ` ${ ( Date . now () - job . startedAt . getTime ()) / 1000 } s` );
List Jobs
// Get all jobs
const jobs = await client . optimization . list ({
limit: 20 ,
status: 'running' , // Filter by status
});
jobs . data . forEach ( job => {
console . log ( ` ${ job . id } : ${ job . status } - Score: ${ job . currentScore } ` );
});
Pause a Job
Temporarily stop a running job (can be resumed later):
await client . optimization . pause ( 'job_xxxx' );
// Check it paused
const job = await client . optimization . getStatus ( 'job_xxxx' );
console . log ( 'Status:' , job . status ); // 'paused'
Pausing preserves the current best prompt and all progress. The job can be resumed from where it left off.
Resume a Job
Continue a paused job:
await client . optimization . resume ( 'job_xxxx' );
console . log ( 'Job resumed from iteration' , job . currentIteration );
Cancel a Job
Permanently stop a job (cannot be resumed):
await client . optimization . cancel ( 'job_xxxx' );
// Results up to this point are still available
const results = await client . optimization . getResults ( 'job_xxxx' );
console . log ( 'Best score achieved:' , results . bestScore );
Cancellation is permanent. If you might want to continue later, use pause instead.
Getting Results
Retrieve detailed results when a job completes:
const results = await client . optimization . getResults ( 'job_xxxx' );
console . log ( '=== Optimization Results === \n ' );
// Original vs optimized
console . log ( 'Original Prompt:' );
console . log ( results . originalPrompt . content );
console . log ( `Score: ${ results . originalScore . toFixed ( 2 ) } \n ` );
console . log ( 'Optimized Prompt:' );
console . log ( results . optimizedPrompt . content );
console . log ( `Score: ${ results . finalScore . toFixed ( 2 ) } \n ` );
// Improvement summary
console . log ( 'Summary:' );
console . log ( ` Improvement: + ${ ( results . improvement * 100 ). toFixed ( 1 ) } %` );
console . log ( ` Iterations: ${ results . totalIterations } ` );
console . log ( ` Duration: ${ results . duration / 1000 } s` );
console . log ( ` Variants tested: ${ results . variantsTested } ` );
Results Structure
interface OptimizationResults {
jobId : string ;
status : 'completed' | 'cancelled' | 'failed' ;
// Prompts
originalPrompt : {
id : string ;
content : string ;
variables : Record < string , string >;
};
optimizedPrompt : {
id : string ;
content : string ;
variables : Record < string , string >;
};
// Scores
originalScore : number ;
finalScore : number ;
improvement : number ; // finalScore - originalScore
// Iteration history
iterations : Array <{
number : number ;
score : number ;
isNewBest : boolean ;
promptVariant : string ;
}>;
// Metadata
totalIterations : number ;
variantsTested : number ;
duration : number ; // milliseconds
startedAt : Date ;
completedAt : Date ;
}
Applying Results
After optimization, create a new prompt version with the optimized content:
const results = await client . optimization . getResults ( 'job_xxxx' );
// Only apply if there was meaningful improvement
if ( results . improvement > 0.02 ) {
// Create new version with optimized content
const newVersion = await client . prompts . createVersion ( results . originalPrompt . id , {
content: results . optimizedPrompt . content ,
description: `Optimized from v ${ results . originalPrompt . version } . Score: ${ results . originalScore . toFixed ( 2 ) } -> ${ results . finalScore . toFixed ( 2 ) } (+ ${ ( results . improvement * 100 ). toFixed ( 1 ) } %)` ,
});
console . log ( 'Created optimized version:' , newVersion . currentVersion );
// Optionally run evaluation to verify
const verification = await client . evaluations . run ({
promptId: newVersion . id ,
datasetId: 'golden_dataset_xxxx' ,
metrics: [ 'g_eval' , 'semantic_similarity' ],
});
console . log ( 'Verification evaluation:' , verification . id );
} else {
console . log ( 'Improvement too small, keeping original' );
}
Monitoring Progress
Polling
Check status periodically:
async function waitForJob ( jobId : string ) {
while ( true ) {
const job = await client . optimization . getStatus ( jobId );
console . log ( `Iteration ${ job . currentIteration } / ${ job . maxIterations } ` );
console . log ( `Current: ${ job . currentScore . toFixed ( 2 ) } | Best: ${ job . bestScore . toFixed ( 2 ) } ` );
if ( job . status === 'completed' || job . status === 'failed' || job . status === 'cancelled' ) {
return job ;
}
await new Promise ( r => setTimeout ( r , 5000 )); // Wait 5s
}
}
Streaming (Recommended)
Use WebSocket streaming for real-time updates:
const unsubscribe = client . optimization . subscribe ( 'job_xxxx' , {
onEvent : ( event ) => {
if ( event . type === 'iteration_complete' ) {
console . log ( `Iteration ${ event . iteration } : ${ event . score . toFixed ( 2 ) } ` );
}
if ( event . type === 'new_best' ) {
console . log ( `New best! Score: ${ event . score . toFixed ( 2 ) } ` );
}
},
onComplete : ( results ) => {
console . log ( 'Done! Final score:' , results . finalScore );
},
});
See Streaming for full details.
Best Practices
Start with a quality dataset
Optimization is only as good as your test cases. Ensure your dataset is representative and well-designed before optimizing.
A target score of 1.0 is rarely achievable. Set targets based on your baseline and acceptable quality levels.
Due to the stochastic nature of optimization, running multiple jobs and comparing results can yield better outcomes.
Always run a separate evaluation on the optimized prompt to verify improvements hold on different test cases.
Preserve variable structure
Keep preserveVariables: true to ensure the optimized prompt remains compatible with your application.
Troubleshooting
Check provider configuration and rate limits. Jobs queue when resources are constrained.
No improvement after many iterations
Try increasing mutation strength or using a different strategy. The prompt may be near optimal for the given dataset.
This shouldn’t happen (optimization keeps the best). Check if the dataset or evaluation changed between runs.
Increase the timeout config or use a smaller dataset for faster iterations.