Skip to main content

Creating Datasets

Build comprehensive datasets to evaluate and optimize your prompts. This guide covers all methods for creating and populating datasets.

Create Empty Dataset

Start by creating a dataset shell, then populate it with items:
const dataset = await client.datasets.create({
  promptId: 'prompt_xxxx',
  name: 'Customer Support Test Cases',
  description: 'Common support questions and expected answers for Q1 2024',
});

console.log('Dataset ID:', dataset.id);
console.log('Name:', dataset.name);
console.log('Items:', dataset.itemCount); // 0

Add Items One by One

Add individual test cases when you need precise control:
// Add a single item
const item = await client.datasets.addItem(dataset.id, {
  input: {
    customer_name: 'John',
    question: 'How do I cancel my subscription?',
  },
  expectedOutput: 'To cancel your subscription, go to Account > Subscription > Cancel. Your access will continue until the end of your billing period.',
  metadata: {
    category: 'billing',
    difficulty: 'easy',
    source: 'support_tickets',
  },
});

console.log('Added item:', item.id);

Bulk Add Items

Add multiple items at once for efficiency:
// Add multiple items in one call
await client.datasets.addItems(dataset.id, [
  {
    input: { question: 'What is the pricing?' },
    expectedOutput: 'Our plans start at $9/month for Basic, $29/month for Pro, and custom pricing for Enterprise.',
    metadata: { category: 'pricing' },
  },
  {
    input: { question: 'Is there a free trial?' },
    expectedOutput: 'Yes, we offer a 14-day free trial with full access to all Pro features. No credit card required.',
    metadata: { category: 'pricing' },
  },
  {
    input: { question: 'How do I contact support?' },
    expectedOutput: 'You can reach our support team via email at [email protected], live chat on our website, or phone at 1-800-EXAMPLE.',
    metadata: { category: 'support' },
  },
  {
    input: { question: 'What payment methods do you accept?' },
    expectedOutput: 'We accept all major credit cards (Visa, MasterCard, American Express), PayPal, and bank transfers for Enterprise plans.',
    metadata: { category: 'billing' },
  },
]);

console.log('Dataset now has', dataset.itemCount, 'items');

Import from JSON

Import datasets from JSON files for easy migration and backup restoration:
import fs from 'fs';

// Load items from a JSON file
const items = JSON.parse(
  fs.readFileSync('test-cases.json', 'utf-8')
);

// Expected JSON format:
// [
//   {
//     "input": { "question": "..." },
//     "expectedOutput": "...",
//     "metadata": { ... }
//   },
//   ...
// ]

await client.datasets.addItems(dataset.id, items);

console.log(`Imported ${items.length} items`);

JSON File Format

[
  {
    "input": {
      "customer_name": "Alice",
      "question": "How do I upgrade my account?"
    },
    "expectedOutput": "Visit Account Settings and click Upgrade Plan.",
    "metadata": {
      "category": "account",
      "priority": "high"
    }
  },
  {
    "input": {
      "customer_name": "Bob",
      "question": "Can I get a refund?"
    },
    "expectedOutput": "Refunds are available within 30 days of purchase.",
    "metadata": {
      "category": "billing",
      "priority": "medium"
    }
  }
]

Import from CSV

Convert CSV data to dataset items:
import fs from 'fs';
import { parse } from 'csv-parse/sync';

// Load and parse CSV
const csvContent = fs.readFileSync('test-cases.csv', 'utf-8');
const records = parse(csvContent, {
  columns: true,
  skip_empty_lines: true,
});

// Transform to dataset items
const items = records.map((row: any) => ({
  input: {
    question: row.question,
    customer_name: row.customer_name || 'Guest',
  },
  expectedOutput: row.expected_answer,
  metadata: {
    category: row.category,
    source: 'csv_import',
  },
}));

await client.datasets.addItems(dataset.id, items);

Clone Existing Dataset

Create a copy of an existing dataset for iteration or A/B testing:
// Clone with a new name
const cloned = await client.datasets.clone(originalDatasetId, {
  name: 'Support Cases v2',
  description: 'Updated version with additional edge cases',
});

console.log('Cloned dataset:', cloned.id);
console.log('Items copied:', cloned.itemCount);

// Now add new items to the clone
await client.datasets.addItems(cloned.id, [
  {
    input: { question: 'New edge case question' },
    expectedOutput: 'Expected response for edge case',
  },
]);

Export Dataset

Export datasets for backup, sharing, or analysis:
// Export to JSON
const data = await client.datasets.export(dataset.id);

// Save to file
fs.writeFileSync(
  'dataset-backup.json',
  JSON.stringify(data, null, 2)
);

console.log('Exported', data.items.length, 'items');

Export Format

interface DatasetExport {
  dataset: {
    id: string;
    name: string;
    description: string;
    promptId: string;
    createdAt: string;
  };
  items: Array<{
    input: Record<string, any>;
    expectedOutput?: string;
    metadata?: Record<string, any>;
  }>;
}

Update Dataset Items

Modify existing items when requirements change:
// Update a single item
await client.datasets.updateItem(datasetId, itemId, {
  expectedOutput: 'Updated expected response with more detail.',
  metadata: {
    lastReviewed: new Date().toISOString(),
    reviewedBy: 'team-lead',
  },
});

// Delete an item
await client.datasets.deleteItem(datasetId, itemId);

List Dataset Items

Retrieve items with filtering and pagination:
// List all items
const items = await client.datasets.listItems(dataset.id, {
  limit: 50,
  offset: 0,
});

// Filter by metadata (if supported)
const billingItems = await client.datasets.listItems(dataset.id, {
  filter: { 'metadata.category': 'billing' },
});

items.data.forEach(item => {
  console.log('Input:', item.input);
  console.log('Expected:', item.expectedOutput);
});

Dataset Statistics

Get insights about your dataset:
const stats = await client.datasets.getStats(dataset.id);

console.log('Total items:', stats.itemCount);
console.log('Items with expected output:', stats.withExpectedOutput);
console.log('Items without expected output:', stats.withoutExpectedOutput);
console.log('Average input length:', stats.avgInputLength);
console.log('Created:', stats.createdAt);
console.log('Last updated:', stats.updatedAt);

Best Practices for Creation

Begin with 10-20 high-quality items and expand based on evaluation results.
Ensure all items use the same variable names and data types as your prompt.
Add categories, difficulty levels, and sources for filtering and analysis.
Have domain experts validate expected outputs before using for evaluation.
Datasets are associated with a specific prompt. If your prompt’s variables change, you may need to update your dataset items accordingly.