> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mutagent.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Datasets

> Manage datasets with the Python SDK

# Datasets SDK

Create and manage evaluation datasets. Datasets are collections of test cases used for prompt evaluation, each containing input variables and expected outputs.

## Dataset Architecture

<Mermaid>
  flowchart TD
  P\[Prompt] -->|prompt\_group\_id| D1\[Dataset A]
  P -->|prompt\_group\_id| D2\[Dataset B]
  D1 --> I1\[Item 1]
  D1 --> I2\[Item 2]
  D1 --> I3\[Item N...]
  D2 --> I4\[Item 1]
  D2 --> I5\[Item 2]
</Mermaid>

Datasets are scoped to a **prompt group** (all versions of a prompt share the same `prompt_group_id`). A dataset created for prompt v1.0.0 is automatically available when testing v2.0.0.

## List All Datasets

```python theme={null}
from mutagent import Mutagent

with Mutagent() as client:
    result = client.prompt_datasets.list_prompt_datasets(limit=20, offset=0)
    for d in result.get("data", []):
        print(d["name"], d["promptGroupId"])
```

Filter by `prompt_id`, `prompt_group_id`, `name`, or `created_by`:

```python theme={null}
result = client.prompt_datasets.list_prompt_datasets(
    prompt_id=123,
    name="baseline",
)
```

## List Datasets for a Prompt

```python theme={null}
with Mutagent() as client:
    datasets = client.prompt_datasets.list_datasets_for_prompt(id_=123)
    for d in datasets:
        print(d["name"])
```

## Create Dataset

Datasets are created for a specific prompt (by prompt ID):

```python theme={null}
from mutagent.models import NameDescriptionMetadata2

with Mutagent() as client:
    dataset = client.prompt_datasets.create_prompt_dataset(
        id_=123,  # Prompt ID
        body=NameDescriptionMetadata2(
            name="Support Scenarios",
            description="Common customer support scenarios",
            labels=["baseline", "v1"],
        ),
    )
    print("Created dataset:", dataset["id"])
```

* `mutagent-sdk-python/src/mutagent/prompt_datasets.py` — `PromptDatasets.create_prompt_dataset`
* `mutagent-sdk-python/src/mutagent/models/name_description_metadata2.py` — `NameDescriptionMetadata2`

### `NameDescriptionMetadata2` fields

| Field         | Type        | Required | Description           |
| ------------- | ----------- | -------- | --------------------- |
| `name`        | `str`       | Yes      | Dataset name          |
| `description` | `str`       | No       | Dataset description   |
| `metadata`    | `Any`       | No       | Arbitrary metadata    |
| `labels`      | `list[str]` | No       | Classification labels |

## Get Dataset

```python theme={null}
with Mutagent() as client:
    dataset = client.prompt_datasets.get_prompt_dataset(id_=456)
    print(dataset["name"], dataset["promptGroupId"])
```

## Update Dataset

```python theme={null}
from mutagent.models import NameDescriptionMetadata

with Mutagent() as client:
    updated = client.prompt_datasets.update_prompt_dataset(
        id_=456,
        body=NameDescriptionMetadata(
            name="Updated Name",
            description="New description",
        ),
    )
```

## Delete Dataset

Use `force=True` to delete a dataset that still contains items:

```python theme={null}
with Mutagent() as client:
    client.prompt_datasets.delete_prompt_dataset(id_=456, force=True)
```

## Clone Dataset

Clone a dataset to another prompt. Provide either `target_prompt_id` or `target_prompt_group_id`:

```python theme={null}
from mutagent.models import TargetPromptIdTargetPromptGroupIdNewName

with Mutagent() as client:
    cloned = client.prompt_datasets.clone_prompt_dataset(
        id_=456,
        body=TargetPromptIdTargetPromptGroupIdNewName(
            target_prompt_id=789,
            new_name="Support Scenarios (Copy)",
        ),
    )
```

## Export Dataset

Export a dataset with all items and metadata:

```python theme={null}
with Mutagent() as client:
    exported = client.prompt_datasets.export_prompt_dataset(id_=456)
    print("Dataset:", exported["dataset"]["name"])
    print("Items:", len(exported.get("items", [])))
```

***

## Dataset Items

Dataset items are managed via the `prompt_dataset_items` namespace.

### List Items

```python theme={null}
with Mutagent() as client:
    items = client.prompt_dataset_items.list_prompt_dataset_items(id_=456)
    for item in items:
        print("Input:", item["input"])
        print("Expected:", item["expectedOutput"])
```

### Add Single Item

```python theme={null}
from mutagent.models import NameInputExpectedOutput

with Mutagent() as client:
    item = client.prompt_dataset_items.create_prompt_dataset_item(
        id_=456,  # Dataset ID
        body=NameInputExpectedOutput(
            input={"company": "Acme Inc", "question": "How do I reset my password?"},
            expected_output={"answer": "To reset your password, go to Settings > Security..."},
            name="Password Reset",
        ),
    )
```

### Bulk Add Items

```python theme={null}
with Mutagent() as client:
    result = client.prompt_dataset_items.bulk_create_prompt_dataset_items(
        id_=456,  # Dataset ID
        body={
            "items": [
                {
                    "input": {"company": "Acme", "question": "Pricing?"},
                    "expectedOutput": {"answer": "Our pricing starts at..."},
                    "name": "Pricing Inquiry",
                },
                {
                    "input": {"company": "Acme", "question": "Refund policy?"},
                    "expectedOutput": {"answer": "We offer 30-day refunds..."},
                    "name": "Refund Question",
                },
            ]
        },
    )
    print(f"Added {len(result)} items")
```

### Get Item

```python theme={null}
with Mutagent() as client:
    item = client.prompt_dataset_items.get_prompt_dataset_item(id_=789)
```

### Update Item

```python theme={null}
from mutagent.models import NameInputExpectedOutput2

with Mutagent() as client:
    client.prompt_dataset_items.update_prompt_dataset_item(
        id_=789,
        body=NameInputExpectedOutput2(
            expected_output={"answer": "Updated expected output..."},
        ),
    )
```

### Delete Item

```python theme={null}
with Mutagent() as client:
    client.prompt_dataset_items.delete_prompt_dataset_item(id_=789)
```

***

## Two-Step Upload Pattern

For uploading datasets from files (JSON, JSONL, CSV), create dataset metadata first, then bulk insert items:

```python theme={null}
import json
from mutagent import Mutagent
from mutagent.models import NameDescriptionMetadata2

with Mutagent() as client:
    # Step 1: Create dataset metadata
    dataset = client.prompt_datasets.create_prompt_dataset(
        id_=prompt_id,
        body=NameDescriptionMetadata2(name="Imported Dataset"),
    )

    # Step 2: Parse file and bulk insert items
    with open("test-cases.json") as f:
        raw = json.load(f)

    items = [
        {
            "input": row["input"],
            "expectedOutput": row["expectedOutput"],
            "name": row.get("name"),
        }
        for row in raw
    ]

    client.prompt_dataset_items.bulk_create_prompt_dataset_items(
        id_=dataset["id"],
        body={"items": items},
    )
```

## Method Reference

### Dataset Methods (`client.prompt_datasets`)

| Method                               | Description                     |
| ------------------------------------ | ------------------------------- |
| `list_prompt_datasets(...)`          | List all datasets with filters  |
| `list_datasets_for_prompt(id_)`      | List datasets for a prompt      |
| `create_prompt_dataset(id_, body)`   | Create dataset for a prompt     |
| `get_prompt_dataset(id_)`            | Get dataset by ID               |
| `update_prompt_dataset(id_, body)`   | Update dataset                  |
| `delete_prompt_dataset(id_, force?)` | Delete dataset                  |
| `clone_prompt_dataset(id_, body)`    | Clone dataset to another prompt |
| `export_prompt_dataset(id_)`         | Export dataset with all items   |

### Item Methods (`client.prompt_dataset_items`)

| Method                                        | Description        |
| --------------------------------------------- | ------------------ |
| `list_prompt_dataset_items(id_)`              | List dataset items |
| `create_prompt_dataset_item(id_, body)`       | Add single item    |
| `bulk_create_prompt_dataset_items(id_, body)` | Add multiple items |
| `get_prompt_dataset_item(id_)`                | Get item by ID     |
| `update_prompt_dataset_item(id_, body)`       | Update item        |
| `delete_prompt_dataset_item(id_)`             | Delete item        |
