Skip to main content

Tracing Overview

Tracing gives you full visibility into what your AI application is doing at every step. Each request flows through multiple stages — LLM calls, tool invocations, retrieval lookups, orchestration logic — and tracing captures all of them as a structured timeline you can inspect, debug, and optimize.

Why Tracing Matters for AI Applications

Traditional application monitoring tracks HTTP requests and database queries. AI applications introduce new challenges:
  • Non-deterministic outputs — The same prompt can produce different results across runs
  • Multi-step orchestration — Agents chain multiple LLM calls, tool uses, and retrieval steps together
  • Cost and latency visibility — Token usage and model latency vary per request and need tracking
  • Debugging complex chains — When an agent fails on step 7 of 12, you need to see the full execution path
MutagenT tracing captures this entire execution graph as a trace — a tree of spans that represent each operation in your AI pipeline.

Core Concepts

Traces

A trace represents a single end-to-end request through your application. It is identified by a unique traceId and contains one or more spans arranged in a parent-child hierarchy.

Spans

A span represents a single operation within a trace. Each span captures:
  • Name — What this operation does (e.g., “generate-response”, “search-docs”)
  • Kind — The category of operation (see SpanKind below)
  • Input / Output — Structured data flowing through the operation
  • Metrics — Token counts, cost, latency, model information
  • Status — Whether the operation succeeded (ok), failed (error), or is in progress (unset)
  • Duration — How long the operation took
  • Attributes — Custom key-value metadata

Parent-Child Relationships

Spans form a tree. When a span is created while another span is active, the new span automatically becomes a child of the active span. This gives you a hierarchical view of your application’s execution: Context propagation is handled automatically using AsyncLocalStorage (TypeScript) or contextvars (Python), so child spans are linked to their parents without any manual wiring.

SpanKind Taxonomy

Every span has a kind that classifies the type of operation it represents. MutagenT defines 15 span kinds, aligned with the OpenTelemetry Gen AI semantic conventions (v1.37+):

Generation

KindDescriptionExample Use
llm.chatChat completion callOpenAI chat.completions.create()
llm.completionText completion callLegacy completion APIs
llm.embeddingEmbedding generationCreating vector embeddings for search

Orchestration

KindDescriptionExample Use
chainSequential pipeline of stepsLangChain chain, RAG pipeline
agentAutonomous agent executionReAct agent, tool-using agent
graphGraph-based executionLangGraph, state machine workflows
nodeSingle node in a graphIndividual graph node execution
edgeTransition between graph nodesConditional routing logic
workflowMulti-step workflowMastra workflow, orchestration flow
middlewareRequest/response middlewareInput validation, output formatting

Tools

KindDescriptionExample Use
toolExternal tool invocationFunction calling, API calls, code execution

Retrieval

KindDescriptionExample Use
retrievalDocument/data retrievalVector store query, database lookup
rerankResult re-rankingCross-encoder reranking of search results

Safety

KindDescriptionExample Use
guardrailSafety or policy checkContent moderation, PII detection

Other

KindDescriptionExample Use
customAny operation not covered aboveCustom business logic

How Tracing Works

Automatic Tracing with Integration Packages

If you use one of MutagenT’s integration packages (@mutagent/langchain, @mutagent/langgraph, @mutagent/openai, @mutagent/vercel-ai, @mutagent/mastra), tracing happens automatically. The integration intercepts framework callbacks and creates properly structured spans without any manual instrumentation.
import { initTracing } from "@mutagent/sdk/tracing";
import { MutagentCallbackHandler } from "@mutagent/langchain";

initTracing({ apiKey: process.env.MUTAGENT_API_KEY! });

const handler = new MutagentCallbackHandler();

// Traces are created automatically for every chain/agent/LLM call
const result = await chain.invoke({ query: "..." }, { callbacks: [handler] });
If MUTAGENT_API_KEY is set in your environment, you can skip the initTracing() call entirely. The SDK will lazily initialize tracing on the first startSpan() call. This is especially useful for quick scripts and development environments.

Manual Tracing with the SDK

When you need fine-grained control, or when building custom orchestration logic, use the SDK’s tracing API directly. MutagenT provides three levels of abstraction:
Use the @trace() decorator to automatically wrap class methods with span creation, input/output capture, and error handling.
import { trace } from "@mutagent/sdk/tracing";

class MyAgent {
  @trace({ kind: "agent", name: "my-agent" })
  async run(query: string): Promise<string> {
    // Span is created automatically
    return await this.generate(query);
  }
}
Use withTrace() to wrap any function with a span and get a SpanHandle for manual enrichment.
import { withTrace } from "@mutagent/sdk/tracing";

const result = await withTrace(
  { kind: "chain", name: "rag-pipeline" },
  async (span) => {
    span.setMetrics({ model: "gpt-4o", provider: "openai" });
    const answer = await generateAnswer(query);
    span.setOutput({ text: answer });
    return answer;
  }
);
Use startSpan() and endSpan() for maximum flexibility, typically in integration adapters.
import { startSpan, endSpan, runInSpanContext } from "@mutagent/sdk/tracing";

const span = startSpan({ kind: "tool", name: "web-search" });
if (span) {
  const result = await runInSpanContext(span, async () => {
    return await searchWeb(query);
  });
  endSpan(span, { status: "ok", output: { raw: result } });
}

Architecture

The tracing pipeline follows a simple, non-blocking flow from your application to the MutagenT API: The API also accepts traces from OpenTelemetry-compatible exporters via the OTLP bridge endpoint (POST /api/traces/otlp), which normalizes ExportTraceServiceRequest payloads into the MutagenT format before storing them. Key design decisions:
  • Non-blocking — Span collection never blocks your application. Spans are buffered in memory and flushed asynchronously.
  • Batch transport — Spans are grouped by trace ID and sent in batches to minimize HTTP overhead. Default: flush every 5 seconds or when 10 spans accumulate.
  • Retry with backoff — Failed flushes are retried with exponential backoff (up to 3 retries, base delay 1s doubling each retry).
  • Graceful shutdownshutdownTracing() flushes all remaining spans before the process exits.
  • Lazy initialization — If MUTAGENT_API_KEY is set in the environment, tracing initializes automatically on the first startSpan() call without requiring an explicit initTracing().
  • Hybrid OTel strategy — MutagenT uses a lean native schema with gen_ai.* OTel semantic conventions (v1.37+) plus mutagent.* extensions, and provides an OTLP bridge for any OTel-compatible exporter.
  • Idempotent ingestion — Traces and spans use upsert semantics, so re-sending the same trace ID updates rather than duplicates.

When to Use What

ScenarioApproach
Using LangChain, LangGraph, Vercel AI, OpenAI, or MastraUse the integration package — tracing is automatic
Custom orchestration logicUse withTrace() for function-level spans
Class-based agent architectureUse @trace() decorator on methods
Building a framework integrationUse the low-level startSpan() / endSpan() API
Simple scripts or one-off callsUse withTrace() for quick instrumentation
OTel-compatible exporter already in useSend traces to the OTLP bridge endpoint (POST /api/traces/otlp)

Next Steps

Tracing Setup

Configure tracing in your application with initTracing() and environment variables.

Tracing API Reference

Full reference for decorators, wrappers, and low-level span primitives.