Spans

A span is a named unit of work inside a trace. Spans form a tree: a parent span contains child spans. This tree should match how you think about your workflow.

What deserves a span

Create spans for steps you would want to:

measure latency for
debug when things go wrong
compare across releases and models

Typical spans:

retrieval.query
retrieval.rerank
llm.completion
tool.execute
guardrails.validate
postprocess.format

Span naming conventions

Use stable, predictable names:

Prefer noun.verb or domain.action (retrieval.query, tool.execute)
Avoid embedding dynamic values in names (use metadata instead)
Keep the set of names small so dashboards and filters stay clean

What to attach to a span

Add structured fields that help debugging and filtering:

Provider/model for llm.* spans
Token usage and finish reasons when available
Tool arguments/results (sanitized) for tool.* spans
Cache hits for retrieval or completion caches
Retries and error codes

Timing

If you wrap a function call with a span, you automatically get accurate timings. That gives you:

per-step latency breakdowns
p95/p99 hotspots
“what changed?” comparisons across releases

Error handling

A span should record failures as first-class data:

mark spans as error when exceptions occur
record error type/message (avoid stack traces if they include secrets)
include retry counts and whether a fallback was used

Next steps

Read Events for capturing inputs/outputs.
Read Tracing for recommended span trees.