Documentation/Reliability/Retries & timeouts

1 min read

Retries & timeouts

Retries are necessary in production, but they can create hidden cost and duplicated side effects if implemented incorrectly.

Timeout taxonomy

connect timeout: cannot reach provider
read timeout: provider stalls during response/stream
overall deadline: request-level budget (recommended)

Use request-level deadlines to prevent a single step from consuming the entire SLA.

Retry rules (recommended)

Retry only when:

request is idempotent (or you have idempotency keys)
error is transient (timeouts, 5xx, rate limits)

Avoid retrying:

invalid requests (4xx)
safety refusals
deterministic tool failures unless you changed inputs

Backoff strategy

exponential backoff with jitter
cap max retries and total time spent
record retry count and last error in spans

Observability

Always capture:

retries attempted
fallback used (yes/no)
provider/model
latency per attempt

Next steps

Fallback models