Setting Up OpenTelemetry in FastAPI
This guide solves one precise problem: wiring OpenTelemetry into a FastAPI service so every request produces a correctly parented span without breaking Starlette's async middleware chain. It is part of the Distributed Tracing and OpenTelemetry in Python guide and applies the provider lifecycle from the OpenTelemetry SDK setup guide to FastAPI's event-loop architecture, giving you a near-zero-overhead instrumentation that exports asynchronously. It sits alongside the broader guide on instrumenting Python web frameworks, which applies the same ASGI/WSGI pattern to Django, Flask, and Starlette directly.
FastAPI is built on Starlette and ASGI, so instrumentation hooks the ASGI application rather than individual routes. That is why a single instrument_app call covers every endpoint, including ones added later, and why the wrapping must respect the async call chain — a synchronous middleware injected in the wrong place breaks the await that drives the whole stack.
Prerequisites
Async context loss happens when instrumentation packages mismatch Starlette's routing layer, so pin the versions that ship together. Mismatched opentelemetry-api and opentelemetry-sdk versions raise an ImportError during provider initialization.
pip install \
"opentelemetry-sdk>=1.30.0,<2.0.0" \
"opentelemetry-instrumentation-fastapi>=0.51b0,<1.0.0" \
"opentelemetry-exporter-otlp-proto-grpc>=1.30.0,<2.0.0"
export OTEL_SERVICE_NAME="fastapi-backend"
export OTEL_RESOURCE_ATTRIBUTES="deployment.environment=production,team=platform"
export OTEL_EXPORTER_OTLP_ENDPOINT="otel-collector:4317"
Programmatic resource defaults still matter even with the env vars set, so the service never falls back to a generic unknown_service label when a variable is missing. The order of these dependencies is deliberate: opentelemetry-instrumentation-fastapi pulls in a compatible opentelemetry-instrumentation-asgi, which is the layer that actually wraps Starlette. If you upgrade the SDK without upgrading the instrumentation package to a matching beta, the ASGI wrapper can fall out of step with Starlette's middleware contract and lose async context, which is exactly the failure the version pins prevent.
Implementation
-
Bootstrap the provider before the app exists. Build the resource and provider, attach a
BatchSpanProcessor(neverSimpleSpanProcessor, which exports synchronously and blocks the loop), and set the global provider — the same deterministic lifecycle described in the OpenTelemetry SDK setup guide. Programmatic resource values guarantee a realservice.nameeven ifOTEL_SERVICE_NAMEis absent. -
Instrument after constructing the app. Call
FastAPIInstrumentor.instrument_app(app)once theFastAPI()instance exists so the ASGI middleware wraps every route natively. The middleware extractstraceparentandtracestatefrom inbound requests and opens a server span, so downstream services receive consistent context with no manual header parsing. -
Nest manual spans for business logic. Auto-instrumentation captures only the request boundary. Open a child span with
tracer.start_as_current_span()inside the handler or a dependency to record the work that defines your latency, preserving parent-child links across eachawait. Because FastAPI runs on a single event loop, the active span is stored in acontextvarand stays correct across awaits within one request without any extra effort — the manual span you open simply nests under the server span the middleware already created. The only place you must intervene is when you hand work to a thread pool or schedule a fire-and-forget task, where the context does not follow automatically. -
Flush on shutdown. Force-flush and shut down the provider in the lifespan handler so spans buffered at process exit are not dropped during a graceful restart. The
yieldin the lifespan context separates startup from shutdown; everything after it runs once when the server begins draining, which is the right moment to callforce_flushbeforeshutdowncloses the exporter connection.
Two details make this robust under real traffic. First, instrument_app must receive the same tracer_provider you registered globally; passing it explicitly removes any ambiguity about which provider the middleware uses and avoids a subtle bug where the middleware binds to a stale default. Second, excluded_urls keeps health checks, metrics scrapes, and the docs UI out of your traces — these fire constantly, carry no diagnostic value, and would otherwise dominate span volume and cost. Exclude them by path fragment so a load balancer's liveness probe never creates a span.
When a FastAPI route depends on a yield-based dependency (a database session, a unit of work), the dependency's setup and teardown run outside the handler's manual span unless you open one inside the dependency itself. If you care about the time spent acquiring a connection or committing a transaction, wrap that work in its own start_as_current_span within the dependency so it nests under the request span rather than disappearing into the auto-generated server span. The same manual-span discipline carries over to background tasks scheduled with BackgroundTasks, which run after the response is sent and therefore after the server span has closed — give them their own span linked to the request's context.
import os
import asyncio
from contextlib import asynccontextmanager
from fastapi import FastAPI, Request
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from opentelemetry.trace import SpanKind
# 1. Resource + provider, bootstrapped before the app is built.
resource = Resource.create({
"service.name": os.getenv("OTEL_SERVICE_NAME", "fastapi-backend"),
"deployment.environment": os.getenv("DEPLOYMENT_ENV", "production"),
})
provider = TracerProvider(resource=resource)
provider.add_span_processor(BatchSpanProcessor( # async export, never SimpleSpanProcessor
OTLPSpanExporter(
endpoint=os.getenv("OTEL_EXPORTER_OTLP_ENDPOINT", "otel-collector:4317"),
insecure=os.getenv("OTEL_EXPORTER_OTLP_INSECURE", "false").lower() == "true",
),
max_export_batch_size=512,
max_queue_size=2048,
schedule_delay_millis=5000,
))
trace.set_tracer_provider(provider)
# 4. Flush buffered spans on graceful shutdown.
@asynccontextmanager
async def lifespan(app: FastAPI):
yield
provider.force_flush(timeout_millis=5000)
provider.shutdown()
app = FastAPI(lifespan=lifespan)
FastAPIInstrumentor.instrument_app( # 2. instrument after app construction
app,
tracer_provider=provider,
excluded_urls="healthz,metrics,docs",
)
tracer = trace.get_tracer(__name__)
@app.get("/process/{item_id}")
async def process_item(item_id: str, request: Request):
# 3. Manual child span for business logic, nested under the server span.
with tracer.start_as_current_span(
"process_item_logic",
kind=SpanKind.INTERNAL,
attributes={"item.id": item_id},
) as span:
await asyncio.sleep(0.05)
span.set_attribute("processing.status", "completed")
return {"item_id": item_id, "status": "processed"}
For a high-traffic endpoint, add a sampler to the provider so the service does not record every request. A ParentBased(TraceIdRatioBased(0.1)) sampler keeps 10% of the traces this service roots while honoring any upstream sampling decision carried in the inbound traceparent, so a request that was already sampled by an upstream gateway is recorded here too. Set the sampler when you construct the TracerProvider, the same place the resource is fixed, and let the collector handle tail sampling for errors and slow requests rather than trying to encode that logic in the application.
Configuration Options
| Option | Where | Default | Recommended |
|---|---|---|---|
excluded_urls |
instrument_app |
none | healthz,metrics,docs to drop noise |
tracer_provider |
instrument_app |
global | pass explicitly to avoid ambiguity |
OTEL_EXPORTER_OTLP_INSECURE |
env | false |
false in production (use TLS) |
max_queue_size |
BatchSpanProcessor |
2048 | 2× peak concurrent requests |
schedule_delay_millis |
BatchSpanProcessor |
5000 | 2000–5000 to amortize I/O |
OTEL_EXPORTER_OTLP_TIMEOUT |
env | 10000 | 5000 so retries cannot block the loop |
Verification
Send a request and confirm the collector receives a server span and the nested process_item_logic child sharing one trace_id.
curl -s localhost:8000/process/12345
Expected Output (collector side):
{
"resourceSpans": [{
"resource": {"attributes": [
{"key": "service.name", "value": {"stringValue": "fastapi-backend"}},
{"key": "deployment.environment", "value": {"stringValue": "production"}}
]},
"scopeSpans": [{"spans": [{
"name": "process_item_logic",
"kind": "SPAN_KIND_INTERNAL",
"attributes": [
{"key": "item.id", "value": {"stringValue": "12345"}},
{"key": "processing.status", "value": {"stringValue": "completed"}}
]
}]}]
}]
}
A correctly wired service shows two spans per request: an auto-generated GET /process/{item_id} server span and the manual child nested beneath it. The server span also carries the standard HTTP attributes — method, route, and status code — applied by the ASGI instrumentation, so you can filter and aggregate by route in the backend without adding them yourself. Note that the route appears as the templated path, not the concrete 12345, which is what keeps span names low-cardinality — the actual value lives in the item.id attribute where it can be queried without exploding the number of distinct span names.
If you need to verify without a collector, attach a ConsoleSpanExporter through a SimpleSpanProcessor in development and watch the two spans print to stdout in parent-child order on each request. Confirm the child's parent_span_id matches the server span's span_id; if it is empty or points elsewhere, async context was lost — usually because the CLI launcher was used instead of instrument_app, the mistake covered below.
Common Mistakes
CLI Auto-Instrumentation Breaks Async Context
Error signature: RuntimeWarning: coroutine 'Starlette.__call__' was never awaited, or broken parent-child relationships.
Root cause: The opentelemetry-instrument CLI wrapper injects synchronous middleware that bypasses FastAPI's async ASGI stack.
Remediation: Drop the CLI wrapper and call FastAPIInstrumentor.instrument_app() programmatically after the app is constructed.
SimpleSpanProcessor Stalls Under Load
Error signature: asyncio.exceptions.TimeoutError at peak load, followed by SpanExportError: Export timed out.
Root cause: SimpleSpanProcessor runs a synchronous gRPC call on every request completion, blocking the event loop.
Remediation: Use BatchSpanProcessor, set max_queue_size to twice expected concurrency, and keep schedule_delay_millis between 2000 and 5000.
Spans Lost on Restart
Error signature: Final requests before a deploy are missing from the backend.
Root cause: The provider is never flushed, so buffered spans die with the process.
Remediation: Call provider.force_flush() and provider.shutdown() in the lifespan handler as shown above, and make sure your process gets a real shutdown signal — a hard SIGKILL skips lifespan teardown entirely, so configure a graceful termination grace period in your orchestrator.
Frequently Asked Questions
Does FastAPI auto-instrumentation capture async generator dependencies?
No. The HTTP instrumentation only covers the outer request and response cycle. Wrap yield-based dependencies and async generators in tracer.start_as_current_span manually to record their sub-spans.
How do I prevent OTLP exporter retries from blocking the event loop?
Set a bounded exporter timeout and pair it with a BatchSpanProcessor sized to your concurrency. The processor flushes on a background thread and drops spans under backpressure rather than queuing indefinitely on the request path.
Can I inject custom baggage into the FastAPI request context?
Yes. Call opentelemetry.baggage.set_baggage inside a dependency or middleware before the route runs, and the W3C baggage header will propagate automatically to downstream HTTP and gRPC calls.