Context Propagation and Baggage in Python OpenTelemetry
A trace only stays whole if its context survives every hop: the trace ID, the parent span ID, and the sampling decision have to ride along with each outbound request and be reconstructed on the other side. Context propagation is the serialization layer that does this, and baggage is the mechanism for carrying your own key-value metadata along the same path. This guide sits within the broader Distributed Tracing and OpenTelemetry in Python guide and builds on a provider configured per OpenTelemetry SDK Setup; it pairs closely with Span Lifecycle and Attributes, which governs the spans whose context you are moving.
Key implementation areas covered below:
- W3C TraceContext and Baggage headers versus the legacy B3 format.
- The
injectandextractlifecycle through theTextMapPropagatorinterface. contextvarsisolation acrossawait, threads, and task queues.- Baggage size limits, serialization cost, and detachment hygiene.
Prerequisites
pip install \
"opentelemetry-api>=1.30.0,<2.0.0" \
"opentelemetry-sdk>=1.30.0,<2.0.0"
For B3 compatibility with a Zipkin-era mesh, also install the propagator extension:
pip install "opentelemetry-propagator-b3>=1.30.0,<2.0.0"
The default global propagator is already composite W3C TraceContext plus Baggage, so most services need no extra propagator configuration. You only override it to add or reorder formats.
Concept and Architecture
Propagation is a header exchange standardized by the W3C TraceContext spec. The traceparent header carries the version, trace ID, parent span ID, and trace flags (including the sampled bit). The tracestate header carries vendor-specific routing data. OpenTelemetry hides the byte format behind the TextMapPropagator interface, which exposes exactly two operations: inject writes the active context into a carrier, and extract reads a carrier back into a Context object.
The lifecycle is strict and directional. On an outbound call the sender invokes inject(carrier), serializing the active context into a mutable mapping such as a dict of HTTP headers, gRPC metadata, or a message payload. On the inbound side the receiver invokes extract(carrier), which returns a new Context containing the remote span context as a parent reference. Crucially, extract does not make that context active. You must call context.attach() to set it as current, and context.detach() to restore the previous state.
Baggage rides the same carrier through a separate baggage header. Where a span attribute is local to one span and consumed by the backend, baggage is propagated metadata: a tenant ID, a feature-flag cohort, or a routing directive that every downstream service can read off the active context without re-querying. The distinction matters because misusing one for the other is a common and expensive mistake; Span Lifecycle and Attributes explains when local span storage is the right home for data instead.
A separate but related concern is correlation with metrics. The sampled flag carried in traceparent is what lets a backend attach an exemplar from a histogram bucket back to the exact trace, a pattern detailed in Python Metrics and Instrumentation.
The wire format is worth understanding because malformed headers fail silently. A traceparent is four hyphen-delimited fields: 00-{32 hex trace id}-{16 hex span id}-{2 hex flags}. The 00 is the version, and the low bit of the flags byte is the sampled flag — 01 means the trace was sampled, 00 means it was not. A receiver that gets a traceparent with the wrong field count, a non-hex character, or an all-zero trace ID treats it as absent and starts a new root rather than raising, so a subtly corrupted header from a buggy intermediary produces broken traces with no error in the logs. tracestate is a separate, optional comma-separated list of vendor key=value pairs that survives alongside traceparent; it is where sampling systems and vendors stash routing hints, and OpenTelemetry preserves it untouched even for vendors it does not understand.
B3 is the older format from the Zipkin ecosystem, carried either as a single b3 header or as several X-B3-* headers. It encodes the same trace ID, span ID, and sampled bit but is not interchangeable with traceparent on the wire. A service mesh mid-migration will have some hops speaking W3C and some speaking B3, which is the entire reason for a composite propagator: extraction tries each registered format and the first that yields a valid context wins, so a single deployment can accept both while always emitting W3C on the way out.
Step-by-Step Implementation
Step 1 — Confirm or set the global propagator. The SDK ships with a composite W3C propagator. To add B3 for a mixed mesh, register a composite explicitly. Order matters only for which format wins when both are present on extraction.
from opentelemetry.propagate import set_global_textmap
from opentelemetry.propagators.composite import CompositePropagator
from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator
from opentelemetry.baggage.propagation import W3CBaggagePropagator
from opentelemetry.propagators.b3 import B3MultiFormat
set_global_textmap(CompositePropagator([
TraceContextTextMapPropagator(), # primary
W3CBaggagePropagator(), # baggage header
B3MultiFormat(), # legacy fallback
]))
Step 2 — Inject on outbound requests. Set baggage on the context, then start the client span and inject. Instrumentation libraries do this for you, but explicit injection is required for custom transports.
import httpx
from opentelemetry import propagate, trace, baggage, context
tracer = trace.get_tracer("gateway.outbound")
async def call_downstream(url: str, payload: dict) -> dict:
ctx = baggage.set_baggage("tenant.id", "acme-corp")
ctx = baggage.set_baggage("region", "us-east-1", context=ctx)
token = context.attach(ctx)
try:
with tracer.start_as_current_span("outbound_api_call"):
headers: dict[str, str] = {}
propagate.inject(headers) # writes traceparent + baggage
async with httpx.AsyncClient() as client:
resp = await client.post(url, json=payload, headers=headers)
return resp.json()
finally:
context.detach(token)
Step 3 — Extract on inbound work and start a child span. The receiver parses headers into a context, attaches it, and starts its span as a child of the remote parent.
from opentelemetry.propagate import extract
async def handle_inbound(request_headers: dict) -> None:
ctx = extract(request_headers) # parse traceparent + baggage
token = context.attach(ctx)
try:
tenant = baggage.get_baggage("tenant.id") # available across this service
with tracer.start_as_current_span("handle_request") as span:
span.set_attribute("tenant.id", tenant or "unknown")
finally:
context.detach(token) # restore previous context
Step 4 — Detach on completion. The finally block is not optional. Because contextvars survive across await, a missing detach leaks the attached context into whatever coroutine the event loop runs next. detach takes the exact token that attach returned and restores the context to its prior state; it is a stack discipline, so detaching tokens out of order — for instance in a nested attach where the inner one is detached after the outer — corrupts the context and logs a warning. Keep each attach/detach pair lexically scoped within one try/finally and never share a token across functions.
Most of this is invisible when you use the official instrumentation. opentelemetry-instrumentation-fastapi, -requests, -httpx, and -grpc inject and extract automatically at the framework's request boundary, so a fully instrumented service propagates context with zero manual inject/extract calls. You only drop to the manual API for transports the instrumentation does not cover: a custom binary protocol, a raw socket, a message format the queue instrumentation does not recognize, or a background task spawned outside the request path. Mixing the two — manually extracting in a handler that the framework instrumentation already extracted for — is a real source of doubled spans, so reach for the manual API only where automatic instrumentation genuinely has no hook.
Baggage carries a subtle security and cost property that span attributes do not: it is automatically forwarded to every downstream service. A tenant ID or feature-flag cohort set once at the edge reaches services three hops away with no further code, which is the feature. The flip side is that anything you put in baggage leaves your trust boundary on every outbound call. Never place secrets, tokens, or PII in baggage, because it will be serialized into plaintext headers and may cross into third-party services or logs you do not control. Treat baggage as a public broadcast channel scoped to the trace, and keep it to small, non-sensitive routing keys.
There is also no automatic eviction. A key set early in a long trace rides every subsequent hop until something explicitly removes it with remove_baggage. In a deep call graph that compounds: each service that adds a key without pruning grows the header monotonically, and you discover the 8 KB ceiling only when a downstream service silently drops the truncated header. The discipline is to set baggage as close to the edge as possible, prune keys the moment they are no longer needed downstream, and treat the baggage header budget as a shared resource the whole request graph spends from.
Configuration Reference
| Setting | Env var / API | Default | Notes |
|---|---|---|---|
| Active propagators | OTEL_PROPAGATORS |
tracecontext,baggage |
Comma list; e.g. b3,tracecontext,baggage |
| Global propagator | set_global_textmap() |
composite W3C | Programmatic override of the above |
| Baggage entry size | W3C spec limit | 4096 chars/entry | Longer entries risk truncation |
| Total baggage header | W3C spec limit | 8192 bytes | Whole baggage header budget |
| Inbound extraction | propagate.extract() |
— | Returns context; does not activate |
| Context activation | context.attach() |
— | Must be paired with detach() |
Async and Concurrency Considerations
asyncio is the easy case. contextvars propagate across await, so a span started before an await is still the parent of one started after it within the same coroutine. The hazard is sharing context where you did not intend to, which is exactly why every attach needs a matching detach.
Thread pools are the hard case. A ThreadPoolExecutor worker begins with a fresh, empty context, so any span created there is orphaned unless you carry the context in. Snapshot it with contextvars.copy_context() and run the worker through ctx.run(...).
import asyncio, contextvars
from opentelemetry import trace
tracer = trace.get_tracer("worker")
def blocking_work():
# parent context was copied in via ctx.run, so this span is correctly parented
with tracer.start_as_current_span("blocking_work") as span:
span.set_attribute("work.kind", "cpu")
async def dispatch():
with tracer.start_as_current_span("dispatch"):
loop = asyncio.get_running_loop()
ctx = contextvars.copy_context()
await loop.run_in_executor(None, lambda: ctx.run(blocking_work))
asyncio.run(dispatch())
Task queues need the most care because the context travels as data, not as a thread-local. The producer injects context into the message payload; the consumer extracts it before work begins and detaches after, so trace state never bleeds into the next task. The full worker lifecycle, including retries and acknowledgement, is in propagating trace context across Celery tasks.
There is a semantic choice at the queue boundary that propagation alone does not make for you: should the consumer's span be a child of the producer, or a separate trace linked to it? A child span ties the two into one trace, which reads naturally for a synchronous-feeling request that happens to hop a queue. But for a fan-out where one producer message triggers thousands of consumers hours later, a single trace with thousands of children becomes unwieldy and the timing is misleading. The convention there is a CONSUMER-kind span in its own trace carrying a span link back to the PRODUCER span, which records the causal relationship without forcing everything into one timeline. The choice is yours to make in the worker; propagation just guarantees the producer's span context is available to either parent from or link to.
Connection pools and contextvars interact in one non-obvious way. A pooled object — a database connection, an HTTP client — is created once and reused across many requests, so any span created at pool-construction time is bound to the wrong, long-dead context. Always create the span at borrow-or-use time inside the active request context, never at pool initialization, or every query will appear to descend from the first request that warmed the pool.
Production Configuration and Trade-offs
Propagator order matters only on extraction, and only when a carrier could plausibly carry more than one format. The composite propagator tries each registered propagator in turn and the first to yield a valid context wins; injection, by contrast, runs every propagator, so a composite of W3C plus B3 emits both header sets on outbound calls. That dual emission is usually what you want during a migration — old services read B3, new ones read W3C — but it doubles the propagation header footprint, so drop the legacy format the moment the last B3-only service is retired.
Header serialization is the measurable cost of propagation, and it scales with what you propagate, not merely that you propagate. The traceparent header is a fixed ~55 bytes; baggage is whatever you put in it. In a high-throughput service the dictionary construction and string formatting on every outbound call is negligible next to the network call it accompanies, but a baggage map that has grown to several kilobytes across many hops is not — it is bytes on every request in the fan-out. Measure header size, not just request count, when you suspect propagation overhead.
The hardest production case is the partially observable mesh: some services propagate, some strip headers, some are third parties you cannot instrument. The rule is to fail open. A stripped header yields an empty context, the next span becomes a new root, and you keep partial observability rather than crashing the request. Log the gap so the broken hop is visible, and where a service genuinely cannot be instrumented, consider injecting context into a payload field it forwards verbatim, so the trace can be stitched back together on the far side.
Production Code Examples
End-to-end: producer injects, consumer extracts, with verifiable headers
This example shows both halves of one hop and prints the carrier so you can assert on it in a test.
from opentelemetry import trace, propagate, baggage, context
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor, ConsoleSpanExporter
provider = TracerProvider()
provider.add_span_processor(SimpleSpanProcessor(ConsoleSpanExporter()))
trace.set_tracer_provider(provider)
tracer = trace.get_tracer("demo")
def producer() -> dict:
ctx = baggage.set_baggage("tenant.id", "acme-corp")
token = context.attach(ctx)
try:
with tracer.start_as_current_span("produce"):
carrier: dict[str, str] = {}
propagate.inject(carrier) # serialize context into the message
return carrier
finally:
context.detach(token)
def consumer(carrier: dict) -> str:
ctx = propagate.extract(carrier)
token = context.attach(ctx)
try:
with tracer.start_as_current_span("consume") as span:
tenant = baggage.get_baggage("tenant.id")
span.set_attribute("tenant.id", tenant or "unknown")
return tenant or "unknown"
finally:
context.detach(token)
carrier = producer()
print("carrier:", carrier)
print("consumed tenant:", consumer(carrier))
Expected Output:
carrier: {'traceparent': '00-9f0c...e1-7b2f...-01', 'baggage': 'tenant.id=acme-corp'}
consumed tenant: acme-corp
The two console-exported spans share one trace_id, and the consume span's parent_id equals the produce span's span_id, confirming the hop reconstructed the tree.
Graceful fallback when context is missing
A legacy upstream that strips headers yields an empty carrier. extract still succeeds and the next span becomes a new root; log the gap so the broken hop is visible rather than silent.
import logging
logger = logging.getLogger(__name__)
def handle(carrier: dict) -> None:
ctx = propagate.extract(carrier)
parent = trace.get_current_span(ctx).get_span_context()
if not parent.is_valid:
logger.warning("no inbound trace context; starting new root trace")
token = context.attach(ctx)
try:
with tracer.start_as_current_span("handle_legacy"):
pass
finally:
context.detach(token)
handle({}) # simulate a stripped-header request
Expected Output:
WARNING:__main__:no inbound trace context; starting new root trace
Common Mistakes
Overloading baggage with large payloads. Error signature: truncated baggage headers, dropped context downstream, intermittent broken traces. Root cause: exceeding the 4096-character per-entry or 8192-byte total limits. Remediation: keep baggage to a handful of small routing keys; move bulk data into the request body or a span attribute, and audit header size under load.
Failing to detach context in async loops. Error signature: a trace ID from one request appearing in an unrelated one under concurrency. Root cause: context.attach() without a paired context.detach(), leaving the contextvars token live across await. Remediation: always wrap attached context in try/finally with detach in the finally.
Using span attributes instead of baggage for cross-service correlation. Error signature: tenant or routing data present in the edge service but absent downstream. Root cause: span attributes terminate at the span; they are not propagated. Remediation: put data that must cross hops in baggage, and reserve attributes for per-operation detail as described in Span Lifecycle and Attributes.
Re-extracting after instrumentation already did. Error signature: doubled or mis-parented spans on instrumented frameworks. Root cause: manually extracting and attaching context in a handler that an instrumentation library already extracted for. Remediation: let the framework instrumentation own extraction; only inject/extract manually on custom transports it does not cover.
Frequently Asked Questions
How does baggage differ from span attributes in OpenTelemetry?
Baggage propagates across service boundaries through headers, while span attributes stay local to a single span. Use baggage for cross-service routing or tenant correlation, and attributes for per-operation detail that the backend indexes.
What is the performance impact of context propagation in Python?
It is minimal when you use native contextvars; the measurable cost is header serialization and the extra bytes on the wire. Keep total baggage well under 8 KB and audit it under high throughput to avoid latency spikes.
How do I handle missing trace context from legacy services?
Extract still returns a valid context, just without a remote parent, so the next span you start becomes a new root. Decide whether to start a fresh sampled trace or drop the request, and log the gap so the broken hop is visible.
Why does my trace ID leak into the wrong request under asyncio?
You attached a context with context.attach but never detached it. The contextvars token persists across await points, so always pair attach with detach in a finally block.