Span Lifecycle and Attributes in Python OpenTelemetry

A span is the atomic unit of a distributed trace, and every span moves through a fixed lifecycle: creation, context attachment, attribute enrichment, status assignment, termination, and export. Getting this lifecycle right is what separates a queryable trace graph from a pile of orphaned, oversized, or silently dropped spans. This guide sits within the broader Distributed Tracing and OpenTelemetry in Python guide, and it builds directly on a correctly initialized provider from OpenTelemetry SDK Setup. It also pairs with context propagation and baggage, which governs how a span's context survives a network hop.

Span lifecycle stages A horizontal flow showing a span moving through start, attach context, set attributes and status, end, and finally the batch processor exporting it. start span attach context attributes + status end span batch export One span, five stages context manager guarantees end; processor decouples export
The span lifecycle: the context manager owns stages one through four, and the batch processor owns export off the request path.

Key implementation areas covered below:

  • Span creation and context attachment mechanics across sync and async code.
  • Attribute cardinality, type validation, and length-limit enforcement.
  • Status code assignment, span events, and exception recording.
  • Termination guarantees and the export pipeline under load.

Prerequisites

Install the API and SDK together with matched versions. A mismatch between opentelemetry-api and opentelemetry-sdk is the most common cause of ImportError during provider construction.

pip install \
  "opentelemetry-api>=1.30.0,<2.0.0" \
  "opentelemetry-sdk>=1.30.0,<2.0.0" \
  "opentelemetry-semantic-conventions>=0.51b0,<1.0.0"

The examples assume a TracerProvider registered as the global provider. If you have not set one up, follow the deterministic bootstrap in OpenTelemetry SDK Setup first.

Concept and Architecture

A span records one operation: an HTTP handler, a database call, a queue consume. It carries a span context (trace ID, span ID, trace flags), a parent reference, a start and end timestamp, a set of attributes, an ordered list of events, and a status. The trace ID is shared by every span in the request; the parent reference is what reconstructs the tree.

The Python SDK tracks the "current" span using PEP 567 contextvars. When you call tracer.start_as_current_span(), the SDK does two things: it creates the span, and it sets that span as the active one in the current context. Any span you start while it is active becomes its child automatically. This implicit parenting is the entire reason synchronous tracing works without you threading a span object through every function.

It helps to separate three things the word "span" can mean. There is the live span object you hold and mutate inside the with block; there is its immutable SpanContext — the trace ID, span ID, and flags that identify it and propagate to other services; and there is the finished, read-only span the SDK hands to the processor after end(). You set attributes and status on the live object; you read the SpanContext to correlate with logs or metrics; and the processor only ever sees the immutable finished form. A common confusion is trying to mutate a span after it has ended — the SDK ignores the call rather than raising, so the change vanishes silently. Everything you want recorded must happen before the with block exits.

Asynchronous event loops complicate this because work is suspended and resumed across await points. contextvars propagate correctly across await, but they do not propagate across a raw thread boundary or a ThreadPoolExecutor submission unless you carry the context explicitly. Detaching a span from its logical execution path produces an orphaned span whose parent reference points at nothing the backend can resolve. The rules for carrying context across those boundaries are detailed in context propagation and baggage.

There are two ways to start a span. start_as_current_span() returns a context manager that both activates the span and guarantees end(). start_span() returns a detached span you must activate and end yourself; reserve it for cases where the span outlives a single function scope, such as a long-running stream where start and end happen in different callbacks.

A span also carries a SpanKind that the backend uses to assemble service topology. SERVER marks the inbound side of a remote call, CLIENT the outbound side, and PRODUCER and CONSUMER the two ends of an asynchronous message hop. A CLIENT span in one service and the SERVER span it triggers in the next share a trace ID and form a parent-child edge; that edge is how a distributed map gets drawn. Leaving everything INTERNAL produces correct traces but a flat, hop-less topology, so set kind deliberately on anything that crosses a process boundary.

Sampling intersects the lifecycle at exactly one point: span creation. The configured sampler runs inside start_as_current_span() and decides whether the span records and exports. A non-recording span is cheap — set_attribute and add_event become near no-ops — which is why you should never guard instrumentation behind your own if span.is_recording() checks for correctness; the SDK already short-circuits the work. The exception is when computing an attribute value is itself expensive (a JSON serialization, a hash), in which case the is_recording() guard saves real CPU. How that sampling decision is made and propagated is the subject of sampling strategies for distributed tracing.

Step-by-Step Implementation

Step 1 — Acquire a tracer. Get a named tracer from the global provider. The name should identify the instrumenting library or module, not the service.

from opentelemetry import trace

tracer = trace.get_tracer("order-service.checkout")

Step 2 — Start the span and bind context. Use the context manager form so the span is both activated and guaranteed to end. Pass kind to describe the span's role; SERVER, CLIENT, PRODUCER, and CONSUMER drive backend topology views, while INTERNAL is the default for in-process work.

from opentelemetry.trace import SpanKind

with tracer.start_as_current_span("process_order", kind=SpanKind.INTERNAL) as span:
    ...  # span is now the active span; children attach automatically

Step 3 — Set attributes with semantic conventions. Attributes must be primitives or homogeneous arrays of primitives. Mixing types in one array raises a validation error, and None values are dropped silently. Prefer the documented semantic convention keys (http.request.method, db.system, messaging.destination.name) so dashboards and alerts written against one service work across all of them.

span.set_attributes({
    "order.id": "ORD-991",
    "order.total": 149.99,
    "http.request.method": "POST",
})

Step 4 — Record outcome. Set StatusCode.ERROR only when the operation fails its contract, and record the exception first so the stack trace is preserved as a span event. StatusCode.UNSET is the correct default for success; explicitly setting OK is rarely needed and can hide a downstream override. The three-state status model is deliberate: UNSET means "no opinion, treat as success," OK means "definitively succeeded, do not override," and ERROR means "failed." Reserve OK for the rare case where a downstream auto-instrumentation might otherwise mark a span as failed and you need to assert success — for example, a 404 that is an expected, handled outcome rather than a fault.

from opentelemetry.trace import Status, StatusCode

try:
    charge_payment()
except PaymentError as exc:
    span.record_exception(exc)            # captures type, message, stacktrace
    span.set_status(Status(StatusCode.ERROR, str(exc)))
    raise

Step 5 — Let it end and export. Exiting the with block calls end(), stamping the end time and handing the finished span to the registered span processor. With a BatchSpanProcessor, export happens on a background thread, off the request path.

Span events and links are the two enrichment mechanisms beyond attributes. An event is a timestamped annotation inside the span — a retry, a cache miss, a validation failure — recorded with span.add_event(name, attributes). Events are cheaper than child spans and ideal for marking moments that do not warrant their own duration. A link, created at span start, points at another span context that is causally related but not the parent: the canonical case is a batch consumer whose single span links to the many producer spans that contributed messages. Reach for links when one span has several upstream causes rather than one.

The export path is asynchronous by design. The BatchSpanProcessor enqueues each finished span and a worker thread drains the queue every schedule_delay_millis or whenever max_export_batch_size accumulates. This decoupling is what keeps tracing off the critical path, but it has a failure mode: if the queue fills faster than the exporter drains it, spans are dropped silently from the tail of the queue. The drop is intentional back-pressure, not an error, so under sustained load you tune max_queue_size and schedule_delay_millis rather than expecting an exception.

Configuration Reference

These limits and processor settings are the levers that govern span size and export behavior. Environment variables are read once at SDK initialization.

Setting Env var / parameter Default Effect
Attribute count limit OTEL_SPAN_ATTRIBUTE_COUNT_LIMIT 128 Attributes beyond the limit are dropped
Attribute value length OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT unlimited Longer string values are truncated
Event count limit OTEL_SPAN_EVENT_COUNT_LIMIT 128 Events beyond the limit are dropped
Link count limit OTEL_SPAN_LINK_COUNT_LIMIT 128 Links beyond the limit are dropped
Max queue size max_queue_size 2048 Spans buffered before drops begin
Batch size max_export_batch_size 512 Spans per export call
Schedule delay schedule_delay_millis 5000 Interval between forced exports
Export timeout export_timeout_millis 30000 Per-export deadline

Distinguish resource attributes from span attributes. Resource attributes (service.name, deployment.environment) describe the producer and are attached once on the TracerProvider. Span attributes describe one operation. Copying static resource data onto every span inflates payload size with no analytical gain.

The limits matter more than they first appear because the SDK enforces them silently. A value that exceeds OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT is truncated without warning, so a serialized payload you stuffed into an attribute arrives at the backend half-present and you debug a "corrupt" value that was simply cut. An attribute past the count limit is dropped entirely. Set the value-length limit deliberately — many backends index only the first few hundred characters of a string field anyway, so a limit of 1024 both protects the wire and matches what the backend can actually search. There is no per-attribute override; the limits are global to the SDK, which is the right level since they exist to protect the export pipeline, not individual spans.

A useful discipline is to set attributes as early in the span as you have the data, not all at the end. The SDK applies count and length limits at set time, and a sampler that inspects attributes (covered in sampling strategies for distributed tracing) only sees attributes passed at span creation through the attributes= argument — attributes added later with set_attribute are invisible to the sampling decision. If a route or tenant drives sampling, pass it at start_as_current_span() time, not afterward.

Async and Concurrency Considerations

In a pure asyncio handler, start_as_current_span() behaves exactly as it does synchronously because contextvars follow the coroutine across await. The danger appears at concurrency boundaries the SDK cannot see. When you offload CPU-bound work with loop.run_in_executor() or hand a job to a background thread, the new thread starts with an empty context, so a span created there has no parent.

The fix is to capture the active context and re-attach it inside the worker. The example below uses contextvars.copy_context(), which snapshots the active OpenTelemetry context so the executor thread runs the child span under the correct parent.

import asyncio
import contextvars
from opentelemetry import trace

tracer = trace.get_tracer("async-handler")


def cpu_bound(n: int) -> int:
    # Runs in a worker thread; the parent context was copied in by run_in_executor
    with tracer.start_as_current_span("hash_payload") as span:
        span.set_attribute("payload.size", n)
        return sum(i * i for i in range(n))


async def handle_request(payload_size: int) -> int:
    with tracer.start_as_current_span("async_handler") as span:
        span.set_attribute("processing.duration_ms", 50)
        await asyncio.sleep(0.05)
        loop = asyncio.get_running_loop()
        ctx = contextvars.copy_context()
        # ctx.run rebinds the captured OTel context inside the worker thread
        return await loop.run_in_executor(None, lambda: ctx.run(cpu_bound, payload_size))


asyncio.run(handle_request(10000))

Without ctx.run, the hash_payload span would become a second root span. With it, the backend renders a clean async_handler → hash_payload parent-child edge.

The same principle applies to asyncio.create_task(), but with a subtler default. A task created from inside an active span inherits a copy of the current context at creation time, so a span started in the task is correctly parented — provided you create the task while the parent span is still active. Create the task after the parent's with block exits and the parent context is already gone, leaving the task's span orphaned. Two patterns avoid this: create all child tasks inside the parent span's scope, or capture the context explicitly and wrap the coroutine. For long-lived background tasks that outlive the request, prefer an explicit start_span() with a captured parent context over relying on the ambient one, because the request's context will be torn down long before the task finishes.

Shutdown is the other place async tracing leaks data. A BatchSpanProcessor holds spans in memory between flushes, so a process that exits without flushing loses everything still queued. In a container that means registering a SIGTERM handler that calls provider.shutdown(), which flushes and then stops the worker thread. Web frameworks expose a cleaner hook: an ASGI lifespan shutdown or a Flask teardown that calls force_flush() before the process returns. The FastAPI setup guide wires this into the lifespan context manager.

Production Code Examples

Full span lifecycle with events, status, and graceful flush

This end-to-end example wires a provider, records a span event, handles an error path, and flushes pending spans on shutdown so a scaling pod loses nothing.

import signal
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.resources import Resource
from opentelemetry.trace import Status, StatusCode, SpanKind
from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter

# 1. Resource attributes describe the producer once, not per span
resource = Resource.create({"service.name": "order-service"})
provider = TracerProvider(resource=resource)
provider.add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
trace.set_tracer_provider(provider)
tracer = trace.get_tracer("order-service.checkout")


def process_order(order_id: str, total: float) -> str:
    with tracer.start_as_current_span("process_order", kind=SpanKind.INTERNAL) as span:
        span.set_attributes({"order.id": order_id, "order.total": total})
        span.add_event("inventory_reserved", {"warehouse": "us-east-1"})
        try:
            if total <= 0:
                raise ValueError("non-positive order total")
            span.set_attribute("payment.status", "captured")
            return "success"
        except ValueError as exc:
            span.record_exception(exc)               # record before status
            span.set_status(Status(StatusCode.ERROR, str(exc)))
            raise


# 2. Flush on SIGTERM so a scaling pod exports buffered spans
def _shutdown(*_):
    provider.shutdown()


signal.signal(signal.SIGTERM, _shutdown)

if __name__ == "__main__":
    process_order("ORD-991", 149.99)
    provider.force_flush()

Expected Output:

{
  "name": "process_order",
  "context": {"trace_id": "0x8a3c...", "span_id": "0x7b2f...", "trace_state": "[]"},
  "kind": "SpanKind.INTERNAL",
  "parent_id": null,
  "status": {"status_code": "UNSET"},
  "attributes": {
    "order.id": "ORD-991",
    "order.total": 149.99,
    "payment.status": "captured"
  },
  "events": [
    {"name": "inventory_reserved", "attributes": {"warehouse": "us-east-1"}}
  ],
  "resource": {"attributes": {"service.name": "order-service"}}
}

Correlating a span with metrics via exemplars

A span ID embedded in a histogram bucket is an exemplar: it lets a latency spike on a dashboard link straight to the trace that caused it. The trace-side requirement is simply that the span be sampled and recording when the metric is observed. The metric side is covered in Python Metrics and Instrumentation, but the join key is the span context read here.

from opentelemetry import trace

with tracer.start_as_current_span("checkout") as span:
    ctx = span.get_span_context()
    # Pass trace_id/span_id to your metric recording so backends attach an exemplar
    labels = {"trace_id": format(ctx.trace_id, "032x"), "sampled": ctx.trace_flags.sampled}
    # record_latency_histogram(value, exemplar=labels)

Expected Output:

checkout span_id=0x7b2f... sampled=True  -> histogram bucket carries this exemplar

Common Mistakes

Unbounded attribute cardinality. Error signature: backend indexing failures, slow trace search, ballooning storage cost. Root cause: high-cardinality values such as raw user IDs, UUIDs, or full request bodies attached as span attributes. Remediation: keep attributes low-cardinality; route per-user debugging data into structured logs carrying the trace ID, or into context propagation and baggage when it genuinely must cross services.

Manual span.end() without scope cleanup. Error signature: orphaned spans, leaked context, incorrect parent-child edges in async or threaded code. Root cause: using start_span() plus a manual end() while forgetting to detach the attached context token. Remediation: prefer start_as_current_span() as a context manager; if you must use a detached span, pair context.attach() with context.detach() in a finally block.

Overriding resource attributes at span level. Error signature: inflated OTLP payloads, duplicated service.name on every span. Root cause: setting static topology metadata as span attributes instead of on the Resource. Remediation: define service.name, deployment.environment, and similar once during provider initialization.

Setting ERROR status before recording the exception. Error signature: error traces with no stack trace event. Root cause: an early raise after set_status skips record_exception. Remediation: always call span.record_exception(exc) first, then set_status(StatusCode.ERROR). Error status also drives retention in tail-based sampling strategies for distributed tracing, so a missing status can cost you the trace entirely.

Frequently Asked Questions

How do OpenTelemetry attribute limits affect Python span performance?

Exceeding the default limits triggers attribute truncation or span drops. Configure OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT and OTEL_SPAN_ATTRIBUTE_COUNT_LIMIT to align with your backend indexing capacity rather than letting values be silently cut.

Should I use span events or log attributes for high-frequency debugging?

Use span events for discrete, trace-correlated occurrences such as a retry or a cache miss. For high-frequency, high-cardinality data, use structured logging with trace ID injection to avoid bloating the trace and throttling the backend.

How does the span lifecycle interact with async Python frameworks like FastAPI or aiohttp?

Async frameworks need explicit context attachment and detachment, or the official instrumentation packages, to keep trace continuity across event loops and prevent context leakage between concurrent requests.

Do I need to call span.end() myself?

Not when you use start_as_current_span as a context manager, which calls end() even on unhandled exceptions. You only call end() manually when you start a detached span with start_span and manage its scope yourself.