Adding Trace IDs to Python Log Records

Correlating a log line with the distributed trace that produced it requires injecting the active trace_id and span_id into every LogRecord, then emitting those fields in structured output so a backend can join logs to spans. This guide is part of the Python Logging Fundamentals and Structured Data guide and extends the formatter configuration material with a focused, production-ready correlation recipe.

How the active span context becomes correlatable fields in JSON log output.

Prerequisites

Install the OpenTelemetry API and SDK with pinned versions. The API exposes the current span context, and the SDK plus an instrumentation library produce the real spans you need for non-zero IDs.

pip install "opentelemetry-api>=1.30.0,<2.0.0" \
            "opentelemetry-sdk>=1.30.0,<2.0.0"

No environment variables are strictly required for the correlation logic itself, but a tracer provider must be configured somewhere in the process. If you have not set one up, the distributed tracing and OpenTelemetry in Python guide covers provider initialization, and propagating context across service boundaries explains how a trace_id flows between processes so the same ID appears in logs on both sides.

Implementation

The correlation pipeline has three responsibilities: read the active span context, attach its identifiers to each LogRecord, and serialize those identifiers as discrete JSON fields.

Step 1 — Read the active span context. Call opentelemetry.trace.get_current_span() and inspect its get_span_context(). The returned SpanContext carries integer trace_id and span_id values plus an is_valid flag that is False when no span is recording. The IDs are stored as plain Python integers, not strings, so a 128-bit trace ID can be a very large number; you must format the integers as zero-padded hex to match W3C Trace Context, using width 32 for the trace ID and 16 for the span ID. Reading the context this way is cheap and lock-free, because the active span lives in a context variable rather than shared mutable state, which is why injection on the hot path adds negligible overhead per record.

Step 2 — Stamp the IDs onto every record with a LogRecordFactory. The factory wraps the default record constructor and is invoked for every record the logging system creates, so you avoid attaching a filter to each handler. This is the most robust injection point.

import logging
from opentelemetry import trace

# Capture the original factory so we can delegate to it.
_old_factory = logging.getLogRecordFactory()


def _record_factory(*args, **kwargs):
    record = _old_factory(*args, **kwargs)
    span = trace.get_current_span()
    ctx = span.get_span_context()
    if ctx.is_valid:
        # Zero-padded lowercase hex matching W3C Trace Context.
        record.trace_id = format(ctx.trace_id, "032x")
        record.span_id = format(ctx.span_id, "016x")
        # trace_flags is an int bitmask; 01 means sampled.
        record.trace_flags = format(ctx.trace_flags, "02x")
    else:
        record.trace_id = "0" * 32
        record.span_id = "0" * 16
        record.trace_flags = "00"
    return record


logging.setLogRecordFactory(_record_factory)

The factory approach has one important property: it runs for records created anywhere in the process, including records emitted by third-party libraries that you do not control. That is usually what you want, because a database driver or web framework that logs a slow query should carry the same trace_id as your own handlers. The cost is global mutation of process state, so install the factory exactly once during startup and keep a reference to the previous factory, as shown above, so the chain remains intact if another component installed its own factory first.

Step 3 — Emit the fields from a JSON formatter. Because the attributes now exist on every record, the formatter reads them with plain attribute access. Use getattr defaults so the formatter still works if the factory was not installed. Keeping the field present even when empty matters for downstream schemas: a log pipeline that indexes on trace_id will reject or mis-map records where the key is sometimes absent, so emitting an explicit zero placeholder is safer than omitting it.

import json
import logging


class TraceJSONFormatter(logging.Formatter):
    def format(self, record: logging.LogRecord) -> str:
        payload = {
            "timestamp": self.formatTime(record, "%Y-%m-%dT%H:%M:%S%z"),
            "level": record.levelname,
            "logger": record.name,
            "message": record.getMessage(),
            "trace_id": getattr(record, "trace_id", "0" * 32),
            "span_id": getattr(record, "span_id", "0" * 16),
            "trace_flags": getattr(record, "trace_flags", "00"),
        }
        return json.dumps(payload, separators=(",", ":"))

Step 4 — Wire it together and emit inside a span. Attach the formatter to a stream handler and log from within an active span so the IDs are populated. The same structured-output discipline appears in the structured logging with the Python standard library guide.

import logging
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider

trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer("checkout")

handler = logging.StreamHandler()
handler.setFormatter(TraceJSONFormatter())
log = logging.getLogger("checkout")
log.setLevel(logging.INFO)
log.addHandler(handler)
log.propagate = False

with tracer.start_as_current_span("process_payment"):
    log.info("payment authorized", extra={"order_id": "A-9182"})

Expected Output:

{"timestamp":"2026-06-19T10:14:02+0000","level":"INFO","logger":"checkout","message":"payment authorized","trace_id":"7651916a3df52b3f86d0c2a1bb9f4e10","span_id":"3a1f9c5d2e7b0846","trace_flags":"01"}

Alternative: a logging.Filter

If you cannot replace the global record factory, for example when another library already owns it, attach a logging.Filter instead. A filter mutates the record in place and returns True so the record is still emitted. The trade-off is that it must be added to every handler or logger that should carry the IDs.

import logging
from opentelemetry import trace


class TraceContextFilter(logging.Filter):
    def filter(self, record: logging.LogRecord) -> bool:
        ctx = trace.get_current_span().get_span_context()
        record.trace_id = format(ctx.trace_id, "032x") if ctx.is_valid else "0" * 32
        record.span_id = format(ctx.span_id, "016x") if ctx.is_valid else "0" * 16
        return True


handler.addFilter(TraceContextFilter())

Configuration options

Choice	Option	When to use
Injection point	`LogRecordFactory`	Default; stamps every record process-wide with no per-handler wiring.
Injection point	`logging.Filter`	When the factory is already owned, or you need IDs on only some handlers.
Trace ID format	`format(id, "032x")`	W3C Trace Context hex expected by Tempo, Jaeger, and most backends.
Trace ID format	raw `int`	Only if a downstream consumer requires the integer; avoid for portability.
Missing-span value	`"0" * 32`	Explicit zeros keep the field present and queryable for un-traced logs.
Missing-span value	omit field	Smaller lines, but breaks schemas that require a fixed key set.

Verification

Run the wired example and confirm three things in the emitted line: trace_id is 32 hex characters, span_id is 16 hex characters, and trace_flags is 01 when the span is sampled. To prove the factory degrades correctly, log once outside any span and assert the zero placeholders appear.

log.info("startup complete")  # no active span

Expected Output:

{"timestamp":"2026-06-19T10:14:00+0000","level":"INFO","logger":"checkout","message":"startup complete","trace_id":"00000000000000000000000000000000","span_id":"0000000000000000","trace_flags":"00"}

Once the fields reach your backend, correlation is a query rather than a guess. In a trace-aware log store you filter logs by the trace_id shown on a slow or failed span and immediately see every line that request produced across services, provided each service injects the same ID and shares context as described in the context propagation and baggage guide. The reverse jump also works: from a log line you copy the trace_id into the trace UI and open the full span tree.

A quick assertion in a unit test confirms the factory is installed without needing a live span context.

import logging

record = logging.getLogRecordFactory()("t", logging.INFO, __file__, 1, "x", None, None)
assert len(record.trace_id) == 32
assert len(record.span_id) == 16

Common mistakes

Logging outside the span's context and expecting populated IDs. OpenTelemetry context is held in a context variable. A record created in a thread or callback that did not inherit that context sees no active span and produces zeros. Propagate context explicitly when crossing thread or task boundaries, as covered in the context variables and thread safety guide.

Emitting the trace ID as a raw integer. Backends key on the 32-character hex string. A bare integer or an uppercase or unpadded value will not match the span the backend stored, so correlation silently fails. Always format with "032x" and "016x".

Stacking a factory and a filter that both set the same attributes. If both run, the last writer wins and you can mask a valid ID with a stale one. Choose one injection point and remove the other.

Frequently Asked Questions

Why is trace_id always 0 in my log records?

A trace_id of 0 (rendered as 32 zeros) means there is no active span in the current context when the log call runs. Either instrumentation has not started a span yet, or the log call executes outside the span's context, for example in a background thread that did not inherit the OpenTelemetry context.

Should I use a logging.Filter or a LogRecordFactory to inject trace IDs?

A LogRecordFactory injects the IDs into every record globally with no per-logger wiring, which is the most reliable approach. A Filter must be attached to each handler or logger and is skipped for records created by other means, so prefer the factory unless you need per-handler control.

Do I need the full OpenTelemetry SDK just to add trace IDs to logs?

You need the API to read the current span context, and an SDK plus instrumentation to actually create spans. Reading trace_id with the API alone returns zeros unless something is producing real spans, so a configured tracer provider is required for meaningful correlation.

How should trace IDs be formatted for backends like Grafana Tempo or Jaeger?

Emit the trace ID as a lowercase 32-character hex string and the span ID as 16 hex characters, matching the W3C Trace Context format. Most backends expect this exact zero-padded hex representation to link a log line to its trace.