structlog Architecture and Setup
structlog separates the act of capturing a log event from the act of rendering it, passing a single dictionary through an ordered chain of processor functions until a terminal renderer turns it into text or JSON. This guide, part of the Modern Python Logging Libraries Deep Dive, explains that pipeline, gives a production-ready bootstrap configuration, and covers the async-safety and performance details that decide whether structlog helps or hurts under load. It links to the architectural comparison in the standard library versus third-party libraries, to the migration path in migrating from standard logging to structlog, and to its focused child on binding context variables in structlog.
The mental model that makes structlog click is that a log call does not produce a string — it produces a dictionary, and you decide how that dictionary is transformed and rendered. Filtering, context injection, and serialization are all just functions in a list, which is what gives the pipeline its determinism and its testability. The sections below set up that pipeline correctly and then guard the two places teams get it wrong: async context and per-call cost.
Because every stage is an ordinary function, the pipeline is something you can reason about and test in isolation rather than a black box. A processor takes three arguments and returns a dictionary, so you can call one directly in a unit test and assert on the keys it adds or removes — there is no logging framework to mock and no global state to reset. This testability is not incidental; it is the reason a platform team can package a processor chain as a shared library, write tests that lock its output schema, and roll it out across services with confidence that every one of them emits the same fields in the same shape.
Prerequisites
Pin structlog so the processor API and JSON shape stay stable, and pin OpenTelemetry if you intend to inject trace context.
pip install \
"structlog>=24.1.0,<26.0.0" \
"opentelemetry-sdk>=1.30.0,<2.0.0"
The examples assume Python 3.11 or newer so that contextvars propagates correctly across asyncio tasks.
Concept & architecture
structlog replaces the handler-and-formatter graph with a flat, ordered list of processors. A log call constructs an event_dict whose first key is the event name, then hands it to each processor in turn. A processor receives the wrapped logger, the called method name, and the current dict, and returns a possibly modified dict. Early processors enrich and filter; the final processor is a terminal renderer that returns a string — JSONRenderer for production, ConsoleRenderer for a local TTY. Because every processor is an ordinary function, the chain is deterministic and trivially unit-testable: feed in a dict, assert on what comes out.
Two configuration choices govern behavior. wrapper_class=make_filtering_bound_logger(logging.INFO) builds a bound logger that discards sub-INFO calls before any processor runs, so a dropped DEBUG line costs almost nothing. logger_factory decides where the final string goes: PrintLoggerFactory() writes straight to stdout, while stdlib.LoggerFactory() routes through the standard library so structlog coexists with libraries that log via logging. That coexistence is the foundation of the migration covered in migrating from standard logging to structlog.
The choice of factory is more consequential than it first appears. PrintLoggerFactory() is the simplest path and is perfect for a greenfield service that logs only through structlog, but it bypasses the standard library entirely, which means a third-party dependency logging through logging will not pass through your structlog chain. stdlib.LoggerFactory() is the right default for any real service, because it lets you attach a ProcessorFormatter to the root logger and capture both structlog records and stdlib records from your dependencies through one chain. The cost is a slightly more involved setup; the benefit is a single, consistent schema across your code and everything it imports. When in doubt, route through the standard library — you can always simplify later, but retrofitting dependency capture into a PrintLoggerFactory setup means reconfiguring the whole stack.
Immutability is the property that makes the pipeline safe under concurrency. When you call log.bind(key=value), structlog returns a new bound logger with the added key rather than mutating the one you called, so two coroutines that bind different values to the same base logger never interfere. The event dict each call constructs is likewise local to that call. This copy-on-bind discipline is why the same configured logger can be shared freely across an entire async application without any locking, and why structlog avoids the whole class of bugs that come from mutating shared logging state.
Context is carried out of band through contextvars, not through call arguments. The merge_contextvars processor reads whatever was bound for the current execution context and folds it into the event dict, so a trace_id bound once in middleware appears on every subsequent line in that request. The binding API and its lifecycle rules are the subject of binding context variables in structlog.
structlog actually offers two layers of context, and distinguishing them prevents a lot of confusion. The first is per-logger context, attached with log = log.bind(component="payments"), which travels with that specific bound logger instance and is ideal for static fields scoped to a module or object. The second is the contextvars-backed context, set with bind_contextvars, which is global to the execution context and is the right home for request-scoped identifiers that must reach code holding a different logger instance. The merge_contextvars processor pulls in the second layer; the bound logger carries the first. A common mistake is to reach for bind() for a request_id and then wonder why a helper function's own logger does not see it — the answer is that request_id belonged in the contextvars layer all along.
The ordering of processors is itself a design decision with real consequences. Put merge_contextvars early so every downstream processor and the renderer can see the merged fields. Put any redaction or field-dropping processor before the renderer so secrets never reach serialization. Put format_exc_info before the renderer if you log exceptions, so the traceback becomes a structured field rather than an unhandled object. The renderer is always last because once the dict has become a string, no further processor can inspect or transform its fields.
Step-by-step implementation
Step 1 — Define the processor chain in order. Put merge_contextvars first so request context is available to everything downstream, then add level and timestamp, then the terminal renderer last.
Step 2 — Configure once at bootstrap. Call structlog.configure() a single time before any worker starts, and enable cache_logger_on_first_use=True so the bound logger is built once rather than rebuilt on every call.
import asyncio
import structlog
import logging
# Configure once during application bootstrap
structlog.configure(
processors=[
structlog.contextvars.merge_contextvars, # request context first
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.JSONRenderer(), # terminal renderer last
],
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
cache_logger_on_first_use=True, # build logger once
logger_factory=structlog.stdlib.LoggerFactory(),
)
async def process_request() -> None:
# Step 3 — bind request-scoped context with contextvars
structlog.contextvars.bind_contextvars(
request_id="req-8f3a9c",
user_agent="curl/7.88.1",
)
log = structlog.get_logger()
log.info("request_started", method="GET", path="/api/v1/orders")
if __name__ == "__main__":
asyncio.run(process_request())
Expected Output:
{"request_id": "req-8f3a9c", "user_agent": "curl/7.88.1", "event": "request_started", "method": "GET", "path": "/api/v1/orders", "level": "info", "timestamp": "2026-06-19T14:22:01.123456Z"}
Step 3 — Bind context in middleware. Call bind_contextvars at the start of a request and clear_contextvars at the end so identifiers do not bleed into the next request handled by the same coroutine pool.
Step 4 — Select the renderer per environment. Keep the chain identical and swap only the final processor: ConsoleRenderer() when stdout is a TTY, JSONRenderer() otherwise. A single sys.stderr.isatty() check at bootstrap chooses between them, so a developer running locally gets readable, colored output and the same code in a container emits machine JSON, with no divergence in the fields produced.
Step 5 — Capture exceptions as structured data. Add structlog.processors.format_exc_info before the renderer and call log.exception("operation_failed") inside an except block. The traceback is attached as an exception field rather than dumped as an unstructured multi-line string, so your aggregator can group failures by exception type instead of treating each stack trace as unique text.
Configuration reference
| Option | Purpose | Production value |
|---|---|---|
processors |
ordered transform chain | context merge → level → timestamp → renderer |
wrapper_class |
early level filter | make_filtering_bound_logger(logging.INFO) |
cache_logger_on_first_use |
build bound logger once | True |
logger_factory |
where strings go | stdlib.LoggerFactory() to coexist with stdlib |
context_class |
per-logger context store | default dict |
| terminal renderer | final serialization | JSONRenderer() in prod, ConsoleRenderer() locally |
Async & concurrency considerations
The reason structlog is safe in asyncio is merge_contextvars. It reads from contextvars, whose values asyncio copies per task, so each coroutine sees its own bound context and never another request's. A thread-local approach would silently leak across coroutines sharing one OS thread, which is the most common source of mislabeled logs in async services. Bind in middleware, clear at request end, and the context is always correct across await points without passing identifiers through every function.
The performance lever is cache_logger_on_first_use. With it disabled, get_logger() rebuilds the bound logger and re-resolves the processor chain on every call, which roughly doubles per-call latency under high QPS. With it enabled, the bound logger is constructed once and reused. Pair that with early level filtering so sub-threshold records are dropped before any processor — including JSON serialization — runs. structlog itself does not provide a non-blocking sink; for that, route through the standard library and put a QueueHandler/QueueListener pair behind the stdlib.LoggerFactory() so the actual write happens on a background thread.
The filtering bound logger deserves emphasis because it is where structlog earns its keep under load. make_filtering_bound_logger(logging.INFO) returns a logger whose debug() method is a no-op that returns immediately, before the event dict is even constructed. That means a hot loop peppered with log.debug(...) calls costs essentially nothing in production, where the level sits at INFO, while remaining fully active when you drop the level to DEBUG for an investigation. Contrast this with building the event dict and only checking the level inside a processor: that pays dict construction and argument evaluation for every dropped line. Filtering at the wrapper class is strictly cheaper, and it is the default reason structlog scales to high call rates.
The number of processors also matters linearly. Each processor is a function call on every surviving record, so a chain of fifteen processors costs noticeably more than a chain of five. Keep the production chain lean: context merge, level, timestamp, any redaction, and the renderer is usually enough. Resist the urge to add diagnostic processors that run on every line when an occasional one would do, and remember that the most expensive processor is almost always the renderer itself, which is exactly why nothing should reach it that the level filter could have dropped.
Microservice deployment considerations
In a containerized service the deployment rule is simple: render JSON to stdout and let the platform's log agent ship it. The application stays stateless, the agent owns buffering and retry, and a collector outage never becomes an application latency spike. Keep stdout for normal logs and reserve stderr for the process's own fatal startup errors, so the orchestrator can distinguish a crash-on-boot from routine output. Because structlog's renderer is the only environment-specific stage, the same image runs identically across staging and production with the renderer chosen by a single TTY check.
Across a fleet, the highest-leverage move is to package the configuration. Ship a small internal module whose one public function calls structlog.configure() with the agreed processor chain, and have every service import and call it at startup. This turns schema consistency into a dependency-version problem rather than a copy-paste discipline problem: bump the package, and every service that redeploys picks up the new fields. It also gives you one place to add a redaction processor or a new semantic-convention field for the whole organization at once. When you do compare structlog against the alternatives for a specific service shape, the head-to-head guide lays out the trade-offs.
Production code examples
This example injects OpenTelemetry trace context into every structlog record. A small processor reads the active span and binds its trace_id and span_id, giving log-and-trace correlation without custom middleware.
import asyncio
import logging
import structlog
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, SimpleSpanProcessor
# Initialize the OTel SDK for the demonstration
trace.set_tracer_provider(TracerProvider())
trace.get_tracer_provider().add_span_processor(
SimpleSpanProcessor(ConsoleSpanExporter())
)
tracer = trace.get_tracer(__name__)
structlog.configure(
processors=[
structlog.contextvars.merge_contextvars,
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.JSONRenderer(),
],
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
cache_logger_on_first_use=True,
)
async def handle_order() -> None:
with tracer.start_as_current_span("process_order") as span:
# Bind W3C-format trace identifiers for this execution context
ctx = span.get_span_context()
structlog.contextvars.bind_contextvars(
trace_id=format(ctx.trace_id, "032x"),
span_id=format(ctx.span_id, "016x"),
trace_flags=int(ctx.trace_flags),
)
log = structlog.get_logger()
log.info("order_validated", order_id="ORD-992", amount=45.99)
if __name__ == "__main__":
asyncio.run(handle_order())
Tested with structlog>=24.1.0,<26.0.0 and opentelemetry-sdk>=1.30.0,<2.0.0. Because the identifiers are bound into contextvars, every later log call in the same span carries them automatically.
There are two ways to wire this correlation, and the choice has consequences. Binding the identifiers into contextvars once at span entry, as above, is explicit and visible, but it requires that you remember to bind at every span boundary. The alternative is a dedicated processor — like the inject_trace_id processor shown on the parent guide — that reads trace.get_current_span() on every log call and attaches the ids automatically. The processor approach is harder to forget because it runs on every line regardless of where the call sits, and it correctly reflects the innermost active span even as spans nest. For most services the processor is the better default; explicit binding is useful when you want the ids in the bound context for reasons beyond logging, such as passing them to a downstream call. Either way, only attach the ids for sampled spans, because correlating to a span that was never exported gives you a dangling reference your backend cannot resolve.
The same contextvars mechanism is what lets a span started deep in a call stack still correlate correctly. structlog does not parse the traceparent header itself; the OpenTelemetry instrumentation extracts it and establishes the span context, and structlog simply reads whatever span is current when each line is logged. That clean separation — OTel owns context extraction, structlog owns rendering — is why the integration needs no custom middleware and why it survives refactoring that moves where spans start.
Expected Output:
{"trace_id": "0000000000000000a1b2c3d4e5f60718", "span_id": "1a2b3c4d5e6f0718", "trace_flags": 1, "event": "order_validated", "order_id": "ORD-992", "amount": 45.99, "level": "info", "timestamp": "2026-06-19T14:22:03.987654Z"}
Common mistakes
- Calling
structlog.configure()more than once. Reconfiguring at runtime invalidates cached loggers and races with in-flight records; the symptom is intermittently missing fields or dropped lines under load. The root cause is mutating global config after loggers were cached. Configure exactly once during bootstrap. - Passing preformatted strings to
log.info(). Writinglog.info(f"user {uid} paid")bypasses structured extraction, so the data lands in the opaque event string and cannot be queried. Pass key-value pairs —log.info("user_paid", user_id=uid)— and let the renderer build the final string. - Disabling
cache_logger_on_first_use. Without caching, the bound logger and processor chain are rebuilt on every call, raising per-call latency by a large factor under high QPS. The signature is logging CPU scaling with call rate; enable caching for both sync and async paths. - Expecting structlog to be non-blocking by itself. structlog renders the string but does not own the write, so a slow sink still blocks the caller. Route through
stdlib.LoggerFactory()with a boundedQueueHandler/QueueListenerso the write moves to a background thread.
Frequently Asked Questions
Does structlog replace the standard logging module?
No. structlog is a structured wrapper that can route through standard library handlers via LoggerFactory. It adds deterministic key-value output and a processor pipeline while preserving compatibility with existing Python logging infrastructure.
How much latency does structlog add per log call?
With cache_logger_on_first_use enabled and a short processor chain, overhead is on the order of tens of microseconds per call. Heavy JSON serialization, deep context merging, or disabled caching increases it, and disabling caching can roughly double per-call cost under high QPS.
Can structlog integrate with OpenTelemetry directly?
Yes. Because structlog reads from contextvars, a small processor can inject the active span's trace_id and span_id into every record, giving unified log-and-trace correlation without custom middleware or manual header parsing.
Why must structlog.configure run only once?
Reconfiguring at runtime invalidates cached loggers and races with in-flight records, which can drop fields or crash under concurrency. Configure exactly once during application bootstrap before any requests are served.