Handler Architecture for Python Logging
Handler architecture defines how a LogRecord is routed, filtered, and dispatched to its destinations. The decisive design choice in any high-throughput Python service is separating log emission from the I/O that writes it, because synchronous disk and network writes on a request thread translate directly into tail-latency spikes. This guide is part of the Python Logging Fundamentals and Structured Data reference and focuses on non-blocking dispatch, backpressure control, and trace correlation. It builds on formatter configuration for serialization and on log levels and severity mapping for routing decisions.
Key architectural principles:
- Decouple log emission from disk and network writes so the calling thread never blocks.
- Deploy one handler per sink with independent filters and formatters to isolate failure domains.
- Bound every queue and define an explicit overflow policy to prevent memory exhaustion.
- Inject trace context through filters rather than mutating global state.
Prerequisites
Handler decoupling and trace injection use only the standard library; the OTLP sink requires the OpenTelemetry SDK. Pin versions to keep opentelemetry-api and opentelemetry-sdk aligned.
pip install \
"opentelemetry-sdk>=1.30.0,<2.0.0" \
"opentelemetry-exporter-otlp-proto-grpc>=1.30.0,<2.0.0"
Concept and Architecture
A handler is the object that takes a filtered LogRecord and emits it to a destination. Python's logging module dispatches a record to every handler attached to the logger and to its ancestors, in order. By default each handler performs its formatting and write synchronously, on whatever thread called logger.info(). For a StreamHandler to stdout this is cheap, but for a FileHandler during rollover or a network handler during a collector slowdown, the write latency lands directly on the request path.
The structural fix is to make the only handler on the application logger a QueueHandler. Its emit does nothing but enqueue the record onto a queue.Queue, which is constant-time and lock-protected. A separate QueueListener runs on a background thread, pulls records off the queue, and dispatches them to the real handlers. This inverts the cost model: the request thread pays the enqueue, and a single dedicated thread absorbs all serialization and I/O. The detailed mechanics, shutdown ordering, and respect_handler_level semantics are covered in non-blocking logging with QueueHandler.
Filters refine this further. A logging.Filter attached to a handler decides, per record, whether that handler emits it. Use one filter to inject trace context and another to route by severity, so a single record can land on stdout at INFO while only ERROR records reach the file sink. Filters run before the formatter, which means an expensive JSON serialization is skipped entirely for records a sink rejects.
There is an important distinction between a handler's level and a filter. The level is a single floor compared against record.levelno; a filter is arbitrary per-record logic that can also enrich the record by mutating it. Filters attached to a logger run during record creation on the calling thread, while filters attached to a handler run when that handler emits. In the queue pattern this matters: a trace-injection filter must run on the side where the context still exists. If you read contextvars in a filter attached to a handler owned by the background QueueListener, the context is already gone, because the listener thread never had it. Attach context-reading filters to the QueueHandler or the logger, so they capture context on the producing thread, and reserve handler-side filters for severity routing that depends only on data already on the record.
A second consequence of the queue boundary is that the record is pickled implicitly only if you use a multiprocessing queue; with an in-process queue.Queue the same LogRecord object crosses to the listener thread by reference. The default QueueHandler.prepare() already formats the message and detaches exc_info into a string so the record is safe to hand off, which is why you should not rely on lazy %-style arguments surviving the queue unchanged.
Step-by-Step Implementation
Step 1 — Inject trace context with a filter. Read the active trace_id and span_id from contextvars and attach them as record attributes. This keeps trace correlation out of every call site and out of global mutable state.
import contextvars
import logging
trace_id_ctx: contextvars.ContextVar[str | None] = contextvars.ContextVar("trace_id", default=None)
span_id_ctx: contextvars.ContextVar[str | None] = contextvars.ContextVar("span_id", default=None)
class OTelTraceFilter(logging.Filter):
"""Inject W3C Trace Context identifiers into every record."""
def filter(self, record: logging.LogRecord) -> bool:
record.trace_id = trace_id_ctx.get() or "0" * 32
record.span_id = span_id_ctx.get() or "0" * 16
return True
Step 2 — Route by severity with a second filter. A range filter lets one sink take everything from INFO up while another takes only ERROR and above.
class SeverityRouter(logging.Filter):
"""Pass records whose level is within [min_level, max_level]."""
def __init__(self, min_level: int, max_level: int):
super().__init__()
self.min_level = min_level
self.max_level = max_level
def filter(self, record: logging.LogRecord) -> bool:
return self.min_level <= record.levelno <= self.max_level
Step 3 — Build per-sink handlers. Each handler gets its own formatter and the filters it needs. Failure in one sink cannot corrupt another.
import sys
stdout_handler = logging.StreamHandler(sys.stdout)
stdout_handler.setFormatter(logging.Formatter(
'{"time":"%(asctime)s","level":"%(levelname)s","msg":"%(message)s",'
'"trace_id":"%(trace_id)s","span_id":"%(span_id)s"}'
))
stdout_handler.addFilter(SeverityRouter(logging.INFO, logging.CRITICAL))
stdout_handler.addFilter(OTelTraceFilter())
error_handler = logging.FileHandler("errors.log")
error_handler.setFormatter(logging.Formatter(
"%(asctime)s [%(levelname)s] %(name)s - %(message)s (trace=%(trace_id)s)"
))
error_handler.addFilter(SeverityRouter(logging.ERROR, logging.CRITICAL))
error_handler.addFilter(OTelTraceFilter())
Step 4 — Bound the queue and wire the listener. Attach only the QueueHandler to the logger, and let the QueueListener own the concrete handlers. Override the enqueue path to drop rather than block when the queue is full.
import queue
from logging.handlers import QueueHandler, QueueListener
def setup_production_handlers(queue_size: int = 10000) -> tuple[QueueHandler, QueueListener]:
log_queue: queue.Queue = queue.Queue(maxsize=queue_size)
listener = QueueListener(
log_queue, stdout_handler, error_handler, respect_handler_level=True
)
listener.start()
queue_handler = QueueHandler(log_queue)
def enqueue_nowait(record: logging.LogRecord) -> None:
try:
log_queue.put_nowait(record) # never block the caller
except queue.Full:
if record.levelno >= logging.ERROR:
log_queue.get_nowait() # evict an old record to keep errors
log_queue.put_nowait(record)
# else: drop low-severity records under saturation
queue_handler.enqueue = enqueue_nowait # type: ignore[method-assign]
return queue_handler, listener
Step 5 — Run it. Attach the queue handler, set context, emit, and stop the listener on shutdown so buffered records flush.
import time
if __name__ == "__main__":
handler, listener = setup_production_handlers(queue_size=5)
logger = logging.getLogger("payment.service")
logger.setLevel(logging.DEBUG)
logger.addHandler(handler)
logger.propagate = False
trace_id_ctx.set("4bf92f3577b34da6a3ce929d0e0e4736")
span_id_ctx.set("00f067aa0ba902b7")
logger.info("Transaction initiated")
logger.warning("Retry attempt 1")
logger.error("Payment gateway timeout")
time.sleep(0.1) # let the background thread drain
listener.stop() # flush and join
Expected Output: (newline-delimited JSON to stdout, Python 3.12)
{"time":"2026-06-19 14:32:11,123","level":"INFO","msg":"Transaction initiated","trace_id":"4bf92f3577b34da6a3ce929d0e0e4736","span_id":"00f067aa0ba902b7"}
{"time":"2026-06-19 14:32:11,124","level":"WARNING","msg":"Retry attempt 1","trace_id":"4bf92f3577b34da6a3ce929d0e0e4736","span_id":"00f067aa0ba902b7"}
{"time":"2026-06-19 14:32:11,125","level":"ERROR","msg":"Payment gateway timeout","trace_id":"4bf92f3577b34da6a3ce929d0e0e4736","span_id":"00f067aa0ba902b7"}
The errors.log file receives only the ERROR record because its SeverityRouter rejects everything below ERROR. With respect_handler_level=True, the listener honors each handler's own level in addition to its filters.
Backpressure and Failure Domains
High request rates expose the queue to saturation whenever a downstream sink slows. The queue depth is your backpressure signal: a steadily growing queue means the listener cannot drain as fast as producers enqueue, almost always because one sink, typically the network exporter, has stalled. Size maxsize to a few seconds of peak emission rate so a transient blip is absorbed without dropping, but not so large that a sustained outage buffers gigabytes before the drop policy engages.
The overflow policy is a deliberate trade-off, not an error path to ignore. Dropping DEBUG and INFO under saturation preserves the ERROR and CRITICAL records that incident response actually needs, and it keeps the logging subsystem from being the cause of an out-of-memory kill during an unrelated outage. Emit a counter metric each time a record is dropped so the loss is visible rather than silent; a logging pipeline that quietly discards records during exactly the incidents you are trying to debug is worse than one that is merely slow.
Keeping one handler per sink also confines failure. If the file handler raises during rollover or the OTLP exporter times out, the exception is contained to that handler's emit, and stdout continues to receive records. A single multiplexing handler would couple these fates together, so an outage in the slowest sink would stall every destination behind it.
Configuration Reference
| Component | Parameter | Purpose | Production guidance |
|---|---|---|---|
QueueHandler |
queue |
Target queue for enqueue | Always a bounded queue.Queue(maxsize=N) |
queue.Queue |
maxsize |
Cap on buffered records | Size to a few seconds of peak rate; never 0 (unbounded) |
QueueListener |
respect_handler_level |
Honor each handler's level | True so per-handler levels apply on the drain side |
QueueListener |
*handlers |
Concrete sinks drained in background | One per destination, each with own formatter |
logging.Filter |
subclass filter() |
Inject context or route by severity | Return False to suppress before formatting |
| Logger | propagate |
Forward to ancestor handlers | False on any logger that owns handlers |
FileHandler |
delay |
Defer file open until first emit | True to avoid opening files in unused workers |
Async and Concurrency Considerations
Standard logging emission is thread-safe, but the QueueHandler pattern is what makes it safe for an asyncio event loop. Because enqueue is non-blocking and constant-time, a coroutine that logs does not yield control to disk or network latency. The single QueueListener thread becomes the only place blocking I/O happens, which also serializes writes to a shared file without any additional locking.
Context variables are the correct carrier for request-scoped trace data in async code: they propagate across await points within a task and are snapshotted when a child task is created. A worker thread, however, does not inherit the event loop's context. When you offload work to a ThreadPoolExecutor, capture the context with contextvars.copy_context() and run the callable inside it, or the trace filter will read empty defaults. The broader rules for safe propagation live in context variables and thread safety.
Production Code Examples
Atomic listener swap for zero-loss reconfiguration
Because the logger only ever holds a QueueHandler, you can rebuild the entire sink topology by draining the old listener and starting a new one against the same queue, without ever detaching the handler the application logs through.
import logging
import queue
from logging.handlers import QueueHandler, QueueListener
def reconfigure(logger: logging.Logger, log_queue: queue.Queue,
old: QueueListener, *new_handlers: logging.Handler) -> QueueListener:
old.stop() # drains remaining records into old handlers
new = QueueListener(log_queue, *new_handlers, respect_handler_level=True)
new.start()
logger.debug("listener reconfigured with %d sinks", len(new_handlers))
return new
Expected Output: (no records are lost; the next emit lands on the new sinks)
DEBUG listener reconfigured with 2 sinks
Common Mistakes
Synchronous I/O in request handlers
Attaching a StreamHandler or FileHandler directly to a request-scoped logger blocks the event loop or worker thread during the write. Under concurrent load this surfaces as P99 latency degradation. Make the QueueHandler the only handler on the application logger and move concrete handlers behind the listener.
Unbounded queue growth under load
Omitting maxsize on queue.Queue lets the queue grow without limit when a downstream sink stalls, ending in an out-of-memory kill instead of graceful degradation. Always set maxsize and define an overflow policy that drops low-severity records first.
Redundant handler attachment
Adding handlers to both the root logger and a child logger without setting propagate = False on the child emits every record twice. This doubles ingestion cost and corrupts any per-record counts derived from the logs.
Forgetting to stop the listener on shutdown
If the process exits without calling listener.stop(), records still in the queue are never drained. Register stop() with an atexit handler or your framework's shutdown hook so the final flush always runs.
Frequently Asked Questions
How does handler architecture impact P99 latency in async Python services?
Synchronous handlers block the event loop or worker thread during I/O. Using a QueueHandler with a background QueueListener offloads serialization and writes, which preserves event loop responsiveness and stabilizes latency under load.
Should I use one handler per log sink or a single multiplexing handler?
Deploy a dedicated handler per sink with its own filter and formatter. This isolates failure domains and lets you tune backpressure per sink without cross-contamination between destinations.
How do I safely reload handler configuration without dropping logs?
Use a QueueHandler as the stable attachment point on the logger and swap the backend listener atomically. Drain the queue before replacing the concrete handlers so no buffered record is lost during the hot reload.
What happens to log records when the queue is full?
By default QueueHandler blocks the caller, which is rarely acceptable in production. Override the enqueue path with put_nowait and a drop policy that sheds debug and info records first, preserving error and critical visibility.