Structured Logging with Python Standard Library: Zero-Dependency JSON Output
Implementing structured logging with Python standard library requires subclassing logging.Formatter to serialize LogRecord attributes into deterministic JSON. This approach eliminates third-party overhead while maintaining full compatibility with modern log aggregation pipelines. By configuring a custom formatter and leveraging logging.config.dictConfig, engineers achieve parseable output optimized for SRE dashboards and automated alerting. For foundational context on formatter pipelines and handler attachment, review Python Logging Fundamentals and Structured Data.
Key implementation goals include eliminating external dependencies. Engineers must map attributes to standardized observability fields. Thread-safe context propagation remains mandatory for concurrent workloads. Backward compatibility with existing logging calls must be preserved during migration.
Custom JSON Formatter Implementation
Override format() to serialize record.__dict__ safely while filtering internal keys. Exclude exc_info, msg, and args to prevent recursion or payload bloat. Handle exception tracebacks by capturing formatException() output before serialization. Ensure deterministic field ordering using sort_keys=True for consistent downstream parsing. Align output fields with OpenTelemetry semantic conventions for seamless ingestion.
Map record.levelname to severity_text for OTel compliance. Convert timestamps to RFC 3339 format using datetime.fromtimestamp(). Inject trace_id and span_id from W3C Trace Context headers. Maintain strict type safety by enforcing string conversion for non-primitive values.
dictConfig Pipeline Integration
Define the formatter class path in your YAML or JSON configuration. Attach the formatter to StreamHandler or RotatingFileHandler to route output correctly. Configure log levels per handler to control verbosity across staging and production environments. Validate configuration on startup using logging.config.dictConfig() to fail fast on syntax errors. This declarative approach decouples logging setup from business logic.
Avoid inline basicConfig() calls in production deployments. Use explicit handler dictionaries to manage I/O boundaries. Set disable_existing_loggers=False to preserve third-party library output. Route ERROR and CRITICAL streams to separate sinks if required by your SRE alerting thresholds.
Context Injection and Thread Safety
Use logging.LoggerAdapter for per-request context enrichment without mutating global state. Leverage contextvars for async and thread-safe propagation across event loops. Avoid mutable default arguments in formatter initialization to prevent cross-request data leakage. Ensure consistent field naming across microservices to maintain OTel compliance. For advanced formatting pipelines and handler attachment strategies, reference Formatter Configuration.
Extract W3C traceparent headers at the ingress boundary. Populate contextvars immediately upon request receipt. Pass the adapter through middleware layers instead of patching global loggers. Isolate request-scoped metadata from process-scoped configuration to guarantee thread safety.
Production Code Examples
The following implementation combines the custom formatter, dictConfig wiring, and async-safe context injection into a single runnable module. It maps directly to OpenTelemetry log data models and handles exception serialization safely.
import json
import logging
import logging.config
import asyncio
from contextvars import ContextVar
from datetime import datetime, timezone
# Async-safe context variables for W3C Trace Context propagation
trace_id_var: ContextVar[str] = ContextVar("trace_id", default="")
span_id_var: ContextVar[str] = ContextVar("span_id", default="")
class OTelJSONFormatter(logging.Formatter):
def format(self, record: logging.LogRecord) -> str:
# Map to OpenTelemetry semantic conventions
log_obj = {
"timestamp": datetime.fromtimestamp(record.created, tz=timezone.utc).isoformat(),
"severity_text": record.levelname,
"severity_number": record.levelno,
"logger": record.name,
"message": record.getMessage(),
"module": record.module,
"function": record.funcName,
"line": record.lineno,
"trace_id": trace_id_var.get(),
"span_id": span_id_var.get(),
}
# Safely handle exception tracebacks
if record.exc_info and record.exc_info[0]:
log_obj["exception"] = self.formatException(record.exc_info).replace("\n", "\\n")
# Merge custom extra fields safely
extra = getattr(record, "extra_fields", {})
log_obj.update({k: v for k, v in extra.items() if k not in log_obj})
return json.dumps(log_obj, default=str, sort_keys=True)
LOGGING_CONFIG = {
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"otel_json": {
"()": "__main__.OTelJSONFormatter",
"datefmt": "%Y-%m-%dT%H:%M:%S%z"
}
},
"handlers": {
"console": {
"class": "logging.StreamHandler",
"formatter": "otel_json",
"level": "INFO"
}
},
"root": {
"level": "INFO",
"handlers": ["console"]
}
}
# Initialize pipeline
logging.config.dictConfig(LOGGING_CONFIG)
# Async-safe adapter wrapper
def get_context_logger(name: str) -> logging.LoggerAdapter:
return logging.LoggerAdapter(logging.getLogger(name), {})
async def simulate_request():
# Simulate W3C Trace Context extraction
trace_id_var.set("4bf92f3577b34da6a3ce929d0e0e4736")
span_id_var.set("00f067aa0ba902b7")
logger = get_context_logger("payment.service")
logger.info("Transaction processed", extra={"extra_fields": {"amount": 49.99, "currency": "USD"}})
try:
raise ValueError("Invalid payment gateway response")
except Exception:
logger.exception("Payment routing failed")
if __name__ == "__main__":
asyncio.run(simulate_request())
Expected Output:
{"amount": 49.99, "currency": "USD", "function": "simulate_request", "line": 60, "logger": "payment.service", "message": "Transaction processed", "module": "__main__", "severity_number": 20, "severity_text": "INFO", "span_id": "00f067aa0ba902b7", "timestamp": "2024-01-15T10:30:00+00:00", "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736"}
{"exception": "Traceback (most recent call last):\\n File \"__main__.py\", line 63, in simulate_request\\n raise ValueError(\"Invalid payment gateway response\")\\nValueError: Invalid payment gateway response", "function": "simulate_request", "line": 66, "logger": "payment.service", "message": "Payment routing failed", "module": "__main__", "severity_number": 40, "severity_text": "ERROR", "span_id": "00f067aa0ba902b7", "timestamp": "2024-01-15T10:30:00+00:00", "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736"}
Common Mistakes
Serializing non-serializable objects directly into JSON
Error Signature: TypeError: Object of type datetime is not JSON serializable
Remediation: Pass default=str to json.dumps() or explicitly convert datetime, Decimal, and custom classes to primitives before serialization. Never rely on implicit type coercion.
Overriding formatMessage() instead of format()
Error Signature: Silent loss of exception stack traces and incorrect LogRecord assembly.
Remediation: Override format() to intercept the complete record lifecycle. formatMessage() only handles string interpolation and bypasses critical handler-level filtering and exception capture.
Using global variables for request context
Error Signature: Cross-request data leakage in WSGI/ASGI servers under concurrent load.
Remediation: Replace module-level dictionaries with contextvars. Wrap loggers in logging.LoggerAdapter to inject request-scoped metadata safely. Isolate context per execution unit.
FAQ
Can I use the standard library for structured logging in async applications?
Yes. Pair logging with contextvars to propagate trace IDs across event loops without thread-local storage. This ensures safe async context isolation and prevents race conditions during concurrent I/O.
How do I handle multi-line exception tracebacks in JSON logs?
Replace newline characters with literal \n or split the traceback into a stack_trace array field during formatter serialization. This maintains valid JSON structure while preserving full debugging context.
Does this approach impact performance compared to third-party libraries?
Minimal overhead. Avoiding extra dependencies and serialization layers reduces memory footprint. json.dumps() adds approximately 50–100µs per log line, which remains negligible for standard observability pipelines.