Logging · Tracing · Metrics

Python observability that holds up in production

Field-tested guides for backend engineers, SREs, and platform teams: structured logging, context-safe trace propagation, OpenTelemetry pipelines, and Prometheus metrics that scale under load.

Start at an architecture hub, then drill into implementation pages for async safety, cardinality control, sampling, and deployment trade-offs — each with runnable code and expected output.

Start with tracing Logging fundamentals

Four pillars, one observability stack

Each guide starts with architecture decisions and operational constraints, then links to focused implementation pages — one pattern at a time.

LG Python Logging Fundamentals Handler architecture, formatter choices, severity routing, and context-safe structured logging. Explore guide → ML Modern Logging Libraries structlog and Loguru deployment patterns, async sinks, and production trade-offs. Explore guide → DT Distributed Tracing + OpenTelemetry OTel SDK setup, context propagation, span lifecycle controls, and sampling strategies. Explore guide → MX Metrics & Instrumentation Prometheus client and OpenTelemetry metrics, metric types, and cardinality control for SRE dashboards. Explore guide →

Built for production, not demos

Runnable code

Every snippet pins version ranges and ships with an expected-output block — console logs or OTLP collector JSON.

Async-safe by default

Patterns for asyncio, contextvars, thread pools, and process boundaries so context never fragments.

Cost-aware

Sampling strategies and label-cardinality control to keep telemetry useful without blowing up storage bills.

Cross-signal correlation

Tie logs, traces, and metrics together with shared resource attributes and injected trace IDs.