Python observability that holds up in production
Field-tested guides for backend engineers, SREs, and platform teams: structured logging, context-safe trace propagation, OpenTelemetry pipelines, and Prometheus metrics that scale under load.
Start at an architecture hub, then drill into implementation pages for async safety, cardinality control, sampling, and deployment trade-offs — each with runnable code and expected output.
Four pillars, one observability stack
Each guide starts with architecture decisions and operational constraints, then links to focused implementation pages — one pattern at a time.
Built for production, not demos
Runnable code
Every snippet pins version ranges and ships with an expected-output block — console logs or OTLP collector JSON.
Async-safe by default
Patterns for asyncio, contextvars, thread pools, and process boundaries so context never fragments.
Cost-aware
Sampling strategies and label-cardinality control to keep telemetry useful without blowing up storage bills.
Cross-signal correlation
Tie logs, traces, and metrics together with shared resource attributes and injected trace IDs.