Logging · Tracing · Metrics

Python observability that holds up in production

Field-tested guides for backend engineers, SREs, and platform teams: structured logging, context-safe trace propagation, OpenTelemetry pipelines, and Prometheus metrics that scale under load.

Start at an architecture hub, then drill into implementation pages for async safety, cardinality control, sampling, and deployment trade-offs — each with runnable code and expected output.

Python service Logs Traces Metrics OTel collector store

Four pillars, one observability stack

Each guide starts with architecture decisions and operational constraints, then links to focused implementation pages — one pattern at a time.

Built for production, not demos

Runnable code

Every snippet pins version ranges and ships with an expected-output block — console logs or OTLP collector JSON.

Async-safe by default

Patterns for asyncio, contextvars, thread pools, and process boundaries so context never fragments.

Cost-aware

Sampling strategies and label-cardinality control to keep telemetry useful without blowing up storage bills.

Cross-signal correlation

Tie logs, traces, and metrics together with shared resource attributes and injected trace IDs.