OpenTelemetry vs Prometheus for Python Metrics

Choosing between OpenTelemetry and Prometheus for a Python service is really a choice between a push pipeline and a pull pipeline, and between two instrumentation libraries — the OpenTelemetry metrics SDK and prometheus_client — that record the same metric types but ship them very differently. This decision shapes how you deploy, how you control cost, and how metrics correlate with your other signals. This guide is part of the Python Metrics and Instrumentation guide, and it builds on the two implementation walkthroughs it compares: instrumenting with prometheus_client and recording metrics with the OpenTelemetry metrics SDK. The goal here is a defensible decision, not a feature list.

The two models compared: Prometheus pulls from an exposition endpoint, OpenTelemetry pushes OTLP to a collector, and the OTel Prometheus exporter bridges them.

Concept & Architecture

The deepest difference is who initiates data movement. In the pull model, the application is passive: prometheus_client keeps current values in an in-process registry and exposes a /metrics endpoint, and a Prometheus server decides when to scrape it. The monitoring system owns timing and discovery, and a failed scrape is itself a useful liveness signal. In the push model, the application is active: the OpenTelemetry SDK runs a PeriodicExportingMetricReader that collects instrument values and sends them over OTLP to a collector or backend on a cadence the app controls.

These models imply different operational shapes. Pull needs the app to be network-reachable by the scraper and works best for long-lived processes with stable addresses — classic Kubernetes pods behind service discovery. Push works when the app cannot be scraped: short-lived batch jobs, serverless functions, or processes behind a NAT, where the workload may exit before any scrape would have fired. Push also folds metrics into the same OTLP pipeline used for distributed traces, giving you one exporter, one endpoint, and one resource definition across signals.

The reliability characteristics differ too. With pull, the monitoring system has a built-in health check: if a scrape fails, Prometheus records the target as up == 0, and you get a free signal that the process is unreachable. There is no need for the app to retry or buffer, because the next scrape simply reads whatever the current value is. With push, the application owns delivery, so transient export failures must be retried and ideally buffered, which is exactly what the OpenTelemetry Collector adds in front of the backend. The flip side is that pull cannot capture a metric from a process that has already exited, whereas push can flush a final export on shutdown — the deciding factor for batch and serverless workloads.

The libraries differ accordingly. prometheus_client is small and synchronous: instruments mutate in memory, serialization happens at scrape, there is no background thread in the simple case. The OpenTelemetry metrics SDK is a fuller pipeline — MeterProvider, readers, exporters, and views that can rename instruments, drop attributes, or reshape histogram buckets before export. That richness costs a background reader/exporter thread and more configuration surface, which is the price of vendor neutrality and cross-signal unification.

Storage and query are where the two stop overlapping. Prometheus is not only a scraper but a complete time series database with its own storage engine and the PromQL query language; choosing Prometheus instrumentation usually means choosing the whole Prometheus stack — server, storage, alertmanager, and PromQL-driven dashboards. OpenTelemetry deliberately stops at instrumentation and transport: it defines how a metric is recorded and shipped but has no storage or query layer of its own. An OTLP pipeline must terminate in some backend, which can be Prometheus (via remote-write or the exporter bridge), a managed vendor, or another OTLP-native store. That separation is the whole point — OpenTelemetry decouples how you instrument from where you store, so you can change backends without rewriting application code.

A second architectural consequence is temporality. Prometheus only understands cumulative values: a counter is the running total since process start, and PromQL functions like rate() derive per-second rates by differencing successive scrapes. The OpenTelemetry SDK can export either cumulative or delta temporality, where delta reports the change since the last export. Delta suits short-lived workloads that never accumulate a meaningful long-run total, but if your backend is Prometheus you must keep cumulative temporality so rate() behaves correctly. Mismatched temporality is a subtle source of broken dashboards after a migration.

Step-by-step Decision

Work through these questions in order; the first decisive answer usually settles it.

Do you already run Prometheus, and are your services long-lived and scrapeable? If yes, the pull model with prometheus_client is the lowest-friction path — no exporters to configure, no collector to run. Use the exposition endpoint and let Prometheus discover and scrape it.

# pip install "prometheus-client>=0.20.0,<1.0.0"
from prometheus_client import Counter, start_http_server

requests = Counter("http_requests_total", "Requests.", ["route", "status"])
start_http_server(9100)  # Prometheus scrapes :9100/metrics
requests.labels("/orders/{id}", "200").inc()

Do you need one pipeline shared with traces and logs, or vendor neutrality? If you want a single OTLP path to a collector that you can re-route without touching app code, choose the OpenTelemetry SDK and push.

# pip install "opentelemetry-sdk>=1.30.0,<2.0.0" \
#   "opentelemetry-exporter-otlp-proto-grpc>=1.30.0,<2.0.0"
from opentelemetry import metrics
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter

reader = PeriodicExportingMetricReader(OTLPMetricExporter(endpoint="localhost:4317"))
metrics.set_meter_provider(MeterProvider(metric_readers=[reader]))
counter = metrics.get_meter("svc").create_counter("http.server.requests")
counter.add(1, {"http.route": "/orders/{id}", "http.status_code": 200})

Are your workloads short-lived or unscrapeable? Batch jobs and serverless favor push, because a scraper may never reach a process that exits in seconds. Prefer OTLP, or a Prometheus push-gateway only for genuine batch jobs.
Do you want OTel instrumentation but Prometheus storage? Use the bridge: the SDK's PrometheusMetricReader exposes a scrapeable endpoint while you write code against the OpenTelemetry API.

# pip install "opentelemetry-sdk>=1.30.0,<2.0.0" \
#   "opentelemetry-exporter-prometheus>=0.50b0,<1.0.0" \
#   "prometheus-client>=0.20.0,<1.0.0"
from prometheus_client import start_http_server
from opentelemetry import metrics
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.exporter.prometheus import PrometheusMetricReader

reader = PrometheusMetricReader()           # OTel instruments -> exposition format
metrics.set_meter_provider(MeterProvider(metric_readers=[reader]))
start_http_server(9100)                      # Prometheus scrapes as usual
metrics.get_meter("svc").create_counter("http.server.requests").add(1)

Expected Output: A scrape of :9100/metrics shows the OTel-created instrument rendered in Prometheus exposition format:

# TYPE http_server_requests_total counter
http_server_requests_total{otel_scope_name="svc"} 1.0

Configuration Reference

Dimension	Prometheus (`prometheus_client`)	OpenTelemetry (metrics SDK)
Transport model	Pull / scrape	Push (also pull via bridge)
Wire format	Exposition text (`text/plain; version=0.0.4`)	OTLP (gRPC or HTTP/protobuf)
Core objects	`Counter`, `Gauge`, `Histogram`, `Summary`, `REGISTRY`	`MeterProvider`, `Meter`, `PeriodicExportingMetricReader`, `OTLPMetricExporter`
Export trigger	On scrape, server-driven	On interval, app-driven
Multiprocess	`PROMETHEUS_MULTIPROC_DIR` + `MultiProcessCollector`	Per-process provider, exported with instance attributes
Histogram quantiles	Computed server-side from buckets	Buckets via views; quantiles computed in backend
Attribute/series shaping	`metric_relabel_configs` at ingestion	Views (rename, drop attributes, change buckets)
Temporality	Cumulative only	Cumulative or delta (configurable)
Cross-signal correlation	Exemplars (trace_id on buckets)	Native resource shared with traces/logs
Collector required	No	Recommended, not required

Async & Concurrency Considerations

Both libraries are safe to call from async handlers — recording a counter or histogram is a fast, non-blocking in-memory operation in each. The concurrency concern is process topology, not coroutines. Under gunicorn or multi-worker uvicorn, the pull model requires multiprocess mode so a scrape aggregates across workers rather than hitting one; without it, totals are silently wrong. The same forking constraint applies to OpenTelemetry but inverts: build the MeterProvider after the worker forks (in a post_fork hook or ASGI lifespan), because the exporter's background thread does not survive fork().

For values you sample rather than increment — queue depth, pool size, resident memory — both offer a pull-at-collection pattern: a Gauge with a set_function callback in prometheus_client, or an observable gauge with a callback in OpenTelemetry. Use these so you never block a request path to read an expensive value. The callback fires at scrape or collection time on the collection thread, keeping your handlers fast.

Ecosystem & Auto-Instrumentation

Library maturity often decides the question more than the model does. Prometheus has a long-established ecosystem: client libraries in every language, exporters for databases and message brokers, and a vast catalogue of community dashboards and alert rules built around PromQL. For a Python team standardized on Prometheus, the integrations for Flask, Django, Celery, and the common databases are battle-tested, and the prometheus_client instrumentation guide covers the framework hooks directly. The cost is that this ecosystem is Prometheus-shaped — moving off it later means re-instrumenting or relying on bridges.

OpenTelemetry's advantage is breadth across signals and vendors. The same project that emits your metrics also emits your traces and logs, and a single set of opentelemetry-instrumentation-* packages auto-instruments web frameworks and clients for all three signals at once. If you have already adopted OpenTelemetry for distributed tracing in Python, extending it to metrics reuses the resource definition, the collector, and the operational know-how you already have. The trade-off is that the metrics half of OpenTelemetry stabilized later than tracing, so some exporter and auto-instrumentation packages still carry beta version markers and need version-pinning discipline.

A realistic recommendation: greenfield services on a platform that already standardizes on OpenTelemetry should use the OTel SDK and OTLP; teams with an entrenched Prometheus stack and long-lived scrapeable services should use prometheus_client; and teams mid-transition should run the bridge so they can change the backend on their own schedule rather than the application's.

Production Code Examples

Migration Bridge: Instrument Once, Serve Both

A common real-world state is a fleet mid-migration: new code uses the OpenTelemetry API, but the platform team still runs Prometheus. The bridge lets both coexist with no double instrumentation.

# pip install "opentelemetry-sdk>=1.30.0,<2.0.0" \
#   "opentelemetry-exporter-prometheus>=0.50b0,<1.0.0" \
#   "opentelemetry-exporter-otlp-proto-grpc>=1.30.0,<2.0.0" \
#   "prometheus-client>=0.20.0,<1.0.0"
from prometheus_client import start_http_server
from opentelemetry import metrics
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter

resource = Resource.create({"service.name": "order-service"})

# Two readers on one provider: scrapeable endpoint AND OTLP push.
prom_reader = PrometheusMetricReader()
otlp_reader = PeriodicExportingMetricReader(
    OTLPMetricExporter(endpoint="localhost:4317"), export_interval_millis=10_000
)
metrics.set_meter_provider(
    MeterProvider(resource=resource, metric_readers=[prom_reader, otlp_reader])
)
start_http_server(9100)  # Prometheus still scrapes; OTLP also flows

meter = metrics.get_meter("order-service")
orders = meter.create_counter("orders.processed")
orders.add(1, {"tier": "gold"})

Expected Output: The same instrument is both scrapeable and pushed. The scrape shows orders_processed_total{otel_scope_name="order-service",tier="gold"} 1.0, while the collector receives an equivalent OTLP data point — letting you cut over backends without touching instrumentation code.

Common Mistakes

Error signature: counters reset to small values on every scrape under gunicorn. Root cause: running the pull model with multiple workers and no multiprocess directory, so each scrape hits one worker. Remediation: set PROMETHEUS_MULTIPROC_DIR and build the scrape registry with MultiProcessCollector, or switch that service to OTLP push.
Error signature: OTLP metrics stop arriving from forked workers despite working in a single process. Root cause: the MeterProvider was created at import time, before fork(), so the exporter thread did not survive. Remediation: initialize the provider in a post_fork hook or lifespan startup.
Error signature: a global p99 dashboard looks wrong after consolidating instances. Root cause: a Summary (or in-process quantile) was used; per-process quantiles cannot be aggregated. Remediation: switch to a histogram so the backend computes the quantile from summed buckets, as covered in choosing counter, gauge, histogram, and summary.
Error signature: Prometheus memory climbs steadily after adopting OTLP push. Root cause: OTel attributes were not constrained the way Prometheus labels are, reintroducing unbounded cardinality. Remediation: apply views that drop high-cardinality attributes before export, following controlling label cardinality.

Frequently Asked Questions

Is OpenTelemetry replacing Prometheus for Python metrics?

No. They solve overlapping but distinct problems. OpenTelemetry is an instrumentation and transport standard with a push model; Prometheus is a storage and query system with a pull model. Many teams instrument with OpenTelemetry and still store and query in Prometheus by using the OTel Prometheus exporter or remote-write.

Can I scrape OpenTelemetry metrics with Prometheus?

Yes. The OpenTelemetry SDK ships a PrometheusMetricReader that exposes a scrapeable exposition endpoint, so you can instrument with the OTel API and still let Prometheus pull. This is the usual bridge during a migration.

Which has lower overhead in a Python process, prometheus_client or the OTel SDK?

prometheus_client is lighter because it only maintains in-memory counters and serializes on scrape, with no background export. The OTel SDK runs a periodic reader and exporter thread, which is modest but non-zero. For most services the difference is negligible compared to request handling.

Do I need the OpenTelemetry Collector to use OTLP metrics?

Not strictly. You can export OTLP directly from the SDK to any OTLP-capable backend. The collector is recommended in production because it adds batching, retries, and the ability to reroute or filter without redeploying the app.

Should a new Python microservice start with pull or push?

If you already run Prometheus and your services are long-lived and scrapeable, start with the pull model and prometheus_client for the least friction. If you are building a vendor-neutral pipeline shared with traces and logs, or running short-lived or unscrapeable workloads, start with OpenTelemetry and OTLP push.

Frequently Asked Questions

Related Guides