Exposing Custom Metrics with the Prometheus Client

Framework auto-instrumentation gives you HTTP counters, but the numbers that matter to the business — orders placed, payments declined, queue depth, batch rows processed — only you can emit. This guide shows how to define those custom metrics with prometheus_client, name and label them so they stay queryable and cheap, and expose them through your existing endpoint. It builds on the broader Prometheus client instrumentation guide and is part of the Python Metrics and Instrumentation guide.

The whole discipline reduces to three decisions per metric: which instrument type, what name and unit, and which labels. Get those right and the series are cheap, aggregatable, and self-documenting; get the labels wrong and you take down your Prometheus server.

Route each business signal to the right instrument, then attach only bounded labels and divert identifiers to logs.

Prerequisites

Only the client is required to define and render metrics; you expose them through whatever server you already run.

pip install "prometheus-client>=0.20.0,<1.0.0"

If you serve metrics from a prefork server (gunicorn, multi-worker uvicorn), set the multiprocess directory so custom series aggregate across workers exactly like default ones.

export PROMETHEUS_MULTIPROC_DIR=/tmp/app_prom
mkdir -p "$PROMETHEUS_MULTIPROC_DIR"

Implementation

Step 1 — Map each signal to an instrument type. A monotonic total uses a Counter. A value that rises and falls uses a Gauge. A distribution of magnitudes or durations uses a Histogram. The decision is rarely ambiguous once you ask "does this only ever go up?" — and the deeper trade-offs, especially Histogram versus Summary, are covered in choosing between Counter, Gauge, Histogram, and Summary.

# business_metrics.py — one module owns the instruments
from prometheus_client import Counter, Gauge, Histogram

# Counter: orders only ever accumulate
ORDERS = Counter(
    "orders_processed_total",
    "Orders processed, by outcome and payment method",
    ["outcome", "payment_method"],          # both are small enumerations
)

# Gauge: queue depth goes up and down
QUEUE_DEPTH = Gauge(
    "order_queue_depth",
    "Orders currently waiting in the processing queue",
)

# Histogram: payment latency distribution, buckets in seconds
PAYMENT_LATENCY = Histogram(
    "payment_gateway_duration_seconds",
    "Payment gateway round-trip latency in seconds",
    ["payment_method"],
    buckets=(0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0),
)

Step 2 — Record from business code. Increment counters on outcomes, set or shift gauges as state changes, and time histograms around the operation. The .time() context manager observes elapsed seconds on exit; .count_exceptions() increments only when the block raises.

from business_metrics import ORDERS, QUEUE_DEPTH, PAYMENT_LATENCY

def process_order(order, method):
    QUEUE_DEPTH.dec()                                   # leaving the queue
    with PAYMENT_LATENCY.labels(method).time():         # observe round trip
        ok = charge(order, method)
    outcome = "success" if ok else "declined"
    ORDERS.labels(outcome=outcome, payment_method=method).inc()

For values your code does not naturally push on every change — a queue depth held by an external system, a cache size — use a Gauge callback so the value is read at scrape time. This avoids drift between the real value and a stale reported one.

QUEUE_DEPTH.set_function(lambda: redis.llen("order_queue"))  # read at scrape

Step 3 — Follow naming and unit conventions. Names are the contract every dashboard and alert depends on. Use snake_case, base units, and the conventional suffixes: _total for counters, _seconds for any duration, _bytes for sizes. Never bake units into the value (no milliseconds, no kilobytes) — store seconds and bytes and let PromQL scale for display. The # TYPE line is emitted automatically from the instrument class, so you do not write it.

Signal	Good name	Why
Orders processed	`orders_processed_total`	`_total` marks a counter
Payment latency	`payment_gateway_duration_seconds`	base unit seconds, `_seconds` suffix
Queue depth	`order_queue_depth`	gauge, no `_total` suffix
Cache size	`cache_resident_bytes`	base unit bytes, `_bytes` suffix

Step 4 — Keep labels bounded. Every distinct combination of label values is a separate time series stored in memory and on the wire. Labels must draw from small, enumerable sets — outcome, payment method, region, HTTP status. Never label by user ID, email, order ID, raw URL, or any free-form string, because each new value adds a permanent series and the total multiplies across labels. A counter with outcome (3 values) and payment_method (4 values) is 12 series; adding user_id makes it unbounded. The full treatment, including how to estimate and cap the series budget, is in controlling label cardinality in Prometheus.

# WRONG: user_id is unbounded -> one series per user, forever
ORDERS.labels(outcome="success", payment_method="card", user_id=order.user).inc()

# RIGHT: bounded labels only; identity belongs in logs or traces
ORDERS.labels(outcome="success", payment_method="card").inc()

Step 5 — Expose the instruments. Custom metrics registered with the default REGISTRY render automatically through the same endpoint as everything else — make_wsgi_app(), make_asgi_app(), start_http_server(), or a manual generate_latest() route. There is no extra registration step; constructing the instrument is the registration.

from prometheus_client import start_http_server
import business_metrics  # importing it registers the instruments

start_http_server(8000)   # custom series now appear at /metrics

Step 6 — Initialize label children so series exist before the first event. A labeled metric does not emit a series for a given label combination until that combination is first observed. This means a dashboard panel for orders_processed_total{outcome="declined"} shows "no data" until the first decline happens, which breaks alerts that expect the series to exist. Pre-create the children you know about by calling .labels(...) once at startup; for a counter this registers a series at value 0.

# Pre-seed known label combinations so the series exist from boot
for outcome in ("success", "declined"):
    for method in ("card", "wallet", "transfer"):
        ORDERS.labels(outcome=outcome, payment_method=method)

This only helps for label sets you can enumerate — which is exactly the bounded labels you should be using. If you cannot enumerate the values to pre-seed, that is a strong signal the label is too high-cardinality and belongs in a log line, not a metric.

Configuration Options

Option	Applies to	Purpose	Notes
`labelnames=[...]`	all instruments	Declares label keys	Keep value sets small and enumerable
`_total` / `_seconds` / `_bytes` suffix	naming	PromQL/Grafana convention	Base units only; never ms or KB
`Histogram(buckets=...)`	histogram	Distribution boundaries	Tune to expected magnitude range
`Gauge.set_function(fn)`	gauge	Read value at scrape time	For externally owned values
`Counter.count_exceptions()`	counter	Increment only on raise	Wraps a block as context manager
`multiprocess_mode`	gauge	Cross-worker reduction	`livesum`, `max`, etc. under prefork
`registry=`	all	Isolate from default	Use for tests; omit in production

Verification

Drive a few operations, then scrape and confirm the suffixes, types, and bounded labels.

curl -s localhost:8000/metrics | grep -E "orders_processed|payment_gateway|order_queue"

Expected Output:

# HELP orders_processed_total Orders processed, by outcome and payment method
# TYPE orders_processed_total counter
orders_processed_total{outcome="success",payment_method="card"} 18.0
orders_processed_total{outcome="declined",payment_method="card"} 2.0
# HELP order_queue_depth Orders currently waiting in the processing queue
# TYPE order_queue_depth gauge
order_queue_depth 4.0
# HELP payment_gateway_duration_seconds Payment gateway round-trip latency in seconds
# TYPE payment_gateway_duration_seconds histogram
payment_gateway_duration_seconds_bucket{payment_method="card",le="0.5"} 15.0
payment_gateway_duration_seconds_bucket{payment_method="card",le="1.0"} 20.0
payment_gateway_duration_seconds_bucket{payment_method="card",le="+Inf"} 20.0
payment_gateway_duration_seconds_sum{payment_method="card"} 7.84
payment_gateway_duration_seconds_count{payment_method="card"} 20.0

A healthy scrape shows a handful of series per metric — one per bounded label combination — not thousands. If grep -c returns a number that grows with traffic or user count, a label is unbounded and must be removed.

Common Mistakes

Units encoded in the name or value

Error signature: PromQL math is off by 1000; a panel labeled "seconds" shows milliseconds. Root cause: the metric stores milliseconds or kilobytes and the name lies, or the name carries a unit the value contradicts. Remediation: always store base units — seconds, bytes — and suffix the name accordingly (_seconds, _bytes). Let Grafana and PromQL scale for display rather than pre-scaling in the application.

A label value set that grows without bound

Error signature: Prometheus memory climbs steadily, prometheus_tsdb_head_series grows linearly with traffic, and queries slow down. Root cause: a label carries a high-cardinality value such as user ID, order ID, or raw path, creating a new permanent series per value. Remediation: remove the offending label and move that identity into logs or trace attributes. Keep labels to small enumerations and validate the series budget per the label cardinality guidance.

Frequently Asked Questions

What suffix should a custom metric name use?

Suffix monotonic counters with _total and any duration with _seconds, using base units throughout. A latency histogram is request_duration_seconds and a processed-items counter is items_processed_total. The TYPE comment is set automatically by the instrument class.

Can I use a high-cardinality value like user ID as a label?

No. Each unique label value is a separate stored time series, so user IDs, request paths, and email addresses cause unbounded series growth that overloads Prometheus. Keep labels to small enumerable sets and push high-cardinality identifiers into logs or traces instead.

How do I track a value the application does not push, like a queue depth?

Use a Gauge with a callback through the set_function method, or a custom collector, so the value is read at scrape time rather than maintained on every change. This avoids drift between the real value and the reported one.

Frequently Asked Questions

Related Guides