Exposing Custom Metrics with the Prometheus Client
Framework auto-instrumentation gives you HTTP counters, but the numbers that matter to the business — orders placed, payments declined, queue depth, batch rows processed — only you can emit. This guide shows how to define those custom metrics with prometheus_client, name and label them so they stay queryable and cheap, and expose them through your existing endpoint. It builds on the broader Prometheus client instrumentation guide and is part of the Python Metrics and Instrumentation guide.
The whole discipline reduces to three decisions per metric: which instrument type, what name and unit, and which labels. Get those right and the series are cheap, aggregatable, and self-documenting; get the labels wrong and you take down your Prometheus server.
Prerequisites
Only the client is required to define and render metrics; you expose them through whatever server you already run.
pip install "prometheus-client>=0.20.0,<1.0.0"
If you serve metrics from a prefork server (gunicorn, multi-worker uvicorn), set the multiprocess directory so custom series aggregate across workers exactly like default ones.
export PROMETHEUS_MULTIPROC_DIR=/tmp/app_prom
mkdir -p "$PROMETHEUS_MULTIPROC_DIR"
Implementation
Step 1 — Map each signal to an instrument type. A monotonic total uses a Counter. A value that rises and falls uses a Gauge. A distribution of magnitudes or durations uses a Histogram. The decision is rarely ambiguous once you ask "does this only ever go up?" — and the deeper trade-offs, especially Histogram versus Summary, are covered in choosing between Counter, Gauge, Histogram, and Summary.
# business_metrics.py — one module owns the instruments
from prometheus_client import Counter, Gauge, Histogram
# Counter: orders only ever accumulate
ORDERS = Counter(
"orders_processed_total",
"Orders processed, by outcome and payment method",
["outcome", "payment_method"], # both are small enumerations
)
# Gauge: queue depth goes up and down
QUEUE_DEPTH = Gauge(
"order_queue_depth",
"Orders currently waiting in the processing queue",
)
# Histogram: payment latency distribution, buckets in seconds
PAYMENT_LATENCY = Histogram(
"payment_gateway_duration_seconds",
"Payment gateway round-trip latency in seconds",
["payment_method"],
buckets=(0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0),
)
Step 2 — Record from business code. Increment counters on outcomes, set or shift gauges as state changes, and time histograms around the operation. The .time() context manager observes elapsed seconds on exit; .count_exceptions() increments only when the block raises.
from business_metrics import ORDERS, QUEUE_DEPTH, PAYMENT_LATENCY
def process_order(order, method):
QUEUE_DEPTH.dec() # leaving the queue
with PAYMENT_LATENCY.labels(method).time(): # observe round trip
ok = charge(order, method)
outcome = "success" if ok else "declined"
ORDERS.labels(outcome=outcome, payment_method=method).inc()
For values your code does not naturally push on every change — a queue depth held by an external system, a cache size — use a Gauge callback so the value is read at scrape time. This avoids drift between the real value and a stale reported one.
QUEUE_DEPTH.set_function(lambda: redis.llen("order_queue")) # read at scrape
Step 3 — Follow naming and unit conventions. Names are the contract every dashboard and alert depends on. Use snake_case, base units, and the conventional suffixes: _total for counters, _seconds for any duration, _bytes for sizes. Never bake units into the value (no milliseconds, no kilobytes) — store seconds and bytes and let PromQL scale for display. The # TYPE line is emitted automatically from the instrument class, so you do not write it.
| Signal | Good name | Why |
|---|---|---|
| Orders processed | orders_processed_total |
_total marks a counter |
| Payment latency | payment_gateway_duration_seconds |
base unit seconds, _seconds suffix |
| Queue depth | order_queue_depth |
gauge, no _total suffix |
| Cache size | cache_resident_bytes |
base unit bytes, _bytes suffix |
Step 4 — Keep labels bounded. Every distinct combination of label values is a separate time series stored in memory and on the wire. Labels must draw from small, enumerable sets — outcome, payment method, region, HTTP status. Never label by user ID, email, order ID, raw URL, or any free-form string, because each new value adds a permanent series and the total multiplies across labels. A counter with outcome (3 values) and payment_method (4 values) is 12 series; adding user_id makes it unbounded. The full treatment, including how to estimate and cap the series budget, is in controlling label cardinality in Prometheus.
# WRONG: user_id is unbounded -> one series per user, forever
ORDERS.labels(outcome="success", payment_method="card", user_id=order.user).inc()
# RIGHT: bounded labels only; identity belongs in logs or traces
ORDERS.labels(outcome="success", payment_method="card").inc()
Step 5 — Expose the instruments. Custom metrics registered with the default REGISTRY render automatically through the same endpoint as everything else — make_wsgi_app(), make_asgi_app(), start_http_server(), or a manual generate_latest() route. There is no extra registration step; constructing the instrument is the registration.
from prometheus_client import start_http_server
import business_metrics # importing it registers the instruments
start_http_server(8000) # custom series now appear at /metrics
Step 6 — Initialize label children so series exist before the first event. A labeled metric does not emit a series for a given label combination until that combination is first observed. This means a dashboard panel for orders_processed_total{outcome="declined"} shows "no data" until the first decline happens, which breaks alerts that expect the series to exist. Pre-create the children you know about by calling .labels(...) once at startup; for a counter this registers a series at value 0.
# Pre-seed known label combinations so the series exist from boot
for outcome in ("success", "declined"):
for method in ("card", "wallet", "transfer"):
ORDERS.labels(outcome=outcome, payment_method=method)
This only helps for label sets you can enumerate — which is exactly the bounded labels you should be using. If you cannot enumerate the values to pre-seed, that is a strong signal the label is too high-cardinality and belongs in a log line, not a metric.
Configuration Options
| Option | Applies to | Purpose | Notes |
|---|---|---|---|
labelnames=[...] |
all instruments | Declares label keys | Keep value sets small and enumerable |
_total / _seconds / _bytes suffix |
naming | PromQL/Grafana convention | Base units only; never ms or KB |
Histogram(buckets=...) |
histogram | Distribution boundaries | Tune to expected magnitude range |
Gauge.set_function(fn) |
gauge | Read value at scrape time | For externally owned values |
Counter.count_exceptions() |
counter | Increment only on raise | Wraps a block as context manager |
multiprocess_mode |
gauge | Cross-worker reduction | livesum, max, etc. under prefork |
registry= |
all | Isolate from default | Use for tests; omit in production |
Verification
Drive a few operations, then scrape and confirm the suffixes, types, and bounded labels.
curl -s localhost:8000/metrics | grep -E "orders_processed|payment_gateway|order_queue"
Expected Output:
# HELP orders_processed_total Orders processed, by outcome and payment method
# TYPE orders_processed_total counter
orders_processed_total{outcome="success",payment_method="card"} 18.0
orders_processed_total{outcome="declined",payment_method="card"} 2.0
# HELP order_queue_depth Orders currently waiting in the processing queue
# TYPE order_queue_depth gauge
order_queue_depth 4.0
# HELP payment_gateway_duration_seconds Payment gateway round-trip latency in seconds
# TYPE payment_gateway_duration_seconds histogram
payment_gateway_duration_seconds_bucket{payment_method="card",le="0.5"} 15.0
payment_gateway_duration_seconds_bucket{payment_method="card",le="1.0"} 20.0
payment_gateway_duration_seconds_bucket{payment_method="card",le="+Inf"} 20.0
payment_gateway_duration_seconds_sum{payment_method="card"} 7.84
payment_gateway_duration_seconds_count{payment_method="card"} 20.0
A healthy scrape shows a handful of series per metric — one per bounded label combination — not thousands. If grep -c returns a number that grows with traffic or user count, a label is unbounded and must be removed.
Common Mistakes
Units encoded in the name or value
Error signature: PromQL math is off by 1000; a panel labeled "seconds" shows milliseconds.
Root cause: the metric stores milliseconds or kilobytes and the name lies, or the name carries a unit the value contradicts.
Remediation: always store base units — seconds, bytes — and suffix the name accordingly (_seconds, _bytes). Let Grafana and PromQL scale for display rather than pre-scaling in the application.
A label value set that grows without bound
Error signature: Prometheus memory climbs steadily, prometheus_tsdb_head_series grows linearly with traffic, and queries slow down.
Root cause: a label carries a high-cardinality value such as user ID, order ID, or raw path, creating a new permanent series per value.
Remediation: remove the offending label and move that identity into logs or trace attributes. Keep labels to small enumerations and validate the series budget per the label cardinality guidance.
Frequently Asked Questions
What suffix should a custom metric name use?
Suffix monotonic counters with _total and any duration with _seconds, using base units throughout. A latency histogram is request_duration_seconds and a processed-items counter is items_processed_total. The TYPE comment is set automatically by the instrument class.
Can I use a high-cardinality value like user ID as a label?
No. Each unique label value is a separate stored time series, so user IDs, request paths, and email addresses cause unbounded series growth that overloads Prometheus. Keep labels to small enumerable sets and push high-cardinality identifiers into logs or traces instead.
How do I track a value the application does not push, like a queue depth?
Use a Gauge with a callback through the set_function method, or a custom collector, so the value is read at scrape time rather than maintained on every change. This avoids drift between the real value and the reported one.