Instrumenting aiohttp Client Requests with OpenTelemetry

Outbound calls made with aiohttp need their trace context injected into the request headers, otherwise the downstream service starts a fresh trace and the call chain breaks at every hop. This guide is part of the Distributed Tracing and OpenTelemetry in Python guide and the async tracing patterns guide, and it shows how AioHttpClientInstrumentor produces client spans and propagates W3C trace context so client and server spans correlate in a single trace.

The instrumentation patches aiohttp.ClientSession so every request opens a CLIENT span and writes the active trace context into the outgoing headers as a traceparent value. The receiving service, if it extracts that header, attaches its SERVER span to the same trace. This handoff is the client-side half of context propagation and baggage, and it is what turns a pile of independent spans into one connected call graph.

The traceparent header carries one trace id from the client CLIENT span to the downstream SERVER span.

Prerequisites

Pin the instrumentation against the SDK. The client instrumentation tracks the unstable instrumentation channel, so a bounded range keeps builds reproducible.

pip install "opentelemetry-sdk>=1.30.0,<2.0.0" \
            "opentelemetry-instrumentation-aiohttp-client>=0.51b0,<1.0.0" \
            "opentelemetry-exporter-otlp-proto-grpc>=1.30.0,<2.0.0" \
            "aiohttp>=3.9.0,<4.0.0"

Configure the exporter endpoint and service name so client spans are attributed correctly.

export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"
export OTEL_SERVICE_NAME="checkout-gateway"

Implementation

First, set up the TracerProvider and a BatchSpanProcessor at import time, before any ClientSession is created. If the provider is installed after the instrumentation runs, requests bind to a NoOpTracer and no spans are exported.

import os
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

# 1. Provider first so instrumentation binds to a real tracer.
provider = TracerProvider()
provider.add_span_processor(
    BatchSpanProcessor(
        OTLPSpanExporter(endpoint=os.getenv("OTEL_EXPORTER_OTLP_ENDPOINT"))
    )
)
trace.set_tracer_provider(provider)

Second, instrument the aiohttp client once. The single instrument() call patches ClientSession process-wide, so it must run before the sessions you want traced are constructed.

from opentelemetry.instrumentation.aiohttp_client import AioHttpClientInstrumentor

# 2. Patch ClientSession so every request opens a CLIENT span and injects context.
AioHttpClientInstrumentor().instrument()

Third, issue requests inside an active span. The instrumentation injects traceparent from whatever span is current in the asyncio context, so the outbound call must run beneath a parent span to propagate a meaningful chain. Wrapping the work in an explicit span makes the relationship clear.

import asyncio
import aiohttp
from opentelemetry import trace

tracer = trace.get_tracer(__name__)

async def fetch_inventory(sku: str) -> dict:
    # The CLIENT span and traceparent header are created under this parent span.
    with tracer.start_as_current_span("check-inventory"):
        async with aiohttp.ClientSession() as session:
            async with session.get(
                f"http://inventory:8080/items/{sku}"
            ) as resp:
                resp.raise_for_status()
                return await resp.json()

asyncio.run(fetch_inventory("SKU-9931"))

The downstream service receives a request whose headers carry the injected context.

Expected Output: the outgoing request includes a W3C traceparent header that the downstream service can extract.

GET /items/SKU-9931 HTTP/1.1
Host: inventory:8080
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01

How the traceparent header is injected

The instrumentation patches ClientSession._request. When a request begins, it opens the CLIENT span, makes that span current in a copy of the active context, and then calls the configured global propagator's inject method against the request's headers mapping. The default propagator is TraceContextTextMapPropagator, which writes the traceparent field; if you have also installed the W3C baggage or B3 propagator through a CompositePropagator, each one writes its own header in the same pass. Injection uses a setter that mutates the outgoing CIMultiDict headers in place, so it appends to whatever you supplied without discarding your own headers.

The traceparent value encodes four fields separated by hyphens: the version (00), the 32-hex-character trace id, the 16-hex-character parent span id (which is this client span's id, not the request's logical parent), and a two-character trace-flags byte whose low bit is the sampled flag. A downstream service that extracts a traceparent with the sampled bit clear should respect that decision under a parent-based sampler, which is how a head-sampling choice made at the edge propagates intact through every hop. Because the injected span id is the client span's id, the downstream SERVER span's parent_id will equal that value, which is the join key your backend uses to draw the edge between the two services.

The injection reads the current context at request time, so if you mutate context between opening your span and issuing the request — for example by entering a nested span — the header reflects the innermost active span. That is usually correct, but it explains why a request issued from inside a helper span links to the helper rather than the handler.

Enriching spans with request and response hooks

The default CLIENT span carries http.method, http.url, and http.status_code, but production traces usually need more: the upstream service name, a request id, a payload size, or a categorized error. Hooks run inside the span's lifetime so anything they set lands on the right span without manual context juggling. The request_hook fires after the span opens but before the bytes go out; the response_hook fires once headers return.

from aiohttp import TraceRequestStartParams, TraceRequestEndParams
from opentelemetry.trace import Span

def request_hook(span: Span, params: TraceRequestStartParams) -> None:
    # Add a stable peer name and the target host for easier filtering.
    if span and span.is_recording():
        span.set_attribute("peer.service", "inventory")
        span.set_attribute("http.request.host", params.url.host)

def response_hook(
    span: Span, params: TraceRequestEndParams
) -> None:
    # Record the upstream's content length and flag server errors.
    if span and span.is_recording():
        length = params.response.headers.get("Content-Length")
        if length is not None:
            span.set_attribute("http.response.body.size", int(length))
        if params.response.status >= 500:
            span.set_attribute("error.type", "upstream_5xx")

AioHttpClientInstrumentor().instrument(
    request_hook=request_hook,
    response_hook=response_hook,
)

Expected Output: the enriched CLIENT span now carries the hook-added attributes.

{
  "name": "GET",
  "kind": "SpanKind.CLIENT",
  "attributes": {
    "http.method": "GET",
    "http.url": "http://inventory:8080/items/SKU-9931",
    "http.status_code": 200,
    "peer.service": "inventory",
    "http.request.host": "inventory",
    "http.response.body.size": 184
  }
}

Always guard hook bodies with span.is_recording(). Under a sampler that dropped the trace, the span is a non-recording stub; calling set_attribute on it is harmless but the surrounding work (parsing headers, computing sizes) is wasted, and the guard makes the cost-free path explicit.

Connection pooling and span timing

A ClientSession owns a TCPConnector whose pool is bounded by limit (total) and limit_per_host. This matters for trace interpretation because the CLIENT span covers the whole request, including any time spent waiting for a free connection from the pool. Under saturation — more concurrent requests than limit_per_host allows — a span's duration includes queueing latency that is invisible in the status code, so a slow span with a 200 result and a healthy upstream is a strong signal that the connector is the bottleneck. Create one long-lived session per service and reuse it; constructing a session per request defeats keep-alive, forces a fresh TCP and TLS handshake on every call, and inflates every span with connection-setup time that should have been amortized.

import aiohttp
from opentelemetry import trace

tracer = trace.get_tracer(__name__)

# One shared, long-lived session; the connector pools and reuses sockets.
_connector = aiohttp.TCPConnector(limit=100, limit_per_host=20)
_session: aiohttp.ClientSession | None = None

async def get_session() -> aiohttp.ClientSession:
    global _session
    if _session is None or _session.closed:
        _session = aiohttp.ClientSession(connector=_connector)
    return _session

async def fetch_price(sku: str) -> dict:
    session = await get_session()
    with tracer.start_as_current_span("fetch-price"):
        async with session.get(f"http://pricing:8080/p/{sku}") as resp:
            resp.raise_for_status()
            return await resp.json()

Expected Output: reused sockets keep span durations dominated by upstream time, not handshakes.

GET   http.url=http://pricing:8080/p/SKU-9931   duration=12ms   (warm pool)

Retries and per-attempt spans

Transient upstream failures are normal, and you usually want each retry attempt to appear as its own CLIENT span so the trace shows the attempt count and the backoff gaps. Wrap the retry loop in an application span and re-enter session.get on each iteration — that re-enters the instrumented request path, producing a fresh span per attempt that each carries its own traceparent (a new client span id per attempt), so the downstream sees distinct requests.

import asyncio

async def fetch_with_retry(sku: str, attempts: int = 3) -> dict:
    session = await get_session()
    with tracer.start_as_current_span("fetch-inventory-retrying") as parent:
        for attempt in range(1, attempts + 1):
            parent.set_attribute("retry.attempt", attempt)
            try:
                # Each iteration opens a new instrumented CLIENT span.
                async with session.get(
                    f"http://inventory:8080/items/{sku}"
                ) as resp:
                    resp.raise_for_status()
                    return await resp.json()
            except aiohttp.ClientError:
                if attempt == attempts:
                    raise
                await asyncio.sleep(0.2 * attempt)  # linear backoff

Expected Output: three sibling CLIENT spans under one parent, each with its own span id.

fetch-inventory-retrying   parent_id=null
GET   parent_id=<fetch-inventory-retrying>   http.status_code=503
GET   parent_id=<fetch-inventory-retrying>   http.status_code=503
GET   parent_id=<fetch-inventory-retrying>   http.status_code=200

Configuration options

AioHttpClientInstrumentor().instrument accepts hooks and filters to shape the emitted spans.

Option	Type	Effect
`request_hook`	`callable`	Invoked with the span and request params; use it to add attributes.
`response_hook`	`callable`	Invoked with the span and response; use it to record status detail.
`url_filter`	`callable`	Rewrites the URL recorded on the span, e.g. to strip query secrets.
`tracer_provider`	`TracerProvider`	Overrides the global provider for these spans.

A url_filter is the standard way to keep high-cardinality path segments or sensitive query strings out of the http.url attribute.

Verification

Attach a ConsoleSpanExporter locally and confirm a CLIENT span is produced with the expected HTTP attributes. The trace_id must match the SERVER span recorded by the downstream service.

{
  "name": "GET",
  "kind": "SpanKind.CLIENT",
  "attributes": {
    "http.method": "GET",
    "http.url": "http://inventory:8080/items/SKU-9931",
    "http.status_code": 200
  },
  "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736"
}

If the downstream SERVER span carries a different trace_id, the request either ran outside any active span or the receiving service is not extracting the header. Confirm the client call sits under a parent span and that the server runs matching instrumentation, the extraction half of context propagation and baggage.

Common mistakes

Issuing requests outside an active span. When no span is current, the injected traceparent references no parent and the downstream SERVER span starts a new trace. Wrap outbound calls in start_as_current_span, or rely on an inbound server instrumentation to establish the parent, the same discipline applied when setting up OpenTelemetry in FastAPI.

Leaving full URLs unfiltered. Recording raw URLs that embed identifiers or tokens both leaks data and can inflate the cardinality of http.url, which complicates trace queries. Supply a url_filter to normalize paths, consistent with how sampling strategies keep trace volume manageable.

Creating a session per request. Constructing a new ClientSession for every call discards the connection pool, so each request pays a fresh TCP and TLS handshake and the CLIENT span's duration is dominated by connection setup rather than upstream work. Worse, sessions left unclosed leak connections and emit "Unclosed client session" warnings. Build one long-lived session per upstream and reuse it, so spans measure the call and not the handshake. This mirrors the broader rule for shared clients under concurrency covered in async tracing patterns.

Frequently Asked Questions

Does AioHttpClientInstrumentor inject trace headers automatically?

Yes. Once instrument() runs, every request issued through an aiohttp ClientSession gets a W3C traceparent header injected from the active context, so the downstream server span links to your client span without any manual header code.

Why is my downstream server not joining the trace?

Either the client request ran outside any active span, so there was no context to inject, or the receiving service is not extracting the traceparent header. Confirm the client span has a valid trace_id and that the server side runs a matching instrumentation that reads incoming context.

Can I instrument only some requests?

instrument() patches the client globally, but you can pass a request_hook and response_hook to enrich or filter spans, and a url_filter to redact or normalize URLs. There is no per-request opt-out short of uninstrumenting.

What span kind do client requests produce?

Each outgoing request produces a CLIENT span named for the HTTP method, with attributes such as http.method, http.url, and http.status_code. The receiving service produces the matching SERVER span under the same trace.

Does each retry get its own span when using a retry wrapper?

It depends where the retry lives. If you retry by issuing a fresh session.get inside the same parent span, each attempt produces its own CLIENT span, which is what you want for visibility into transient failures. A retry library that replays the same request object may reuse one span, hiding the attempt count, so prefer wrapping the call so each attempt re-enters the instrumented request path.

Frequently Asked Questions

Related Guides