Skip to content

Observability

AuthPlane exposes Prometheus metrics on the admin port, structured JSON logs to stdout, and optional OTLP export for logs, metrics, and traces.

SurfaceWhereConfigured by
Prometheus metricsGET /metrics on the admin port (default :9001)observability.metrics.provider: prometheus (default), observability.metrics.path: /metrics
Structured logsstdout, JSONobservability.logging.format: json (default), level via AUTHPLANE_LOG_LEVEL
OTLP logsobservability.logging.outputs.otel: true + endpointAUTHPLANE_LOG_OTEL, AUTHPLANE_LOG_OTEL_ENDPOINT
OTLP tracesobservability.tracing.enabled: true + endpointAUTHPLANE_TRACING_ENABLED, AUTHPLANE_TRACING_ENDPOINT
OTLP metrics (in parallel with Prometheus)observability.metrics.provider: bothAUTHPLANE_METRICS_PROVIDER, AUTHPLANE_METRICS_OTEL_ENDPOINT

/metrics is bound on the admin port so it is not publicly exposed by default. Make sure your scraper can reach it — same cluster network in Kubernetes, loopback for systemd.

prometheus.yml
scrape_configs:
- job_name: authserver
scrape_interval: 15s
metrics_path: /metrics
static_configs:
- targets: ["authserver:9001"] # admin port

In Kubernetes via the Helm chart, serviceMonitor.enabled: true already points at the admin port. The metrics provider has three modes — prometheus (default pull-based), otel (push over OTLP), or both.

observability:
tracing:
enabled: true # AUTHPLANE_TRACING_ENABLED
endpoint: otel-collector:4317 # AUTHPLANE_TRACING_ENDPOINT
insecure: true # AUTHPLANE_TRACING_INSECURE (TLS off for local)
sample_rate: 1.0 # AUTHPLANE_TRACING_SAMPLE_RATE; lower for prod

Drop the sample rate (e.g. 0.1) once you’ve validated traces — high cardinality at full sample is expensive.

observability:
logging:
level: info # AUTHPLANE_LOG_LEVEL
format: json
outputs:
stdout: true # keep stdout; pod-logs are forever
otel: true # AUTHPLANE_LOG_OTEL
otel_endpoint: otel-collector:4317
insecure: true

Each log line includes trace_id, span_id, request_id, client_id, and the grant name — click a trace in Grafana Tempo to pivot to the matching logs in Loki.

The repo ships a Grafana LGTM overlay at deploy/observability/docker-compose.observability.yml running Alloy (OTLP gateway on :4317), Tempo (traces), Loki (logs), Mimir (metrics), Prometheus, and Grafana. Run it standalone:

Terminal window
docker compose -f deploy/observability/docker-compose.observability.yml up -d
# Grafana at http://localhost:3000 (admin/admin)

Or pull it into your own stack with Compose’s include: directive and point logs, traces, and metrics at alloy:4317 — this is exactly what the reference deploy/docker-compose.yml does.

The prefix is currently mixed: authserver_* for the OAuth core, authplane_* for newer subsystems (DPoP, token exchange, client credentials, XAA). Both are live.

MetricWhy operators care
authserver_tokens_issued_total{grant_type}Throughput baseline, anomaly detection
authserver_refresh_token_reuse_totalStolen-token reuse (RFC 6749 §10.4) — page on any non-zero increase
authserver_auth_denied_total{reason}locked_out = brute-force; invalid_client = misconfigured caller
authplane_dpop_proofs_rejected_totalToken-binding violations
authplane_token_exchange_denied_totalCross-client policy denials
authserver_token_issuance_duration_seconds{grant_type}Per-grant p99 latency
authserver_http_request_duration_seconds{method,path,status}Full HTTP-surface SLO
authserver_active_token_familiesOutstanding token families; sizing input for the purge schedule

The exhaustive instrument list lives in docs/reference/metrics.md, generated from the metrics source of truth.

groups:
- name: authserver
rules:
- alert: RefreshTokenReuse # page-worthy
expr: increase(authserver_refresh_token_reuse_total[5m]) > 0
for: 1m
labels: { severity: critical }
annotations:
summary: "Refresh token reuse — possible theft"
- alert: HighAuthDenialRate # likely an attack or a regression
expr: rate(authserver_auth_denied_total[5m]) > 10
for: 5m
labels: { severity: warning }
annotations:
summary: "auth_denied rate >10/s for 5m"
- alert: TokenP99Slow # something's wrong with signing or DB
expr: histogram_quantile(0.99, rate(authserver_token_issuance_duration_seconds_bucket[5m])) > 2
for: 5m
labels: { severity: warning }
annotations:
summary: "Token issuance p99 > 2s"
Terminal window
# Scrape the admin port — sanity-check metric names
curl -fsS http://localhost:9001/metrics | grep -E '^authserver_|^authplane_' | head -5
# Confirm tracing is exporting (look for the OTLP grpc connection log line)
kubectl logs deploy/authplane | grep -iE 'otlp|tracer'

Next: Docker · Kubernetes · Configuration