Skip to content

Traefik rate limit middleware — average, burst, sourceCriterion, and what each does

Configure the Traefik rateLimit middleware — token bucket parameters, per-IP vs per-header limiting, and choosing limits that protect without false positives.

The Traefik rateLimit middleware throttles incoming traffic using a token-bucket algorithm — average requests per second, with a burst allowance for momentary spikes. It is the right primitive for protecting expensive endpoints (login, search, write APIs) from credential-stuffing, scrapers, and accidental client loops. Getting the parameters wrong is what trips real users — a too-tight limit on legitimate traffic and a too-loose limit that doesn’t stop the abuse. This article covers the two parameters, the sourceCriterion for “limit by what”, and the patterns we deploy.

How to verify

# Hammer the route and watch for 429
for i in $(seq 1 50); do
  curl -sI https://api.example.com/login -o /dev/null -w "%{http_code}\n"
done | sort | uniq -c
# Expected: most 200, a tail of 429 once the bucket drains
# Confirm the middleware is wired
curl -s http://127.0.0.1:8082/api/http/middlewares | jq '.[] | select(.type=="rateLimit")'
# Per-route 429 count in Prometheus
curl -s http://127.0.0.1:9100/metrics | grep -E 'traefik_service_requests_total.*code="429"'

What’s happening

The token bucket starts full. Every incoming request consumes one token. Tokens refill at a rate of average per second. When the bucket is empty, requests get HTTP 429. The burst parameter is the maximum tokens the bucket holds — a higher burst means a bigger initial spike is allowed.

A common confusion: average is in requests per period, where period is by default 1 second. So average: 100, burst: 50 means 100 RPS sustained, with up to 50 extra requests in a single spike. You can set period: 1m to talk in requests-per-minute; the math is the same but the unit changes.

The middleware needs to know what to limit by. By default it limits per source IP using the immediate peer. Behind a proxy you need sourceCriterion.ipStrategy.depth to walk X-Forwarded-For (same semantics as ipAllowList). You can also limit by a request header value (X-Tenant, Authorization) using sourceCriterion.requestHeaderName — useful for per-tenant or per-API-key limits.

The store is in-memory and per-Traefik-instance. If you run two Traefik replicas, each has its own bucket and the effective limit is 2x average. There is no shared-store option in core; for true global limits you need a plugin backed by Redis or similar.

The procedure

  1. Basic per-IP rate limit. 100 RPS sustained, 50-request burst, default per-IP keying.

    http:
      middlewares:
        rl-api:
          rateLimit:
            average: 100
            burst: 50
      routers:
        api:
          rule: "Host(`api.example.com`)"
          entryPoints: [websecure]
          service: api-backend
          middlewares: [rl-api]
  2. Tight limit on a sensitive endpoint. Login gets a stricter limit and a longer window — 5 attempts per minute, no burst.

    http:
      middlewares:
        rl-login:
          rateLimit:
            average: 5
            period: 1m
            burst: 1
      routers:
        login:
          rule: "Host(`api.example.com`) && Path(`/login`)"
          entryPoints: [websecure]
          service: api-backend
          middlewares: [rl-login]
  3. Per-tenant rate limit. Limit by X-Tenant header — each tenant gets its own bucket.

    http:
      middlewares:
        rl-tenant:
          rateLimit:
            average: 1000
            burst: 200
            sourceCriterion:
              requestHeaderName: X-Tenant
      routers:
        api:
          rule: "Host(`api.example.com`)"
          entryPoints: [websecure]
          service: api-backend
          middlewares: [rl-tenant]
  4. Behind a CDN / load balancer. Set ipStrategy.depth to walk past trusted hops.

    http:
      middlewares:
        rl-api:
          rateLimit:
            average: 100
            burst: 50
            sourceCriterion:
              ipStrategy:
                depth: 2     # behind CDN + LB
  5. Combine with a retry middleware — but be careful. Retries do not count toward the bucket (they go to the same backend, not the same Traefik entry), but a 429 returned to the client triggers client-side retries that DO refill the load.

    http:
      routers:
        api-reads:
          rule: "Host(`api.example.com`) && Method(`GET`)"
          entryPoints: [websecure]
          service: api-backend
          middlewares: [rl-api, retry-safe]
  6. Kubernetes IngressRoute equivalent.

    apiVersion: traefik.io/v1alpha1
    kind: Middleware
    metadata:
      name: rl-api
      namespace: my-app
    spec:
      rateLimit:
        average: 100
        burst: 50
        sourceCriterion:
          ipStrategy:
            depth: 2

Operational notes

  • Rate-limit values should map to realistic traffic — an audit of the existing P99 RPS per route is the right baseline before flipping the middleware on.
  • Per-IP limits punish NAT’d users — an office of 50 sharing one egress IP hits the limit immediately. Soften with a higher limit, or layer with a per-session limit upstream.
  • The default in-memory store means N replicas of Traefik = N times the effective limit. Set the average to 1/N of your target when running multiple replicas, or use a single-replica Traefik for the rate-limited surface.
  • 429 responses should include a Retry-After header so well-behaved clients back off; Traefik does not add this automatically. A customResponseHeaders middleware in front can add it as a hint.
  • A rate-limit alert on a sudden jump in 429s is more useful than the limit itself — it tells you when the limit is working overtime.

Rate limiting is one of the most underused middlewares because the parameters require thought — a value picked from a blog post rarely matches your traffic. The audit-then-tune approach is part of how Stack Harbor deploys managed operations. For the related retry middleware see traefik-retry-middleware.