Skip to content

HAProxy custom log format

Move past httplog to a structured HAProxy log — log-format fields, JSON output, the variables we capture, and how to wire it to Loki, Elastic, or CloudWatch.

The default option httplog line in HAProxy is dense, position-dependent, and a pain to parse downstream. Once you have more than one log consumer (a SIEM, a dashboard, an incident search), you want structured logs — JSON, key=value, or a custom layout your aggregator understands. This article covers the log-format directive, the variables you should capture, and the production pattern we ship to clients.

How to verify

Look at what HAProxy is producing now:

sudo tail -f /var/log/haproxy.log
grep -E 'log[ -]format' /etc/haproxy/haproxy.cfg
sudo journalctl -u haproxy -n 50 --no-pager | head -10
sudo haproxy -c -f /etc/haproxy/haproxy.cfg

If the lines look like 192.0.2.1:54321 [02/Jun/2026:14:23:01.123] fe_http be_app/app1 0/0/1/15/16 200 1234 ..., that is httplog. If they look like JSON, someone already customized.

What’s happening

HAProxy logs each request after it completes. The format is controlled by:

  • option httplog — the built-in HTTP format. Dense, fixed columns, classic.
  • option tcplog — the TCP equivalent, fewer fields.
  • log-format <string> — your own format, in HAProxy’s log-format syntax with variables.
  • log-format-sd <string> — RFC 5424 structured-data appendix; useful when shipping over syslog to a structured-data-aware collector.

The log-format string contains literal text and %[var] substitutions. Variables include [src], [hr] (header value), [ssl_fc], [backend_name], [server_name], [res.hdr], plus timing variables [TR], [Tw], [Tc], [Tr], [Ta] (request, wait, connect, response, active time).

Two output paths:

  • Sysloglog /dev/log local0 sends to the local syslog daemon; rsyslog or journald writes to disk. This is the default Ubuntu setup.
  • stdout/stderrlog stdout format raw (or stderr) sends to the systemd journal directly without going through syslog. Useful in containers.

The procedure

  1. Define a JSON log format. This is the production format we ship — every field has a name, the line is one valid JSON object per request:

    global
        log /dev/log local0
        log-format '{"timestamp":"%[date(iso)]","client_ip":"%ci","client_port":%cp,"frontend":"%f","backend":"%b","server":"%s","tr":%TR,"tw":%Tw,"tc":%Tc,"tr_dur":%Tr,"ta":%Ta,"status":%ST,"bytes_read":%B,"req_method":"%HM","req_uri":"%HU","req_proto":"%HV","req_id":"%ID","ssl":%[ssl_fc],"sni":"%[ssl_fc_sni]","host":"%[capture.req.hdr(0)]","user_agent":"%[capture.req.hdr(1)]","x_forwarded_for":"%[capture.req.hdr(2)]"}'

    Variables matter:

    • %TR — total request time (from client first byte to backend response start).
    • %Tw — time waiting in the queue.
    • %Tc — time to connect to the backend.
    • %Tr — time waiting for the backend response.
    • %Ta — total active time (sum).
    • %B — bytes read by the server (response size).
    • %ID — unique request ID if unique-id-format is set.
  2. Capture request headers. The log variables %[capture.req.hdr(N)] reference indexed header captures. Set them at the frontend:

    frontend fe_http
        bind *:80
        capture request header Host len 64
        capture request header User-Agent len 200
        capture request header X-Forwarded-For len 64
        default_backend be_app

    The capture index matches the %[capture.req.hdr(0)], (1), (2) ordering — first capture is index 0.

  3. Unique request ID for tracing. Generate one per request, log it, forward to backend:

    global
        unique-id-format %{+X}o\ %ci:%cp_%fi:%fp_%Ts_%rt:%pid
    
    frontend fe_http
        bind *:80
        http-request set-header X-Request-Id %[unique-id]

    The backend logs receive the X-Request-Id; engineers correlating an HAProxy log entry to a backend log entry now have a stable key.

  4. Wire to Loki. With promtail on the host:

    # /etc/promtail/config.yml
    scrape_configs:
      - job_name: haproxy
        static_configs:
          - targets: [localhost]
            labels:
              job: haproxy
              __path__: /var/log/haproxy.log
        pipeline_stages:
          - json:
              expressions:
                status: status
                backend: backend
                tr_dur: tr_dur
          - labels:
              status:
              backend:

    The pipeline_stages parses the JSON and promotes status and backend to indexable labels.

  5. Wire to Elastic / OpenSearch. Filebeat picks up the file and ships to Logstash or directly:

    # /etc/filebeat/filebeat.yml
    filebeat.inputs:
      - type: filestream
        paths: [/var/log/haproxy.log]
        parsers:
          - ndjson:
              keys_under_root: true
    output.elasticsearch:
      hosts: ["elastic.example.com:9200"]

    With ndjson parsing, Filebeat decodes each line into a structured document — no Logstash grok needed.

  6. Wire to CloudWatch. The CloudWatch agent picks up the file:

    {
      "logs": {
        "logs_collected": {
          "files": {
            "collect_list": [
              { "file_path": "/var/log/haproxy.log", "log_group_name": "haproxy", "log_stream_name": "{hostname}" }
            ]
          }
        }
      }
    }

    CloudWatch Logs Insights queries the JSON natively: fields @timestamp, status, tr_dur | filter status >= 500.

  7. Rotate the log file. The packaged logrotate config covers /var/log/haproxy.log:

    cat /etc/logrotate.d/haproxy
    sudo logrotate -d /etc/logrotate.d/haproxy

    For high-volume sites, rotate hourly with dateext and offload to S3 / object storage daily. Always test the rotate command in -d (debug) mode first.

Common pitfalls

  • The log-format string is interpreted by the shell on copy-paste — quote it as a single-quoted YAML/string in your config management to avoid mangling. The % character is literal, the \ continuation matters.
  • Capturing a header that does not exist yields an empty value; in JSON that becomes "". If your parser refuses empty strings, validate with %[hdr(X),default("-")].
  • The default option httplog overrides log-format if it appears later in the defaults block. Either remove option httplog or place log-format after it.
  • Sending HAProxy logs to journald without a structured format makes them less useful. Either keep file-based logs and ship the file, or use log-format to produce JSON before journald.
  • The high-cardinality fields (unique-id, full URL, user-agent) bloat your storage. Drop them at the aggregator if cost matters; never log them if you do not need them.

Stack Harbor ships HAProxy with a JSON log-format from day one — request ID, timing variables, backend identity, response status, and the four or five headers we care about. Logs feed the same aggregator as the application logs so an incident search finds both. This is part of how we wire observability for Managed Operations — the load balancer is not an opaque box.