The default option httplog line in HAProxy is dense, position-dependent, and a pain to parse downstream. Once you have more than one log consumer (a SIEM, a dashboard, an incident search), you want structured logs — JSON, key=value, or a custom layout your aggregator understands. This article covers the log-format directive, the variables you should capture, and the production pattern we ship to clients.

How to verify

Look at what HAProxy is producing now:

sudo tail -f /var/log/haproxy.log
grep -E 'log[ -]format' /etc/haproxy/haproxy.cfg
sudo journalctl -u haproxy -n 50 --no-pager | head -10
sudo haproxy -c -f /etc/haproxy/haproxy.cfg

If the lines look like 192.0.2.1:54321 [02/Jun/2026:14:23:01.123] fe_http be_app/app1 0/0/1/15/16 200 1234 ..., that is httplog. If they look like JSON, someone already customized.

What’s happening

HAProxy logs each request after it completes. The format is controlled by:

option httplog — the built-in HTTP format. Dense, fixed columns, classic.
option tcplog — the TCP equivalent, fewer fields.
log-format <string> — your own format, in HAProxy’s log-format syntax with variables.
log-format-sd <string> — RFC 5424 structured-data appendix; useful when shipping over syslog to a structured-data-aware collector.

The log-format string contains literal text and %[var] substitutions. Variables include [src], [hr] (header value), [ssl_fc], [backend_name], [server_name], [res.hdr], plus timing variables [TR], [Tw], [Tc], [Tr], [Ta] (request, wait, connect, response, active time).

Two output paths:

Syslog — log /dev/log local0 sends to the local syslog daemon; rsyslog or journald writes to disk. This is the default Ubuntu setup.
stdout/stderr — log stdout format raw (or stderr) sends to the systemd journal directly without going through syslog. Useful in containers.

The procedure

Define a JSON log format. This is the production format we ship — every field has a name, the line is one valid JSON object per request:
```
global
    log /dev/log local0
    log-format '{"timestamp":"%[date(iso)]","client_ip":"%ci","client_port":%cp,"frontend":"%f","backend":"%b","server":"%s","tr":%TR,"tw":%Tw,"tc":%Tc,"tr_dur":%Tr,"ta":%Ta,"status":%ST,"bytes_read":%B,"req_method":"%HM","req_uri":"%HU","req_proto":"%HV","req_id":"%ID","ssl":%[ssl_fc],"sni":"%[ssl_fc_sni]","host":"%[capture.req.hdr(0)]","user_agent":"%[capture.req.hdr(1)]","x_forwarded_for":"%[capture.req.hdr(2)]"}'
```
Variables matter:
- %TR — total request time (from client first byte to backend response start).
- %Tw — time waiting in the queue.
- %Tc — time to connect to the backend.
- %Tr — time waiting for the backend response.
- %Ta — total active time (sum).
- %B — bytes read by the server (response size).
- %ID — unique request ID if unique-id-format is set.

Capture request headers. The log variables %[capture.req.hdr(N)] reference indexed header captures. Set them at the frontend:

frontend fe_http
    bind *:80
    capture request header Host len 64
    capture request header User-Agent len 200
    capture request header X-Forwarded-For len 64
    default_backend be_app

The capture index matches the %[capture.req.hdr(0)], (1), (2) ordering — first capture is index 0.

Unique request ID for tracing. Generate one per request, log it, forward to backend:
```
global
    unique-id-format %{+X}o\ %ci:%cp_%fi:%fp_%Ts_%rt:%pid

frontend fe_http
    bind *:80
    http-request set-header X-Request-Id %[unique-id]
```
The backend logs receive the X-Request-Id; engineers correlating an HAProxy log entry to a backend log entry now have a stable key.

Wire to Loki. With promtail on the host:

# /etc/promtail/config.yml
scrape_configs:
  - job_name: haproxy
    static_configs:
      - targets: [localhost]
        labels:
          job: haproxy
          __path__: /var/log/haproxy.log
    pipeline_stages:
      - json:
          expressions:
            status: status
            backend: backend
            tr_dur: tr_dur
      - labels:
          status:
          backend:

The pipeline_stages parses the JSON and promotes status and backend to indexable labels.

Wire to Elastic / OpenSearch. Filebeat picks up the file and ships to Logstash or directly:

# /etc/filebeat/filebeat.yml
filebeat.inputs:
  - type: filestream
    paths: [/var/log/haproxy.log]
    parsers:
      - ndjson:
          keys_under_root: true
output.elasticsearch:
  hosts: ["elastic.example.com:9200"]

With ndjson parsing, Filebeat decodes each line into a structured document — no Logstash grok needed.

Wire to CloudWatch. The CloudWatch agent picks up the file:

{
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [
          { "file_path": "/var/log/haproxy.log", "log_group_name": "haproxy", "log_stream_name": "{hostname}" }
        ]
      }
    }
  }
}

CloudWatch Logs Insights queries the JSON natively: fields @timestamp, status, tr_dur | filter status >= 500.

Rotate the log file. The packaged logrotate config covers /var/log/haproxy.log:
```
cat /etc/logrotate.d/haproxy
sudo logrotate -d /etc/logrotate.d/haproxy
```
For high-volume sites, rotate hourly with dateext and offload to S3 / object storage daily. Always test the rotate command in -d (debug) mode first.

Common pitfalls

The log-format string is interpreted by the shell on copy-paste — quote it as a single-quoted YAML/string in your config management to avoid mangling. The % character is literal, the \ continuation matters.
Capturing a header that does not exist yields an empty value; in JSON that becomes "". If your parser refuses empty strings, validate with %[hdr(X),default("-")].
The default option httplog overrides log-format if it appears later in the defaults block. Either remove option httplog or place log-format after it.
Sending HAProxy logs to journald without a structured format makes them less useful. Either keep file-based logs and ship the file, or use log-format to produce JSON before journald.
The high-cardinality fields (unique-id, full URL, user-agent) bloat your storage. Drop them at the aggregator if cost matters; never log them if you do not need them.

Stack Harbor ships HAProxy with a JSON log-format from day one — request ID, timing variables, backend identity, response status, and the four or five headers we care about. Logs feed the same aggregator as the application logs so an incident search finds both. This is part of how we wire observability for Managed Operations — the load balancer is not an opaque box.