Skip to content

WAL-G continuous WAL archive for PostgreSQL

How WAL-G streams base backups and WAL segments straight to object storage, with delta backups, parallelization, and the encryption and storage class settings we use in production.

WAL-G is a streaming backup tool for PostgreSQL (and several other engines) that targets object storage directly: base backups stream out of pg_basebackup-like internals, WAL segments are uploaded as PostgreSQL closes them, and restores stream back without intermediate disk. It is the tool we reach for when pgBackRest’s repo-on-disk model is awkward — typically when the cluster runs in Kubernetes or on ephemeral instances with no large local volume for staging. This article covers a working WAL-G install with S3, the delta-backup feature that keeps the data cheap, and the restore drill.

How to verify

After install and the first base backup, confirm the binary, the env, the archive command, and the S3 contents:

wal-g --version
sudo -u postgres psql -c "SHOW archive_command;"
sudo -u postgres -- env WALG_S3_PREFIX=s3://acme-wal/prod wal-g backup-list
aws s3 ls s3://acme-wal/prod/wal_005/ | head
aws s3 ls s3://acme-wal/prod/basebackups_005/ | head

The two prefixes wal_005/ (for PostgreSQL 16; the number tracks the wal-protocol version) and basebackups_005/ should both be populated after a base backup + a few WAL switches.

What’s happening

WAL-G uses PostgreSQL’s standard archive_command to push closed WAL segments. The trick is that WAL-G uploads them concurrently in batches, with optional client-side encryption and compression, directly to object storage — no intermediate disk. Base backups read the cluster’s data files through the same replication protocol pg_basebackup uses, tar them on the fly, encrypt+compress, and stream to S3.

The “delta backup” feature is the cost lever: WAL-G base backups can be incremental against a prior full, storing only the file pages that changed (using the page LSN against the previous backup’s LSN). A typical retention is one full per week, deltas the rest of the days, plus continuous WAL. Restore re-applies the latest full + chain of deltas + WAL segments.

The encryption story is client-side. Set WALG_LIBSODIUM_KEY (or WALG_PGP_KEY) and every object WAL-G writes is encrypted with that symmetric key before upload. The S3 provider sees ciphertext. Lose the key, lose the data.

The procedure

  1. Install WAL-G. The upstream publishes static binaries; on Debian/Ubuntu we typically drop it to /usr/local/bin/wal-g.

    curl -fsSL -o /tmp/wal-g.tar.gz https://github.com/wal-g/wal-g/releases/download/v3.0.3/wal-g-pg-ubuntu-22.04-amd64.tar.gz
    tar xf /tmp/wal-g.tar.gz -C /usr/local/bin/
    /usr/local/bin/wal-g --version
  2. Configure WAL-G via environment for the postgres user. We use a drop-in systemd override on top of the PostgreSQL unit so the env is loaded for every command PostgreSQL invokes (including archive_command).

    # /etc/systemd/system/[email protected]/walg.conf
    [Service]
    Environment=AWS_REGION=us-east-1
    Environment=WALG_S3_PREFIX=s3://acme-wal/prod
    Environment=WALG_COMPRESSION_METHOD=brotli
    Environment=WALG_LIBSODIUM_KEY_TRANSFORM=base64
    EnvironmentFile=/etc/wal-g/secrets.env

    With /etc/wal-g/secrets.env (mode 0600, postgres:postgres) holding AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and WALG_LIBSODIUM_KEY.

  3. Wire PostgreSQL to push WAL through WAL-G.

    # postgresql.conf
    archive_mode = on
    archive_command = '/usr/local/bin/wal-g wal-push %p'
    archive_timeout = 60
    wal_level = replica
    max_wal_senders = 10

    Restart PostgreSQL. Confirm pg_stat_archiver shows successful archives after a SELECT pg_switch_wal();.

  4. Take the first base backup.

    sudo -u postgres /usr/local/bin/wal-g backup-push /var/lib/postgresql/16/main
    sudo -u postgres /usr/local/bin/wal-g backup-list
  5. Schedule rotation. Weekly full, daily delta, retention via backup-retain.

    # /etc/cron.d/wal-g
    30 02 * * 0 postgres /usr/local/bin/wal-g backup-push /var/lib/postgresql/16/main
    30 02 * * 1-6 postgres /usr/local/bin/wal-g backup-push --full=false /var/lib/postgresql/16/main
    00 04 * * 0 postgres /usr/local/bin/wal-g delete retain FULL 8 --confirm
  6. Restore drill. The drill must run quarterly and the report should be in the same dashboard as everything else.

    systemctl stop postgresql@16-main
    rm -rf /var/lib/postgresql/16/main/*
    sudo -u postgres /usr/local/bin/wal-g backup-fetch /var/lib/postgresql/16/main LATEST
    echo "restore_command = '/usr/local/bin/wal-g wal-fetch %f %p'" \
      | sudo -u postgres tee /var/lib/postgresql/16/main/postgresql.auto.conf -a
    sudo -u postgres touch /var/lib/postgresql/16/main/recovery.signal
    systemctl start postgresql@16-main
    sudo -u postgres psql -c "SELECT now(), pg_is_in_recovery();"

Operational notes

  • WALG_LIBSODIUM_KEY is the recovery key. Put it in your password manager, in your runbook, and on paper somewhere offline. Losing it means the backups are noise.
  • archive_command failure is silent unless you alert on pg_stat_archiver.failed_count. WAL-G does not retry endlessly — PostgreSQL will, and the WAL directory fills up.
  • The delta-backup chain has a length limit. Beyond about 7 deltas the restore time exceeds the time saved on backup; tune WALG_DELTA_MAX_STEPS and force a new full periodically.
  • S3 storage class matters. We default to S3 Standard for the first 7 days of WAL (hot restores), then a lifecycle rule to Standard-IA, then optionally Glacier Instant Retrieval for long retention. Glacier Deep Archive breaks restore time SLAs.
  • WAL-G can back up logical-replication-aware streams (wal-g db-archive) but the common case stays physical. Logical streams are a niche we evaluate per engagement.
  • WAL-G also handles MySQL, MongoDB, FoundationDB, and Redis with different sub-commands. The PostgreSQL surface is the most mature; treat the others as separate evaluations.

In the engagements we run, WAL-G is the choice when PostgreSQL lives in a container or on instances that can’t hold a local repo. The full + daily-delta + continuous-WAL pattern, paired with a client-side encryption key managed in our vault, gives us an S3-native backup model with PITR that survives the loss of the entire cluster fleet. The full operational wrap is at /en/services/managed-operations/.