WAL-G is a streaming backup tool for PostgreSQL (and several other engines) that targets object storage directly: base backups stream out of pg_basebackup-like internals, WAL segments are uploaded as PostgreSQL closes them, and restores stream back without intermediate disk. It is the tool we reach for when pgBackRest’s repo-on-disk model is awkward — typically when the cluster runs in Kubernetes or on ephemeral instances with no large local volume for staging. This article covers a working WAL-G install with S3, the delta-backup feature that keeps the data cheap, and the restore drill.
How to verify
After install and the first base backup, confirm the binary, the env, the archive command, and the S3 contents:
wal-g --version
sudo -u postgres psql -c "SHOW archive_command;"
sudo -u postgres -- env WALG_S3_PREFIX=s3://acme-wal/prod wal-g backup-list
aws s3 ls s3://acme-wal/prod/wal_005/ | head
aws s3 ls s3://acme-wal/prod/basebackups_005/ | head
The two prefixes wal_005/ (for PostgreSQL 16; the number tracks the wal-protocol version) and basebackups_005/ should both be populated after a base backup + a few WAL switches.
What’s happening
WAL-G uses PostgreSQL’s standard archive_command to push closed WAL segments. The trick is that WAL-G uploads them concurrently in batches, with optional client-side encryption and compression, directly to object storage — no intermediate disk. Base backups read the cluster’s data files through the same replication protocol pg_basebackup uses, tar them on the fly, encrypt+compress, and stream to S3.
The “delta backup” feature is the cost lever: WAL-G base backups can be incremental against a prior full, storing only the file pages that changed (using the page LSN against the previous backup’s LSN). A typical retention is one full per week, deltas the rest of the days, plus continuous WAL. Restore re-applies the latest full + chain of deltas + WAL segments.
The encryption story is client-side. Set WALG_LIBSODIUM_KEY (or WALG_PGP_KEY) and every object WAL-G writes is encrypted with that symmetric key before upload. The S3 provider sees ciphertext. Lose the key, lose the data.
The procedure
-
Install WAL-G. The upstream publishes static binaries; on Debian/Ubuntu we typically drop it to
/usr/local/bin/wal-g.curl -fsSL -o /tmp/wal-g.tar.gz https://github.com/wal-g/wal-g/releases/download/v3.0.3/wal-g-pg-ubuntu-22.04-amd64.tar.gz tar xf /tmp/wal-g.tar.gz -C /usr/local/bin/ /usr/local/bin/wal-g --version -
Configure WAL-G via environment for the
postgresuser. We use a drop-in systemd override on top of the PostgreSQL unit so the env is loaded for every command PostgreSQL invokes (includingarchive_command).# /etc/systemd/system/[email protected]/walg.conf [Service] Environment=AWS_REGION=us-east-1 Environment=WALG_S3_PREFIX=s3://acme-wal/prod Environment=WALG_COMPRESSION_METHOD=brotli Environment=WALG_LIBSODIUM_KEY_TRANSFORM=base64 EnvironmentFile=/etc/wal-g/secrets.envWith
/etc/wal-g/secrets.env(mode 0600, postgres:postgres) holdingAWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY, andWALG_LIBSODIUM_KEY. -
Wire PostgreSQL to push WAL through WAL-G.
# postgresql.conf archive_mode = on archive_command = '/usr/local/bin/wal-g wal-push %p' archive_timeout = 60 wal_level = replica max_wal_senders = 10Restart PostgreSQL. Confirm
pg_stat_archivershows successful archives after aSELECT pg_switch_wal();. -
Take the first base backup.
sudo -u postgres /usr/local/bin/wal-g backup-push /var/lib/postgresql/16/main sudo -u postgres /usr/local/bin/wal-g backup-list -
Schedule rotation. Weekly full, daily delta, retention via
backup-retain.# /etc/cron.d/wal-g 30 02 * * 0 postgres /usr/local/bin/wal-g backup-push /var/lib/postgresql/16/main 30 02 * * 1-6 postgres /usr/local/bin/wal-g backup-push --full=false /var/lib/postgresql/16/main 00 04 * * 0 postgres /usr/local/bin/wal-g delete retain FULL 8 --confirm -
Restore drill. The drill must run quarterly and the report should be in the same dashboard as everything else.
systemctl stop postgresql@16-main rm -rf /var/lib/postgresql/16/main/* sudo -u postgres /usr/local/bin/wal-g backup-fetch /var/lib/postgresql/16/main LATEST echo "restore_command = '/usr/local/bin/wal-g wal-fetch %f %p'" \ | sudo -u postgres tee /var/lib/postgresql/16/main/postgresql.auto.conf -a sudo -u postgres touch /var/lib/postgresql/16/main/recovery.signal systemctl start postgresql@16-main sudo -u postgres psql -c "SELECT now(), pg_is_in_recovery();"
Operational notes
WALG_LIBSODIUM_KEYis the recovery key. Put it in your password manager, in your runbook, and on paper somewhere offline. Losing it means the backups are noise.archive_commandfailure is silent unless you alert onpg_stat_archiver.failed_count. WAL-G does not retry endlessly — PostgreSQL will, and the WAL directory fills up.- The delta-backup chain has a length limit. Beyond about 7 deltas the restore time exceeds the time saved on backup; tune
WALG_DELTA_MAX_STEPSand force a new full periodically. - S3 storage class matters. We default to S3 Standard for the first 7 days of WAL (hot restores), then a lifecycle rule to Standard-IA, then optionally Glacier Instant Retrieval for long retention. Glacier Deep Archive breaks restore time SLAs.
- WAL-G can back up logical-replication-aware streams (
wal-g db-archive) but the common case stays physical. Logical streams are a niche we evaluate per engagement. - WAL-G also handles MySQL, MongoDB, FoundationDB, and Redis with different sub-commands. The PostgreSQL surface is the most mature; treat the others as separate evaluations.
In the engagements we run, WAL-G is the choice when PostgreSQL lives in a container or on instances that can’t hold a local repo. The full + daily-delta + continuous-WAL pattern, paired with a client-side encryption key managed in our vault, gives us an S3-native backup model with PITR that survives the loss of the entire cluster fleet. The full operational wrap is at /en/services/managed-operations/.