Garage is a lighter, geo-aware alternative to MinIO and Ceph for self-hosted S3 — written in Rust, deliberately designed to tolerate high inter-node latency, and capable of running on consumer-grade hardware including ARM SBCs. It targets a niche where MinIO requires too many resources, Ceph is too operationally heavy, and the workload tolerates eventual consistency. This article documents the four-node install we use when a customer wants self-hosted S3 across two physical sites without the operational weight of Ceph multi-site, and where Garage fits and doesn’t.
How to verify
# Service status on every node
sudo systemctl status garage
sudo journalctl -u garage --since "5 minutes ago" | tail -30
# Cluster status and layout
sudo garage status
sudo garage layout show
# Bucket and key inventory
sudo garage bucket list
sudo garage key list
# S3 API check
curl -i http://localhost:3900/
What’s happening
Garage’s architecture splits into two services: an API server speaking S3, and a storage layer that persists object data and metadata. Nodes form a cluster via a gossip protocol; each node knows about every other, and the cluster layout specifies how data is replicated. The “zone” concept is first-class — when you assign a layout, you declare which zone each node lives in (e.g. dc1, dc2, home), and Garage places replicas to maximize zone diversity. With replication factor 3 and 3 zones, each object has one replica per zone, which gives you single-DC-loss tolerance.
The key tradeoff vs MinIO is consistency. Garage is eventually consistent for writes — a PUT returns success when one replica confirms, and other replicas catch up asynchronously. A read immediately after a write may not see the new object if the read hits a different replica. For backup, archive, and CDN-origin workloads, that’s fine; for transactional applications that read-after-write, it’s a correctness problem.
Geographic distribution is the angle Garage genuinely shines at. The protocol assumes inter-node links may be slow and lossy, and tolerates partition windows that would push MinIO out of quorum. You can run a cluster across an office DC and a colo location 100ms apart and have it work. MinIO will technically work in that topology but writes get slow because every fragment must be ACKed by remote nodes.
The hardware story is also different. Garage runs on a Raspberry Pi 4 with 2 GB RAM and an external USB SSD; a small cluster is realistic on commodity SBCs. MinIO needs more, Ceph needs much more. For lab and homelab S3 the resource floor is meaningfully lower.
The procedure
-
On four nodes (
garage1throughgarage4, withgarage1+garage2in zonedc1andgarage3+garage4in zonedc2), prepare data and metadata directories on separate SSDs if possible:sudo mkdir -p /var/lib/garage/data /var/lib/garage/meta sudo useradd -r garage -s /sbin/nologin sudo chown -R garage:garage /var/lib/garage -
Install the Garage binary:
wget https://garagehq.deuxfleurs.fr/_releases/v1.0.0/x86_64-unknown-linux-musl/garage sudo install -m 0755 garage /usr/local/bin/ -
Generate a shared
rpc_secret(32 random hex bytes — same on every node):openssl rand -hex 32 -
Create
/etc/garage.tomlon every node (with node-specificreplication_modeand per-zone identity):metadata_dir = "/var/lib/garage/meta" data_dir = "/var/lib/garage/data" db_engine = "lmdb" replication_mode = "3" rpc_bind_addr = "[::]:3901" rpc_public_addr = "10.0.0.11:3901" rpc_secret = "<32-byte-hex>" [s3_api] s3_region = "garage-cluster" api_bind_addr = "[::]:3900" root_domain = ".s3.example.com" [s3_web] bind_addr = "[::]:3902" root_domain = ".web.example.com" index = "index.html" [admin] api_bind_addr = "[::]:3903" admin_token = "<random-admin-token>" -
Create a systemd unit and start on each node:
# /etc/systemd/system/garage.service [Unit] Description=Garage S3 After=network-online.target [Service] User=garage ExecStart=/usr/local/bin/garage server Restart=always [Install] WantedBy=multi-user.targetsudo systemctl enable --now garage -
From one node, register the cluster layout (assigning each node to a zone and capacity):
sudo garage status # note the node IDs sudo garage layout assign <node-id-1> -z dc1 -c 1T -t garage1 sudo garage layout assign <node-id-2> -z dc1 -c 1T -t garage2 sudo garage layout assign <node-id-3> -z dc2 -c 1T -t garage3 sudo garage layout assign <node-id-4> -z dc2 -c 1T -t garage4 sudo garage layout show sudo garage layout apply --version 1 -
Create an S3 user and bucket:
sudo garage key create app01 # Output: KeyID and SecretKey sudo garage bucket create app-uploads sudo garage bucket allow app-uploads --read --write --owner --key app01 # Test from a workstation aws --endpoint-url http://garage1:3900 s3 ls aws --endpoint-url http://garage1:3900 s3 cp /etc/hostname s3://app-uploads/
Common pitfalls
rpc_secretmismatch between nodes is the most common first-day failure — nodes silently refuse to join the cluster. Distribute the secret via your config management before any node starts.- Replication factor 3 with only 2 zones means at least two replicas land in the same zone — defeats the geo-fault tolerance. Always have N zones for replication factor N.
- The first-time
layout applyrequires every node to be reachable. If one is down, the apply fails and the cluster has no layout. Plan a maintenance window for the initial bring-up. - Eventually consistent reads are a real surprise to applications. Test with your specific read-after-write patterns; Garage isn’t a drop-in replacement for strongly consistent stores like MinIO with
default-fsmode. - Garage’s S3 implementation is partial — multipart, presigned URLs, basic bucket policies work, but newer S3 features and Object Lock may not. Check your SDK’s specific call surface.
In the engagements we run, Garage shows up for two patterns: geo-distributed backup targets where consistency can be eventual and the WAN link is the constraint, and lab/edge deployments where the resource budget can’t sustain MinIO. We monitor cluster status and per-node disk usage, alert on layout discrepancies, and document the consistency model with the customer at deploy time — Garage is a different deal from strongly consistent S3, and the operating model needs to acknowledge that.