Skip to content

Garage: self-hosted S3 cluster designed for geo-distributed deployment

Install Garage, a Rust-based S3-compatible object store built for low-end hardware and geo-distributed nodes, with the layout config and the operational tradeoffs vs MinIO.

Garage is a lighter, geo-aware alternative to MinIO and Ceph for self-hosted S3 — written in Rust, deliberately designed to tolerate high inter-node latency, and capable of running on consumer-grade hardware including ARM SBCs. It targets a niche where MinIO requires too many resources, Ceph is too operationally heavy, and the workload tolerates eventual consistency. This article documents the four-node install we use when a customer wants self-hosted S3 across two physical sites without the operational weight of Ceph multi-site, and where Garage fits and doesn’t.

How to verify

# Service status on every node
sudo systemctl status garage
sudo journalctl -u garage --since "5 minutes ago" | tail -30

# Cluster status and layout
sudo garage status
sudo garage layout show

# Bucket and key inventory
sudo garage bucket list
sudo garage key list

# S3 API check
curl -i http://localhost:3900/

What’s happening

Garage’s architecture splits into two services: an API server speaking S3, and a storage layer that persists object data and metadata. Nodes form a cluster via a gossip protocol; each node knows about every other, and the cluster layout specifies how data is replicated. The “zone” concept is first-class — when you assign a layout, you declare which zone each node lives in (e.g. dc1, dc2, home), and Garage places replicas to maximize zone diversity. With replication factor 3 and 3 zones, each object has one replica per zone, which gives you single-DC-loss tolerance.

The key tradeoff vs MinIO is consistency. Garage is eventually consistent for writes — a PUT returns success when one replica confirms, and other replicas catch up asynchronously. A read immediately after a write may not see the new object if the read hits a different replica. For backup, archive, and CDN-origin workloads, that’s fine; for transactional applications that read-after-write, it’s a correctness problem.

Geographic distribution is the angle Garage genuinely shines at. The protocol assumes inter-node links may be slow and lossy, and tolerates partition windows that would push MinIO out of quorum. You can run a cluster across an office DC and a colo location 100ms apart and have it work. MinIO will technically work in that topology but writes get slow because every fragment must be ACKed by remote nodes.

The hardware story is also different. Garage runs on a Raspberry Pi 4 with 2 GB RAM and an external USB SSD; a small cluster is realistic on commodity SBCs. MinIO needs more, Ceph needs much more. For lab and homelab S3 the resource floor is meaningfully lower.

The procedure

  1. On four nodes (garage1 through garage4, with garage1+garage2 in zone dc1 and garage3+garage4 in zone dc2), prepare data and metadata directories on separate SSDs if possible:

    sudo mkdir -p /var/lib/garage/data /var/lib/garage/meta
    sudo useradd -r garage -s /sbin/nologin
    sudo chown -R garage:garage /var/lib/garage
  2. Install the Garage binary:

    wget https://garagehq.deuxfleurs.fr/_releases/v1.0.0/x86_64-unknown-linux-musl/garage
    sudo install -m 0755 garage /usr/local/bin/
  3. Generate a shared rpc_secret (32 random hex bytes — same on every node):

    openssl rand -hex 32
  4. Create /etc/garage.toml on every node (with node-specific replication_mode and per-zone identity):

    metadata_dir = "/var/lib/garage/meta"
    data_dir = "/var/lib/garage/data"
    db_engine = "lmdb"
    
    replication_mode = "3"
    
    rpc_bind_addr = "[::]:3901"
    rpc_public_addr = "10.0.0.11:3901"
    rpc_secret = "<32-byte-hex>"
    
    [s3_api]
    s3_region = "garage-cluster"
    api_bind_addr = "[::]:3900"
    root_domain = ".s3.example.com"
    
    [s3_web]
    bind_addr = "[::]:3902"
    root_domain = ".web.example.com"
    index = "index.html"
    
    [admin]
    api_bind_addr = "[::]:3903"
    admin_token = "<random-admin-token>"
  5. Create a systemd unit and start on each node:

    # /etc/systemd/system/garage.service
    [Unit]
    Description=Garage S3
    After=network-online.target
    [Service]
    User=garage
    ExecStart=/usr/local/bin/garage server
    Restart=always
    [Install]
    WantedBy=multi-user.target
    sudo systemctl enable --now garage
  6. From one node, register the cluster layout (assigning each node to a zone and capacity):

    sudo garage status   # note the node IDs
    
    sudo garage layout assign <node-id-1> -z dc1 -c 1T -t garage1
    sudo garage layout assign <node-id-2> -z dc1 -c 1T -t garage2
    sudo garage layout assign <node-id-3> -z dc2 -c 1T -t garage3
    sudo garage layout assign <node-id-4> -z dc2 -c 1T -t garage4
    
    sudo garage layout show
    sudo garage layout apply --version 1
  7. Create an S3 user and bucket:

    sudo garage key create app01
    # Output: KeyID and SecretKey
    sudo garage bucket create app-uploads
    sudo garage bucket allow app-uploads --read --write --owner --key app01
    
    # Test from a workstation
    aws --endpoint-url http://garage1:3900 s3 ls
    aws --endpoint-url http://garage1:3900 s3 cp /etc/hostname s3://app-uploads/

Common pitfalls

  • rpc_secret mismatch between nodes is the most common first-day failure — nodes silently refuse to join the cluster. Distribute the secret via your config management before any node starts.
  • Replication factor 3 with only 2 zones means at least two replicas land in the same zone — defeats the geo-fault tolerance. Always have N zones for replication factor N.
  • The first-time layout apply requires every node to be reachable. If one is down, the apply fails and the cluster has no layout. Plan a maintenance window for the initial bring-up.
  • Eventually consistent reads are a real surprise to applications. Test with your specific read-after-write patterns; Garage isn’t a drop-in replacement for strongly consistent stores like MinIO with default-fs mode.
  • Garage’s S3 implementation is partial — multipart, presigned URLs, basic bucket policies work, but newer S3 features and Object Lock may not. Check your SDK’s specific call surface.

In the engagements we run, Garage shows up for two patterns: geo-distributed backup targets where consistency can be eventual and the WAN link is the constraint, and lab/edge deployments where the resource budget can’t sustain MinIO. We monitor cluster status and per-node disk usage, alert on layout discrepancies, and document the consistency model with the customer at deploy time — Garage is a different deal from strongly consistent S3, and the operating model needs to acknowledge that.