Ceph gives you unified block, object, and file storage on commodity hardware, but the install used to be a punishment — ceph-deploy, hand-rolled YAML, copying keyrings around. cephadm replaced all of that with a containerized orchestrator that pulls Ceph daemons as Podman/Docker images and manages them from a single SSH-reachable admin node. This article walks the three-node bootstrap we use as the baseline for any Ceph engagement, what to verify before declaring the cluster ready, and the operational pre-flight that catches misconfigured network MTU or clock drift before they wreck quorum.
How to verify
# Bootstrap node only — admin keyring and config in /etc/ceph
sudo cephadm shell -- ceph -s
sudo cephadm shell -- ceph orch host ls
sudo cephadm shell -- ceph orch ps
sudo cephadm shell -- ceph osd tree
sudo cephadm shell -- ceph mon dump | head -20
# Cluster health
sudo cephadm shell -- ceph health detail
sudo cephadm shell -- ceph df
sudo cephadm shell -- ceph version
What’s happening
A Ceph cluster has three core daemon types. Monitors (mons) maintain the cluster map and quorum — 3 or 5 of them in production. OSDs (Object Storage Daemons) are one-per-disk processes that hold actual data and replicate to peers. Managers (mgrs) run the dashboard, Prometheus exporter, and the orchestrator that talks to cephadm. On top of those, MDS handles CephFS metadata, RGW handles S3-compatible gateways, RBD is the block service. cephadm deploys each daemon as a container (Podman by default on Ubuntu 24.04), placed across hosts according to placement specs you submit.
The bootstrap process creates the first monitor, the first manager, the admin keyring, and the cluster config — all on the node where you run cephadm bootstrap. After that, every other host is added via ceph orch host add, and the orchestrator pulls Ceph container images, copies the cluster public SSH key, and starts daemons according to placement rules. The default placement spreads monitors across distinct hosts, and OSDs land on every visible block device that’s not already mounted or partitioned.
There are three pre-flight items that, if wrong, will eat your bootstrap silently or cause health-warn states forever. First, clock skew: monitors require sub-50ms time drift between peers. Run chronyd or systemd-timesyncd against a reliable source on every node before bootstrap. Second, network MTU: if you’re using jumbo frames (9000 MTU) on the cluster network, every host must agree, or large object scrubs will silently fragment and stall. Third, dedicated cluster network: while not required, separating client traffic from OSD-to-OSD replication onto a second NIC is the difference between a Ceph cluster that performs and one that doesn’t.
The procedure
-
On all three nodes (
ceph1,ceph2,ceph3), install Podman and the cephadm bootstrap script:sudo apt update sudo apt install -y podman lvm2 chrony curl sudo systemctl enable --now chrony curl --silent --remote-name --location https://download.ceph.com/rpm-19.2.0/el9/noarch/cephadm sudo install -m 0755 cephadm /usr/local/sbin/cephadm sudo cephadm add-repo --release squid sudo cephadm install -
On
ceph1, bootstrap the cluster:sudo cephadm bootstrap \ --mon-ip 10.0.0.11 \ --cluster-network 10.0.1.0/24 \ --initial-dashboard-user admin \ --initial-dashboard-password '<set-a-strong-password>' \ --dashboard-password-noupdateThe output prints the dashboard URL and admin credentials. Copy them.
-
Copy the admin keyring to a local install for convenient CLI use:
sudo cephadm shell # inside the container shell: ceph -sOr install the client locally and reference
/etc/ceph/ceph.conf:sudo cephadm install ceph-common sudo ceph -s -
Add the other two nodes to the cluster:
sudo ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph2 sudo ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph3 sudo ceph orch host add ceph2 10.0.0.12 sudo ceph orch host add ceph3 10.0.0.13 sudo ceph orch host ls -
Wait for the monitor count to converge to 3 (cephadm places mons by default across distinct hosts):
watch -n 5 'sudo ceph mon dump | tail -3 && sudo ceph orch ps --daemon-type mon' -
Provision OSDs across all available block devices:
sudo ceph orch device ls sudo ceph orch apply osd --all-available-devices sudo ceph osd treeThis deploys an OSD daemon per disk on every host. Wait for OSDs to show
upandin. -
Confirm health and create the first pool:
sudo ceph -s sudo ceph osd pool create rbd-data 64 64 sudo ceph osd pool application enable rbd-data rbd
Common pitfalls
- Bootstrap will fail silently if
--mon-ipis on an interface that isn’t reachable from the other planned hosts. Use the address on the network you intend mons to communicate on. --cluster-networkis the OSD-to-OSD replication network. If you don’t specify it, all replication traffic goes over the public network and competes with client I/O. Set it.- Clock drift past 50ms produces
MON_CLOCK_SKEWwarnings that escalate to outages. Verifychronyc trackingshows sync on every node before bootstrap. --all-available-devicesconsumes every unpartitioned disk. If you have a disk you want to keep out (boot disk, NVMe for cache), use explicitceph orch daemon add osd <host>:<device>syntax instead.- Pre-19.x clusters had quirks around stretched placement; we standardize on Reef (18) or Squid (19) for new deployments.
In the engagements we run, Ceph is the storage layer for clustered environments that need block, file, and object semantics under one operations story. We bootstrap as above, layer in an RGW S3 gateway and CephFS on the same cluster, wire the dashboard to our SSO, and pipe Prometheus metrics into the same time series the rest of the platform uses. The bootstrap takes an hour; the operational discipline — capacity forecasts, PG count tuning, slow-OSD detection — is the work that follows.