When setting up Zabbix monitoring for the home lab, we determined that the Ceph storage network was not initially configured as per best practices.
Rather than configuring Ceph with the monitors communicating on a “public” network and using a private “Cluster” network for osd communications, we had put all Ceph-related communications on an isolated V-lan. This functioned quite well, however was not in line with best practices, and left no way for us to use Ceph’s built-in Zabbix reporter.
After a large amount of research, we determined that the most often given advice regarding modifying the storage network configuration of an operational Ceph cluster is “Don’t.”
Taking this to heart, we started the backup process.
Our first requirement was having off-cluster backups of all VM’s. This way, even if the process fails, recovery can be performed.
The second requirement for this was having available time for a cluster reboot. This process requires significant changes, and can cause a temporary loss of quorum as monitors are brought down on one network, and up on another.
As our test rack servers only have two NICs apiece, we chose to move the Ceph monitor daemons from the private cluster network to the public network which is also acting as the interface for ProxMox, and VM bridges.
Note that monitor configuration is shared across all members of the cluster, so while you do need to configure each monitor, these steps only need to be done on one member of the cluster.
- Backup storage network configuration:
cp /etc/pve/storage.cfg /tmp/backup/storage.cfg.bak cp /etc/pve/ceph.conf /tmp/backup/ceph.conf.bak ceph mon getmap -o /tmp/backup/monmap.bak
- Create working files:
cp /etc/pve/storage.cfg /tmp/storage.cfg cp /etc/pve/ceph.conf /tmp/ceph.conf ceph mon getmap -o /tmp/monmap
- Make changes to the Ceph Monitor map:
Note that only ceph cluster members which are running a monitor daemon need to be configured here.
monmaptool --rm prox1 --rm prox2 --rm prox3 /tmp/monmap monmaptool --add prox1 <newip:port> --add prox2 <newip:port> --add prox3 <newip:port>
- Verify the Ceph Monitor Map changes:
monmaptool --print /tmp/monmap
- Edit /tmp/ceph.conf, and change the entries listed below:
public network = <new monitor subnet in cidr notation> [mon.prox3] host = prox3 mon addr = <newip:port> [mon.prox1] host = prox1 mon addr = <newip:port> [mon.prox2] host = prox2 mon addr = <newip:port
If your OSD’s are shown here, you may need to modify additional settings.
- Commit changes:
First we need to inject the new monitor map to all monitor daemons.
ceph-mon -i prox1 --inject-monmap /tmp/monmap ceph-mon -i prox2 --inject-monmap /tmp/monmap ceph-mon -i prox3 --inject-monmap /tmp/monmap
Now we need to overwrite the /etc/pve/ceph.conf, and verify that changes replicate to each node of the cluster.
cp /tmp/ceph.conf /etc/pve/ceph.conf
You should be able to restart the monitor daemons now, however we rebooted each node of the cluster, as the ProxMox web interface was not responding after the monitor restart.
If you have any questions about this process, feel free to comment below!