Ceph - Distributed File System in Proxmox

2023-05-28 491 words 3 minutes

Contents

Ceph is massively scalable distributed file system which runs on Linux. I primarily use Ceph within a Proxmox cluster. If you haven’t checked out Proxmox as a virtual machine hypervisor then I strongly suggest you do so!

Ceph Troubleshooting

Clearing Crash Notifications

Warning

I’d suggest you always investigate logs first before simply dismissing crash notifications. Once you’re happy there is not an underlying issue you’re free to get rid of that annoying notification.

From the console, simply run:

    ceph crash archive-all

Removing Ceph Completely

If you need to completely remove Ceph from a server as you no longer need it or you’d like to install cleanly then you can follow these steps.

Warning

This will wipe all data and is non-recoverable!

    apt purge ceph-mon ceph-osd ceph-mgr ceph-mds
    rm -rf /var/lib/ceph/mon/  /var/lib/ceph/mgr/  /var/lib/ceph/mds/
    rm -r /var/lib/ceph
    rm /etc/pve/priv/ceph -r
    rm /etc/ceph/ceph.conf
    rm /etc/pve/ceph.conf
    pveceph purge

Improve Recovery / Rebalance Performance

By default Ceph prioritises client traffic over recovery and rebalancing. If you need to give Ceph recovery a higher priority (for instanc; when recovering from a disk replacement), you can change the settings on each OSD to optimise the cluster behaviour.

    ceph tell 'osd.*' injectargs '--osd-max-backfills 32'
    ceph tell 'osd.*' injectargs '--osd-recovery-max-active 32'

Don’t forget to reset afterwards:

    ceph tell 'osd.*' injectargs '--osd-max-backfills 1' 
    ceph tell 'osd.*' injectargs '--osd-recovery-max-active 0'

Additional performance tuning options you can experiment with:

    ceph tell 'osd.*' config set osd_min_pg_log_entries 10
    ceph tell 'osd.*' config set osd_max_pg_log_entries 10
    ceph tell 'osd.*' config set osd_pg_log_dups_tracked 10
    ceph tell 'osd.*' config set osd_pg_log_trim_min 10

Destroying an unused disk reference

Destroying an old (unused) disk that is still being held onto as a reference in Proxmox.

Warning

Check you know exactly which device mount you are destroying!

    ceph-volume lvm zap /dev/sdX --destroy

Changing the Ceph Public Network

Changing the Ceph Public Network once the cluster is up and running can be challenging to figure out but is quite straightforward.

Ceph Public Network

The public network is where all client (VM) traffic flows, the cluster network is (optionally) defined differently and is where recovery and rebalance traffic flows. By default the cluster network is not specified.

Edit ceph config file from shell (/etc/ceph/ceph.conf) and set new IP range
One by one destroy and recreate monitors and managers from the UI, they should come up with an IP on the new public network
Check the cluster health at each step
Reboot each server one by one

Benchmarking

It’s often helpful to measure the performance of your Ceph setup. For me this has helped verify the network configuration is setup correctly, as well as ensuring the disk performance settings are optimal.

Create a dedicated Ceph pool to benchmark in:

    ceph osd pool create scbench 128 128
    ceph osd pool application enable scbench rdb

Run benchmarks using rados:

    rados bench -p scbench 10 write --no-cleanup
    rados bench -p scbench 10 seq
    rados bench -p scbench 10 rand

Cleanup once you’re finished:

    rados -p scbench cleanup