Optimizing Ceph Storage: How to Remove and Reuse OSD Drives

Ceph is an open-source software-defined storage system. It provides object, block, and file storage in a unified system. It is designed to be fault-tolerant, self-healing, and capable of running on commodity hardware, making it a perfect choice for large-scale storage needs. One of the main components of Ceph’s architecture is the Object Storage Daemon (OSD), which is responsible for storing data, handling replication, recovery, rebalancing, and providing some of the cluster’s monitoring information.

Each OSD in a Ceph cluster typically manages one or more physical or logical storage devices, and the cluster relies on these OSDs to distribute data across the storage pool. The OSDs are critical to the performance and reliability of the Ceph cluster, ensuring that data is both safe and accessible.

If you’re interested in learning how to deploy a Ceph Cluster, I already have another blog covering the deployment process on AWS.

Ceph Cluster Deployment on AWS: Mastering Distributed Storage

However, there are scenarios where you may need to remove an OSD from the Ceph cluster. These situations could include:

  1. Maintenance or Hardware Replacement: When an OSD drive shows signs of failure or degradation, it may need to be replaced with a new one to maintain the cluster’s health and performance.
  2. Upgrading Hardware: If you are upgrading your storage infrastructure with larger or faster drives, you may need to remove older OSDs to make room for the new hardware.
  3. Repurposing Hardware: In some cases, you might want to repurpose an OSD’s storage device for a different use, such as allocating it to a different storage system, using it for another service, or redeploying it in a different cluster.
  4. Cluster Rebalancing: Sometimes, to achieve better data distribution or to optimize the cluster’s performance, you might need to remove an OSD temporarily or permanently.

Removing an OSD from a Ceph cluster is a delicate process that must be performed carefully to avoid disrupting the cluster’s operation and to ensure that data remains accessible and protected. In this blog, we will walk through the steps involved in safely removing an OSD from a Ceph cluster and repurposing the drive for a different purpose.

Preparation for removing an OSD

Before removing an OSD from a Ceph cluster, it is essential to ensure that the cluster is in a stable state. Removing an OSD can impact data redundancy and cluster performance, so careful preparation is necessary to avoid data loss or prolonged recovery times. Below are the steps you should follow to ensure the cluster’s stability before proceeding with the OSD removal:

Check Cluster Health:

The first step is to verify the current health status of the Ceph cluster. You can check the cluster’s health using the Ceph CLI command:

root@rke2-server1:~# kubectl exec -it rook-ceph-tools-6d4568fd7f-nnf6l -n rook-ceph -- ceph status
  cluster:
    id:     0f4e757c-e7fb-4113-96d3-ac323b05c6f0
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,e (age 32h)
    mgr: a(active, since 32h), standbys: b
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 25h), 3 in (since 25h); 52 remapped pgs
    rgw: 1 daemon active (1 hosts, 1 zones)

  data:
    volumes: 1/1 healthy
    pools:   12 pools, 169 pgs
    objects: 3.39M objects, 2.3 TiB
    usage:   7.1 TiB used, 92 TiB / 99 TiB avail
    pgs:     169 active+clean

  io:
    client:   1.3 KiB/s rd, 4.5 KiB/s wr, 2 op/s rd, 0 op/s wr

root@rke2-server1:~#

You should see an output similar to this:

cluster:
  id:     0f4e757c-e7fb-4113-96d3-ac323b05c6f0
  health: HEALTH_OK

Make sure the cluster health is reported as HEALTH_OK. If the health status is HEALTH_WARN or HEALTH_ERR, you will need to investigate the cause and resolve any issues before proceeding. Pay attention to warnings related to degraded data, undersized PGs (placement groups), or OSD flapping.

In the case of HEALTH_WARN, check for any recovery activities or ongoing data migrations that could be affected by the OSD removal. You should wait until these operations are complete before proceeding. Check for Active I/O and Recovery Operations and ensure no significant I/O operations or recovery activities are happening on the cluster.

Removing an OSD: Detailed Instructions

When removing an OSD from a Ceph cluster, you need to follow a series of steps to ensure the process is done safely and smoothly. This involves marking the OSD as “out,” removing it from the CRUSH map, and deleting the OSD from the Ceph cluster. Below are detailed instructions for each step, along with the necessary commands and explanations.

Step 1: Marking the OSD as “Out”
  • Identify the OSD ID

First, identify the OSD ID you want to remove. You can find the OSD ID by running:

root@rke2-server1:~# kubectl exec -it rook-ceph-tools-6d4568fd7f-nnf6l bash -n rook-ceph
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
bash-4.4$ ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME                  STATUS  REWEIGHT  PRI-AFF
-1         4.98488  root default
-3         1.81929      host rke2-agent2-gse
 0    hdd  0.90959          osd.0                  up   1.00000  1.00000
 2    hdd  0.90970          osd.2                  up   1.00000  1.00000
-7         3.16559      host rke2-server1-gse
 1    hdd  2.72899          osd.1                  up   1.00000  1.00000
 3    ssd  0.43660          osd.3                  up   1.00000  1.00000
bash-4.4$

Look for the osd.<id> you want to remove.

  • Mark the OSD as “out”

Next, mark the OSD as “out” so Ceph begins migrating the data to other OSDs:

bash-4.4$ ceph osd out osd.1
marked out osd.1.

Step2: Remove the OSD from the CRUSH Map

Ceph uses the CRUSH map to determine where to store data across the OSDs. After marking the OSD as out and stopping the pod, remove the OSD from the CRUSH map:

bash-4.4$ ceph osd crush remove osd.1
removed item id 1 name 'osd.1' from crush map

Step 3: Deleting the OSD from the Ceph Cluster

Now that the OSD is marked out, its process is stopped, and it has been removed from the CRUSH map, you can delete the OSD entirely from the Ceph cluster.

  • Remove the OSD Authentication Key

Ceph stores authentication keys for each OSD. Remove the OSD key from the cluster:

bash-4.4$ ceph auth del osd.1
  • Delete the OSD

Finally, remove the OSD from the cluster:

Sometimes it takes time to get the OSD down wait for it to reflect as down. Once it comes down, you can delete the OSD.

bash-4.4$ ceph osd rm osd.1
Error EBUSY: osd.1 is still up; must be down before removal.
bash-4.4$ ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME                  STATUS  REWEIGHT  PRI-AFF
-1         2.25589  root default
-3         1.81929      host rke2-agent2-gse
 0    hdd  0.90959          osd.0                  up   1.00000  1.00000
 2    hdd  0.90970          osd.2                  up   1.00000  1.00000
-7         0.43660      host rke2-server1-gse
 3    ssd  0.43660          osd.3                  up   1.00000  1.00000
 1               0  osd.1                          up         0  1.00000
bash-4.4$ ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME                  STATUS  REWEIGHT  PRI-AFF
-1         2.25589  root default
-3         1.81929      host rke2-agent2-gse
 0    hdd  0.90959          osd.0                  up   1.00000  1.00000
 2    hdd  0.90970          osd.2                  up   1.00000  1.00000
-7         0.43660      host rke2-server1-gse
 3    ssd  0.43660          osd.3                  up   1.00000  1.00000
 1               0  osd.1                        down         0  1.00000
bash-4.4$ ceph osd rm osd.1
removed osd.1
bash-4.4$

Once the OSD is removed we can verify using the command:

bash-4.4$ ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME                  STATUS  REWEIGHT  PRI-AFF
-1         2.25589  root default
-3         1.81929      host rke2-agent2-gse
 0    hdd  0.90959          osd.0                  up   1.00000  1.00000
 2    hdd  0.90970          osd.2                  up   1.00000  1.00000
-7         0.43660      host rke2-server1-gse
 3    ssd  0.43660          osd.3                  up   1.00000  1.00000
bash-4.4$ exit
exit

Step 4: Clearing the Disk

If you want to repurpose the physical drive used by the removed OSD, you will need to clear the disk. This involves wiping the disk clean of the Ceph data structures:

  • Identify the Disk

Identify the disk associated with the removed OSD. You can do this by matching the OSD ID with the corresponding disk using lsblk or similar commands.

root@rke2-server1:~# lsblk -fp | grep ceph_bluestore
  • Wipe the Disk

Use wipefs to remove the Ceph data structures from the disk:

root@rke2-server1:~# wipefs -a /dev/sdi
/dev/sdi: 22 bytes were erased at offset 0x00000000 (ceph_bluestore): 62 6c 75 65 73 74 6f 72 65 20 62 6c 6f 63 6b 20 64 65 76 69 63 65

You can now repurpose the disk for another use.

The same steps can be used to remove OSDs across various deployment types, including Rook-Ceph and other Ceph deployments, regardless of the deployment method.

Handling Degraded State

This section will explain the potential impact of removing an OSD on the cluster’s health. Even though you’ve already covered the process, it’s crucial to set expectations and provide guidance on how to manage the cluster during this period.

What to Include:

  • Cluster Health During Removal: Explain that removing an OSD may temporarily cause the cluster to enter a degraded state, where data redundancy is reduced. This is normal, but the situation needs monitoring.
  • Monitoring Cluster Status: Reiterate the importance of keeping an eye on the cluster’s status using ceph status. Discuss what different warnings (e.g., undersized PGs, degraded PGs) mean.
  • Addressing Issues: Offer advice on how to resolve common issues like degraded redundancy, such as adding a new OSD or waiting for data to be rebalanced. Include commands for checking the progress of data migration and resolving scrub issues.
  • Returning to Healthy State: Outline steps to ensure the cluster returns to a healthy state after the OSD removal, including running scrubs and deep-scrubs, and verifying that no PGs are undersized or degraded.

Repurposing the Drive

While you’ve already covered wiping the disk, expanding on this topic would provide readers with a more comprehensive guide on what to do next. This section could help users understand how to effectively reuse the drive, depending on their specific needs.

What to Include:

  • Reformatting or Reconfiguring the Drive: Detail the steps needed to reformat the drive after wiping, depending on the intended use (e.g., ext4 for Linux, NTFS for Windows).
  • Examples of New Uses: Provide examples of how the cleaned drive can be repurposed, such as:
  • Adding to Another Ceph Cluster: Guide on how to prepare and add the drive as a new OSD to a different Ceph cluster.
  • Using for General Storage: Steps to mount the drive as additional storage on a Linux system.
  • Incorporating into a RAID Array: Instructions on how to add the disk to a RAID array for improved performance and redundancy.
  • Deploying on a Different System: Tips on preparing the drive for deployment in a new system or as a backup drive.

Conclusion

Removing an OSD from a Ceph cluster is a critical task that, when done correctly, allows you to maintain cluster health and repurpose hardware effectively. By following the steps in this guide, you can safely manage the process, ensuring minimal impact on your storage system.

Always monitor the cluster’s status during and after the removal to address any issues promptly. Whether you’re upgrading hardware or reallocating resources, careful planning and execution are key to keeping your Ceph cluster running smoothly.

If you have any questions or experiences to share, feel free to comment below. Your insights are valuable as we continue to explore Ceph management together.

Happy clustering!

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top