Hardware⚓︎
Disk Failures⚓︎
Michael W Lucas
Disks are evil. They lie about their characteristics and layout, they hide errors, and they fail in unexpected ways. Your disks are secretly plotting against you.
One of the most significant threats to data integrity and system performance in GPFS is hardware disk failures. Understanding how GPFS handles disk failures and their potential impact on the overall system is crucial for maintaining data availability and minimizing downtime. Disk failures can lead to temporary performance degradation and increased operational complexity.
Once we get the state from any of the above command we can check what that mean from Pdisk States List
Disks and RAID⚓︎
Detailed references for Disk management is located in RAID administration guide from IBM.
mmvdisk greatly simplifies IBM Spectrum Scale RAID administration
Command structure: mmvdisk <noun> <parameter>
The nouns that mmvdisk recognizes are
mmvdisk nodeclass- Manage server node classesmmvdisk server- Manage recovery group serversmmvdisk recoverygroup- Manage recovery groupsmmvdisk vdiskset- Manage vdisk setsmmvdisk filesystem- Manage file systems made from vdisk setsmmvdisk pdisk- Manage pdisksmmvdisk vdisk- Manage vdisks
Parameters are applied to nouns, but specific parameters are designed to work with certain nouns. Common parameters include actions such as list, change, add, delete, and configure.
Expand to show mmvdisk examples
Disk Topography⚓︎
Disk subsystem configuration on an IBM Spectrum Scale RAID server can be captured by following the next times
Failed Disk Replacement⚓︎
Locating the Disk: to locate the failed disk we have more than one option:
#mmvdisk pdisk list --replace --recovery-group all
Replacing a pdisk requires the following three steps
- Run the
mmvdisk pdisk replace --preparecommand to prepare the pdisk for physical removal. - Physically remove the disk and replace it with a new disk of the same type.
- Run the
mmvdisk pdisk replacecommand to complete the replacement.
Warning
Please consult with your support before replacing the disk since this process can void the support agreement.
#mmvdisk pdisk replace --prepare --recovery-group <server> --pdisk <disk-name>
after removal of old disk and insertion of the new disk we need to issue this command