3.3.5 Replacing a Hard Disk Proactively

Exadata Storage software has a complete set of automated operations for hard disk maintenance, when a hard disk has failed or has been flagged as a problematic disk. But there are situations where a hard disk has to be removed proactively from the configuration.

In the CellCLI ALTER PHYSICALDISK command, the DROP FOR REPLACEMENT option checks if a normal functioning hard disk can be removed safely without the risk of data loss. However, after the execution of the command, the grid disks on the hard disk are inactivated on the storage cell and set to offline in the Oracle ASM disk groups.

To reduce the risk of having a disk group without full redundancy and proactively replace a hard disk, follow this procedure:

  1. Identify the LUN, cell disk, and grid disk associated with the hard disk.

    Use a command similar to the following where, X:Y identifies the hard disk name of the drive you are replacing.

    # cellcli –e "list diskmap" | grep 'X:Y'

    The output should be similar to the following:

       20:5            KEBTDJ          5                       normal  559G           
        CD_05_exaceladm01    /dev/sdf                
        "DATAC1_CD_05_exaceladm01, DBFS_DG_CD_05_exaceladm01, 
         RECOC1_CD_05_exaceladm01"
    

    To get the LUN, issue a command similar to the following:

    CellCLI> list lun where deviceName='/dev/sdf/'
             0_5     0_5     normal
    
  2. Drop the disk.
    • If you are using at least Oracle Exadata System Software release 21.2.0, use the following command to drop the physical disk while maintaining redundancy:

      CellCLI> alter physicaldisk X:Y drop for replacement maintain redundancy

      Wait for the operation to complete before continuing.

    • If you are using an Oracle Exadata System Software release before 21.2.0, do the following:

      1. Drop the affected grid disks from the Oracle ASM disk groups in normal mode.

        SQL> ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name;
      2. Wait for the ASM rebalance operation to complete before continuing.

      3. Drop the physical disk.

        Use a command similar to the following where, X:Y identifies the hard disk name of the drive you are replacing.

        CellCLI> alter physicaldisk X:Y drop for replacement
  3. Ensure that the blue OK to Remove LED on the disk is lit before removing the disk.
  4. Replace the new hard disk.
  5. Verify the LUN, cell disk and grid disk associated with the hard disk were created.
    CellCLI> list lun lun_name
    CellCLI> list celldisk where lun=lun_name
    CellCLI> list griddisk where celldisk=celldisk_name
  6. Verify the grid disk was added to the Oracle ASM disk groups.

    The following query should return no rows.

    SQL> SELECT path,header_status FROM v$asm_disk WHERE group_number=0;

    The following query shows whether all the failure groups have the same number of disks:

    SQL> SELECT group_number, failgroup, mode_status, count(*) FROM v$asm_disk
         GROUP BY group_number, failgroup, mode_status;