3.4.2 About Flash Disk Degraded Performance Statuses

If a flash disk has degraded performance, you might need to replace the disk.

You may need to replace a flash disk because the disk is has one of the following statuses:

  • warning - predictive failure
  • warning - poor performance
  • warning - write-through caching
  • warning - peer failure

Note:

For Oracle Exadata System Software releases earlier than release 11.2.3.2.2, the status is not present.

An alert is generated when a flash disk is in predictive failure, poor performance, write-through caching or peer failure status. The alert includes specific instructions for replacing the flash disk. If you have configured the system for alert notifications, then the alerts are sent by e-mail message to the designated address.

predictive failure

Flash disk predictive failure status indicates that the flash disk will fail soon, and should be replaced at the earliest opportunity. If the flash disk is used for flash cache, then it continues to be used as flash cache. If the flash disk is used for grid disks, then the Oracle ASM disks associated with these grid disks are automatically dropped, and Oracle ASM rebalance relocates the data from the predictively failed disk to other disks.

When a flash disk has predictive failure due to one flash disk, then the data is copied. If the flash disk is used for grid disks, then Oracle ASM re-partners the associated partner, and does a rebalance. If the flash disk is used for write back flash cache, then the data is flushed from the flash disks to the grid disks.

To identify a predictive failure flash disk, use the following command:

CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS=  \
'warning - predictive failure' DETAIL

         name:               FLASH_1_1
         deviceName:         /dev/nvme3n1
         diskType:           FlashDisk
         luns:               1_1
         makeModel:          "Oracle Flash Accelerator F160 PCIe Card"
         physicalFirmware:   8DV1RA13
         physicalInsertTime: 2016-11-30T21:24:45-08:00
         physicalSerial:     CVMD519000251P6KGN
         physicalSize:       1.4554837569594383T
         slotNumber:         "PCI Slot: 1; FDOM: 1"
         status:             warning - predictive failure

poor performance

Flash disk poor performance status indicates that the flash disk demonstrates extremely poor performance, and should be replaced at the earliest opportunity. Starting with Oracle Exadata System Software release 11.2.3.2, an under-performing disk is automatically identified and removed from active configuration. If the flash disk is used for flash cache, then flash cache is dropped from this disk thus reducing the effective flash cache size for the storage server. If the flash disk is used for grid disks, then the Oracle ASM disks associated with the grid disks on this flash disk are automatically dropped with FORCE option, if possible. If DROP...FORCE cannot succeed due to offline partners, then the grid disks are automatically dropped normally, and Oracle ASM rebalance relocates the data from the poor performance disk to other disks.

Oracle Exadata Database Machine then runs a set of performance tests. When poor disk performance is detected by CELLSRV, the cell disk status changes to normal - confinedOnline, and the physical disk status changes to warning - confinedOnline. The following conditions trigger disk confinement:

  • Disk stopped responding. The cause code in the storage alert log is CD_PERF_HANG.
  • Slow cell disk such as the following:
    • High service time threshold (cause code CD_PERF_SLOW_ABS)
    • High relative service time threshold (cause code CD_PERF_SLOW_RLTV)
  • High read or write latency such as the following:
    • High latency on writes (cause code CD_PERF_SLOW_LAT_WT)
    • High latency on reads (cause code CD_PERF_SLOW_LAT_RD)
    • High latency on reads and writes (cause code CD_PERF_SLOW_LAT_RW)
    • Very high absolute latency on individual I/Os happening frequently (cause code CD_PERF_SLOW_LAT_ERR)
  • Errors such as I/O errors (cause code CD_PERF_IOERR).

If the disk problem is temporary and passes the tests, then it is brought back into the configuration. If the disk does not pass the tests, then it is marked as poor performance, and Oracle Auto Service Request (ASR) submits a service request to replace the disk. If possible, Oracle ASM takes the grid disks offline for testing. If Oracle ASM cannot take the disks offline, then the cell disk status stays at normal - confinedOnline until the disks can be taken offline safely.

To identify a poor performance flash disk, use the following command:

CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS= \
'warning - poor performance' DETAIL

         name:                FLASH_1_4
         diskType:            FlashDisk
         luns:                1_4
         makeModel:           "Sun Flash Accelerator F20 PCIe Card"
         physicalFirmware:    D20Y
         physicalInsertTime:  2012-09-27T13:11:16-07:00
         physicalSerial:      508002000092e70FMOD2
         physicalSize:        22.8880615234375G
         slotNumber:          "PCI Slot: 1; FDOM: 3"
         status:              warning - poor performance

The disk status change is associated with the following entry in the cell alert history:

MESSAGE ID date_time info "Hard disk entered confinement status. The LUN
 n_m changed status to warning - confinedOnline. CellDisk changed status to normal
 - confinedOnline. Status: WARNING - CONFINEDONLINE  Manufacturer: name  Model
 Number: model  Size: size  Serial Number: serial_number  Firmware: fw_release 
 Slot Number: m  Cell Disk: cell_disk_name  Grid Disk: grid disk 1, grid disk 2
 ... Reason for confinement: threshold for service time exceeded"

The following would be logged in the storage server alert log:

CDHS: Mark cd health state change cell_disk_name  with newState HEALTH_BAD_
ONLINE pending HEALTH_BAD_ONLINE ongoing INVALID cur HEALTH_GOOD
Celldisk entering CONFINE ACTIVE state with cause CD_PERF_SLOW_ABS activeForced: 0
inactiveForced: 0 trigger HistoryFail: 0, forceTestOutcome: 0 testFail: 0
global conf related state: numHDsConf: 1 numFDsConf: 0 numHDsHung: 0 numFDsHung: 0
...

Note:

In Oracle Exadata System Software releases earlier than release 11.2.3.2, use the CALIBRATE command to identify a bad flash disk, and look for very low throughput and IOPS for each flash disk.

If a flash disk exhibits extremely poor performance, then it is marked as poor performance. The flash cache on that flash disk is automatically disabled, and the grid disks on that flash disk are automatically dropped from the Oracle ASM disk group.

write-through caching

Flash disk write-through caching status indicates the capacitors used to support data cache on the PCIe card have failed, and the card should be replaced as soon as possible.

peer failure

Flash disk peer failure status indicates one of the flash disks on the same Sun Flash Accelerator PCIe card has failed or has a problem. For example, if FLASH_5_3 fails, then FLASH_5_0, FLASH_5_1, and FLASH_5_2 have peer failure status. The following is an example:

CellCLI> LIST PHYSICALDISK
         36:0            L45F3A          normal
         36:1            L45WAE          normal
         36:2            L45WQW          normal
...
         FLASH_5_0       5L0034XM        warning - peer failure
         FLASH_5_1       5L0034JE        warning - peer failure
         FLASH_5_2       5L002WJH        warning - peer failure
         FLASH_5_3       5L002X4P        failed

When CellSRV detects a predictive or peer failure in any flash disk used for write back flash cache and only one FDOM is bad, then the data on the bad FDOM is resilvered, and the data on the other three FDOMs is flushed. CellSRV then initiates an Oracle ASM rebalance for the disks if there are valid grid disks. The bad disk cannot be replaced until the tasks are completed. MS sends an alert when the disk can be replaced.