3 Maintaining Exadata Storage Servers of Oracle Exadata Racks

This chapter contains the following topics:

Note:

  • For ease of reading, the name "Oracle Exadata Rack" is used when information refers to both Oracle Exadata Database Machine and Oracle Exadata Storage Expansion Rack.

  • All procedures in this chapter are applicable to Oracle Exadata Database Machine and Oracle Exadata Storage Expansion Rack.

3.1 Maintaining Exadata Storage Servers

This section describes how to perform maintenance on Exadata Storage Servers. It contains the following topics:

See Also:

Oracle Exadata Storage Server Software User's Guide for additional information about the Oracle ASM disk repair timer

3.1.1 Shutting Down Exadata Storage Server

When performing maintenance on Exadata Storage Servers, it may be necessary to power down or restart the cell. If Exadata Storage Server is to be shut down when one or more databases are running, then verify that taking Exadata Storage Server offline will not impact Oracle ASM disk group and database availability. The ability to take Exadata Storage Server offline without affecting database availability depends on the level of Oracle ASM redundancy used on the affected disk groups, and the current status of disks in other Exadata Storage Servers that have mirror copies of data as Exadata Storage Server to be taken offline.

The following procedure describes how to power down Exadata Storage Server.

  1. (Optional) Run the following command to have the grid disks remain offline after restarting the cell:
    CellCLI> ALTER GRIDDISK ALL INACTIVE
    

    This step is useful if there are multiple restarts, or to control when the cell becomes active again, such as verifying the planned maintenance activity was successful before the cell is used.

    Note:

    If this step is performed, then it is necessary to perform step 6 to activate the grid disks.

  2. Stop the cell services using the following command:
    CellCLI> ALTER CELL SHUTDOWN SERVICES ALL
    

    The preceding command checks if any disks are offline, in predictive failure status or need to be copied to its mirror. If Oracle ASM redundancy is intact, then the command takes the grid disks offline in Oracle ASM, and then stops the cell services. If the following error is displayed, then it may not be safe to stop the cell services because a disk group may be forced to dismount due to redundancy.

    Stopping the RS, CELLSRV, and MS services...
    The SHUTDOWN of ALL services was not successful.
    CELL-01548: Unable to shut down CELLSRV because disk group DATA, RECO may be
    forced to dismount due to reduced redundancy.
    Getting the state of CELLSRV services... running
    Getting the state of MS services... running
    Getting the state of RS services... running
    

    If the error occurs, then restore Oracle ASM disk group redundancy and retry the command when disk status is back to normal for all the disks.

  3. Shut down the cell.
  4. After performing the maintenance, power up the cell. The cell services are started automatically. As part of the cell startup, all grid disks are automatically ONLINE in Oracle ASM.
  5. Verify that all grid disks have been successfully put online using the following command:
    CellCLI> LIST GRIDDISK ATTRIBUTES name, asmmodestatus
    

    Wait until asmmodestatus shows ONLINE or UNUSED for all grid disks.

  6. Set the grid disks online using the following command. This step is only necessary when step 1 has been performed. If step 1 was not performed, then the grid disks are activated automatically.
    CellCLI> ALTER GRIDDISK ALL ACTIVE
    

3.1.2 Dropping Exadata Storage Server

The following procedure describes how to drop Exadata Storage Server:

  1. From Oracle ASM, drop the Oracle ASM disks on the physical disk using the following command:
    SQL> ALTER DISKGROUP diskgroup_name DROP DISKS IN FAILGROUP failgroup_name;
    

    To ensure correct redundancy level in Oracle ASM, wait for the rebalance to complete before proceeding.

  2. Remove the IP address entry from the cellip.ora file on each database server that accesses Exadata Storage Server.
  3. From Exadata Storage Server, drop the grid disks, cell disks, and cell on the physical disk using the following command:
    CellCLI> DROP CELLDISK
    
  4. Shut down all services on Exadata Storage Server.
  5. Power down the cell.

3.1.3 Checking Status of a Rebalance Operation

Oracle ASM rebalance occurs when dropping or adding a disk. To check the status of the rebalance, do the following:

  • The rebalance operation may have been successfully run. Check the Oracle ASM alert logs to confirm.

  • The rebalance operation may be currently running. Check the GV$ASM_OPERATION view to determine if the rebalance operation is still running.

  • The rebalance operation may have failed. Check the V$ASM_OPERATION.ERROR view to determine if the rebalance operation failed.

  • Rebalance operations from multiple disk groups can be done on different Oracle ASM instances in the same cluster if the physical disk being replaced contains ASM disks from multiple disk groups. One Oracle ASM instance can run one rebalance operation at a time. If all Oracle ASM instances are busy, then rebalance operations are queued.

Note:

Storage servers running Oracle Exadata Storage Server Software release 12.1.2.0 with Oracle Database release 12.1.0.2 with BP4, Oracle ASM sends an e-mail about the status of a rebalance operation. In earlier releases, the administrator had to check the status of the operation.

3.1.4 Understanding Patch Application for Exadata Cells

Software updates are applied to Exadata Storage Servers using the patchmgr utility. The patchmgr utility installs and updates all software and firmware needed on Exadata Storage Servers. The patchmgr utility is included with the software release that is downloaded from My Oracle Support.

Note:

Do not install, change, reconfigure, or remove software on Exadata Storage Servers unless you follow the specific Oracle Exadata instructions provided in this guide, Oracle Exadata Storage Server Software User's Guide, or My Oracle Support. It is not supported to make any change to Exadata Storage Servers unless it is documented in the specific Oracle Exadata documentation.

Note:

Regular Oracle Exadata Cell updates as described in My Oracle Support note 888828.1 can only be applied when the date-stamp of the target release is newer than the date-stamp of the release currently running. Updates from images running a release with a date-stamp newer than the target release will be blocked.

For example: Updating 12.1.1.1.2.150411 to 12.1.2.1.0.141206.1 is blocked because 12.1.1.1.2.1 (with date-stamp 150411) has a date-stamp newer than 12.1.2.1.0 (141206), although Exadata release 12.1.2.1.0 seems to be more recent than 12.1.1.1.2 based on the release number. In situations like this, you should update to a maintenance release that has a newer date-stamp for the same major release.

The patchmgr utility performs the software update on all Exadata Cells in the configuration. The utility supports rolling and non-rolling software update methods. The utility can send e-mail messages about patch completion, and the status of rolling and non-rolling patch application. The following is the command to send e-mail messages:

# ./patchmgr -cells cell_group -patch [-rolling] [-ignore_alerts] [-unkey] \
           [-smtp_from "addr" -smtp_to "addr1 addr2 addr3 ..."]

In the preceding command, addr is the sending address, and addr1, addr2 and addr3 are the receiving addresses.

This section describes the methods to apply patches to Exadata Cells, as well as the differences between each method. This section contains the following topics:

3.1.4.1 Understanding Rolling Updates

The rolling update method is also known as no deployment-wide downtime.

  • Benefits of this method:

    Does not require any database downtime.

    If there is a problem, then only one cell is affected.

  • Considerations for this method:

    Takes much longer than the non-rolling update. Cells are processed one at a time, one after the other. The minimum time it takes to apply the patch is approximately the number of cells multiplied by the time it takes to patch one cell in non-rolling updates. A single cell update in the non-rolling case takes approximately 1 hour. The time can be significantly longer when done as a rolling update and a significant load is running on the deployment. Such load conditions add to the time spent in activating and re-activating the grid disks on the cells as they are patched.

    Note:

    Oracle recommends that you subscribe to the software update notification when you launch the patchmgr utility. The administrator user will automatically receive an email notification that clearly indicates when the software update is complete. This helps prevent the administrator from manually rebooting the cell too early and interrupting the software update process.

    Oracle ASM repair timeout needs to be increased during the patching to avoid Oracle ASM dropping the grid disks on the cell. Re-adding these grid disks is a time consuming manual operation. Ensure that disk_repair_time on the disk groups is set to a minimum of the default value of 3.6 hours.

Caution:

  • Apply the fixes on Oracle Database 11g database servers as listed in My Oracle Support note 1485475.1 prior to using the -rolling option with the patchmgr utility for rolling update or rollback. The note identifies the database releases that need the fixes.

The following is a high-level description about applying this patch to Exadata Cells using a rolling update. These steps are performed automatically by the patchmgr utility.

  1. Inactivate all the grid disks on one cell that are eligible to be inactivated. The eligible grid disks have their attribute asmdeactivationoutcome set to Yes.

  2. Confirm that the inactivated disks are OFFLINE.

  3. Patch the cell.

  4. After the cell reboots and comes up correctly, activate the grid disks that were inactivated.

  5. Wait for the Oracle ASM resync operation to complete and confirm the grid disks activated in step 4 are now ONLINE.

  6. Move to the next cell to be patched and repeat the steps.

The patchmgr utility performs the preceding steps one cell at a time.

3.1.4.2 Understanding Non-Rolling Updates

The non-rolling update method is also known as deployment-wide downtime. The patch is applied to all cells in parallel when you choose to have deployment-wide downtime. The patchmgr utility provides this functionality for Exadata Cells.

  • Benefits of this method:

    All cells are done in parallel. Therefore, the patch time is almost constant no matter how many cells there are to patch.

    There is no time spent trying to inactivate and re-activate grid disks.

  • Considerations for this method:

    No cells are available during the patching process. All databases using the cell services must remain shut down for the duration of the patch.

    When a cell encounters problems during patching, the entire deployment is unavailable until resolution of the problem on that one cell. It is usually possible to bring up the database by starting Oracle Clusterware in exclusive mode and mounting the disk groups in Oracle ASM with the FORCE option followed by dropping of the grid disks in the problematic cell. However, it still extends the downtime for the duration of these steps.

3.1.5 Enabling Network Connectivity with the Diagnostics ISO

The diagnostics ISO may be needed to access a cell that does not restart in order to be manually repaired. The ISO is located on any Exadata Cell at /opt/oracle.SupportTools/diagnostics.iso. The diagnostics ISO should be used after other boot methods, such as using the USB, do not work.

The following procedure enables networking with the diagnostics ISO so files can be transferred to repair the cell:

  1. Enable a one-time CD-ROM boot in the service processor using a web interface or serial console. The following is an example of the command for the serial console:
    set boot_device=cdrom
    
  2. Mount a local copy of diagnostics.iso as a CD-ROM using the service processor interface.
  3. Restart the cell using the reboot command.
  4. Log into the cell as the root user with the diagnostics ISO password.
  5. Use the following command to avoid pings:
    alias ping="ping -c"
    
  6. Make a directory named /etc/network.
  7. Make a directory named /etc/network/if-pre-up.d.
  8. Add the following lines to the /etc/network/interfaces file:
    iface eth0 inet static
    address IP_address_of_cell
    netmask netmask_of_cell
    gateway gateway_IP_address_of_cell
    
  9. Bring up the eth0 interface using the following command:
    ifup eth0
     
    

    There may be some warning messages, but the interface is operational.

  10. Use either FTP or the wget command to retrieve the files to repair the cell.

3.2 Maintaining the Hard Disks of Exadata Storage Servers

Every Exadata Database Machine has a system area, which is where the Oracle Exadata Storage Server Software system software resides. In Exadata Database Machine X7 systems, two internal M.2 devices contain the system area. In all other systems, the first two disks of Exadata Storage Server are system disks and the portions on these system disks are referred to as the system area.

In Exadata Database Machine X7 systems, all the hard disks in the cell are data disks. In systems prior to Exadata Database Machine X7, the non-system area of the system disks, referred to as data partitions, is used for normal data storage. All other disks in the cell are data disks.

Starting in release 11.2.3.2.0, if there is a disk failure, then Oracle Exadata Storage Server Software sends an alert stating that the disk can be replaced, and turns on the blue LED for the hard disk with predictive failure after all data has been rebalanced out from that disk. In Oracle Exadata Storage Server Software releases earlier than 11.2.3.2.0, the amber LED was turned on for a hard disk with predictive failure, but not the blue LED. In these cases, it is necessary to manually check if all data has been rebalanced out from the disk before proceeding with disk replacement.

Starting with Oracle Exadata Storage Server Software release 18.1.0.0.0 and Exadata Database Machine X7 systems, there is an additional DoNotService LED that indicates when redundancy is reduced to inform system administrators or field engineers that the storage server should not be powered off for services. When redundancy is restored, Exadata Storage Server Software automatically turns off the Do-Not-Service LED to indicate that the cell can be powered off for services.

For a hard disk that has failed, both the blue LED and the amber LED are turned on for the drive indicating that disk replacement can proceed. The behavior is the same in all releases. The drive LED light is a solid light in releases 11.2.3.2.0 and later, whereas the drive LED blinks in earlier releases.

Note:

Oracle Exadata Rack is online and available while replacing the Exadata Storage Server physical disks.

This section contains the following topics:

See Also:

3.2.1 Monitoring the Status of Hard Disks

You can monitor the status of a hard disk by checking its attributes with the CellCLI LIST PHYSICALDISK command.

For example, a hard disk status equal to failed (the status for failed hard disks was critical in earlier releases), or warning - predictive failure is probably having problems and needs to be replaced. The disk firmware maintains the error counters, and marks a drive with Predictive Failure when internal thresholds are exceeded. The drive, not the cell software, determines if it needs replacement.

  • Use the CellCLI command LIST PHSYICALDISK to determine the status of a hard disk:
    CellCLI> LIST PHYSICALDISK WHERE disktype=harddisk AND status!=normal DETAIL
             name:                            8:4
             deviceId:              12
               deviceName:                   /dev/sde
               diskType:                      HardDisk
             enclosureDeviceId:      8
             errOtherCount:          0
             luns:                   0_4
               makeModel:                    "HGST    H7280A520SUN8.0T"
             physicalFirmware:         PD51
             physicalInsertTime:      2016-11-30T21:24:45-08:00
             physicalInterface:     sas
             physicalSerial:            PA9TVR
             physicalSize:               7.153663907200098T
             slotNumber:                  4
             status:                        failed
    

When disk I/O errors occur, Oracle ASM performs bad extent repair for read errors due to media errors. The disks will stay online, and no alerts are sent. When Oracle ASM gets a read error on a physically-addressed metadata block, it does not have mirroring for the blocks, and takes the disk offline. Oracle ASM then drops the disk using the FORCE option.

The Exadata Storage Server hard disk statuses are as follows:

  • Oracle Exadata Storage Server Software Release 11.2.3.3 and later:

    • normal

    • normal - dropped for replacement

    • normal - confinedOnline

    • normal - confinedOnline - dropped for replacement

    • not present

    • failed

    • failed - dropped for replacement

    • failed - rejected due to incorrect disk model

    • failed - rejected due to incorrect disk model - dropped for replacement

    • failed - rejected due to wrong slot

    • failed - rejected due to wrong slot - dropped for replacement

    • warning - confinedOnline

    • warning - confinedOnline - dropped for replacement

    • warning - peer failure

    • warning - poor performance

    • warning - poor performance - dropped for replacement

    • warning - poor performance, write-through caching

    • warning - predictive failure, poor performance

    • warning - predictive failure, poor performance - dropped for replacement

    • warning - predictive failure, write-through caching

    • warning - predictive failure

    • warning - predictive failure - dropped for replacement

    • warning - predictive failure, poor performance, write-through caching

    • warning - write-through caching

  • Oracle Exadata Storage Server Software Release 11.2.3.2:

    • normal

    • normal - confinedOnline

    • not present

    • failed

    • failed - rejected due to incorrect disk model

    • failed - rejected due to wrong slot

    • warning - confinedOnline

    • warning - peer failure

    • warning - poor performance

    • warning - poor performance, write-through caching

    • warning - predictive failure, poor performance

    • warning - predictive failure, write-through caching

    • warning - predictive failure

    • warning - predictive failure, poor performance, write-through caching

    • warning - write-through caching

  • Oracle Exadata Storage Server Software Release 11.2.3.1.1 and earlier:

    • normal

    • critical

    • poor performance

    • predictive failure

    • not present

3.2.2 Monitoring Hard Disk Controller Write-through Caching Mode

The hard disk controller on each Exadata Storage Server periodically performs a discharge and charge of the controller battery. During the operation, the write cache policy changes from write-back caching to write-through caching. Write-through cache mode is slower than write-back cache mode. However, write-back cache mode has a risk of data loss if the Exadata Storage Server loses power or fails. For Exadata Storage Server releases earlier than release 11.2.1.3, the operation occurs every month. For Oracle Exadata Storage Server Software release 11.2.1.3.0, and later, the operation occurs every three months, for example, at 01:00 on the 17th day of January, April, July and October. To change the start time for the learn cycle, use a command similar to the following. The time reverts to the default learn cycle time after the cycle completes.

CellCLI> ALTER CELL bbuLearnCycleTime="2013-01-22T02:00:00-08:00"

To see the time for the next learn cycle, use the following command:

CellCLI> LIST CELL ATTRIBUTES bbuLearnCycleTime

Exadata Storage Server generates an informational alert about the status of the caching mode for logical drives on the cell, similar to the following:

HDD disk controller battery on disk contoller at adapter 0 is going into a learn
cycle. This is a normal maintenance activity that occurs quarterly and runs for
approximately 1 to 12 hours. The disk controller cache might go into WriteThrough
caching mode during the learn cycle. Disk write throughput might be temporarily
lower during this time. The message is informational only, no action is required.

Use the following command to view the status of the battery:

# /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -a0

The following is an example of the output of the command:

BBU status for Adapter: 0
 
BatteryType: iBBU08
Voltage: 3721 mV
Current: 541 mA
Temperature: 43 C
 
BBU Firmware Status:
Charging Status : Charging
Voltage : OK
Temperature : OK
Learn Cycle Requested : No
Learn Cycle Active : No
Learn Cycle Status : OK
Learn Cycle Timeout : No
I2c Errors Detected : No
Battery Pack Missing : No
Battery Replacement required : No
Remaining Capacity Low : Yes
Periodic Learn Required : No
Transparent Learn : No
 
Battery state:
 
GasGuageStatus:
Fully Discharged : No
Fully Charged : No
Discharging : No
Initialized : No
Remaining Time Alarm : Yes
Remaining Capacity Alarm: No
Discharge Terminated : No
Over Temperature : No
Charging Terminated : No
Over Charged : No
 
Relative State of Charge: 7 %
Charger System State: 1
Charger System Ctrl: 0
Charging current: 541 mA
Absolute state of charge: 0 %
Max Error: 0 %
 
Exit Code: 0x00

3.2.3 Replacing a Hard Disk Due to Disk Failure

A hard disk outage can cause a reduction in performance and data redundancy. Therefore, the disk should be replaced with a new disk as soon as possible. When the disk fails, the Oracle ASM disks associated with the grid disks on the hard disk are automatically dropped with the FORCE option, and an Oracle ASM rebalance follows to restore the data redundancy.

An Exadata alert is generated when a disk fails. The alert includes specific instructions for replacing the disk. If you have configured the system for alert notifications, then the alert is sent by e-mail to the designated address.

After the hard disk is replaced, the grid disks and cell disks that existed on the previous disk in that slot are re-created on the new hard disk. If those grid disks were part of an Oracle ASM group, then they are added back to the disk group, and the data is rebalanced on them, based on the disk group redundancy and ASM_POWER_LIMIT parameter.

Note:

Storage servers running Oracle Exadata Storage Server Software release 12.1.2.0 with Oracle Database release 12.1.0.2 with BP4, Oracle ASM sends an e-mail about the status of a rebalance operation. In earlier releases, the administrator had to check the status of the operation.

For earlier releases, check the rebalance operation status as described in "Checking Status of a Rebalance Operation."

The following procedure describes how to replace a hard disk due to disk failure:

  1. Determine the failed disk using the following command:
    CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status=failed DETAIL
    

    The following is an example of the output from the command. The slot number shows the location of the disk, and the status shows that the disk has failed.

    CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status=failed DETAIL
    
             name:                   28:5
             deviceId:               21
             diskType:               HardDisk
             enclosureDeviceId:      28
             errMediaCount:          0
             errOtherCount:          0
             foreignState:           false
             luns:                   0_5
             makeModel:              "SEAGATE ST360057SSUN600G"
             physicalFirmware:       0705
             physicalInterface:      sas
             physicalSerial:         A01BC2
             physicalSize:           558.9109999993816G
             slotNumber:             5
             status:                 failed
    
  2. Ensure the blue OK to Remove LED on the disk is lit before removing the disk.
  3. Replace the hard disk on Exadata Storage Server and wait for three minutes. The hard disk is hot-pluggable, and can be replaced when the power is on.
  4. Confirm the disk is online.

    When you replace a hard disk, the disk must be acknowledged by the RAID controller before you can use it. This does not take long. Use the LIST PHYSICALDISK command similar to the following to ensure the status is NORMAL.

    CellCLI> LIST PHYSICALDISK WHERE name=28:5 ATTRIBUTES status
    
  5. Verify the firmware is correct using the ALTER CELL VALIDATE CONFIGURATION command.

In rare cases, the automatic firmware update may not work, and the LUN is not rebuilt. This can be confirmed by checking the ms-odl.trc file.

See Also:

3.2.4 Replacing a Hard Disk Due to Disk Problems

You may need to replace a hard disk because the disk is in warning - predictive failure status.

The predictive failure status indicates that the hard disk will soon fail, and should be replaced at the earliest opportunity. The Oracle ASM disks associated with the grid disks on the hard drive are automatically dropped, and an Oracle ASM rebalance relocates the data from the predictively failed disk to other disks.

If the drop did not complete before the hard drive dies, then refer to "Replacing a Hard Disk Due to Disk Failure".

An alert is sent when the disk is removed. After replacing the hard disk, the grid disks and cell disks that existed on the previous disk in the slot are re-created on the new hard disk. If those grid disks were part of an Oracle ASM disk group, then they are added back to the disk group, and the data is rebalanced based on disk group redundancy and the ASM_POWER_LIMIT parameter.

Note:

On Exadata Storage Servers running Exadata Storage Server Software release 12.1.2.0 with Oracle Database release 12.1.0.2 with BP4, Oracle ASM sends an e-mail about the status of a rebalance operation. In earlier releases, the administrator had to check the status of the operation.

For earlier releases, check the rebalance operation status as described in "Checking Status of a Rebalance Operation."

  1. Determine which disk is the failing disk.
    CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status= \
            "warning - predictive failure" DETAIL
    

    The following is an example of the output. The slot number shows the location of the disk, and the status shows the disk is expected to fail.

    CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status= \
             "warning - predictive failure" DETAIL
             name:                   28:3
             deviceId:               19
             diskType:               HardDisk
             enclosureDeviceId:      28
             errMediaCount:          0
             errOtherCount:          0
             foreignState:           false
             luns:                   0_3
             makeModel:              "SEAGATE ST360057SSUN600G"
             physicalFirmware:       0705
             physicalInterface:      sas
             physicalSerial:         E07L8E
             physicalSize:           558.9109999993816G
             slotNumber:             3
             status:                 warning - predictive failure
    
  2. Ensure the blue OK to Remove LED on the disk is lit before removing the disk.
  3. Wait until the Oracle ASM disks associated with the grid disks on the hard disk have been successfully dropped. To determine if the grid disks have been dropped, query the V$ASM_DISK_STAT view on the Oracle ASM instance.

    Caution:

    The disks in the first two slots are system disks which store the operating system and Oracle Exadata Storage Server Software. One system disk must be in working condition to keep up the server.

    Wait until ALTER CELL VALIDATE CONFIGURATION shows no mdadm errors, which indicates the system disk resync has completed, before replacing the other system disk.

  4. Replace the hard disk on Exadata Storage Server and wait for three minutes. The hard disk is hot-pluggable, and can be replaced when the power is on.
  5. Confirm the disk is online.

    When you replace a hard disk, the disk must be acknowledged by the RAID controller before you can use it. This does not take long. Use the LIST PHYSICALDISK command to ensure the status is NORMAL.

    CellCLI> LIST PHYSICALDISK WHERE name=28:3 ATTRIBUTES status
    
  6. Verify the firmware is correct using the ALTER CELL VALIDATE CONFIGURATION command.

See Also:

3.2.5 Replacing a Hard Disk Due to Bad Performance

A single bad hard disk can degrade the performance of other good disks. It is better to remove the bad disk from the system than let it remain. Starting with Exadata Storage Server Software release 11.2.3.2, an underperforming disk is automatically identified and removed from active configuration. Oracle Exadata Database Machine then runs a set of performance tests. When poor disk performance is detected by CELLSRV, the cell disk status changes to normal - confinedOnline, and the hard disk status changes to warning - confinedOnline.

The following conditions trigger disk confinement:

  • Disk stopped responding. The cause code in the storage alert log is CD_PERF_HANG.

  • Slow cell disk such as the following:

    • High service time threshold (cause code CD_PERF_SLOW_ABS)

    • High relative service time threshold (cause code CD_PERF_SLOW_RLTV)

  • High read or write latency such as the following:

    • High latency on writes (cause code CD_PERF_SLOW_LAT_WT)

    • High latency on reads (cause code CD_PERF_SLOW_LAT_RD)

    • High latency on reads and writes (cause code CD_PERF_SLOW_LAT_RW)

    • Very high absolute latency on individual I/Os happening frequently (cause code CD_PERF_SLOW_LAT_ERR)

  • Errors such as I/O errors (cause code CD_PERF_IOERR).

If the disk problem is temporary and passes the tests, then it is brought back into the configuration. If the disk does not pass the tests, then it is marked as poor performance, and ASR submits a service request to replace the disk. If possible, Oracle ASM takes the grid disks offline for testing. If Oracle ASM cannot take the disks offline, then the cell disk status stays at normal - confinedOnline until the disks can be taken offline safely.

The disk status change is associated with the following entry in the cell alert history:

MESSAGE ID date_time info "Hard disk entered confinement status. The LUN
 n_m changed status to warning - confinedOnline. CellDisk changed status to normal
 - confinedOnline. Status: WARNING - CONFINEDONLINE  Manufacturer: name  Model
 Number: model  Size: size  Serial Number: serial_number  Firmware: fw_release 
 Slot Number: m  Cell Disk: cell_disk_name  Grid Disk: grid disk 1, grid disk 2
 ... Reason for confinement: threshold for service time exceeded"

The following would be logged in the storage cell alert log:

CDHS: Mark cd health state change cell_disk_name  with newState HEALTH_BAD_
ONLINE pending HEALTH_BAD_ONLINE ongoing INVALID cur HEALTH_GOOD
Celldisk entering CONFINE ACTIVE state with cause CD_PERF_SLOW_ABS activeForced: 0
inactiveForced: 0 trigger HistoryFail: 0, forceTestOutcome: 0 testFail: 0
global conf related state: numHDsConf: 1 numFDsConf: 0 numHDsHung: 0 numFDsHung: 0
...

Note:

In releases earlier than Exadata Storage Server Software release 11.2.3.2, use the CALIBRATE command to identify a bad hard disk, and look for very low throughput and IOPS for each hard disk.

The following procedure describes how to remove a hard disk once the bad disk has been identified:

  1. Illuminate the hard drive service LED to identify the drive to be replaced using a command similar to the following, where disk_name is the name of the hard disk to be replaced, such as 20:2:
    cellcli -e 'alter physicaldisk disk_name serviceled on'
    
  2. Find all the grid disks on the bad disk.
    For example:
    [root@exa05celadm03 ~]# cellcli -e "list physicaldisk 20:11 attributes name, id"
            20:11 RD58EA 
    [root@exa05celadm03 ~]# cellcli -e "list celldisk where physicalDisk='RD58EA'"
            CD_11_exa05celadm03 normal 
    [root@exa05celadm03 ~]# cellcli -e "list griddisk where cellDisk='CD_11_exa05celadm03'"
            DATA_CD_11_exa05celadm03 active
            DBFS_CD_11_exa05celadm03 active
            RECO_CD_11_exa05celadm03 active
            TPCH_CD_11_exa05celadm03 active
    
  3. Direct Oracle ASM to stop using the bad disk immediately.
    SQL> ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name;
    
  4. Ensure the blue OK to Remove LED on the disk is lit before removing the disk.
  5. Ensure that the Oracle ASM disks associated with the grid disks on the bad disk have been successfully dropped by querying the V$ASM_DISK_STAT view.
  6. Remove the badly-performing disk. An alert is sent when the disk is removed.
  7. When a new disk is available, install the new disk in the system. The cell disks and grid disks are automatically created on the new hard disk.

    Note:

    When a hard disk is replaced, the disk must be acknowledged by the RAID controller before it can be used. The acknowledgement does not take long, but use the LIST PHYSICALDISK command to ensure the status is NORMAL.

See Also:

3.2.6 Replacing a Hard Disk Proactively

Exadata Storage software has a complete set of automated operations for hard disk maintenance, when a hard disk has failed or has been flagged as a problematic disk. But there are situations where a hard disk has to be removed proactively from the configuration.

In the CellCLI ALTER PHYSICALDISK command, the drop for replacement option checks if a normal functioning hard disk can be removed safely without the risk of data lost. However, after the execution of the command, the grid disks on the hard disk are inactivated on the storage cell and set to offline in the Oracle ASM disk groups.

The redundancy of the disk group is compromised until the hard disk has been replaced or re-enabled, and the subsequent rebalance completes. This is especially important for disk groups using normal redundancy.

To reduce the risk of having a disk group without full redundancy and proactively replace a hard disk, follow this procedure:

  1. Identify the LUN, celldisk, and grid disk associated with the hard disk.

    Use a command similar to the following where, X:Y identifies the hard disk name of the drive you are replacing.

    # cellcli –e "list diskmap" | grep 'X:Y'
    

    The output should be similar to the following:

       20:5            KEBTDJ          5                       normal  559G           
        CD_05_exaceladm01    /dev/sdf                
        "DATAC1_CD_05_exaceladm01, DBFS_DG_CD_05_exaceladm01, 
         RECOC1_CD_05_exaceladm01"
    

    To get the LUN, issue a command similar to the following:

    CellCLI> list lun where deviceName='/dev/sdf/'
             0_5     0_5     normal
    
  2. Drop the grid disk from the Oracle ASM disk groups in normal mode
    SQL> ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name;
    
  3. Wait for the completion of rebalance.
  4. Execute the drop for replacement.

    Use a command similar to the following where, X:Y identifies the hard disk name of the drive you are replacing.

    CellCLI> alter physicaldisk X:Y drop for replacement
    
  5. Ensure the blue “OK to Remove” LED on the disk is lit before removing the disk.
  6. Replace the new hard disk.
  7. Verify the LUN, celldisk and grid disk associated with the hard disk were created.
    CellCLI> list lun lun_name
    CellCLI> list celldisk where lun=lun_name
    CellCLI> list griddisk where celldisk=celldisk_name
    
  8. Verify the grid disk was added to the Oracle ASM disk groups.

    The following query should return no rows.

    SQL> SELECT path,header_status FROM v$asm_disk WHERE group_number=0;
    

    The following query shows whether all the failgroups have the same number of disks:

    SQL> SELECT group_number, failgroup, mode_status, count(*) FROM v$asm_disk
         GROUP BY group_number, failgroup, mode_status;
    

3.2.7 Moving All Drives to Another Exadata Storage Server

It may necessary to move all drives from one Exadata Storage Server to another Exadata Storage Server.

This need may occur when there is a chassis-level component failure, such as a motherboard or ILOM failure, or when troubleshooting a hardware problem.

  1. Back up the files in the following directories:
    • /etc/hosts

    • /etc/modprobe.conf

    • /etc/sysconfig/network

    • /etc/sysconfig/network-scripts

  2. Safely inactivate all grid disks and shut down Exadata Storage Server.

    Refer to "Shutting Down Exadata Storage Server". Make sure the Oracle ASM disk_repair_time attribute is set to a sufficiently large enough value so Oracle ASM does not drop the disks before the grid disks can be activated in another Exadata Storage Server.

  3. Move the hard disks, flash disks, disk controller and USB flash drive from the original Exadata Storage Server to the new Exadata Storage Server.

    Caution:

    • Ensure the first two disks, which are the system disks, are in the same first two slots. Failure to do so causes the Exadata Storage Server to function improperly.

    • Ensure the flash cards are installed in the same PCIe slots as the original Exadata Storage Server.

  4. Power on the new Exadata Storage Server using either the service processor interface or by pressing the power button.
  5. Log in to the console using the service processor or the KVM switch.
  6. Check the files in the following directories. If they are corrupted, then restore them from the backups.
    • /etc/hosts

    • /etc/modprobe.conf

    • /etc/sysconfig/network

    • /etc/sysconfig/network-scripts

  7. Use the ifconfig command to retrieve the new MAC address for eth0, eth1, eth2, and eth3. For example:
    # ifconfig eth0
    eth0      Link encap:Ethernet  HWaddr 00:14:4F:CA:D9:AE
              inet addr:10.204.74.184  Bcast:10.204.75.255  Mask:255.255.252.0
              inet6 addr: fe80::214:4fff:feca:d9ae/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:141455 errors:0 dropped:0 overruns:0 frame:0
              TX packets:6340 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:9578692 (9.1 MiB)  TX bytes:1042156 (1017.7 KiB)
              Memory:f8c60000-f8c80000
    
  8. Edit the ifcfg-eth0 file, ifcfg-eth1 file, ifcfg-eth2 file, and ifcfg-eth3 file in the /etc/sysconfig/network-scripts directory to change the HWADDR value based on the output from step 7. The following is an example of the ifcfg-eth0 file:
    #### DO NOT REMOVE THESE LINES ####
    #### %GENERATED BY CELL% ####
    DEVICE=eth0
    BOOTPROTO=static
    ONBOOT=yes
    IPADDR=10.204.74.184
    NETMASK=255.255.252.0
    NETWORK=10.204.72.0
    BROADCAST=10.204.75.255
    GATEWAY=10.204.72.1
    HOTPLUG=no
    IPV6INIT=no
    HWADDR=00:14:4F:CA:D9:AE
    
  9. Restart Exadata Storage Server.
  10. Activate the grid disks using the following command:
    CellCLI> ALTER GRIDDISK ALL ACTIVE
    

    If the Oracle ASM disk on the disks on the cell have not been dropped, then they change to ONLINE automatically, and start getting used.

  11. Validate the configuration using the following command:
    CellCLI> ALTER CELL VALIDATE CONFIGURATION
    
  12. Activate the ILOM for ASR.

3.2.8 Repurposing a Hard Disk

You may want to delete all data on a disk, and then use the disk for another purpose.

Before repurposing a hard disk, ensure that you have copies of the data that is on the disk.

If you use this procedure for the system disks (disk 0 and disk1), then only the data partitions are erased, not the system partitions.

  1. Use the CellCLI LIST command to display the Exadata Storage Server objects. You must identify the grid disks and cell disks on the hard drive. For example:
    CellCLI> LIST PHYSICALDISK
             20:0   D174LX    normal
             20:1   D149R0    normal
             ...
    
  2. Determine the cell disks and grid disks on the LUN, using a command similar to the following:
    CellCLI> LIST LUN WHERE physicalDrives='20:0' DETAIL
      name:              0_0
      deviceName:        /dev/sda
      diskType:          HardDisk
      id:                0_0
      isSystemLun:       TRUE
      lunSize:           557.861328125G
      lunUID:            0_0
      physicalDrives:    20:0
      raidLevel:         0
      lunWriteCacheMode: "WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU"
      status:            normal
    

    To get the celldisks and grid disks, use a command similar to the following:

    #cellcli -e "list diskmap" | grep 20:0
    
       20:0            K68DWJ          0                       normal  559G
       CD_00_burd01celadm01    /dev/sda3   
       "DATAC1_CD_00_burd01celadm01, RECOC1_CD_00_burd01celadm01"
    
  3. From Oracle ASM, drop the Oracle ASM disks on the hard disk using the following command:
    SQL> ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name;
    
  4. From Exadata Storage Server, drop the cell disks and grid disks on the hard disk using the following command:
    CellCLI> DROP CELLDISK celldisk_on_this_lun FORCE 
    

    Note:

    To overwrite all data on the cell disk, use the ERASE option with the DROP CELLDISK command. The following is an example of the command:

    CellCLI> DROP CELLDISK CD_03_cell01 ERASE=1pass NOWAIT
    
    CellDisk CD_03_cell01 erase is in progress
    
  5. Drop the drive for hot removal. For example:
    CellCli> ALTER PHYSICALDISK 20:0 DROP FOR REPLACEMENT
    
  6. Ensure the blue OK to Remove LED on the disk is lit before removing the disk.

    Caution:

    Ensure the disk blue LED is turned on before removing the drive. Do not remove the drive if the disk blue LED is unlit, or it may cause your system to crash.

  7. Remove the disk to be repurposed, and insert a new disk.
  8. Wait for the new hard disk to be added as a LUN.
    CellCLI> LIST LUN
    

    The cell disks and grid disks are automatically be created on the new hard disk, and the grid disks are added to the Oracle ASM group.

3.2.9 Removing and Replacing the Same Hard Disk

What happens if you accidentally remove the wrong hard disk?

If you inadvertently remove the wrong hard disk, then put the disk back. It will automatically be added back in the Oracle ASM disk group, and its data is resynchronized.

Note:

When replacing disk due to disk failure or disk problems, the LED is lit on the disk for identification.

3.2.10 Re-Enabling a Hard Disk That Was Rejected

If a physical disk was rejected because it was inserted into the wrong slot, you can re-enable the disk.

Run the following command:

Caution:

The following command removes all data on the physical disk.

CellCLI> ALTER PHYSICALDISK hard_disk_name reenable force

The following is an example of the output from the command:

Physical disk 20:0 was reenabled.

3.3 Maintaining Flash Disks on Exadata Storage Servers

Data is mirrored across Exadata Cells, and write operations are sent to at least two storage cells. If a flash card in one Exadata Storage Server has problems, then the read and write operations are serviced by the mirrored data in another Exadata Storage Server. No interruption of service occurs for the application.

If a flash card fails, then Oracle Exadata Storage Server Software determines the data in the flash cache by reading the data from the surviving mirror. The data is then written to the cell that had the failed flash card. The location of the data lost in the failed flash cache is saved by Oracle Exadata Storage Server Software at the time of the flash failure. Resilvering then starts by replacing the lost data with the mirrored copy. During resilvering, the grid disk status is ACTIVE -- RESILVERING WORKING.

3.3.1 Replacing a Flash Disk Due to Flash Disk Failure

Each Exadata Storage Server is equipped with flash devices.

Starting with Exadata Database Machine X7, the flash devices are hot pluggable on both Extreme Flash (EF) and High Capacity (HC) cells. When performing a hot-pluggable replacement of a flash device on Exadata Database Machine X7, the disk status should be Dropped for replacement, and the power LED on the flash card should be off, which indicates the flash disk is ready for online replacement.

Caution:

Removing a card with power LED on could result in a system crash. If a failed disk has a status of "Failed – dropped for replacement" but the power LED is still on, contact Oracle Support.

For Exadata Database Machine X6 and earlier, the flash devices are hot-pluggable on EF cells, but not on HC cells. On HC cells, you need to power down the cells before replacing them.

To identify a failed flash disk, use the following command:

CellCLI> LIST PHYSICALDISK WHERE disktype=flashdisk AND status=failed DETAIL

The following is an example of the output from an Extreme Flash cell:

    name:                          NVME_10
      deviceName:                 /dev/nvme7n1
      diskType:             FlashDisk
      luns:                 0_10
      makeModel:            "Oracle NVMe SSD"
      physicalFirmware:        8DV1RA13
      physicalInsertTime:     2016-09-28T11:29:13-07:00
      physicalSerial:           CVMD426500E21P6LGN
      physicalSize:              1.4554837569594383T
      slotNumber:                 10
      status:                       failed

The following is an example of the output from an Oracle Flash Accelerator F160 PCIe Card:

CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS=failed DETAIL

         name:                   FLASH_5_1
         deviceName:             /dev/nvme1n1
         diskType:               FlashDisk
         luns:                   5_1
         makeModel:              "Oracle Flash Accelerator F160 PCIe Card"
         physicalFirmware:       8DV1RA13
         physicalInsertTime:     2016-11-30T21:24:45-08:00
         physicalSerial:         1030M03UYM
         physicalSize:           1.4554837569594383T
         slotNumber:             "PCI Slot: 5; FDOM: 1"
         status:                 failed

The following is an example of the output from a Sun Flash Accelerator F40 PCIe card:

         name:                   FLASH_5_3
         diskType:               FlashDisk
         luns:                   5_3
         makeModel:              "Sun Flash Accelerator F40 PCIe Card"
         physicalFirmware:       TI35
         physicalInsertTime:     2012-07-13T15:40:59-07:00
         physicalSerial:         5L002X4P
         physicalSize:           93.13225793838501G
         slotNumber:             "PCI Slot: 5; FDOM: 3"
         status:                 failed

For the PCIe cards, the name and slotNumber attributes show the PCI slot and the FDOM number. For Extreme Flash cells, the slotNumber attribute shows the NVMe slot on the front panel.

On Exadata Database Machine X7 systems, all flash disks are in the form of an Add-in-Card (AIC), which is inserted into a PCIe slot on the motherboard. On X7 systems, the slotNumber attribute shows the PCI number and FDOM number, regardless of whether it is an EF or HC cell.

If an flash disk is detected to have failed, then an alert is generated indicating that the flash disk, as well as the LUN on it, has failed. The alert message includes either the PCI slot number and FDOM number or the NVMe slot number. These numbers uniquely identify the field replaceable unit (FRU). If you have configured the system for alert notification, then an alert is sent by e-mail message to the designated address.

A flash disk outage can cause reduction in performance and data redundancy. The failed disk should be replaced with a new flash disk at the earliest opportunity. If the flash disk is used for flash cache, then the effective cache size for the cell is reduced. If the flash disk is used for flash log, then flash log is disabled on the disk thus reducing the effective flash log size. If the flash disk is used for grid disks, then the Oracle ASM disks associated with these grid disks are automatically dropped with the FORCE option from the Oracle ASM disk group, and an Oracle ASM rebalance starts to restore the data redundancy.

The following procedure describes how to replace an FDOM due to disk failure on High Capacity cells that do not support online flash replacement. Replacing an NVMe drive on Extreme Flash cells is the same as replacing a physical disk: you can just remove the NVMe drive from the front panel and insert a new one. You do not need to shut down the cell.

  1. Shut down the cell. See "Shutting Down Exadata Storage Server"
  2. Replace the failed flash disk based on the PCI number and FDOM number. A white cell LED is lit to help locate the affected cell.
  3. Power up the cell. The cell services are started automatically. As part of the cell startup, all grid disks are automatically ONLINE in Oracle ASM.
  4. Verify that all grid disks have been successfully put online using the following command:
    CellCLI> LIST GRIDDISK ATTRIBUTES name, asmmodestatus
             data_CD_00_testceladm10     ONLINE
             data_CD_01_testceladm10     ONLINE
             data_CD_02_testceladm10     ONLINE
             ...
    

    Wait until asmmodestatus shows ONLINE or UNUSED for all grid disks.

The new flash disk is automatically used by the system. If the flash disk is used for flash cache, then the effective cache size increases. If the flash disk is used for grid disks, then the grid disks are re-created on the new flash disk. If those grid disks were part of an Oracle ASM disk group, then they are added back to the disk group, and the data is rebalanced on them based on the disk group redundancy and ASM_POWER_LIMIT parameter.

See Also:

3.3.2 Replacing a Flash Disk Due to Flash Disk Problems

Exadata Storage Server is equipped with four PCIe cards. Each card has four flash disks (FDOMs) for a total of 16 flash disks. The four PCIe cards are present on PCI slot numbers 1, 2, 4, and 5. Starting with Exadata Database Machine X7, you can replace the PCIe cards without powering down the cell. See Performing a Hot Pluggable Replacement of a Flash Disk.

In Exadata Database Machine X6 and earlier systems, the PCIe cards are not hot-pluggable. Exadata Storage Server must be powered down before replacing the flash disks or cards.

You may need to replace a flash disk because the disk is has one of the following statuses:

  • warning - predictive failure

  • warning - poor performance

  • warning - write-through caching

  • warning - peer failure

Note:

For releases earlier than release 11.2.3.2.2, the status is not present.

Flash disk predictive failure status indicates that the flash disk will fail soon, and should be replaced at the earliest opportunity. If the flash disk is used for flash cache, then it continues to be used as flash cache. If the flash disk is used for grid disks, then the Oracle ASM disks associated with these grid disks are automatically dropped, and Oracle ASM rebalance relocates the data from the predictively failed disk to other disks.

Flash disk poor performance status indicates that the flash disk demonstrates extremely poor performance, and should be replaced at the earliest opportunity. Starting with release 11.2.3.2, an underperforming disk is automatically identified and removed from active configuration. If the flash disk is used for flash cache, then flash cache is dropped from this disk thus reducing the effective flash cache size for Exadata Storage Server. If the flash disk is used for grid disks, then the Oracle ASM disks associated with the grid disks on this flash disk are automatically dropped with FORCE option, if possible. If DROP...FORCE cannot succeed due to offline partners, then the grid disks are automatically dropped normally, and Oracle ASM rebalance relocates the data from the poor performance disk to other disks.

Oracle Exadata Database Machine then runs a set of performance tests. When poor disk performance is detected by CELLSRV, the cell disk status changes to normal - confinedOnline, and the physical disk status changes to warning - confinedOnline. The following conditions trigger disk confinement:

  • Disk stopped responding. The cause code in the storage alert log is CD_PERF_HANG.

  • Slow cell disk such as the following:

    • High service time threshold (cause code CD_PERF_SLOW_ABS)

    • High relative service time threshold (cause code CD_PERF_SLOW_RLTV)

  • High read or write latency such as the following:

    • High latency on writes (cause code CD_PERF_SLOW_LAT_WT)

    • High latency on reads (cause code CD_PERF_SLOW_LAT_RD)

    • High latency on reads and writes (cause code CD_PERF_SLOW_LAT_RW)

    • Very high absolute latency on individual I/Os happening frequently (cause code CD_PERF_SLOW_LAT_ERR)

  • Errors such as I/O errors (cause code CD_PERF_IOERR).

If the disk problem is temporary and passes the tests, then it is brought back into the configuration. If the disk does not pass the tests, then it is marked as poor performance, and ASR submits a service request to replace the disk. If possible, Oracle ASM takes the grid disks offline for testing. If Oracle ASM cannot take the disks offline, then the cell disk status stays at normal - confinedOnline until the disks can be taken offline safely.

The disk status change is associated with the following entry in the cell alert history:

MESSAGE ID date_time info "Hard disk entered confinement status. The LUN
 n_m changed status to warning - confinedOnline. CellDisk changed status to normal
 - confinedOnline. Status: WARNING - CONFINEDONLINE  Manufacturer: name  Model
 Number: model  Size: size  Serial Number: serial_number  Firmware: fw_release 
 Slot Number: m  Cell Disk: cell_disk_name  Grid Disk: grid disk 1, grid disk 2
 ... Reason for confinement: threshold for service time exceeded"

The following would be logged in the storage cell alert log:

CDHS: Mark cd health state change cell_disk_name  with newState HEALTH_BAD_
ONLINE pending HEALTH_BAD_ONLINE ongoing INVALID cur HEALTH_GOOD
Celldisk entering CONFINE ACTIVE state with cause CD_PERF_SLOW_ABS activeForced: 0
inactiveForced: 0 trigger HistoryFail: 0, forceTestOutcome: 0 testFail: 0
global conf related state: numHDsConf: 1 numFDsConf: 0 numHDsHung: 0 numFDsHung: 0
...

Note:

In releases earlier than release 11.2.3.2, use the CALIBRATE command to identify a bad flash disk, and look for very low throughput and IOPS for each flash disk.

If a flash disk exhibits extremely poor performance, then it is marked as poor performance. The flash cache on that flash disk is automatically disabled, and the grid disks on that flash disk are automatically dropped from the Oracle ASM disk group.

Flash disk write-through caching status indicates the capacitors used to support data cache on the PCIe card have failed, and the card should be replaced as soon as possible.

Flash disk peer failure status indicates one of the flash disks on the same Sun Flash Accelerator PCIe card has failed or has a problem. For example, if FLASH_5_3 fails, then FLASH_5_0, FLASH_5_1, and FLASH_5_2 have peer failure status. The following is an example:

CellCLI> LIST PHYSICALDISK
         36:0            L45F3A          normal
         36:1            L45WAE          normal
         36:2            L45WQW          normal
...
         FLASH_5_0       5L0034XM        warning - peer failure
         FLASH_5_1       5L0034JE        warning - peer failure
         FLASH_5_2       5L002WJH        warning - peer failure
         FLASH_5_3       5L002X4P        failed

When CellSRV detects a predictive or peer failure in any flash disk used for write back flash cache and only one FDOM is bad, then the data on the bad FDOM is resilvered, and the data on the other three FDOMs is flushed. CellSRV then initiates an Oracle ASM rebalance for the disks if there are valid grid disks. The bad disk cannot be replaced until the tasks are completed. MS sends an alert when the disk can be replaced.

On X7 systems, each flash card on both High Capacity and Extreme Flash cells is a field-replaceable unit (FRU). The flash cards are also hot-pluggable, so you do not have to shut down the cell before removing the flash card.

On X5 and X6 systems, each flash card on High Capacity and each flash drive on Extreme Flash are FRUs. This means that there is no peer failure for these systems.

On X3 and X4 systems, because the flash card itself is a FRU, if any FDOMs were to fail, the cell software would automatically put the rest of FDOMs on that card to peer failure so that the data can be moved out to prepare for the flash card replacement.

On V2 and X2 systems, each FDOM is a FRU. There is no peer failure for flash for these systems.

When a flash disk has predictive failure due to one flash disk, then the data is copied. If the flash disk is used for grid disks, then Oracle ASM re-partners the associated partner, and does a rebalance. If the flash disk is used for write back flash cache, then the data is flushed from the flash disks to the grid disks.

To identify a predictive failure flash disk, use the following command:

CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS=  \
'warning - predictive failure' DETAIL

         name:               FLASH_1_1
         deviceName:         /dev/nvme3n1
         diskType:           FlashDisk
         luns:               1_1
         makeModel:          "Oracle Flash Accelerator F160 PCIe Card"
         physicalFirmware:   8DV1RA13
         physicalInsertTime: 2016-11-30T21:24:45-08:00
         physicalSerial:     CVMD519000251P6KGN
         physicalSize:       1.4554837569594383T
         slotNumber:         "PCI Slot: 1; FDOM: 1"
         status:             warning - predictive failure

To identify a poor performance flash disk, use the following command:

CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS= \
'warning - poor performance' DETAIL

         name:                FLASH_1_4
         diskType:            FlashDisk
         luns:                1_4
         makeModel:           "Sun Flash Accelerator F20 PCIe Card"
         physicalFirmware:    D20Y
         physicalInsertTime:  2012-09-27T13:11:16-07:00
         physicalSerial:      508002000092e70FMOD2
         physicalSize:        22.8880615234375G
         slotNumber:          "PCI Slot: 1; FDOM: 3"
         status:              warning - poor performance

An alert is generated when a flash disk is in predictive failure, poor performance, write-through caching or peer failure status. The alert includes specific instructions for replacing the flash disk. If you have configured the system for alert notifications, then the alerts are sent by e-mail message to the designated address.

Determining when to proceed with disk replacement depends on the release, as described in the following:

  • For releases earlier than 11.2.3.2:

    Wait until the Oracle ASM disks have been successfully dropped by querying the V$ASM_DISK_STAT view before proceeding with the flash disk replacement. If the normal drop did not complete before the flash disk fails, then the Oracle ASM disks are automatically dropped with the FORCE option from the Oracle ASM disk group. If the DROP command did not complete before the flash disk fails, then refer to "Replacing a Flash Disk Due to Flash Disk Failure".

  • For releases 11.2.3.2 and later:

    An alert is sent when the Oracle ASM disks have been dropped, and the flash disk can be safely replaced. If the flash disk is used for write-back flash cache, then wait until none of the grid disks are cached by the flash disk. Use the following command to check the cachedBy attribute of all the grid disks. The cell disk on the flash disk should not appear in any grid disk's cachedBy attribute.

    CellCLI> LIST GRIDDISK ATTRIBUTES name, cachedBy
    

    If the flash disk is used for both grid disks and flash cache, then wait until receiving the alert, and the cell disk is not shown in any grid disk's cachedBy attribute.

The following procedure describes how to replace a flash disk on High Capacity X6 and earlier cells due to disk problems.

Note:

On Extreme Flash X6 cells and all X7 cells, you can just remove the flash disk from the front panel and insert a new one. You do not need to shut down the cell.

  1. Stop the cell services using the following command:
    CellCLI> ALTER CELL SHUTDOWN SERVICES ALL
    

    The preceding command checks if any disks are offline, in predictive failure status or need to be copied to its mirror. If Oracle ASM redundancy is intact, then the command takes the grid disks offline in Oracle ASM, and then stops the cell services. If the following error is displayed, then it may not be safe to stop the cell services because a disk group may be forced to dismount due to redundancy.

    Stopping the RS, CELLSRV, and MS services...
    The SHUTDOWN of ALL services was not successful.
    CELL-01548: Unable to shut down CELLSRV because disk group DATA, RECO may be
    forced to dismount due to reduced redundancy.
    Getting the state of CELLSRV services... running
    Getting the state of MS services... running
    Getting the state of RS services... running
    

    If the error occurs, then restore Oracle ASM disk group redundancy and retry the command when disk status is back to normal for all the disks.

  2. Shut down the cell.
  3. Replace the failed flash disk based on the PCI number and FDOM number. A white cell LED is lit to help locate the affected cell.
  4. Power up the cell. The cell services are started automatically. As part of the cell startup, all grid disks are automatically ONLINE in Oracle ASM.
  5. Verify that all grid disks have been successfully put online using the following command:
    CellCLI> LIST GRIDDISK ATTRIBUTES name, asmmodestatus
    

    Wait until asmmodestatus shows ONLINE or UNUSED for all grid disks.

The new flash disk is automatically used by the system. If the flash disk is used for flash cache, then the effective cache size increases. If the flash disk is used for grid disks, then the grid disks are re-created on the new flash disk. If those gird disks were part of an Oracle ASM disk group, then they are added back to the disk group, and the data is rebalanced on them based on the disk group redundancy and ASM_POWER_LIMIT parameter.

See Also:

3.3.3 Performing a Hot Pluggable Replacement of a Flash Disk

Starting with Exadata Database Machine X7, flash disks are hot-pluggable on both Extreme Flash (EF) and High Capacity (HC) cells.

For Exadata Database Machine X6 and earlier, the flash devices are hot-pluggable on EF cells, but not on HC cells. For HC cells on the Exadata Database Machine X6 and earlier systems, you must power down the cells before replacing the flash disks.

  1. Determine if the flash disk is ready to be replaced.
    When performing a hot-pluggable replacement of a flash device on Exadata Database Machine X7, the disk status should be Dropped for replacement, which indicates the flash disk is ready for online replacement.
    CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS LIKE '.*dropped 
    for replacement.*' DETAIL
    
             name:               FLASH_6_1
             deviceName:         /dev/nvme0n1
             diskType:           FlashDisk
             luns:               6_0
             makeModel:          "Oracle Flash Accelerator F640 PCIe Card"
             physicalFirmware:   QDV1RD09
             physicalInsertTime: 2017-08-11T12:25:00-07:00
             physicalSerial:     PHLE6514003R6P4BGN-1
             physicalSize:       2.910957656800747T
             slotNumber:         "PCI Slot: 6; FDOM: 1"
             status:             failed - dropped for replacement
    
  2. Locate the failed flash disk based on the PCI number and FDOM number.
    A white cell LED is lit to help locate the affected cell. An amber attention LED is lit to identify the affected flash card.
  3. Make sure the power LED is off on the card.

    Caution:

    Removing a card with power LED on could result in a system crash. If a failed disk has a status of "Failed – dropped for replacement" but the power LED is still on, contact Oracle Support.
  4. Remove and replace the failed flash disk.

3.3.4 Enabling and Disabling Write Back Flash Cache for Software Versions 11.2.3.3.1 and Higher

Starting with release 11.2.3.2.1, Exadata Smart Flash Cache can transparently cache frequently-accessed data to fast solid-state storage, improving query response times and throughput. Write operations serviced by flash instead of by disk are referred to as "write back flash cache." This feature can be enabled or disabled, as needed. Note the following when changing the flash cache mode:

  • With storage cell software release 11.2.3.3.1 or higher, you do not have to stop cell services or inactivate griddisks.

  • Grid homes and Oracle Database homes must be at Oracle Database 11g release 11.2.0.3 BP9 or higher to use write back flash cache.

Note:

Any time the flash cache is dropped and re-created, there could be a performance impact when the database starts again, until the cache is warmed up.

This section contains the following topics:

3.3.4.1 Enable Write Back Flash Cache for 11.2.3.3.1 or Higher

Enable write back flash cache on the storage servers to improve query response times and throughput.

Note:

  • With release 11.2.3.3.1 or higher, you do not have to stop the cellsrv process or inactivate griddisks.

  • To reduce the performance impact on the application, enable the write back flash cache during a period of reduced workload.

  1. Validate all the Physical Disks are in NORMAL state before modifying FlashCache.

    The following command should return no rows:

    # dcli –l root –g cell_group cellcli –e “list physicaldisk attributes name,status”|grep –v NORMAL
    
  2. Drop the flash cache .
    # dcli –l root –g cell_group cellcli -e drop flashcache
    
  3. Set the flashCacheMode attribute to writeback.
    # dcli –l root – g cell_group cellcli -e "alter cell flashCacheMode=writeback"
    
  4. Re-create the flash cache.
    # dcli –l root –g cell_group cellcli -e create flashcache all 
    
  5. Verify the flashCacheMode has been set to writeback.
    # dcli –l root –g cell_group cellcli -e list cell detail | grep flashCacheMode
    
  6. Validate the griddisk attributes cachingPolicy and cachedby.
    # cellcli –e list griddisk attributes name,cachingpolicy,cachedby
    

3.3.4.2 Disable Write Back Flash Cache for 11.2.3.3.1 or Higher

Use these steps if you need to disable write back flash cache on the storage servers.

Note:

  • With release 11.2.3.3.1 or higher, you do not have to stop the cellsrv process or inactivate griddisks.

  • To reduce the performance impact on the application, disable the write back flash cache during a period of reduced workload.

  1. Validate all the Physical Disks are in NORMAL state before modifying FlashCache.

    The following command should return no rows:

    # dcli –l root –g cell_group cellcli –e “LIST PHYSICALDISK ATTRIBUTES name,status”|grep –v NORMAL
    
  2. Determine amount of dirty data in the flash cache.
    # cellcli -e "LIST METRICCURRENT ATTRIBUTES name,metricvalue WHERE name LIKE \'FC_BY_DIRTY.*\' "
    
  3. Flush the flash cache .
    # dcli –g cell_group –l root cellcli -e "ALTER FLASHCACHE ALL FLUSH" 
    
  4. Check the progress of the flushing of flash cache.

    The flushing process is complete when FC_BY_DIRTY is 0 MB.

    # dcli -g cell_group -l root cellcli -e "LIST METRICCURRENT ATTRIBUTES name,metricvalue WHERE name LIKE \'FC_BY_DIRTY.*\' " 
    

    Or, you can check to see if the attribute flushstatus has been set to Completed.

    # dcli -g cell_group -l root cellcli -e "LIST CELLDISK ATTRIBUTES name, flushstatus, flusherror" | grep FD 
    
  5. After flushing of the flash cache completes, drop the flash cache.
    # dcli -g cell_group -l root cellcli -e drop flashcache 
    
  6. Set the flashCacheMode attribute to writethrough.
    # dcli -g cell_group -l root cellcli -e "ALTER CELL flashCacheMode=writethrough"
    
  7. Re-create the flash cache.
    # dcli –l root –g cell_group cellcli -e create flashcache all 
    
  8. Verify the flashCacheMode has been set to writethrough.
    # dcli –l root –g cell_group cellcli -e list cell detail | grep flashCacheMode
    

3.3.5 Enabling and Disabling Write Back Flash Cache for Software Versions Lower Than 11.2.3.3.1

Starting with release 11.2.3.2.1, Exadata Smart Flash Cache can transparently cache frequently-accessed data to fast solid-state storage, improving query response times and throughput. Write operations serviced by flash instead of by disk are referred to as "write back flash cache." This feature can be enabled or disabled, as needed. Note the following when changing the flash cache mode:

  • The cell services must be shut down before changing the flashCacheMode attribute. The cell services can shut down on rolling basis or as a total shutdown.

  • If the change is done on a rolling basis, then ensure cell services resynchronization is complete before changing the next cell.

  • When changing the flash cache mode on a rolling basis, the grid disk attribute asmDeactivationOutcome must be yes, and the asmModeStatus attribute must be online for all grid disks on the current cell before moving to the next cell. To check the grid disk attributes, use the following command:

    LIST GRIDDISK ATTRIBUTES asmDeactivationOutcome, asmModeStatus
    
  • If the change is done on a non-rolling basis, then shut down the entire cluster before doing the change.

  • When changing the flash cache mode on a non-rolling basis, ensure the entire cluster is shut down, including Oracle Clusterware CRS and all the databases.

  • Grid homes and Oracle Database homes must be at Oracle Database 11g release 11.2.0.3 BP9 or later to use write back flash cache.

Note:

Any time the flash cache is dropped and re-created, there could be a performance impact when the database starts again, until the cache is warmed up.

This section contains the following topics:

3.3.5.1 Enabling Write Back Flash Cache on a Rolling Basis

If the attribute is to be modified from writethrough to writeback, then flash cache must be dropped before modifying the attribute. The following procedure describes the steps to enable write back flash cache on a rolling basis:

Note:

There is a shell script to automate enabling and disabling write back flash cache. Refer to My Oracle Support note 1500257.1 for the script and additional information.

See Also:

My Oracle Support note 888828.1 lists the minimum release requirements for Oracle Exadata Storage Server Software, Grid Infrastructure home, and Oracle Database home

  1. Log in as the root user to the first cell to be enabled for write back flash cache.
  2. Check that the flash cache is in normal state and no flash disks are degraded or in a critical state using the following command:
    # cellcli -e LIST FLASHCACHE detail
    
  3. Drop the flash cache on the cell using the following command:
    # cellcli -e DROP FLASHCACHE
    
  4. Inactivate the grid disks on the cell using the following command:
    # cellcli -e ALTER GRIDDISK ALL INACTIVE
    
  5. Shut down CELLSRV services using the following command:
    # cellcli -e ALTER CELL SHUTDOWN SERVICES CELLSRV
    
  6. Set the flashCacheMode attribute to writeback using the following command:
    # cellcli -e "ALTER CELL FLASHCACHEMODE=writeback"
    
  7. Restart cell services using the following command:
    # cellcli -e ALTER CELL STARTUP SERVICES CELLSRV
    
  8. Reactivate the grid disks on the cell using the following command:
    # cellcli -e ALTER GRIDDISK ALL ACTIVE
    
  9. Re-create the flash cache using the following command:
    # cellcli -e CREATE FLASHCACHE ALL
    
  10. Check the status of the cell using the following command:
    # cellcli -e LIST CELL DETAIL | grep flashCacheMode
    

    The flashCacheMode attribute should be set to writeback.

  11. Check the grid disk attributes asmDeactivationOutcome and asmModeStatus before moving to the next cell using the following command:
    CellCLI> LIST GRIDDISK ATTRIBUTES name,asmdeactivationoutcome,asmmodestatus
    

    The asmDeactivationOutcome attribute should be yes, and the asmModeStatus attribute should be online.

  12. Repeat the preceding steps on the next cell.

3.3.5.2 Enabling Write Back Flash Cache on a Non-Rolling Basis

If the attribute is to be modified from writethrough to writeback, then flash cache must be dropped before modifying the attribute. The following procedure describes the steps to enable write back flash cache on a non-rolling basis:

Note:

There is a shell script to automate enabling and disabling write back flash cache. Refer to My Oracle Support note 1500257.1 for the script and additional information.

See Also:

My Oracle Support note 888828.1 lists the minimum release requirements for Oracle Exadata Storage Server Software, Grid Infrastructure home, and Oracle Database home

  1. Log in as the root user to a database node.
  2. Shut down the entire cluster using the following commands:
    # cd $GI_HOME/bin
    # ./crsctl stop cluster -all
    
  3. Drop the flash cache for all cells using the following command:
    # dcli -g cell_group -l root cellcli -e DROP FLASHCACHE
    
  4. Shut down CELLSRV services using the following command:
    # dcli -g cell_group -l root cellcli -e ALTER CELL SHUTDOWN SERVICES CELLSRV
    
  5. Confirm that the flash cache is in writethrough mode:
    # dcli -g cell_group -l root "cellcli -e list cell detail | grep -i flashcachemode"
    
  6. Set the flashCacheMode attribute to writeback using the following command:
    # dcli -g cell_group -l root cellcli -e "ALTER CELL FLASHCACHEMODE=writeback"
    
  7. Restart cell services using the following command:
    # dcli -g cell_group -l root cellcli -e ALTER CELL STARTUP SERVICES CELLSRV
    
  8. Re-create the flash cache using the following command:
    # dcli -g cell_group -l root cellcli -e CREATE FLASHCACHE ALL
    
  9. Restart the cluster:
    # cd $GI_HOME/bin
    # ./crsctl start cluster -all
    

3.3.5.3 Disabling Write Back Flash Cache on a Rolling Basis

If the flashCacheMode attribute is modified from writeback to writethrough and there is existing flash cache, then an error is displayed. The flash cache must be flushed and dropped before changing the attribute to writethrough. Once the flush operation begins, all caching to the flash cache stops. The following procedure describes the steps to disable write back flash cache on a rolling basis:

Note:

There is a shell script to automate enabling and disabling write back flash cache. Refer to My Oracle Support note 1500257.1 for the script and additional information.

  1. Log in as the root user to the first cell to be disabled for write back flash cache.

  2. Verify the asmDeactivationOutcome attribute is yes for all grid disks on the cell using the following command:

    # dcli -g cell_group -l root cellcli -e "LIST GRIDDISK WHERE   \
     asmdeactivationoutcome != 'Yes' attributes name, asmdeactivationoutcome, \
    asmmodestatus"
    

    If a grid disk is returned, then you must resolve this issue before proceeding.

  3. Check the amount of dirty data in the flash cache using the following command:

    # dcli -g cell_group -l root cellcli -e "LIST METRICCURRENT ATTRIBUTES  \
    name,metricvalue WHERE name LIKE \'FC_BY_DIRTY.*\'"
    
  4. Flush the flash cache using the following command:

    # dcli -g cell_group -l root cellcli -e ALTER FLASHCACHE ALL FLUSH
    
  5. Check the status of the flash cache using the following command:

    # dcli -g cell_group -l root cellcli -e LIST CELLDISK ATTRIBUTES name, \
    flushstatus, flusherror | grep FD 
    

    The status shows completed when the flush is done.

  6. Perform the following set of steps on for all cells, one cell at a time. That is, perform steps (a) through (i) on one cell, then perform them on another cell until all the cells are done.

    1. Drop the flash cache using the following command.

      # cellcli -e DROP FLASHCACHE
      
    2. Inactivate all grid disks on the cell using the following command.

      # cellcli -e ALTER GRIDDISK ALL INACTIVE
      
    3. Shut down CELLSRV services using the following command.

      # cellcli -e ALTER CELL SHUTDOWN SERVICES CELLSRV
      
    4. Set the flashCacheMode attribute to writethrough using the following command.

      # cellcli -e "ALTER CELL FLASHCACHEMODE=writethrough"
      
    5. Restart cell services using the following command.

      # cellcli -e ALTER CELL STARTUP SERVICES CELLSRV
      
    6. Reactivate the grid disks on the cell using the following command.

      # cellcli -e ALTER GRIDDISK ALL ACTIVE
      
    7. Re-create the flash cache using the following command.

      # cellcli -e CREATE FLASHCACHE ALL
      
    8. Check the status of the cell using the following command.

      # cellcli -e LIST CELL DETAIL | grep flashCacheMode
      
    9. Check the grid disk attributes asmDeactivationOutcome and asmModeStatus using the following command.

      # cellcli -e LIST GRIDDISK ATTRIBUTES name,status,asmdeactivationoutcome,asmmodestatus
      

      The asmDeactivationOutcome attribute should be yes, and the asmModeStatus attribute should be online.

      If the disk status is SYNCING, wait until it is ACTIVE before proceeding.

3.3.5.4 Disabling Write Back Flash Cache on a Non-Rolling Basis

If the flashCacheMode attribute is modified from writeback to writethrough and there is existing flash cache, then an error is displayed. The flash cache must be flushed and dropped before changing the attribute to writethrough. Once the flush operation begins, all caching to the flash cache stops. The following procedure describes the steps to disable write back flash cache on a non-rolling basis:

Note:

  • There is a shell script to automate enabling and disabling write back flash cache. Refer to My Oracle Support note 1500257.1 for the script and additional information.

  • The flash cache flush operation can be performed prior to shutting down the entire cluster.

  1. Log in as the root user to the first database node to be disabled for write back flash cache.
  2. Check the amount of dirty data in the flash cache using the following command:
    # dcli -g cell_group -l root cellcli -e "LIST METRICCURRENT ATTRIBUTES  \
            name,metricvalue WHERE name LIKE \'FC_BY_DIRTY.*\'"
    
  3. Flush the flash cache using the following command:
    # dcli -g cell_group -l root cellcli -e ALTER FLASHCACHE ALL FLUSH
    
  4. Check the status as the blocks are moved to disk using the following command. The count reduces to zero.
    # dcli -g cell_group -l root cellcli -e "LIST METRICCURRENT ATTRIBUTES name, \
           metricvalue WHERE NAME LIKE \'FC_BY_DIRTY.*\'"
    
  5. Check the status of the flash disks using the following command:
    # dcli -g cell_group -l root cellcli -e LIST CELLDISK ATTRIBUTES name, flushstatus, flusherror | grep FD 
    

    The status shows completed when the flush is done.

  6. Shut down the database and the entire cluster using the following commands:
    # cd $GI_HOME/bin
    # ./crsctl stop cluster -all
    
  7. Drop the flash cache across all cells using the following command:
    # dcli -g cell_group -l root cellcli -e DROP FLASHCACHE
    
  8. Shut down CELLSRV services using the following command:
    # dcli -g cell_group -l root cellcli -e ALTER CELL SHUTDOWN SERVICES CELLSRV
    
  9. Set the flashCacheMode attribute to writethrough using the following command:
    # dcli -g cell_group -l root cellcli -e "ALTER CELL FLASHCACHEMODE=writethrough"
    
  10. Restart cell services using the following command:
    # dcli -g cell_group -l root cellcli -e ALTER CELL STARTUP SERVICES CELLSRV
    
  11. Re-create the flash cache using the following command:
    # dcli -g cell_group -l root cellcli -e CREATE FLASHCACHE ALL
    
  12. Check the flash cache mode of the cells using the following command:
    # dcli -g cell_group -l root cellcli -e LIST CELL DETAIL | grep flashCacheMode
    
  13. Restart the cluster and database using the following commands:
    # cd $GI_HOME/bin
    # ./crsctl start cluster -all
    

3.3.6 Enabling Flash Cache Compression

Flash cache compression can be enabled on Oracle Exadata Database Machine X4-2, Oracle Exadata Database Machine X3-2, and Oracle Exadata Database Machine X3-8 Full Rack systems. Oracle Exadata Database Machine X5-2, X5-8, and later systems do not have flash cache compression. Flash cache compression dynamically increases the logical capacity of the flash cache by transparently compressing user data as it is loaded into the flash cache.

Note:

  • Oracle Advanced Compression Option is required to enable flash cache compression.

  • User data is not retained when enabling flash cache compression.

The following procedure describes how to enable flash cache compression:

  1. Perform this step only if writeback flash cache is enabled. If writeback flash cache is not enabled, skip this step. Performing this step when writeback flash cache is not enabled can result in error messages. You can check the flash cache mode by running the following command:
    # cellcli -e LIST CELL DETAIL | grep flashCacheMode
    

    If writeback flash cache is enabled, then save the user data on the flash cell disks.

    # cellcli -e ALTER FLASHCACHE ALL FLUSH
    

    During the flash operation, the flushstatus attribute has a value of working. When the flush operation completes successfully, the value is changed to complete. For grid disks, the attribute cachedby should be null. Also, the number of dirty buffers (unflushed) will be 0 after flush is complete.

    # cellcli -e LIST METRICCURRENT FC_BY_DIRTY
              FC_BY_DIRTY     FLASHCACHE      0.000 MB
    
  2. Remove the flash cache from the cell.
    # cellcli -e DROP FLASHCACHE ALL
    
  3. Remove the flash log from the cell.
    # cellcli -e DROP FLASHLOG ALL
    
  4. Drop the cell disks on the flash disks.
    # cellcli -e DROP CELLDISK ALL FLASHDISK
    
  5. Enable flash cache compression using the following commands, based on the system:
    • For Oracle Exadata Database Machine X3-2, X3-8, and X4-2 Servers with Exadata Storage Cell server image 11.2.3.3.1 or higher:

      # cellcli -e ALTER CELL flashcachecompress=true
      
    • For Oracle Exadata Database Machine Exadata X3-2 with Exadata Storage Cell Server image 11.2.3.3.0:

      # cellcli -e ALTER CELL flashCacheCompX3Support=true
      # cellcli -e ALTER CELL flashCacheCompress=true
      
  6. Verify the size of the physical disks has increased.
    # cellcli -e LIST PHYSICALDISK attributes name,physicalSize,status WHERE disktype=flashdisk
    

    The status should be normal. Use the following information to validate the expected size when Compression is ON:

    • Aura 2.0/F40/X3:

      • Physical Disk Size: 93.13 G (OFF) or 186.26 G (ON)

      • Flash Cache Size: 1489 G (OFF) or 2979 G (ON)

    • Aura 2.1/F80/X4:

      • Physical Disk Size: 186.26 G (OFF) or 372.53 G (ON)

      • Flash Cache Size: 2979 G (OFF) or 5959 G (ON)

  7. Create the cell disks on the flash disks.
    # cellcli -e CREATE CELLDISK ALL FLASHDISK
    CellDisk FD_00_exampleceladm18 successfully created
    ...
    CellDisk FD_15_exampleceladm18 successfully created 
    
  8. Create the flash log.
    # cellcli -e CREATE FLASHLOG ALL
    Flash log RaNdOmceladm18_FLASHLOG successfully created 
    
  9. Create the flash cache on the cell.
    # cellcli -e CREATE FLASHCACHE ALL
    Flash cache exampleceladm18_FLASHCACHE successfully created
    

3.3.7 Disabling Flash Cache Compression

Flash cache compression can be disabled on Oracle Exadata Database Machine X4-2, Oracle Exadata Database Machine X3-2, and Oracle Exadata Database Machine X3-8 Full Rack systems. Oracle Exadata Database Machine X5-2, X5-8, and later systems do not have flash cache compression.

Note:

  • User data is not retained when disabling flash cache compression.

The following procedure describes how to disable flash cache compression:

  1. Save the user data on the flash cell disks.
    # cellcli -e ALTER FLASHCACHE ALL FLUSH
    

    For grid disks, the attribute cachedby should be null. Also, the number of dirty buffers (unflushed) will be 0 after flush is complete.

    # cellcli -e LIST METRICCURRENT FC_BY_DIRTY
              FC_BY_DIRTY     FLASHCACHE      0.000 MB
    
  2. Remove the flash cache from the cell.
    # cellcli -e DROP FLASHCACHE ALL
    
  3. Remove the flash log from the cell.
    # cellcli -e DROP FLASHLOG ALL
    
  4. Drop the cell disks on the flash disks.
    # cellcli -e DROP CELLDISK ALL FLASHDISK
    
  5. Disable Flash Cache Compression using the following commands, based on the system:
    • If Exadata Storage Cell Server image is 11.2.3.3.1 or higher and the Exadata Storage Cell is X3-2 or X4-2:

      # cellcli -e ALTER CELL flashcachecompress=false
      
    • If Exadata Storage Cell Server image is 11.2.3.3.0 and the Exadata Storage Cell is X3-2:

      # cellcli -e ALTER CELL flashCacheCompX3Support=true
      # cellcli -e ALTER CELL flashCacheCompress=false
      

      Note:

      Note that flashCacheCompress is set to false, but flashCacheCompX3Support has to be set to true.

    You can verify that Flash Cache Compress has been disabled by viewing the cell attributes:

    # cellcli -e LIST CELL attributes name,flashCacheCompress
    

    Correct values are FALSE or a null string.

  6. Verify the size of the physical disks has decreased.
    # cellcli -e LIST PHYSICALDISK attributes name,physicalSize,status WHERE disktype=flashdisk
    

    The status should be normal. Use the following information to validate the expected size when Compression is OFF:

    • Aura 2.0/F40/X3:

      • Physical Disk Size: 93.13 G (OFF) or 186.26 G (ON)

      • Flash Cache Size: 1489 G (OFF) or 2979 G (ON)

    • Aura 2.1/F80/X4:

      • Physical Disk Size: 186.26 G (OFF) or 372.53 G (ON)

      • Flash Cache Size: 2979 G (OFF) or 5959 G (ON)

  7. Create the cell disks on the flash disks.
    # cellcli -e CREATE CELLDISK ALL FLASHDISK
    CellDisk FD_00_exampleceladm18 successfully created
    ...
    CellDisk FD_15_exampleceladm18 successfully created 
    
  8. Create the flash log.
    # cellcli -e CREATE FLASHLOG ALL
    Flash log RaNdOmceladm18_FLASHLOG successfully created 
    

    Verify the flash log is in normal mode.

    # cellcli -e LIST FLASHLOG DETAIL
    
  9. Create the flash cache on the cell.
    # cellcli -e CREATE FLASHCACHE ALL
    Flash cache exampleceladm18_FLASHCACHE successfully created
    

    Verify the flash cache is in normal mode.

    # cellcli -e LIST FLASHCACHE DETAIL
    
  10. Verify that flash cache compression is disabled.
    # cellcli -e LIST CELL
    

    The value of the flashCacheCompress attribute should be false.

3.4 Maintaining the M.2 Disks of Exadata Storage Servers

Exadata Database Machine X7 systems comes with two internal M.2 devices that contain the system area. In all previous systems, the first two disks of the Exadata Storage Server are system disks and the portions on these system disks are referred to as the system area.

Note:

Oracle Exadata Rack and cell can remain online and available while replacing an Exadata Storage Server M.2 disk.

This section contains the following topics:

3.4.1 Monitoring the Status of M.2 Disks

You can monitor the status of a M.2 disk by checking its attributes with the CellCLI LIST PHYSICALDISK command.

The disk firmware maintains the error counters, and marks a drive with Predictive Failure when the disk is about to fail. The drive, not the cell software, determines if it needs replacement.

  • Use the CellCLI command LIST PHSYICALDISK to determine the status of a M.2 disk:
    CellCLI> LIST PHYSICALDISK WHERE disktype='M2Disk' DETAIL
             name:                           M2_SYS_0
               deviceName:                  /dev/sdm
               diskType:                      M2Disk
               makeModel:                    "INTEL SSDSCKJB150G7"
             physicalFirmware:         N2010112
             physicalInsertTime:      2017-07-14T08:42:24-07:00
             physicalSerial:            PHDW7082000M150A
             physicalSize:               139.73558807373047G
             slotNumber:                  "M.2 Slot: 0"
             status:                failed
    
             name:                  M2_SYS_1        
             deviceName:            /dev/sdn
             diskType:              M2Disk
             makeModel:             "INTEL SSDSCKJB150G7"
             physicalFirmware:      N2010112
             physicalInsertTime:    2017-07-14T12:25:05-07:00
             physicalSerial:        PHDW708200SZ150A
             physicalSize:          139.73558807373047G
             slotNumber:            "M.2 Slot: 1"
             status:                normal
    

    The Exadata Storage Server M.2 disk statuses are:

    • normal

    • normal - dropped for replacement

    • not present

    • failed

    • failed - dropped for replacement

    • warning - predictive failure

    • warning - predictive failure - dropped for replacement

3.4.2 Replacing a M.2 Disk Due to Failure or Other Problems

Failure of a M.2 disks reduces redundancy of the system area, and can impact patching, imaging, and system rescue. Therefore, the disk should be replaced with a new disk as soon as possible. When a M.2 disk fails, the storage server automatically and transparently switches to using the software stored on the inactive system disk, making it the active system disk.

An Exadata alert is generated when an M.2 disk fails. The alert includes specific instructions for replacing the disk. If you have configured the system for alert notifications, then the alert is sent by e-mail to the designated address.

M.2 disk is hot-pluggable and can be replaced when the power is on.

After the M.2 disk is replaced, Exadata Storage Server Software automatically adds the new device to the system partition and starts the rebuilding process.

  1. Identify the failed M.2 disk.
    CellCLI> LIST PHYSICALDISK WHERE diskType=M2Disk AND status!=normal DETAIL
             name:                            M2_SYS_0
               deviceName:                   /dev/sda
               diskType:                      M2Disk
               makeModel:                    "INTEL SSDSCKJB150G7"
             physicalFirmware:         N2010112
             physicalInsertTime:      2017-07-14T08:42:24-07:00
             physicalSerial:            PHDW7082000M150A
             physicalSize:               139.73558807373047G
             slotNumber:                  "M.2 Slot: 0"
             status:                        failed - dropped for replacement
    
  2. Locate the cell that has the white LED lit.
  3. Open the chassis and identify the M.2 disk by the slot number in Step 1. The amber LED for this disk should be lit to indicate service is needed.

    M.2 disks are hot pluggable, so you do not need to power down the cell before replacing the disk.

  4. Remove the M.2 disk:
    1. Rotate both riser board socket ejectors up and outward as far as they will go.
      The green power LED on the riser board turns off when you open the socket ejectors.
    2. Carefully lift the riser board straight up to the remove it from the sockets.
  5. Insert the replacement M.2 disk:
    1. Unpack the replacement flash riser board and place it on an antistatic mat.
    2. Align the notch in the replacement riser board with the connector key in the connector socket.
    3. Push the riser board into the connector socket until the riser board is securely seated in the socket.

      Caution:

      If the riser board does not easily seat into the connector socket, verify that the notch in the riser board is aligned with the connector key in the connector socket. If the notch is not aligned, damage to the riser board might occur.

    4. Rotate both riser board socket ejectors inward until the ejector tabs lock the riser board in place.
      The green power LED on the riser board turns on when you close the socket ejectors.
  6. Confirm the M.2 disk has been replaced.
    CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=M2Disk DETAIL
         name:                  M2_SYS_0 
        deviceName:            /dev/sdm   
       diskType:              M2Disk   
       makeModel:             "INTEL SSDSCKJB150G7"   
       physicalFirmware:      N2010112    
       physicalInsertTime:    2017-08-24T18:55:13-07:00   
       physicalSerial:        PHDW708201G0150A   
       physicalSize:          139.73558807373047G   
       slotNumber:            "M.2 Slot: 0"   
       status:                normal   
    
       name:                  M2_SYS_1   
       deviceName:            /dev/sdn   
       diskType:              M2Disk   
       makeModel:             "INTEL SSDSCKJB150G7"    
       physicalFirmware:      N2010112   
       physicalInsertTime:    2017-08-24T18:55:13-07:00   
       physicalSerial:        PHDW708200SZ150A   
       physicalSize:          139.73558807373047G   
       slotNumber:            "M.2 Slot: 1"   
       status:                normal
    
  7. Confirm the system disk arrays are have an active sync status, or are being rebuilt.
    # mdadm --detail /dev/md[2-3][4-5]
    /dev/md24:
          Container : /dev/md/imsm0, member 0
         Raid Level : raid1
         Array Size : 104857600 (100.00 GiB 107.37 GB)
      Used Dev Size : 104857600 (100.00 GiB 107.37 GB)
       Raid Devices : 2
      Total Devices : 2
    
                   State  : active
     Active Devices  : 2
    Working Devices  : 2
     Failed Devices  : 0
       Spare Devices : 0  
    
                UUUID : 152f728a:6d294098:5177b2e5:8e0d766c
       Number    Major    Minor    RaidDevice    State
           1           8         16             0            active sync  /dev/sdb
           0           8           0            1            active sync  /dev/sda
    /dev/md25:
          Container : /dev/md/imsm0, member 1
         Raid Level : raid1
         Array Size : 41660416 (39.73 GiB 42.66 GB)
      Used Dev Size : 41660544 (39.73 GiB 42.66 GB)
       Raid Devices : 2
      Total Devices : 2
    
                   State  : clean
     Active Devices  : 2
    Working Devices  : 2
     Failed Devices  : 0
       Spare Devices : 0  
    
                 UUID : 466173ba:507008c7:6d65ed89:3c40cf23
       Number    Major    Minor    RaidDevice    State
           1           8         16             0            active sync  /dev/sdb
           0           8       0             1            active sync  /dev/sda
    

3.5 Managing the RAM Cache on the Storage Servers

Cell RAM Cache is a cache in front of the Flash Cache and is extension of the database cache. It is faster than the Flash Cache, but has smaller capacity.

The Cell RAM Cache feature was introduced in Oracle Exadata Storage Server Software release 18c (18.1.0.0.0). RAM Cache is disabled by default (ramCacheMode is set to auto).
  • To view the current status of the RAM Cache, retrieve the ramCacheMode, ramCacheSize and ramCacheMaxSize attributes for the cell.
    CellCLI> LIST CELL DETAIL
    ...
            ramCacheMaxSize:       512M
            ramCacheMode:          On
            ramCacheSize:          512M
    ...
    
  • To enable the RAM Cache feature, set the ramCacheMode cell attribute to on.
    1. Use CellCLI to alter the cell.
      CellCLI> ALTER CELL ramCacheMode=on
      
    2. Restart CellSrv.
      CellCLI> ALTER CELL RESTART SERVICES CELLSRV
      
  • To disable the RAM Cache feature, set the ramCacheMode cell attribute to off.
    1. Use CellCLI to alter the cell.
      CellCLI> ALTER CELL ramCacheMode=off
      
    2. Restart CellSrv.
      CellCLI> ALTER CELL RESTART SERVICES CELLSRV
      
  • To set the maximum size of the RAM Cache, modify the value of the ramCacheMaxSize cell attribute.

    For example, to set the maximum size of the RAM Cache to 1 GB, use the following command:

    CellCLI> ALTER CELL ramCacheMaxSize=1G
    

3.6 Resizing Grid Disks

You can resize grid disks and Oracle ASM disk groups to shrink one with excess free space and increase the size of another that is near capacity.

Initial configuration of Oracle Exadata Database Machine disk group size is based on Oracle best practices and the location of the backup files. For internal backups, space is allocated at 40% for the DATA disk groups, and 60% to the RECO disk groups. For external backups, the space allocations are 80% to the DATA disk group, and 20% to the RECO disk group. The disk group allocations can be changed after deployment. For example, the DATA disk group allocation may be too small at 60%, and need to be resized to 80%.

If your system has no free space available on the cell disks and one disk group, for example RECO, has plenty of free space, then you can resize the RECO disk group to a smaller size and reallocate the free space to the DATA disk group. The free space available after shrinking the RECO disk group is at a non-contiguous offset from the existing space allocations for the DATA disk group. Grid disks can use space anywhere on the cell disks and do not have to be contiguous.

If you are expanding the grid disks and the cell disks already have sufficient space to expand the existing grid disks, then you do not need to first resize an existing disk group. You would skip steps 2 and 3 below where the example shows the RECO disk group and grid disks are shrunk (you should still verify the cell disks have enough free space before growing the DATA grid disks). The amount of free space the administrator should reserve depends on the level of failure coverage.

If you are shrinking the size of the grid disks, you should understand how space is reserved for mirroring. Data is protected by Oracle ASM using normal or high redundancy to create one or two copies of data, which are stored as file extents. These copies are stored in separate failure groups. A failure in one failure group does not affect the mirror copies, so data is still accessible. When a failure occurs, Oracle ASM re-mirrors, also known as rebalances, any extents that are not accessible so that redundancy is reestablished. In order for the re-mirroring process to succeed, sufficient free space must exist in the disk group to allow creation of the new file extent mirror copies. If there is not enough free space, then some extents will not be re-mirrored and the subsequent failure of the other data copies will require the disk group be restored from backup. Oracle ASM sends an error when a re-mirror process fails due to lack of space.

You must be using Oracle Exadata Storage Server Software release 12.1.2.1.0 or higher, or have the patch for bug 19695225 applied to your software.

This procedure for resizing grid disks applies to bare metal and virtual machine (VM) deployments.

  1. Determine the Amount of Available Space
  2. Shrink the Oracle ASM Disks in the Donor Disk Group
  3. Shrink the Grid Disks in the Donor Disk Group
  4. Increase the Size of the Grid Disks Using Available Space
  5. Increase the Size of the Oracle ASM Disks

3.6.1 Determine the Amount of Available Space

To increase the size of the disks in a disk group you must either have unallocated disk space available, or you have to reallocate space currently used by a different disk group.

  1. View the space currently used by the disk groups.
    SELECT name, total_mb, free_mb, total_mb - free_mb used_mb, round(100*free_mb/total_mb,2) pct_free
    FROM v$asm_diskgroup
    ORDER BY 1;
    
    NAME                             TOTAL_MB    FREE_MB    USED_MB   PCT_FREE
    ------------------------------ ---------- ---------- ---------- ----------
    DATAC1                           68812800    9985076   58827724      14.51
    RECOC1                           94980480   82594920   12385560      86.96
    

    The example above shows that the DATAC1 disk group has only about 15% of free space available while the RECOC1 disk group has about 87% free disk space. The PCT_FREE displayed here is raw free space, not usable free space. Additional space is needed for rebalancing operations.

  2. For the disk groups you plan to resize, view the count and status of the failure groups used by the disk groups.
    SELECT dg.name, d.failgroup, d.state, d.header_status, d.mount_mode, 
     d.mode_status, count(1) num_disks
    FROM V$ASM_DISK d, V$ASM_DISKGROUP dg
    WHERE d.group_number = dg.group_number
    AND dg.name IN ('RECOC1', 'DATAC1')
    GROUP BY dg.name, d.failgroup, d.state, d.header_status, d.mount_status,
      d.mode_status
    ORDER BY 1, 2, 3;
    
    NAME       FAILGROUP      STATE      HEADER_STATU MOUNT_S  MODE_ST  NUM_DISKS
    ---------- -------------  ---------- ------------ -------- -------  ---------
    DATAC1     EXA01CELADM01  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM02  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM03  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM04  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM05  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM06  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM07  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM08  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM09  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM10  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM11  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM12  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM13  NORMAL     MEMBER        CACHED  ONLINE   12
    DATAC1     EXA01CELADM14  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM01  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM02  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM03  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM04  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM05  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM06  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM07  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM08  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM09  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM10  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM11  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM12  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM13  NORMAL     MEMBER        CACHED  ONLINE   12
    RECOC1     EXA01CELADM14  NORMAL     MEMBER        CACHED  ONLINE   12
    

    The above example is for a full rack, which has 14 cells and 14 failure groups for DATAC1 and RECOC1. Verify that each failure group has at least 12 disks in the NORMAL state (num_disks). If you see disks listed as MISSING, or you see an unexpected number of disks for your configuration, then do not proceed until you resolve the problem.

    Extreme Flash (EF) systems should see a disk count of 8 instead of 12 for num_disks.

  3. List the corresponding grid disks associated with each cell and each failure group, so you know which grid disks to resize.
    SELECT dg.name, d.failgroup, d.path
    FROM V$ASM_DISK d, V$ASM_DISKGROUP dg
    WHERE d.group_number = dg.group_number
    AND dg.name IN ('RECOC1', 'DATAC1')
    ORDER BY 1, 2, 3;
    
    NAME        FAILGROUP      PATH
    ----------- -------------  ----------------------------------------------
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_00_exa01celadm01
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_01_exa01celadm01
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_02_exa01celadm01
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_03_exa01celadm01
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_04_exa01celadm01
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_05_exa01celadm01
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_06_exa01celadm01
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_07_exa01celadm01
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_08_exa01celadm01
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_09_exa01celadm01
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_10_exa01celadm01
    DATAC1      EXA01CELADM01  o/192.168.74.43/DATAC1_CD_11_exa01celadm01
    DATAC1      EXA01CELADM02  o/192.168.74.44/DATAC1_CD_00_exa01celadm01
    DATAC1      EXA01CELADM02  o/192.168.74.44/DATAC1_CD_01_exa01celadm01
    DATAC1      EXA01CELADM02  o/192.168.74.44/DATAC1_CD_02_exa01celadm01
    ...
    RECOC1      EXA01CELADM13  o/192.168.74.55/RECOC1_CD_00_exa01celadm13
    RECOC1      EXA01CELADM13  o/192.168.74.55/RECOC1_CD_01_exa01celadm13
    RECOC1      EXA01CELADM13  o/192.168.74.55/RECOC1_CD_02_exa01celadm13
    ...
    RECOC1      EXA01CELADM14  o/192.168.74.56/RECOC1_CD_09_exa01celadm14
    RECOC1      EXA01CELADM14  o/192.168.74.56/RECOC1_CD_10_exa01celadm14
    RECOC1      EXA01CELADM14  o/192.168.74.56/RECOC1_CD_11_exa01celadm14  
    
    168 rows returned.
    
  4. Check the cell disks for available free space.
    Free space on the cell disks can be used to increase the size of the DATAC1 grid disks.  If there is not enough available free space to expand the DATAC1 grid disks, then you must shrink the RECOC1 grid disks to provide the additional space for the desired new size of DATAC1 grid disks. 
    [root@exa01adm01 tmp]# dcli -g ~/cell_group -l root "cellcli -e list celldisk \
      attributes name,freespace" 
    exa01celadm01: CD_00_exa01celadm01 0 
    exa01celadm01: CD_01_exa01celadm01 0 
    exa01celadm01: CD_02_exa01celadm01 0 
    exa01celadm01: CD_03_exa01celadm01 0 
    exa01celadm01: CD_04_exa01celadm01 0 
    exa01celadm01: CD_05_exa01celadm01 0 
    exa01celadm01: CD_06_exa01celadm01 0 
    exa01celadm01: CD_07_exa01celadm01 0 
    exa01celadm01: CD_08_exa01celadm01 0 
    exa01celadm01: CD_09_exa01celadm01 0 
    exa01celadm01: CD_10_exa01celadm01 0 
    exa01celadm01: CD_11_exa01celadm01 0 
    ...
    

    In this example, there is no free space available, so you must shrink the RECOC1 grid disks first to provide space for the DATAC1 grid disks.  In your configuration there might be plenty of free space available and you can use that free space instead of shrinking the RECOC1 grid disks.

  5. Calculate the amount of space to shrink from the RECOC1 disk group and from each grid disk.

    The minimum size to safely shrink a disk group and its grid disks must take into account the following:

    • Space currently in use (USED_MB)

    • Space expected for growth (GROWTH_MB)

    • Space needed to rebalance in case of disk failure (DFC_MB), typically 15% of total diskgroup size

    The minimum size calculation taking the above factors into account is: 

    Minimum DG size (MB) = (USED_MB + GROWTH_MB ) * 1.15 
    
    • USED_MB can be derived from V$ASM_DISKGROUP by calculating TOTAL_MB - FREE_MB

    • GROWTH_MB is an estimate specific to how the disk group will be used in the future and should be based on historical patterns of growth

    For the RECOC1 disk group space usage shown in step 1, we see the minimum size it can shrink to assuming no growth estimates is:

    Minimum RECOC1 size = (TOTAL_MB - FREE_MB + GROWTH_MB) * 1.15

                               = ( 94980480 - 82594920 + 0) * 1.15 = 14243394 MB = 13,910 GB

    In the example output shown in Step 1, RECOC1 has plenty of free space and DATAC1 has less than 15% free. So, you could shrink RECOC1 and give the freed disk space to DATAC1. If you decide to reduce RECOC1 to half of its current size, the new size is 94980480 / 2 = 47490240 MB. This size is significantly above the minimum size we calculated for the RECOC1 disk group above, so it is safe to shrink it down to this value.

    The query in Step 2 shows that there are 168 grid disks for RECOC1, because there are 14 cells and 12 disks per cell (14 * 12 = 168). The estimated new size of each grid disk for the RECOC1 disk group is 47490240 / 168, or 282,680 MB.

    Find the closest 16 MB boundary for the new grid disk size. If you do not perform this check, then the cell will round down the grid disk size to the nearest 16 MB boundary automatically, and you could end up with a mismatch in size between the Oracle ASM disks and the grid disks.

    SQL> SELECT 16*TRUNC(&new_disk_size/16) new_disk_size FROM dual;
    Enter value for new_disk_size: 282680
    
    NEW_DISK_SIZE
    -------------
           282672
    

    Based on the above result, you should choose 282672 MB as the new size for the grid disks in the RECOC1 disk group. After resizing the grid disks, the size of the RECOC1 disk group will be 47488896 MB.

  6. Calculate how much to increase the size of each grid disk in the DATAC1 disk group.

    Ensure the Oracle ASM disk size and the grid disk sizes match across the entire disk group. The following query shows the combinations of disk sizes in each disk group. Ideally, there is only one size found for all disks and the sizes of both the Oracle ASM (total_mb) disks and the grid disks (os_mb) match.

    SELECT dg.name, d.total_mb, d.os_mb, count(1) num_disks
    FROM v$asm_diskgroup dg, v$asm_disk d
    WHERE dg.group_number = d.group_number
    GROUP BY dg.name, d.total_mb, d.os_mb;
    
    NAME                             TOTAL_MB      OS_MB  NUM_DISKS
    ------------------------------ ---------- ---------- ----------
    DATAC1                             409600     409600        168
    RECOC1                             565360     565360        168
    

    After shrinking RECOC1's grid disks, the following space is left per disk for DATAC1:

    Additional space for DATAC1 disks = RECOC1_current_size - RECOC1_new_size
                                                           = 565360 - 282672 = 282688 MB

    To calculate the new size of the grid disks for the DATAC1 disk group, use the following:

    DATAC1's disks new size  = DATAC1_ disks_current_size + new_free_space_from_RECOC1
                                              = 409600 + 282688 = 692288 MB

    Find the closest 16 MB boundary for the new grid disk size. If you do not perform this check, then the cell will round down the grid disk size to the nearest 16 MB boundary automatically, and you could end up with a mismatch in size between the Oracle ASM disks and the grid disks.

    SQL> SELECT 16*TRUNC(&new_disk_size/16) new_disk_size FROM dual;
    Enter value for new_disk_size: 692288
    
    NEW_DISK_SIZE
    -------------
           692288
    

    Based on the query result, you can use the calculated size of 692288 MB for the disks in the DATAC1 disk groups because the size is on a 16 MB boundary. If the result of the query is different from the value you supplied, then you must use the value returned by the query because that is the value to which the cell will round the grid disk size.

    The calculated value of the new grid disk size will result in the DATAC1 disk group having a total size of 116304384 MB (168 disks * 692288 MB).

3.6.2 Shrink the Oracle ASM Disks in the Donor Disk Group

If there is no free space available on the cell disks, you can reduce the space used by one disk group to provide additional disk space for a different disk group.

This task is a continuation of a example where space in the RECOC1 disk group is being reallocated to the DATAC1 disk group.
Before resizing the disk group, make sure the disk group you are taking space from has sufficient free space.
  1. Shrink the Oracle ASM disks for the RECO disk group down to the new desired size for all disks.

    Use the new size for the disks in the RECO disk group that was calculated in 5 of "Determine the Amount of Available Space".

    SQL> alter diskgroup RECOC1 resize all size 282672M rebalance power 64;
    

    Note:

    The ALTER DISKGROUP command may take several minutes to complete. The SQL prompt will not return until this operation has completed.

    Wait for rebalance to finish by checking the view GV$ASM_OPERATION.

    SQL> set lines 250 pages 1000
    SQL> col error_code form a10
    SQL> SELECT dg.name, o.*
      2  FROM gv$asm_operation o, v$asm_diskgroup dg
      3  WHERE o.group_number = dg.group_number;
    

    Proceed to the next step ONLY when the query against GV$ASM_OPERATION shows no rows for the disk group being altered.

  2. Verify the new size of the ASM disks using the following queries:
    SQL> SELECT name, total_mb, free_mb, total_mb - free_mb used_mb,
      2   round(100*free_mb/total_mb,2) pct_free
      3  FROM v$asm_diskgroup
      4  ORDER BY 1;
    
    NAME                             TOTAL_MB    FREE_MB    USED_MB   PCT_FREE
    ------------------------------ ---------- ---------- ---------- ----------
    DATAC1                           68812800    9985076   58827724      14.51
    RECOC1                           47488896   35103336   12385560      73.92
    
    SQL> SELECT dg.name, d.total_mb, d.os_mb, count(1) num_disks
      2  FROM v$asm_diskgroup dg, v$asm_disk d
      3  WHERE dg.group_number = d.group_number
      4  GROUP BY dg.name, d.total_mb, d.os_mb;
    
    NAME                             TOTAL_MB      OS_MB  NUM_DISKS
    ------------------------------ ---------- ---------- ----------
    DATAC1                             409600     409600        168
    RECOC1                             282672     565360        168
    

    The above query example shows that the disks in the RECOC1 disk group have been resized to a size of 282672 MG each, and the total disk group size is 47488896 MB.

3.6.3 Shrink the Grid Disks in the Donor Disk Group

After shrinking the disks in the Oracle ASM disk group, you then shrink the size of the grid disks on each cell.

This task is a continuation of a example where space in the RECOC1 disk group is being reallocated to the DATAC1 disk group.
You must have first completed the task Shrink the Oracle ASM Disks in the Donor Disk Group.
  1. Shrink the grid disks associated with the RECO disk group on all cells down to the new, smaller size.

    For each storage cell identified in Determine the Amount of Available Space in Step 3, shrink the grid disks to match the size of the Oracle ASM disks that were shrunk in the previous task. Use commands similar to the following:

    dcli -c exa01celadm01 -l root "cellcli -e alter griddisk RECOC1_CD_00_exa01celadm01 \
    ,RECOC1_CD_01_exa01celadm01 \
    ,RECOC1_CD_02_exa01celadm01 \
    ,RECOC1_CD_03_exa01celadm01 \
    ,RECOC1_CD_04_exa01celadm01 \
    ,RECOC1_CD_05_exa01celadm01 \
    ,RECOC1_CD_06_exa01celadm01 \
    ,RECOC1_CD_07_exa01celadm01 \
    ,RECOC1_CD_08_exa01celadm01 \
    ,RECOC1_CD_09_exa01celadm01 \
    ,RECOC1_CD_10_exa01celadm01 \
    ,RECOC1_CD_11_exa01celadm01 \
    size=282672M "
    
    dcli -c exa01celadm02 -l root "cellcli -e alter griddisk RECOC1_CD_00_exa01celadm02 \
    ,RECOC1_CD_01_exa01celadm02 \
    ,RECOC1_CD_02_exa01celadm02 \
    ,RECOC1_CD_03_exa01celadm02 \
    ,RECOC1_CD_04_exa01celadm02 \
    ,RECOC1_CD_05_exa01celadm02 \
    ,RECOC1_CD_06_exa01celadm02 \
    ,RECOC1_CD_07_exa01celadm02 \
    ,RECOC1_CD_08_exa01celadm02 \
    ,RECOC1_CD_09_exa01celadm02 \
    ,RECOC1_CD_10_exa01celadm02 \
    ,RECOC1_CD_11_exa01celadm02 \
    size=282672M "
    
    ...
    
    dcli -c exa01celadm14 -l root "cellcli -e alter griddisk RECOC1_CD_00_exa01celadm14 \
    ,RECOC1_CD_01_exa01celadm14 \
    ,RECOC1_CD_02_exa01celadm14 \
    ,RECOC1_CD_03_exa01celadm14 \
    ,RECOC1_CD_04_exa01celadm14 \
    ,RECOC1_CD_05_exa01celadm14 \
    ,RECOC1_CD_06_exa01celadm14 \
    ,RECOC1_CD_07_exa01celadm14 \
    ,RECOC1_CD_08_exa01celadm14 \
    ,RECOC1_CD_09_exa01celadm14 \
    ,RECOC1_CD_10_exa01celadm14 \
    ,RECOC1_CD_11_exa01celadm14 \
    size=282672M "
    
  2. Verify the new size of the grid disks using the following command:
    [root@exa01adm01 tmp]# dcli -g cell_group -l root "cellcli -e list griddisk attributes name,size where name like \'RECOC1.*\' "
    
    exa01celadm01: RECOC1_CD_00_exa01celadm01 276.046875G
    exa01celadm01: RECOC1_CD_01_exa01celadm01 276.046875G
    exa01celadm01: RECOC1_CD_02_exa01celadm01 276.046875G
    exa01celadm01: RECOC1_CD_03_exa01celadm01 276.046875G
    exa01celadm01: RECOC1_CD_04_exa01celadm01 276.046875G
    exa01celadm01: RECOC1_CD_05_exa01celadm01 276.046875G
    exa01celadm01: RECOC1_CD_06_exa01celadm01 276.046875G
    exa01celadm01: RECOC1_CD_07_exa01celadm01 276.046875G
    exa01celadm01: RECOC1_CD_08_exa01celadm01 276.046875G
    exa01celadm01: RECOC1_CD_09_exa01celadm01 276.046875G
    exa01celadm01: RECOC1_CD_10_exa01celadm01 276.046875G
    exa01celadm01: RECOC1_CD_11_exa01celadm01 276.046875G  
    ...
    

    The above example shows that the disks in the RECOC1 disk group have been resized to a size of 282672 MB each (276.046875 * 1024).

3.6.4 Increase the Size of the Grid Disks Using Available Space

You can increase the size used by the grid disks if there is unallocated disk space either already available, or made available by shrinking the space used by a different Oracle ASM disk group.

This task is a continuation of a example where space in the RECOC1 disk group is being reallocated to the DATAC1 disk group. If you have already have sufficient space to expand an existing disk group, then you do not need to reallocate space from a different disk group.

  1. Check that the cell disks have the expected amount of free space.
    After completing the tasks to shrink the Oracle ASM disks and the grid disks, you would expect to see the following free space on the cell disks:
    [root@exa01adm01 tmp]# dcli -g ~/cell_group -l root "cellcli -e list celldisk \
    attributes name,freespace"
    
    exa01celadm01: CD_00_exa01celadm01 276.0625G
    exa01celadm01: CD_01_exa01celadm01 276.0625G
    exa01celadm01: CD_02_exa01celadm01 276.0625G
    exa01celadm01: CD_03_exa01celadm01 276.0625G
    exa01celadm01: CD_04_exa01celadm01 276.0625G
    exa01celadm01: CD_05_exa01celadm01 276.0625G
    exa01celadm01: CD_06_exa01celadm01 276.0625G
    exa01celadm01: CD_07_exa01celadm01 276.0625G
    exa01celadm01: CD_08_exa01celadm01 276.0625G
    exa01celadm01: CD_09_exa01celadm01 276.0625G
    exa01celadm01: CD_10_exa01celadm01 276.0625G
    exa01celadm01: CD_11_exa01celadm01 276.0625G 
    ...
    
  2. For each storage cell, increase the size of the DATA grid disks to the desired new size.

    Use the size calculated in Determine the Amount of Available Space.

    dcli -c exa01celadm01 -l root "cellcli -e alter griddisk DATAC1_CD_00_exa01celadm01 \
    ,DATAC1_CD_01_exa01celadm01 \
    ,DATAC1_CD_02_exa01celadm01 \
    ,DATAC1_CD_03_exa01celadm01 \
    ,DATAC1_CD_04_exa01celadm01 \
    ,DATAC1_CD_05_exa01celadm01 \
    ,DATAC1_CD_06_exa01celadm01 \
    ,DATAC1_CD_07_exa01celadm01 \
    ,DATAC1_CD_08_exa01celadm01 \
    ,DATAC1_CD_09_exa01celadm01 \
    ,DATAC1_CD_10_exa01celadm01 \
    ,DATAC1_CD_11_exa01celadm01 \
    size=692288M "
    ...
    dcli -c exa01celadm14 -l root "cellcli -e alter griddisk DATAC1_CD_00_exa01celadm14 \
    ,DATAC1_CD_01_exa01celadm14 \
    ,DATAC1_CD_02_exa01celadm14 \
    ,DATAC1_CD_03_exa01celadm14 \
    ,DATAC1_CD_04_exa01celadm14 \
    ,DATAC1_CD_05_exa01celadm14 \
    ,DATAC1_CD_06_exa01celadm14 \
    ,DATAC1_CD_07_exa01celadm14 \
    ,DATAC1_CD_08_exa01celadm14 \
    ,DATAC1_CD_09_exa01celadm14 \
    ,DATAC1_CD_10_exa01celadm14 \
    ,DATAC1_CD_11_exa01celadm14 \
    size=692288M "
    
  3. Verify the new size of the grid disks associated with the DATAC1 disk group using the following command:
    dcli -g cell_group -l root "cellcli -e list griddisk attributes name,size \ 
    where name like \'DATAC1.*\' "
    
    exa01celadm01: DATAC1_CD_00_exa01celadm01 676.0625G
    exa01celadm01: DATAC1_CD_01_exa01celadm01 676.0625G
    exa01celadm01: DATAC1_CD_02_exa01celadm01 676.0625G
    exa01celadm01: DATAC1_CD_03_exa01celadm01 676.0625G
    exa01celadm01: DATAC1_CD_04_exa01celadm01 676.0625G
    exa01celadm01: DATAC1_CD_05_exa01celadm01 676.0625G
    exa01celadm01: DATAC1_CD_06_exa01celadm01 676.0625G
    exa01celadm01: DATAC1_CD_07_exa01celadm01 676.0625G
    exa01celadm01: DATAC1_CD_08_exa01celadm01 676.0625G
    exa01celadm01: DATAC1_CD_09_exa01celadm01 676.0625G
    exa01celadm01: DATAC1_CD_10_exa01celadm01 676.0625G
    exa01celadm01: DATAC1_CD_11_exa01celadm01 676.0625G
    

Instead of increasing the size of the DATA disk group, you could instead create new disk groups with the newly freed free space or keep it free for future use. In general, Oracle recommends using the smallest number of disk groups needed (typically DATA, RECO, and DBFS_DG) to give the greatest flexibility and ease of administration. However, there may be cases, perhaps when using virtual machines or consolidating many databases, where additional disk groups or available free space for future use may be desired.  

If you decide to leave free space on the grid disks in reserve for future use, please see the My Oracle Support Note 1684112.1 for the steps on how to allocate free space to an existing disk group at a later time.

3.6.5 Increase the Size of the Oracle ASM Disks

You can increase the size used by the Oracle ASM disks after increasing the space allocated to the associated grid disks.

This task is a continuation of a example where space in the RECOC1 disk group is being reallocated to the DATAC1 disk group.
You must have completed the task of resizing the grid disks before you can resize the corresponding Oracle ASM disk group.
  1. Increase the Oracle ASM disks for DATAC1 disk group to the new size of the grid disks on the storage cells.
    SQL> ALTER DISKGROUP datac1 RESIZE ALL;
    

    This command resizes the Oracle ASM disks to match the size of the grid disks.

    Note:

    If the specified disk group has quorum disks configured within the disk group, then the ALTER DISKGROUP ... RESIZE ALL command could fail with error ORA-15277. Quorum disks are configured if the requirements specified in Oracle Exadata Database Machine Maintenance Guide are met.

    As a workaround, you can specify the storage server failure group names (for the ones of FAILURE_TYPE "REGULAR", not "QUORUM") explicitly in the SQL command, for example:

    SQL> ALTER DISKGROUP datac1 RESIZE DISKS IN FAILGROUP exacell01, exacell02, exacell03;
    
  2. Wait for the rebalance operation to finish.
    SQL> set lines 250 pages 1000 
    SQL> col error_code form a10 
    SQL> SELECT dg.name, o.* FROM gv$asm_operation o, v$asm_diskgroup dg 
         WHERE o.group_number = dg.group_number;
    

    Do not continue to the next step until the query returns zero rows for the disk group that was altered.

  3. Verify that the new sizes for the Oracle ASM disks and disk group is at the desired sizes.
    SQL> SELECT name, total_mb, free_mb, total_mb - free_mb used_mb, 
         round(100*free_mb/total_mb,2) pct_free
         FROM v$asm_diskgroup
         ORDER BY 1;
    
    NAME                             TOTAL_MB    FREE_MB    USED_MB   PCT_FREE
    ------------------------------ ---------- ---------- ---------- ----------
    DATAC1                          116304384   57439796   58864588      49.39
    RECOC1                           47488896   34542516   12946380      72.74
    
    SQL>  SELECT dg.name, d.total_mb, d.os_mb, count(1) num_disks
          FROM  v$asm_diskgroup dg, v$asm_disk d
          WHERE dg.group_number = d.group_number
          GROUP BY dg.name, d.total_mb, d.os_mb;
     
    NAME                             TOTAL_MB      OS_MB  NUM_DISKS
    ------------------------------ ---------- ---------- ----------
    DATAC1                             692288     692288        168
    RECOC1                             282672     282672        168
    
    

    The results of the queries show that the RECOC1 and DATAC1 disk groups and disk have been resized.

3.7 Using the Oracle Exadata Storage Server Software Rescue Procedure

The rescue procedure is necessary when system disks fail, the operating system has a corrupt file system, or there was damage to the boot area. If only one system disk fails, then use CellCLI commands to recover. In the rare event that both system disks fail simultaneously, you must use the Exadata Storage Server rescue functionality provided on the Oracle Exadata Storage Server Software CELLBOOT USB flash drive.

If you are using normal redundancy, then there is only one mirror copy for the cell being rescued. The data may be irrecoverably lost if that single mirror also fails during the rescue procedure. Oracle recommends you take a complete backup of the data on the mirror copy, and immediately take the mirror copy cell offline to prevent any new data changes to it prior to attempting a rescue. This ensures that all data residing on the grid disks on the failed cell and its mirror copy is inaccessible during rescue procedure.

The Oracle ASM disk repair timer has a default repair time of 3.6 hours. If you know that you cannot perform the rescue procedure within that time frame, then you should use the Oracle ASM rebalance procedure to rebalance the disk until you can do the rescue procedure.

When using high redundancy disk groups, such as having more than one mirror copy in Oracle ASM for all the grid disks of the failed cell, then take the failed cell offline. Oracle ASM automatically drops the grid disks on the failed cell after the configured Oracle ASM time out, and starts rebalancing data using mirror copies. The default timeout is two hours. If the cell rescue takes more than two hours, then you must re-create the grid disks on the rescued cells in Oracle ASM.

Caution:

Use the rescue procedure with extreme caution. Incorrectly using the procedure can cause data loss.

It is important to note the following when using the rescue procedure:

  • The rescue procedure can potentially rewrite some or all of the disks in the cell. If this happens, then you can lose all the content on those disks without possibility of recovery.

    Use extreme caution when using this procedure, and pay attention to the prompts. Ideally, you should use the rescue procedure only with assistance from Oracle Support Services, and when you have decided that you can afford the loss of data on some or all of the disks.

  • The rescue procedure does not destroy the contents of the data disks or the contents of the data partitions on the system disks unless you explicitly choose to do so during the rescue procedure.

  • Starting in Oracle Exadata Storage Server Software 11g Release 2 (11.2), the rescue procedure restores the Exadata Storage Server software to the same release. This includes any patches that existed on the cell as of the last successful boot. The following is not restored using the rescue procedure:

    • Configuration for the cell, such as alert configurations, SMTP information, administrator e-mail address, and so on. It does however restore the network configuration that existed at the end of last successful run of /usr/local/bin/ipconf utility. It does restore the SSH identities for the cell, and the root, celladmin and cellmonitor users.

      ILOM configurations for Exadata Storage Servers. Typically, ILOM configurations remain undamaged even in case of Exadata Storage Server software failures.

  • The rescue procedure does not examine or reconstruct data disks or data partitions on the system disks. If there is data corruption on the grid disks, then do not use the rescue procedure. Instead use the rescue procedure for Oracle Database and Oracle ASM.

After a successful rescue, you must reconfigure the cell, and if you had chosen to preserve the data, then import the cell disks. If you chose not to preserve the data, then you should create new cell disks, and grid disks.

This section contains the following topics:

See Also:

3.7.1 Performing Rescue Using the CELLBOOT USB Flash Drive

Using the CELLBOOT USB flash drive, perform the following procedure:

  1. Connect to the Exadata Storage Server using the console.

  2. Boot the Exadata Storage Server. You will see something like the following:

    Press any key to enter the menu
    Booting Exadata_DBM_0: CELL_USB_BOOT_trying_C0D0_as_HD1 in 4 seconds...
    Booting Exadata_DBM_0: CELL_USB_BOOT_trying_C0D0_as_HD1 in 3 seconds...
    Press any key to see the menu.
    

    Note that for old versions of Exadata software, you may see the "Oracle Exadata" splash screen. If the splash screen appears, press any key on the keyboard. The splash screen remains visible for only 5 seconds.

  3. In the displayed list of boot options, scroll down to the last option, CELL_USB_BOOT_CELLBOOT_usb_in_rescue_mode, and press Enter.

  4. Select the rescue option, and proceed with the rescue.

  5. Do the following when prompted to restart the system or enter the shell at the end of the first phase of the rescue:

    1. Choose to enter the shell. Do not choose to restart the system.

    2. Log in to the shell using the rescue root password.

    3. Run the reboot command from the shell.

    4. Press F8 as the cell restarts and before the Oracle Exadata splash screen appears. Pressing F8 accesses the boot device selection menu.

    5. Select the RAID controller as the boot device. This causes the cell to boot from the hard disks.

Note:

Additional options may be available that allow you to enter a rescue mode Linux login shell with limited functionality. You can log in to the shell as the root user with the password supplied by Oracle Support Services to manually run additional diagnostics and repairs on the cell. For complete details, refer to the release notes or the latest documentation for this release available to you online or through your Oracle Support Services representative.

After a successful rescue, you must configure the cell. If the data partitions were preserved, then the cell disks were imported automatically during the rescue procedure.

  1. Re-create the cell disks and grid disks for any disks that were replaced during the rescue procedure.

  2. Check the status of the grid disk. If it is inactive, run the command below to make it active:

    CellCLI> ALTER GRIDDISK ALL ACTIVE
    
  3. Log in to the Oracle ASM instance, and set the disks to ONLINE using the following command for each disk group:

    SQL> ALTER DISKGROUP disk_group_name ONLINE DISKS IN FAILGROUP \
    cell_name WAIT;
    

    Note:

    If the command fails because the disks were already force-dropped, then you need to force-add the disks back to the ASM disk groups.

    Note:

    The grid disk attributes asmmodestatus and asmdeactivationoutcome will not report correctly until the ALTER DISKGROUP statement is complete.

  4. Reconfigure the cell using the ALTER CELL command. The following is an example for the most-common parameters:

    CellCLI> ALTER CELL
    smtpServer='my_mail.example.com', -
    smtpFromAddr='john.doe@example.com', -
    smtpPwd=email_address_password, -
    smtpToAddr='jane.smith@example.com', -
    notificationPolicy='critical,warning,clear', -
    notificationMethod='mail,snmp'
    
  5. Re-create the I/O Resource Management (IORM) plan.

  6. Re-create the metric thresholds.

See Also:

3.7.1.1 Configuring Oracle Exadata Database Machine Eighth Rack Storage Server After Rescue

In Oracle Exadata Storage Server Software release 11.2.3.3 and later, no extra steps are needed after cell rescue. Use the following procedure to verify the configuration:

  1. Copy the /opt/oracle.SupportTools/resourcecontrol utility from another storage server to the /opt/oracle.SupportTools/resourcecontrol directory on the recovered server.
  2. Ensure proper permissions are set on the utility using the following command:
    # chmod 740 /opt/oracle.SupportTools/resourcecontrol
    
  3. Verify the current configuration using the following command. The output from the command is shown.
    # /opt/oracle.SupprtTools/resourcecontrol -show
    
    Validated hardware and OS. Proceed.
    Number of cores active: 6
    Number of harddisks active: 6
    Number of flashdisks active: 8
    

    For an eighth rack configuration, six cores, six hard disks, and 8 flash disks should be enabled. If other values are shown, then use the following command to enable the eighth rack configuration:

    CellCLI> ALTER CELL eighthRack=true
    

3.7.2 Re-creating a Damaged CELLBOOT USB Flash Drive

If the CELLBOOT USB flash drive is lost or damaged, then you can create a new one using the following procedure:

Note:

To create a USB flash drive for a machine running Oracle Exadata Storage Server Software release 12.1.2.1.0 or later requires a machine running Oracle Linux 6.

  1. Log in to the cell as the root user.
  2. Attach a new USB flash drive. This flash drive should have a capacity of at least 1 GB, and up to 8 GB.
  3. Remove any other USB flash drives from the system.
  4. Run the following commands:
    cd /opt/oracle.SupportTools
    ./make_cellboot_usb -verbose -force
    

3.8 Changing Existing Elastic Configurations for Storage Cells

This section describes how to perform the following changes to existing elastic configurations.

3.8.1 Adding a Cell Node

In this scenario, you want to add a new cell to an existing Exadata storage grid that includes disk groups.

  1. If this is a brand new cell, perform these steps:
    1. Complete all necessary cabling requirements to make the new storage cell available to the desired storage grid.

      Refer to the Oracle Exadata Database Machine Installation and Configuration Guide.

    2. Image the cell with the appropriate Oracle Exadata Storage Server Software image and provide appropriate input when prompted for the IP addresses.
    3. Skip to step 3.
  2. If this is an existing cell in the rack and you are allocating it to another cluster within the InfiniBand network, note the IP addresses assigned to the InfiniBand interfaces (ib0 and ib1) of the cell being added.

    Perform these steps on any database server in the cluster:

    1. cd /etc/oracle/cell/network-config
    2. cp cellip.ora cellip.ora.orig
    3. cp cellip.ora cellip.ora-bak
    4. Add the new entries to /etc/oracle/cell/network-config/cellip.ora-bak.
    5. Copy the edited file to the cellip.ora file on all database nodes using the following command, where database_nodes refers to a file containing the names of each database server in the cluster, with each name on a separate line:
      /usr/local/bin/dcli -g database_nodes -l root -f cellip.ora-bak -d /etc/oracle/cell/network-config/cellip.ora
      
  3. Add the IP addresses from the step above to the /etc/oracle/cell/network-config/cellip.ora file of every Oracle RAC node.
  4. If Auto Service Request (ASR) alerting was set up on the existing storage cells, configure cell ASR alerting for the cell being added.
    1. From any existing storage grid cell, list the cell attributes required for configuring cell ASR alerting.
      cellcli -e LIST CELL ATTRIBUTES snmpsubscriber
      
    2. Apply the same SNMP values to the new cell by running the command below as the celladmin user, replacing snmpsubscriber with the value from the previous command.
      cellcli -e "ALTER CELL snmpsubscriber=snmpsubscriber"
      
  5. Configure cell alerting for the cell being added.
    1. From any existing storage server, list the cell attributes required for configuring cell alerting.
      cellcli -e LIST CELL ATTRIBUTES
       notificationMethod,notificationPolicy,smtpToAddr,smtpFrom,
       smtpFromAddr,smtpServer,smtpUseSSL,smtpPort
      
    2. Apply the same values to the new cell by running the command below as the celladmin user, substituting the placeholders with the values found from the existing cell.
      cellcli -e "ALTER CELL
       notificationMethod='notificationMethod',
       notificationPolicy='notificationPolicy',
       smtpToAddr='smtpToAddr',
       smtpFrom='smtpFrom',
       smtpFromAddr='smtpFromAddr',
       smtpServer='smtpServer',
       smtpUseSSL=smtpUseSSL,
       smtpPort=smtpPort"
      
  6. Create cell disks on the cell being added.
    1. Log in to the cell as celladmin and run the following command:
      cellcli -e CREATE CELLDISK ALL
      
    2. Check that the flash log was created by default.
      cellcli –e LIST FLASHLOG
      

      You should see the name of the flash log. It should look like cellnodename_FLASHLOG, and its status should be normal.

      If the flash log does not exist, create it.

      cellcli -e CREATE FLASHLOG ALL
      
    3. Check the current flash cache mode and compare it to the flash cache mode on existing cells.
      cellcli -e LIST CELL ATTRIBUTES flashcachemode
      

      To change the flash cache mode to match the flash cache mode of existing cells, do the following:

      1. If the flash cache exists and the cell is in WriteBack flash cache mode, you must first flush the flash cache.

        cellcli –e ALTER FLASHCACHE ALL FLUSH
        

        Wait for the command to return.

      2. Drop the flash cache.

        cellcli -e DROP FLASHCACHE ALL
        
      3. Change the flash cache mode.

        The value of the flashCacheMode attribute is either writeback or writethrough. The value has to match the flash cache mode of the other storage cells in the cluster.

        cellcli -e "ALTER CELL flashCacheMode=writeback_or_writethrough"
        
      4. Create the flash cache.

        cellcli -e CREATE FLASHCACHE ALL
        
    4. Log in to the cell as celladmin and run the following command:
      cellcli -e CREATE CELLDISK ALL
      
    5. Log in to the cell as celladmin and run the following command:
      cellcli -e CREATE CELLDISK ALL
      
  7. Create the grid disks on the cell being added.
    1. Query the size and cachingpolicy attributes of the existing grid disks from an existing cell.
      CellCLI> LIST GRIDDISK ATTRIBUTES name,asmDiskGroupName,cachingpolicy,size,offset
      
    2. For each disk group found by the above command, create grid disks on the new cell that is being added to the cluster.

      Match the size and the cachingpolicy attributes of the existing grid disks for the particular disk group reported by the command above. Grid disks should be created in the order of increasing offset to ensure similar layout and performance characteristics as the existing cells. For example, the LIST GRIDDISK command could return something like this:

      DATAC1          default         2.15625T        32M
      DBFS_DG         default         33.796875G      2.695465087890625T
      RECOC1          none            552.109375G     2.1562957763671875T
      

      When creating grid disks, begin with DATAC1, then RECOC1, and finally DBFS_DG using the following command:

      cellcli -e CREATE GRIDDISK ALL HARDDISK
       PREFIX=matching_prefix_of_the_corresponding_existing_diskgroup,
       size=size_followed_by_G_or_T,
       cachingPolicy=\'value_from_command_above_for_this_disk_group\',
       comment =\"Cluster cluster_name diskgroup diskgroup_name\"
      

      Caution:

      Be sure to specify the EXACT size shown along with the unit (either T or G)
  8. Log in to each Oracle RAC node and verify that the newly created grid disks are visible from the Oracle RAC nodes.

    In the following example, Grid_home refers to the directory in which the Oracle Grid Infrastructure software is installed.

    $Grid_home/bin/kfod op=disks disks=all | grep cellName_being_added
    

    The kfod command should list all the grid disks created in step 7 above.

  9. Add the newly created grid disks to the respective existing ASM disk groups.

    In this example, comma_separated_disk_names refers to the disk names from step 5 corresponding to disk_group_name.

    SQL> ALTER DISKGROUP disk_group_name ADD DISK 'comma_separated_disk_names';
    

    This command kicks off an ASM rebalance at the default power level.

  10. Monitor the progress of the rebalance by querying GV$ASM_OPERATION.
    SQL> SELECT * FROM GV$ASM_OPERATION;
    

    When the rebalance completes, the addition of the cell to the Oracle RAC cluster is complete.

  11. Download and run the latest version of Oracle ExaCHK to ensure that the resulting configuration implements the latest best practices for Oracle Exadata.

3.8.2 Expanding an Existing Exadata Storage Grid

In this scenario, you have an Exadata storage cell in an Exadata rack, and you want to add the storage cell to an Exadata storage grid that you most probably want to expand.

  1. Decommission the storage cell from its current cluster. To do this, follow the procedure in "Dropping a Storage Cell from an Existing Disk Group or Storage Grid".
  2. Add the storage cell to the desired Exadata storage grid. To do this, follow the procedure in "Adding a Cell Node".
  3. Download and run the latest exachk to ensure that the resulting configuration implements the latest best practices for Oracle Exadata.

3.8.3 Dropping a Storage Cell from an Existing Disk Group or Storage Grid

You can remove a storage cell from an existing Oracle Exadata storage grid.

  1. Drop the disks belonging to the cell to be removed from Oracle ASM.

    Note:

    For Oracle Exadata Oracle VM deployments, the substeps below need to be executed from all the Oracle VM clusters.

    1. Log in to any node in the cluster.
    2. Query the list of grid disks being used by the cluster for the targeted Exadata cell.
      Grid_home/bin/asmcmd  lsdsk --suppressheader | grep cellName_being_removed | awk -F'/' '{print $NF}'
      

      Note:

      Make sure the available free space in every disk group that contains disks from the storage cell being removed is at least 15% of the allocated storage for that disk group.

    3. Drop the Oracle ASM disks returned by the command above from their respective disk groups.
      SQL> ALTER DISKGROUP diskgroup_name DROP DISKS IN FAILGROUP cellName_being_removed;
      
    4. The disk drop operation above kicks off a rebalance operation at the default power level. Monitor for the rebalance using the following command:
      SQL> SELECT * FROM gv$asm_operation;
      

      Wait until the rebalance completes, that is, wait until gv$asm_operation returns no rows.

    5. Verify that all the disk groups do not have any references to the disks from the cell being removed.
      SQL> SELECT path, name, header_status, mode_status, mount_status, state,
       failgroup FROM v$asm_disk ORDER BY path;
      

      The header_status column for all the disks belonging to the cell being removed should show FORMER.

      Reminder:

      For Exadata Oracle VM deployments, the substeps above need to be executed from all the Oracle VM clusters.

  2. Clean up the cell being removed.

    Log in to the cell as celladmin and run the following commands. Run the following commands for each set of grid disks:

    1. Drop the grid disk.
      cellcli -e drop griddisk all prefix=prefix_of_the_grid_disk
      
    2. If flash cache exists and the cell is in WriteBack flash cache mode, you must first flush the flash cache before dropping it.
      cellcli -e alter flashcache all flush
      

      Wait for the command to return.

    3. Drop the flash cache.
      cellcli -e drop flashcache all
      
    4. Drop the cell disks.
      cellcli -e drop celldisk all
      

      If you need to erase data securely, you can run the DROP CELLDISK command with the erase option, or the DROP CELL with the erase option.

      The time required to complete the erase operation is listed in the table under the DROP CELL command.

  3. Remove the entry of the cell being removed from /etc/oracle/cell/network-config/cellip.ora on all the database server nodes in the cluster.
    Run the following steps on any database server node in the cluster:
    1. cd /etc/oracle/cell/network-config
    2. cp cellip.ora cellip.ora.orig
    3. cp cellip.ora cellip.ora-bak
    4. Remove the entries for the cell being removed from /etc/oracle/cell/network-config/cellip.ora-bak.
    5. /usr/local/bin/dcli -g database_nodes -l root -f cellip.ora-bak -d /etc/oracle/cell/network-config/cellip.ora
      where database_nodes refers to a file containing the names of each database server in the cluster. Each name is on a separate line.
  4. Download and run the latest version of Oracle ExaCHK to ensure that the resulting configuration implements the latest best practices for Oracle Exadata.