2 Maintaining Exadata Database Servers

Note:

For ease of reading, the name "Oracle Exadata Rack" is used when information refers to both Oracle Exadata Database Machine and Oracle Exadata Storage Expansion Rack.

2.1 Management Server on Database Servers

Management Server (MS) running on database servers provides monitoring, alerting, and other administrative capabilities. It also provides the DBMCLI command-line administration tool.

See Also:

2.2 Maintaining the Local Storage on Exadata Database Servers

Repair of the local drives does not require an Oracle Exadata Database Machine database server to be shut down.

No downtime of the rack is required, however the individual server may require downtime, temporarily taking it out of the cluster.

2.2.1 Verifying the Database Server Configuration

Oracle recommends verifying the status of the database server RAID devices to avoid possible performance impact, or an outage.

The impact of validating the RAID devices is minimal. The impact of corrective actions will vary depending on the specific issue uncovered, and may range from simple reconfiguration to an outage.

2.2.1.1 About the RAID Storage Configuration

The local storage drives are configured in a RAID configuration.

Table 2-1 Disk Configurations for Exadata Database Machine Two-Socket Systems

Server Type RAID Controller Disk Configuration

Oracle Exadata Database Machine X8M-2

MegaRAID SAS 9361-16i

4 disk drives in a RAID-5 configuration on each database server

Oracle Exadata Database Machine X8-2

MegaRAID SAS 9361-16i

4 disk drives in a RAID-5 configuration on each database server

Oracle Exadata Database Machine X7-2

MegaRAID SAS 9361-16i

4 disk drives in a RAID-5 configuration on each database server

Oracle Exadata Database Machine X6-2

MegaRAID SAS 9361-8i

4 disk drives in a RAID-5 configuration on each database server

Oracle Exadata Database Machine X5-2

MegaRAID SAS 9361-8i

4 disk drives in a RAID-5 configuration on each database server

Oracle Exadata Database Machine X4-2

MegaRAID SAS 9261-8i

4 disk drives in a RAID-5 configuration on each database server

Oracle Exadata Database Machine X3-2

MegaRAID SAS 9261-8i

4 disk drives in a RAID-5 configuration on each database server

Oracle Exadata Database Machine X2-2

MegaRAID SAS 9261-8i

4 disk drives in a RAID-5 configuration on each database server

Table 2-2 Disk Configurations for Exadata Database Machine Eight-Socket Systems

Server Type RAID Controller Disk Configuration

Oracle Exadata Database Machine X8M-8

N/A

Two mirrored (RAID-1) NVMe flash drives in each database server

Oracle Exadata Database Machine X8-8

N/A

Two mirrored (RAID-1) NVMe flash drives in each database server

Oracle Exadata Database Machine X7-8

N/A

Two mirrored (RAID-1) NVMe flash drives in each database server

Oracle Exadata Database Machine X5-8

MegaRAID SAS 9361-8i

8 disk drives in each database server with one virtual drive created across the RAID-5 set

Oracle Exadata Database Machine X4-8

MegaRAID SAS 9261-8i

7 disk drives in each database server configured as one 6-disk RAID-5 with one global hot spare drive by default

Oracle Exadata Database Machine X3-8

MegaRAID SAS 9261-8i

8 disk drives in each database server with one virtual drive created across the RAID-5 set

2.2.1.2 Verifying Disk Controller Configuration on Oracle Exadata Database Machine X7-8 or Later Systems
  • Query mdstat to view the database server disk controller configuration on Oracle Exadata Database Machine X7-8 or later systems.
    [root@dbnode01adm01 ~]# cat /proc/mdstat 
    Personalities : [raid1] 
    md34 : active raid1 nvme3n1[1] nvme1n1[0]
          3125613568 blocks super external:/md126/0 [2/2] [UU]
          
    md24 : active raid1 nvme2n1[1] nvme0n1[0]
          262144000 blocks super external:/md127/0 [2/2] [UU]
          
    md25 : active raid1 nvme2n1[1] nvme0n1[0]
          2863467520 blocks super external:/md127/1 [2/2] [UU]
          
    md126 : inactive nvme3n1[1](S) nvme1n1[0](S)
          6306 blocks super external:imsm
           
    md127 : inactive nvme2n1[1](S) nvme0n1[0](S)
          6306 blocks super external:imsm
           
    unused devices: <none> 

If the output you see is different, then investigate and correct the problem. Degraded virtual drives usually indicate absent or failed physical disks. Disks that show [1/2] and [U_] or [_U] for the state indicate that one of the NVME disks is down. Failed disks should be replaced quickly.

2.2.1.3 Verifying Disk Controller Configuration on Oracle Exadata Database Machine X6-8 and Earlier

For Oracle Exadata Database Machine X4-2, Oracle Exadata Database Machine X3-2, and Oracle Exadata Database Machine X2-2, the expected output is one virtual drive, none degraded or offline, five physical devices (one controller and four disks), four disks, and no critical or failed disks.

For Oracle Exadata Database Machine X3-8 Full Rack and Oracle Exadata Database Machine X2-8 Full Rack, the expected output is one virtual drive, none degraded or offline, 11 physical devices (one controller, two SAS2 expansion ports, and eight disks), eight disks, and no critical or failed disks.

If your output is different, then investigate and correct the problem. Degraded virtual drives usually indicate absent or failed physical disks. Critical disks should be replaced immediately to avoid the risk of data loss if the number of failed disks in the node exceed the count needed to sustain the operations of the system. Failed disks should also be replaced quickly.

Note:

If additional virtual drives or a hot spare are present, then it may be that the procedure to reclaim disks was not performed at deployment time or that a bare metal restore procedure was performed without using the dualboot=no qualifier. Contact Oracle Support Services and reference My Oracle Support note 1323309.1 for additional information and corrective steps.

When upgrading a database server that has a hot spare to Oracle Exadata System Software release 11.2.3.2.0 or later, the hot spare is removed, and added as an active drive to the RAID configuration. The database servers have the same availability in terms of RAID5 redundancy, and can survive the loss of one drive. When a drive failure happens, Oracle Auto Service Request (ASR) sends out a notification to replace the faulty drive at the earliest opportunity.

  • Use the following command to verify the database server disk controller configuration on all systems prior to Oracle Exadata Database Machine X7-8:
    if [[ -d /proc/xen && ! -f /proc/xen/capabilities ]]
    then
      echo -e "\nThis check will not run in a user domain of a virtualized environment.  Execute this check in the management domain.\n"
    else
      if [ -x /opt/MegaRAID/storcli/storcli64 ]
      then
        export CMD=/opt/MegaRAID/storcli/storcli64
      else
        export CMD=/opt/MegaRAID/MegaCli/MegaCli64
      fi
      RAW_OUTPUT=$($CMD AdpAllInfo -aALL -nolog | grep "Device Present" -A 8);
      echo -e "The database server disk controller configuration found is:\n\n$RAW_OUTPUT";
    fi;

    Note:

    This check is not applicable to Oracle Exadata Database Machine X7-8 or later database servers because they do not have any conventional disk drives.

Example 2-1 Checking the disk controller configuration for Oracle Exadata Database Machine 2-socket system (X2-2 or later) without the disk expansion kit

The following is an example of the output from the command for Oracle Exadata Database Machine 2-socket system (X2-2 or later) without the disk expansion kit.

                Device Present
                ================
Virtual Drives    : 1 
  Degraded        : 0 
  Offline         : 0 
Physical Devices  : 5 
  Disks           : 4 
  Critical Disks  : 0 
  Failed Disks    : 0 

Example 2-2 Checking the disk controller configuration for Oracle Exadata Database Machine X4-8 Full Rack

The following is an example of the output from the command for Oracle Exadata Database Machine X4-8 Full Rack:

                Device Present
                ================
Virtual Drives    : 1
  Degraded        : 0
  Offline         : 0
Physical Devices  : 8
  Disks           : 7
  Critical Disks  : 0
  Failed Disks    : 0

Example 2-3 Checking the disk controller configuration for Oracle Exadata Database Machine X5-8 or X6-8 Full Rack

The following is an example of the output from the command for Oracle Exadata Database Machine X5-8 or X6-8 Full Rack:

                Device Present
                ================
Virtual Drives   : 1
  Degraded       : 0
  Offline        : 0
Physical Devices : 9
  Disks          : 8
  Critical Disks : 0
  Failed Disks   : 0
2.2.1.4 Verifying Virtual Drive Configuration

To verify the virtual drive configuration, use the following command to verify the virtual drive configuration:

Note:

If you are running Oracle Exadata System Software 19.1.0 or later, substitute /opt/MegaRAID/storcli/storcli64 for /opt/MegaRAID/MegaCli/MegaCli64 in the following commands:
/opt/MegaRAID/MegaCli/MegaCli64 CfgDsply -aALL | grep "Virtual Drive:";    \
/opt/MegaRAID/MegaCli/MegaCli64 CfgDsply -aALL | grep "Number Of Drives";  \
/opt/MegaRAID/MegaCli/MegaCli64 CfgDsply -aALL | grep "^State" 

The following is an example of the output for Oracle Exadata Database Machine X4-2, Oracle Exadata Database Machine X3-2 and Oracle Exadata Database Machine X2-2. The virtual device 0 should have four drives, and the state is Optimal.

Virtual Drive                 : 0 (Target Id: 0)
Number Of Drives              : 4
State                         : Optimal

The expected output for Oracle Exadata Database Machine X3-8 Full Rack and Oracle Exadata Database Machine X2-8 Full Rack displays the virtual device has eight drives and a state of optimal.

Note:

If a disk replacement was performed on a database server without using the dualboot=no option, then the database server may have three virtual devices. Contact Oracle Support and reference My Oracle Support note 1323309.1 for additional information and corrective steps.

2.2.1.5 Verifying Physical Drive Configuration

Check your system for critical, degraded, or failed disks.

To verify physical drive configuration, use the following command to verify the database server physical drive configuration:

Note:

If you are running Oracle Exadata System Software 19.1.0 or later, substitute /opt/MegaRAID/storcli/storcli64 for /opt/MegaRAID/MegaCli/MegaCli64 in the following commands:
/opt/MegaRAID/MegaCli/MegaCli64 -PDList -aALL | grep "Firmware state"

The following is an example of the output for Oracle Exadata Database Machine X4-2, Oracle Exadata Database Machine X3-2, and Oracle Exadata Database Machine X2-2:

Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up

The drives should show a state of Online, Spun Up. The order of the output is not important. The output for Oracle Exadata Database Machine X3-8 Full Rack or Oracle Exadata Database Machine X2-8 Full Rack should be eight lines of output showing a state of Online, Spun Up.

If your output is different, then investigate and correct the problem.

Degraded virtual drives usually indicate absent or failed physical disks. Critical disks should be replaced immediately to avoid the risk of data loss if the number of failed disks in the node exceed the count needed to sustain the operations of the system. Failed disks should be replaced quickly.

2.2.2 Monitoring a Database Server RAID Set Rebuilding

If a drive in a database server RAID set is replaced, then the progress of the RAID set rebuild should be monitored.

Use the following command on the database server that has the replaced disk. The command is run as the root user.

Note:

If you are running Oracle Exadata System Software 19.1.0 or later, substitute /opt/MegaRAID/storcli/storcli64 for /opt/MegaRAID/MegaCli/MegaCli64 in the following commands:
/opt/MegaRAID/MegaCli/MegaCli64 -pdrbld -showprog -physdrv \
[disk_enclosure:slot_number] -a0

In the preceding command, disk_enclosure and slot_number indicate the replacement disk identified by the MegaCli64 -PDList command. The following is an example of the output from the command:

Rebuild Progress on Device at Enclosure 252, Slot 2 Completed 41% in 13 Minutes.

2.2.3 Reclaiming a Hot Spare Drive After Upgrading to Oracle Exadata System Software Release 12.1.2.1.0 or Later

Oracle Exadata Database Machines upgraded to Oracle Exadata System Software release 12.1.2.1.0 or later that have a hot spare drive cannot use the reclaimdisks.sh script to reclaim the drive. The following procedure describes how to manually reclaim the drive:

Note:

During the procedure, the database server is restarted twice. The steps in the procedure assume that the Oracle Grid Infrastructure restart is disabled after the server restart.

The sample output shows Oracle Exadata Database Machine X2-2 database server with four disks. The enclosure identifier, slot number, and such may be different for your system.

Note:

If you are running Oracle Exadata System Software 19.1.0 or later, substitute the string /opt/MegaRAID/storcli/storcli64 for /opt/MegaRAID/MegaCli/MegaCli64 in the following commands:
  1. Identify the hot spare drive.
    # /opt/MegaRAID/MegaCli/MegaCli64 -PDList -aALL
    

    The following is an example of the output from the command for the hot spare drive:

    ...
    Enclosure Device ID: 252
    Slot Number: 3
    Enclosure position: N/A
    Device Id: 8
    WWN: 5000CCA00A9FAA5F
    Sequence Number: 2
    Media Error Count: 0
    Other Error Count: 0
    Predictive Failure Count: 0
    Last Predictive Failure Event Seq Number: 0
    PD Type: SAS
    Hotspare Information:
    Type: Global, with enclosure affinity, is revertible
     
    Raw Size: 279.396 GB [0x22ecb25c Sectors]
    Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
    Coerced Size: 278.464 GB [0x22cee000 Sectors]
    Sector Size: 0
    Logical Sector Size: 0
    Physical Sector Size: 0
    Firmware state: Hotspare, Spun down
    Device Firmware Level: A2A8
    Shield Counter: 0
    Successful diagnostics completion on : N/A
    ...
    

    The command identified the hot spare drive on enclosure identifier 252, slot 3.

  2. Obtain the virtual drive information.
    # /opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -Aall
    

    The following is an example of the output from the command:

    Adapter 0 -- Virtual Drive Information:
    Virtual Drive: 0 (Target Id: 0)
    Name :DBSYS
    RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3
    Size : 556.929 GB
    Sector Size : 512
    Is VD emulated : No
    Parity Size : 278.464 GB
    State : Optimal
    Strip Size : 1.0 MB
    Number Of Drives : 3
    Span Depth : 1
    Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
    Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
    Default Access Policy: Read/Write
    Current Access Policy: Read/Write
    Disk Cache Policy : Disabled
    Encryption Type : None
    Is VD Cached: No
    

    The command identified a RAID 5 configuration for virtual drive 0 on adapter 0.

  3. Remove the hot spare drive.
    # /opt/MegaRAID/MegaCli/MegaCli64 -PDHSP -Rmv -PhysDrv[252:3] -a0
    
  4. Add the drive as an active RAID 5 drive.
    # /opt/MegaRAID/MegaCli/MegaCli64 -LDRecon -Start -r5     \
      -Add -PhysDrv[252:3] -L0 -a0
    
    Start Reconstruction of Virtual Drive Success.
    Exit Code: 0x00
    

    Note:

    If the message Failed to Start Reconstruction of Virtual Drive is displayed, then follow the instructions in My Oracle Support note 1505157.1.

  5. Monitor the progress of the RAID reconstruction.
    # /opt/MegaRAID/MegaCli/MegaCli64 -LDRecon -ShowProg -L0 -a0
    
    Reconstruction on VD #0 (target id #0) Completed 1% in 2 Minutes.
    

    The following output shows the output of the command after the hot spare drive is added to the RAID 5 configuration, and the reconstruction is finished:

    Reconstruction on VD #0 is not in Progress.
    
  6. Verify the number of drives.
    # /opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -Aall
    

    The following is an example of the output from the command:

    Adapter 0 -- Virtual Drive Information:
    Virtual Drive: 0 (Target Id: 0)
    Name :DBSYS
    RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3
    Size : 835.394 GB
    Sector Size : 512
    Is VD emulated : No
    Parity Size : 278.464 GB
    State : Optimal
    Strip Size : 1.0 MB
    Number Of Drives : 4
    Span Depth : 1
    Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
    Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
    Default Access Policy: Read/Write
    Current Access Policy: Read/Write
    Disk Cache Policy : Disabled
    Encryption Type : None
    Is VD Cached: No
    
  7. Check the size of the RAID.
    # parted /dev/sda print
    
    Model: LSI MR9261-8i (scsi)
    Disk /dev/sda: 598GB
    Sector size (logical/physical): 512B/4096B
    Partition Table: msdos
     
    Number Start End Size Type File system Flags
    1 32.3kB 132MB 132MB primary ext3 boot
    2 132MB 598GB 598GB primary lvm 
    
  8. Restart the server in order for the changes to take effect.
  9. Check the size of the RAID again.
    # parted /dev/sda print
    
    Model: LSI MR9261-8i (scsi)
    Disk /dev/sda: 897GB
    Sector size (logical/physical): 512B/4096B
    Partition Table: msdos
     
    Number Start End Size Type File system Flags
    1 32.3kB 132MB 132MB primary ext3 boot
    2 132MB 598GB 598GB primary lvm
    

    The increased RAID size allows for extending the volume group. To extend the volume group, you must add an additional partition to the drive.

  10. Obtain the new size, in sectors.
    # parted /dev/sda
    
    GNU Parted 2.1
    Using /dev/sda
    Welcome to GNU Parted! Type 'help' to view a list of commands.
    (parted) unit s
    (parted) print
    Model: LSI MR9261-8i (scsi)
    Disk /dev/sda: 1751949312s
    Sector size (logical/physical): 512B/4096B
    Partition Table: msdos
     
    Number Start End Size Type File system Flags
    1 63s 257039s 256977s primary ext3 boot
    2 257040s 1167957629s 1167700590s primary lvm
    

    In the preceding example, a third partition can be created starting at sector 1167957630, and ending at the end of the disk at sector 1751949311.

  11. Create an additional partition on the drive.
    # parted /dev/sda
    
    GNU Parted 2.1
    Using /dev/sda
    Welcome to GNU Parted! Type 'help' to view a list of commands.
    (parted) unit s
     
    (parted) mkpart
     
    Partition type? primary/extended? primary
    File system type? [ext2]? ext2 
    Start? 1167957630
    End? 1751949311
    Warning: The resulting partition is not properly aligned for best performance.
    Ignore/Cancel? Ignore
    Warning: WARNING: the kernel failed to re-read the partition table on /dev/sda (Device or resource busy). As a
    result, it may not reflect all of your changes until after reboot.
    (parted)
     
    (parted) print
    Model: LSI MR9261-8i (scsi)
    Disk /dev/sda: 1751949312s
    Sector size (logical/physical): 512B/4096B
    Partition Table: msdos
     
    Number Start End Size Type File system Flags
    1 63s 257039s 256977s primary ext3 boot
    2 257040s 1167957629s 1167700590s primary lvm
    3 1167957630s 1751949311s 583991682s primary
     
    (parted) set 3 lvm on 
     
    Warning: WARNING: the kernel failed to re-read the partition table on /dev/sda (Device or resource busy). As a
    result, it may not reflect all of your changes until after reboot.
    (parted) print
    Model: LSI MR9261-8i (scsi)
    Disk /dev/sda: 1751949312s
    Sector size (logical/physical): 512B/4096B
    Partition Table: msdos
     
    Number Start End Size Type File system Flags
    1 63s 257039s 256977s primary ext3 boot
    2 257040s 1167957629s 1167700590s primary lvm
    3 1167957630s 1751949311s 583991682s primary lvm
    
  12. Restart the database server.
  13. Create the physical volume.
    # pvcreate /dev/partition_name
    
  14. Add the physical volume to the existing volume group.

    In the following example, substitute the actual names for the volume_group, partition_name, and volume_name.

    # vgextend volume_group /dev/partition_name
     
    Volume group "volume_name" successfully extended 
    
  15. Resize the logical volume and file systems as described in "Extending LVM Partitions."

2.2.4 Understanding Automated File Deletion Policy

Management Server (MS) includes a file deletion policy for the / (root) directory on the database servers that is triggered when file system utilization is high. Deletion of files is triggered when file utilization is 80 percent, and an alert is sent before the deletion begins. The alert includes the name of the directory, and space usage for the subdirectories. In particular, the deletion policy is as follows:

Files in the following directories are deleted using a policy based on the file modification time stamp.

  • /opt/oracle/dbserver/log
  • /opt/oracle/dbserver/dbms/deploy/config/metrics
  • /opt/oracle/dbserver/dbms/deploy/log

Files that are older than the number of days set by the metricHistoryDays attribute are deleted first, then successive deletions occur for earlier files, down to files with modification time stamps older than or equal to 10 minutes, or until file system utilization is less than 75 percent. The metricHistoryDays attribute applies to files in /opt/oracle/dbserver/dbms/deploy/config/metrics. For the other log and trace files, use the diagHistoryDays attribute.

Starting with Oracle Exadata System Software release 12.1.2.2.0, the maximum amount of space for ms-odl.trc and ms-odl.log files is 100 MB (twenty 5 MB files) for *.trc files and 100 MB (twenty 5 MB files) for *.log files. Previously, it was 50 MB (ten 5 MB files) for both *.trc and *.log files.

ms-odl generation files are renamed when they reach 5 MB, and the oldest are deleted when the files use up 100 MB of space.

2.3 Maintaining Flash Disks on Exadata Database Servers

Flash disks should be monitored and replaced when necessary.

Starting with Exadata Database Machine X7-8, the database servers contain flash devices instead of hard disks. These flash devices can be replaced without shutting down the server.

2.3.1 Monitoring the Status of Flash Disks

You can monitor the status of a flash disk on the Exadata Database Machine by checking its attributes with the DBMCLI LIST PHYSICALDISK command.

For example, a flash disk status equal to failed is probably having problems and needs to be replaced.

  • Use the DBMCLI command LIST PHSYICALDISK to determine the status of a flash disk:
    DBMCLI> LIST PHYSICALDISK WHERE disktype=flashdisk AND status!=normal DETAIL
             name:               FLASH_1_1
             deviceName:         /dev/nvme0n1
             diskType:           FlashDisk
             luns:               1_1
             makeModel:          "Oracle Flash Accelerator F640 PCIe Card"
             physicalFirmware:   QDV1RD09
             physicalInsertTime: 2017-08-11T12:25:00-07:00
             physicalSerial:     PHLE6514003R6P4BGN-1
             physicalSize:       2.910957656800747T
             slotNumber:         "PCI Slot: 1; FDOM: 1"
             status:             failed - dropped for replacement

The Exadata Database Server flash disk statuses are as follows:

  • normal

  • normal - dropped for replacement

  • failed

  • failed - dropped for replacement

  • failed - rejected due to incorrect disk model

  • failed - rejected due to incorrect disk model - dropped for replacement

  • failed - rejected due to wrong slot

  • failed - rejected due to wrong slot - dropped for replacement

  • warning - peer failure

  • warning - predictive failure, write-through caching

  • warning - predictive failure

  • warning - predictive failure - dropped for replacement

  • warning - write-through caching

2.3.2 Performing a Hot-Pluggable Replacement of a Flash Disk

For Oracle Exadata Database Machine X7-8 and X8-8 models, the database server uses hot-pluggable flash disks instead of hard disk drives.

  1. Determine if the flash disk is ready to be replaced.
    When performing a hot-pluggable replacement of a flash device on Oracle Exadata Database Machine X7-8 and X8-8 database servers, the disk status should be Dropped for replacement, which indicates the flash disk is ready for online replacement.
    DBMCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS LIKE '.*dropped 
    for replacement.*' DETAIL
    
             name:               FLASH_1_1
             deviceName:         /dev/nvme0n1
             diskType:           FlashDisk
             luns:               1_1
             makeModel:          "Oracle Flash Accelerator F640 PCIe Card"
             physicalFirmware:   QDV1RD09
             physicalInsertTime: 2017-08-11T12:25:00-07:00
             physicalSerial:     PHLE6514003R6P4BGN-1
             physicalSize:       2.910957656800747T
             slotNumber:         "PCI Slot: 1; FDOM: 1"
             status:             failed - dropped for replacement
    
  2. Locate the failed flash disk based on the PCI number and FDOM number.
    A white Locator LED is lit to help locate the affected database server. An amber Fault-Service Required LED is lit to identify the affected flash card.
  3. Make sure the DPCC OK LED is off on the card.

    Caution:

    Removing a card with the DPCC OK LED on could result in a system crash. If a failed disk has a status of Failed – dropped for replacement but the DPCC OK LED is still on, contact Oracle Support.
  4. Remove and replace the failed flash disk.
    1. Slide out the DPCC and replace the flash card inside.
    2. Slide the DPCC carrier back to the slot.
  5. Use a stylus to press both ATTN buttons on the front of the DPCC.
    • If only a single PCIe card is present, press only the corresponding ATTN button.
    • If you are not performing a hot-pluggable replacement, then this step is not necessary.

    The buttons alert the system to a request to bring the devices online. When the system acknowledges the request, it lights the DPCC OK LED indicators on the DPCC. If you do not press the ATTN buttons, then the flash disk will not be detected by the operating system.

2.4 Adding the Disk Expansion Kit to Database Servers

You can add local storage space to an Oracle Exadata Database Server by using a disk expansion kit.

Note the following restrictions and requirements:

  • The disk expansion kit is supported only on 2-socket Oracle Exadata Database Machine systems, X5-2 and later.

  • Oracle Exadata System Software release 12.1.2.3.0 or later is required.

  • If you are adding the disk expansion kit to an Oracle Exadata Database Machine X7-2 system, and you are using an Oracle Exadata System Software release before 18.1.11, then ensure that the following symbolic link is present on the database server before proceeding:

    # ls -l /opt/MegaRAID/MegaCli/MegaCli64 
    lrwxrwxrwx 1 root root 31 Jun  4 03:40 /opt/MegaRAID/MegaCli/MegaCli64 -> /opt/MegaRAID/storcli/storcli64

    If the symbolic link is not present, then use the following commands to create it:

    # mkdir -p /opt/MegaRAID/MegaCli
    # ln -s /opt/MegaRAID/storcli/storcli64 /opt/MegaRAID/MegaCli/MegaCli64

To add the disk expansion kit to an Oracle Exadata Database Server:

  1. Remove the plastic filler panels that cover the vacant drive bays and insert the four drives that are contained in the disk expansion kit.

    The server should be powered on so that the disk controller can sense the new drives.

    The drives may be installed in any order. All four drives must be installed at the same time (within 30 minutes) so that the disk controller can sense the new drives before any of them enter a power-saving mode.

    When the disk controller senses the new drives, the RAID reconstruction process automatically begins.

  2. Monitor the server alert history. Ensure that the RAID reconstruction process completes successfully before proceeding.

    The RAID reconstruction process may take several hours to complete (7 hours in the following example). Look out for the clear message (message 1_2 below), which indicates that the RAID reconstruction process is completed.

    # dbmcli -e list alerthistory
    
             1_1     2016-02-15T14:01:00-08:00       warning         "A disk
     expansion kit was installed. The additional physical drives were automatically
     added to the existing RAID5 configuration, and reconstruction of the
     corresponding virtual drive was automatically started."
    
             1_2     2016-02-15T21:01:01-08:00       clear           "Virtual drive
     reconstruction due to disk expansion was completed."
    

    At the end of the RAID reconstruction process, the virtual drive at /dev/sda includes the additional storage space from the disk expansion kit.

  3. If you are adding the disk expansion kit as part of deploying a new system, then proceed with this step. Otherwise, skip to the next step.

    This section uses reclaimdisks.sh to extend the VGExaDb volume group so that it consumes the additional storage space provided by the disk expansion kit.

    reclaimdisks.sh works only during initial deployment, before installation of the database software.

    1. Run /opt/oracle.SupportTools/reclaimdisks.sh -extend-vgexadb to extend the VGExaDb volume group.

      If prompted to fix the GUID Partition Table (GPT) or to continue with the current settings, enter F to fix the GPT.

      For example:

      # /opt/oracle.SupportTools/reclaimdisks.sh -extend-vgexadb
      Model is ORACLE SERVER X6-2
      Number of LSI controllers: 1
      Physical disks found: 8 (252:0 252:1 252:2 252:3 252:4 252:5 252:6 252:7)
      Logical drives found: 1
      Linux logical drive: 0
      RAID Level for the Linux logical drive: 5
      Physical disks in the Linux logical drive: 8 (252:0 252:1 252:2 252:3 252:4 252:5 252:6 252:7)
      Dedicated Hot Spares for the Linux logical drive: 0
      Global Hot Spares: 0
      Valid. Disks configuration: RAID5 from 8 disks with no global and dedicated hot spare disks.
      Valid. Booted: Linux. Layout: Linux + DOM0.
      [INFO     ] Size of system block device /dev/sda: 4193GB
      [INFO     ] Last partition on /dev/sda ends on: 1797GB
      [INFO     ] Unused space detected on the system block device: /dev/sda
      [INFO     ] Label of partition table on /dev/sda: gpt
      [INFO     ] Adjust the partition table to use all of the space on /dev/sda
      [INFO     ] Respond to the following prompt by typing 'F'
      Warning: Not all of the space available to /dev/sda appears to be used, you can fix the GPT to use all of the space (an extra 4679680000 blocks) or
      continue with the current setting?
      Fix/Ignore? F
      Model: LSI MR9361-8i (scsi)
      Disk /dev/sda: 4193GB
      Sector size (logical/physical): 512B/512B
      Partition Table: gpt
       
      Number  Start   End     Size    File system  Name     Flags
       1      32.8kB  537MB   537MB   ext4         primary  boot
       2      537MB   123GB   122GB                primary  lvm
       3      123GB   1690GB  1567GB               primary
       4      1690GB  1797GB  107GB                primary  lvm
       
      [INFO     ] Check for Linux with inactive DOM0 system disk
      [INFO     ] Valid Linux with inactive DOM0 system disk is detected
      [INFO     ] Number of partitions on the system device /dev/sda: 4
      [INFO     ] Higher partition number on the system device /dev/sda: 4
      [INFO     ] Last sector on the system device /dev/sda: 8189440000
      [INFO     ] End sector of the last partition on the system device /dev/sda: 3509759000
      [INFO     ] Unmount /u01 from /dev/mapper/VGExaDbOra-LVDbOra1
      [INFO     ] Remove inactive system logical volume /dev/VGExaDb/LVDbSys3
      [INFO     ] Remove xen files from /boot
      [INFO     ] Remove logical volume /dev/VGExaDbOra/LVDbOra1
      [INFO     ] Remove volume group VGExaDbOra
      [INFO     ] Remove physical volume /dev/sda4
      [INFO     ] Remove partition /dev/sda4
      [INFO     ] Remove device /dev/sda4
      [INFO     ] Remove partition /dev/sda3
      [INFO     ] Remove device /dev/sda3
      [INFO     ] Create primary partition 3 using 240132160 8189439966
      [INFO     ] Set lvm flag for the primary partition 3 on device /dev/sda
      [INFO     ] Add device /dev/sda3
      [INFO     ] Primary LVM partition /dev/sda3 has size 7949307807 sectors
      [INFO     ] Create physical volume on partition /dev/sda3
      [INFO     ] LVM Physical Volume /dev/sda3 has size 3654340511 sectors
      [INFO     ] Size of LVM physical volume less than size of device /dev/sda3
      [INFO     ] Remove LVM physical volume /dev/sda3
      [INFO     ] Reboot is required to apply the changes in the partition table
      
    2. Examine the end of the output from the previous command. If a reboot is not required, then skip to the next substep. If a reboot a required, then reboot the server and re-run /opt/oracle.SupportTools/reclaimdisks.sh -extend-vgexadb.

      For example:

      # shutdown -r now

      Then, after system reboot:

      # /opt/oracle.SupportTools/reclaimdisks.sh -extend-vgexadb
    3. Run /opt/oracle.SupportTools/reclaimdisks.sh with no arguments. In the output, confirm that there are no errors and that the output references the additional disks from the disk expansion kit.
      # /opt/oracle.SupportTools/reclaimdisks.sh
      Model is ORACLE SERVER X6-2
      Number of LSI controllers: 1
      Physical disks found: 8 (252:0 252:1 252:2 252:3 252:4 252:5 252:6 252:7)
      Logical drives found: 1
      Linux logical drive: 0
      RAID Level for the Linux logical drive: 5
      Physical disks in the Linux logical drive: 8 (252:0 252:1 252:2 252:3 252:4 252:5 252:6 252:7)
      Dedicated Hot Spares for the Linux logical drive: 0
      Global Hot Spares: 0
      Valid. Disks configuration: RAID5 from 8 disks with no global and dedicated hot spare disks.
      Valid. Booted: Linux. Layout: Linux.
      

    You can now continue with deploying the system and use the additional storage space provided by the disk expansion kit. Do not perform the next step.

  4. If you are adding the disk expansion kit to a previously deployed system, then proceed with this step.

    This step uses Operating System commands to consume the additional storage space provided by the disk expansion kit.

    1. Run parted to view the sector information for /dev/sda.

      If you see a request to fix the GPT, respond with F.

      # parted /dev/sda 
      GNU Parted 2.1Using /dev/sda
      Welcome to GNU Parted! Type 'help' to view a list of commands.
      (parted) unit s 
      (parted) print
      Warning: Not all of the space available to /dev/sda appears to be used, you can
      fix the GPT to use all of the space (an extra 4679680000 blocks) or continue
      with the current setting? Fix/Ignore? F  
      
      Model: LSI MR9361-8i (scsi) 
      Disk /dev/sda: 8189440000s 
      Sector size (logical/physical): 512B/512B 
      Partition Table: gpt 
      
      Number  Start       End           Size         File system  Name     Flags 
      1       64s         1046591s      1046528s     ext3         primary  boot 
      4       1046592s    1048639s      2048s                     primary  bios_grub
      2       1048640s    240132159s    239083520s                primary  lvm 
      
      (parted) q

      Examine the output and note the disk size. Note also the largest end sector value, which should be the end sector of the last partition. In the preceding example, the disk size is 8189440000 sectors, and the largest end sector value is 240132159. You will use these values in the next step.

    2. Create a new partition in /dev/sda.

      The command requires a start sector and an end sector, which you must derive from the values that you noted previously.

      For the start sector, add 1 to largest end sector value from the previous step. For example: 240132159 + 1 = 240132160.

      For the end sector, subtract 34 from the disk size value. For example: 8189440000 - 34 = 8189439966.

      # parted -s /dev/sda mkpart primary 240132160s 8189439966s

      This command produces no output.

    3. Review the updated partition table and take note of the partition number for the new partition.

      In this example, the new partition number is 3. You will use this value in the following commands.

      # parted -s /dev/sda unit s print
      Model: LSI MR9361-8i (scsi)
      Disk /dev/sda: 8189440000s
      Sector size (logical/physical): 512B/512B
      Partition Table: gpt 
      Number  Start        End          Size         File system  Name     Flags
      1       64s         1046591s      1046528s     ext4         primary  boot 
      4       1046592s    1048639s      2048s                     primary  bios_grub
      2       1048640s    240132159s    239083520s                primary  lvm 
      3       240132160s  8189439966s   7949307807s               primary  
      
    4. Set the LVM flag for the new partition.

      In this example, the new partition number is 3. Use the partition number that you observed in the previous step.

      You can ignore the warning displayed in the example.

      # parted -s /dev/sda set 3 lvm on
      Warning: WARNING: the kernel failed to re-read the partition table on /dev/sda (Device or
       resource busy).  As a result, it may not reflect all of your changes until after reboot.
    5. Create an LVM physical volume (PV) on the newly created partition.

      In this example, the new partition number is 3, so the physical volume is /dev/sda3. Adjust the command based on the partition number that you observed previously.

      # lvm pvcreate --force /dev/sda3
        Physical volume "/dev/sda3" successfully created
    6. Extend the LVM volume group VGExaDb to use the newly created physical volume.
      # lvm vgextend VGExaDb /dev/sda3
        Volume group "VGExaDb" successfully extended

    You can now use the additional storage space provided by the disk expansion kit to extend various storage volumes and file systems on the server.

2.5 Adding Memory Expansion Kit to Database Servers

Additional memory can be added to database servers. The following procedure describes how to add the memory:

  1. Power down the database server.
  2. Replace the plastic fillers with the DIMMs.
  3. Power on the database server.
  4. Add the database server back to the cluster.

Additional notes:

  • Memory for Sun Server X4-2 Oracle Database Servers and Sun Server X3-2 Oracle Database Servers can be expanded to a maximum of 512 GB with the memory expansion kit.
  • Memory for Sun Fire X4170 Oracle Database Servers can be expanded to a maximum of 144 GB by removing the existing memory, and replacing it with three X2-2 Memory Expansion Kits.
  • Sun Fire X4170 M2 Oracle Database Servers ship from the factory with 96 GB of memory, with 12 of the 18 DIMM slots populated with 8 GB DIMMs. The optional X2-2 Memory Expansion Kit can be used to populate the remaining 6 empty slots with 16 GB DIMMs to bring the total memory to 192 GB (12 x 8 GB and 6 x 16GB).

    The memory expansion kit is primarily for consolidation workloads where many databases are run on each database server. In this scenario, the CPU usage is often low while the memory usage is very high.

    However, there is a downside to populating all the memory slots as the frequency of the memory DIMMs drop to 800 MHz from 1333 MHz. The performance effect of the slower memory appears as increased CPU utilization. The average measured increase in CPU utilization is typically between 5% and 10%. The increase varies greatly by workload. In test workloads, several workloads had almost zero increase, while one workload had as high as a 20% increase.

  • When adding memory to Oracle Exadata Database Machines running Oracle Linux, Oracle recommends updating the /etc/security/limits.conf file with the following:

    oracle    soft     memlock 75%
    oracle    hard     memlock 75%
    

2.6 Verifying and Modifying the Link Speed on the Client Network Ports for X7 and Later Systems

You can configure 10 GbE connections or 25 GbE connections on the client network on Oracle Exadata Database Machine X7 and later database servers.

Note:

You should configure the client network ports using Oracle Exadata Deployment Assistant (OEDA) during system deployment. See Using Oracle Exadata Deployment Assistant.

The following steps may be necessary to configure a client access port if the OEDA deployment was not performed or was performed incorrectly. You can also use these steps to change the client network from 10 GbE to 25 GbE, or from 25 GbE to 10 GbE.

  1. For each network interface (designated by x) that does not have the link detected, run the following commands:
    • For 10GbE network interfaces:
      # ifdown ethx
      # ethtool -s ethx 10000 duplex full autoneg off
      # ifup ethx
      # ethtool ethx

      For 10 Gb/s, you must use SFP+ transceivers; SFP28 transceivers do not support 10 Gb/s traffic.

    • For 25GbE network interfaces:
      # ifdown ethx
      # ethtool -s ethx 25000 duplex full autoneg off
      # ifup ethx
      # ethtool ethx
  2. Confirm that the output from the ethtool command shows yes for Link detected.
            Link detected: yes
  3. Edit the appropriate files in /etc/sysconfig/network-scripts, where x is the number associated with the network interface.
    1. Locate the /etc/sysconfig/network-scripts/ifcfg-ethx file. Add the following lines, if they are not already present in the file:
      • For 10 GbE network interfaces:

        ONBOOT=YES
        ETHTOOL_OPTS="speed 10000 duplex full autoneg off"
      • For 25 GbE network interfaces:

        ONBOOT=YES
        ETHTOOL_OPTS="speed 25000 duplex full autoneg off"
    2. Repeat the previous step for all network interfaces that do not have the ETHTOOL_OPTS setting in the associated ifcfg-ethx file and are connected to 10 GbE or 25 GbE switches.

    The network interface should now show the link as detected. These changes are persistent, and do not need to be repeated after a server reboot.

  4. Check the ILOM on each compute node to validate the LAN on Motherboard is properly configured to detect the 25 GbE transceiver.
    show /HOST/network
      /HOST/network
         Targets:
    
         Properties:
             active_media = none
             auto_media_detection = enabled
             current_active_media = (none)
    
         Commands:
             cd
             set
             show

    If the NIC is not working, change the active_media and current_active_media to the proper values:

    • For 25 GbE transceivers (Fiber or Copper) these parameters should be set to SPF28
    • For 10 GbE network using RJ-45 ended CAT6 cables, these parameters should be set to RJ45

2.7 Adding and Configuring an Extra Network Card on Oracle Exadata Database Machine X6-2 and Later

You can add an additional network card on Oracle Exadata Database Machine X6-2 and later systems.

Prerequisites

Ensure you are using the correct link speed for Oracle Exadata Database Machine X7-2 and X8-2 compute nodes. Complete the steps in Verifying and Modifying the Link Speed on the Client Network Ports for X7 and Later Systems.

Oracle Exadata Database Machine X6-2

Oracle Exadata Database Machine X6-2 database server offers highly available copper 10G network on the motherboard, and an optical 10G network via a PCI card on slot 2. Oracle offers an additional Ethernet card for customers that require additional connectivity. The additional card provides either dual port 10GE copper connectivity (part number 7100488) or dual port 10GE optical connectivity (part number X1109A-Z). You install this card in PCIe slot 1 on the Oracle Exadata Database Machine X6-2 database server.

After you install the card and connect it to the network, the Oracle Exadata System Software automatically recognizes the new card and configures the two ports as eth6 and eth7 interfaces on the X6-2 database server. You can use these additional ports to provide an additional client network, or to create a separate backup or data recovery network. On a database server that runs virtual machines, you could use this to isolate traffic from two virtual machines.

Oracle Exadata Database Machine X7-2

Oracle Exadata Database Machine X7-2 and later database servers offer 2 copper (RJ45) or 2 optical (SFP28) network connections on the motherboard plus 2 optical (SFP28) network connections in PCIe card slot 1. Oracle offers an additional 4 copper (RJ45) 10G network connections for customers that require additional connectivity. The additional card is the Oracle Quad Port 10GBase-T card (part number 7111181). You install this card in PCIe slot 3 on the database server.

After you install the card and connect it to the network, the Oracle Exadata System Software automatically recognizes the new card and configures the four ports as eth5 to eth8 interfaces on the database server. You can use these additional ports to provide an additional client network, or to create a separate backup or data recovery networks. On a database server that runs virtual machines, you could use this additional client network to isolate traffic from two virtual machines.

After you have added the card to the database server, you need to configure the card. See the following topics for instructions:

Oracle Exadata Database Machine X8-2

Oracle Exadata Database Machine X8-2 database servers offer 2 copper (RJ45) or 2 copper/optical (SFP28) network connections on the motherboard plus 2 optical (SFP28) network connections in PCIe card slot 1. Oracle offers an additional 4 copper 1/10G (RJ45) or 2 optical 10/25G (SFP28) network connections for customers that require additional connectivity. The two additional cards are:

  • Oracle Quad Port 10GBase-T card (part number 7111181)
  • Oracle Dual Port 25 Gb Ethernet Adapter (part number 7118016)

The additional card is installed in PCIe slot 3 on the database server.

After you install the card and connect it to the network, the Oracle Exadata System Software automatically recognizes the new card and configures either the four ports as eth5 to eth8 interfaces for the quad port card, or eth5 and eth6 for the dual port card on the database server. You can use these additional ports to provide an additional client network, or to create a separate backup or data recovery networks. On a database server that runs virtual machines, you could use this additional client network to isolate traffic from two virtual machines.

2.7.1 Viewing the Network Interfaces

To view the network interfaces, you can run the ipconf.pl command.

Example 2-4 Viewing the default network interfaces for an Oracle Exadata Database Machine X8M-2 database server

The following example shows the output for an Oracle Exadata Database Machine X8M-2 database server without the additional network card. In addition to the RDMA Network Fabric interfaces, the output shows the interfaces for three network cards:

  • A single port 1/10Gb card, eth0
  • A dual port 10 or 25Gb card, on eth1 and eth2
  • A dual port 10 or 25Gb card, on eth3 and eth4
root@scaz23adm01 ibdiagtools]# /opt/oracle.cellos/ipconf.pl 
[Info]: ipconf command line: /opt/oracle.cellos/ipconf.pl
Logging started to /var/log/cellos/ipconf.log
Interface re0   is      Linked.    hca: mlx5_0
Interface re1   is      Linked.    hca: mlx5_0
Interface eth0  is      Linked.    driver/mac: igb/00:10:e0:c3:b7:9c
Interface eth1  is      Unlinked.  driver/mac: bnxt_en/00:10:e0:c3:b7:9d (slave of bondeth0)
Interface eth2  is      Linked.    driver/mac: bnxt_en/00:10:e0:c3:b7:9d (slave of bondeth0)
Interface eth3  is      Unlinked.  driver/mac: bnxt_en/00:0a:f7:c3:28:30
Interface eth4  is      Unlinked.  driver/mac: bnxt_en/00:0a:f7:c3:28:38

Example 2-5 Viewing the default network interfaces for an Oracle Exadata Database Machine X7-2 or X8-2 database server

The following example shows the output for an Oracle Exadata Database Machine X7-2 or X8-2 database server without the additional network card. In addition to the RDMA Network Fabric interfaces, the output shows the interfaces for three network cards:

  • A single port 10Gb card, on eth0
  • A dual port 10 or 25Gb card, on eth1 and eth2
  • A dual port 25Gb card, on eth3 and eth4
# /opt/oracle.cellos/ipconf.pl
Logging started to /var/log/cellos/ipconf.log 
Interface ib0   is          Linked.    hca: mlx4_0 
Interface ib1   is          Linked.    hca: mlx4_0 
Interface eth0  is          Linked.    driver/mac: igb/00:
10:e0:c3:ba:72 
Interface eth1  is          Linked.    driver/mac: bnxt_en
/00:10:e0:c3:ba:73 
Interface eth2  is          Linked.    driver/mac: bnxt_en
/00:10:e0:c3:ba:74 
Interface eth3  is          Linked.    driver/mac: bnxt_en
/00:0a:f7:c3:14:a0 (slave of bondeth0) 
Interface eth4  is          Linked.    driver/mac: bnxt_en
/00:0a:f7:c3:14:a0 (slave of bondeth0)

Example 2-6 Viewing the default network interfaces for an Oracle Exadata Database Machine X6-2 database server

The following example shows the output for an Oracle Exadata Database Machine X6-2 database server without the additional network card. In addition to the RDMA Network Fabric interfaces, the output shows the interfaces for two network cards:

  • A quad port 10Gb card, on eth0 to eth3
  • A dual port 10Gb card, on eth4 and eth5
# cd /opt/oracle.cellos/

# ./ipconf.pl
Logging started to /var/log/cellos/ipconf.log
Interface ib0   is          Linked.    hca: mlx4_0
Interface ib1   is          Linked.    hca: mlx4_0
Interface eth0  is          Linked.    driver/mac: ixgbe/00:10:e0:8b:24:b6
Interface eth1  is .....    Linked.    driver/mac: ixgbe/00:10:e0:8b:24:b7
Interface eth2  is .....    Linked.    driver/mac: ixgbe/00:10:e0:8b:24:b8
Interface eth3  is .....    Linked.    driver/mac: ixgbe/00:10:e0:8b:24:b9
Interface eth4  is          Linked.    driver/mac: ixgbe/90:e2:ba:ac:20:ec (slave of bondeth0)
Interface eth5  is           Linked.    driver/mac: ixgbe/90:e2:ba:ac:20:ec (slave of bondeth0)

2.7.2 Configuring the Additional Network Card for a Non-Oracle VM Environment

You can configure the additional network card on an Oracle Exadata Database Machine X6-2 or later database server for a non-Oracle VM environment.

This procedure assumes that you have already installed the network card in the Oracle Exadata Database Machine database server but have not yet completed the configuration with Oracle Exadata Deployment Assistant (OEDA).

WARNING:

If you have already installed Oracle Grid Infrastructure on Oracle Exadata Database Machine, then refer to the Oracle Clusterware documentation. Use caution when changing the network interfaces for the cluster.
  1. Ensure you have the following information for the new network card.
    You will need to input this information when you run ipconf.pl.
    • IP address
    • Netmask
    • Gateway
  2. Run the ipconf.pl script to configure the card.

    The following example shows a sample ipconf.pl session. The output shows three network cards:

    • A quad port 10Gb card, on eth0 to eth3
    • A dual port 10Gb card, on eth4 and eth5, with only one port cabled
    • A dual port 10Gb card, on eth6 and eth7, with only one port cabled. This is the new network card.

    For sample output for Oracle Exadata Database Machine X7-2, see Viewing the Network Interfaces.

    # cd /opt/oracle.cellos/
    # ./ipconf.pl
    
    Logging started to /var/log/cellos/ipconf.log
    Interface ib0   is                      Linked.    hca: mlx4_0
    Interface ib1   is                      Linked.    hca: mlx4_0
    Interface eth0  is                      Linked.    driver/mac: 
    ixgbe/00:10:e0:8b:22:e8 (slave of vmeth0)
    Interface eth1  is                      Linked.    driver/mac: 
    ixgbe/00:10:e0:8b:22:e9 (slave of bondeth0)
    Interface eth2  is                      Linked.    driver/mac: 
    ixgbe/00:10:e0:8b:22:e9 (slave of bondeth0)
    Interface eth3  is                      Linked.    driver/mac: 
    ixgbe/00:10:e0:8b:22:eb
    Interface eth4  is                      Linked.    driver/mac: 
    ixgbe/90:e2:ba:ac:1d:e4
    Interface eth5  is .................... Unlinked.  driver/mac: 
    ixgbe/90:e2:ba:ac:1d:e5
    Interface eth6  is ...                  Linked.    driver/mac: 
    ixgbe/90:e2:ba:78:d0:10
    Interface eth7  is .................... Unlinked.  driver/mac: 
    ixgbe/90:e2:ba:78:d0:11
    
    bondeth0 eth1,eth2 UP      vmbondeth0 10.128.1.169  255.255.240.0
    10.128.0.1  SCAN       test08client02.example.com
    bondeth1 None      UNCONF 
    bondeth2 None      UNCONF 
    bondeth3 None      UNCONF 
    Select interface name to configure or press Enter to continue: eth6
    Selected interface. eth6
    IP address or up or none: 10.129.19.34
    Netmask: 255.255.248.0
    Gateway (IP address or none) or none: 10.129.16.0
    
    Select network type for interface from the list below
    1: Management
    2: SCAN
    3: Other
    Network type: 3
    
    Fully qualified hostname or none: test08adm02-bkup.example.com
    Continue configuring or re-configuring interfaces? (y/n) [y]: n
    ...
    Do you want to configure basic ILOM settings (y/n) [y]: n
    [Info]: Custom changes have been detected in /etc/sysconfig/network-script
    s/ifcfg-eth6
    [Info]: Original file /etc/sysconfig/network-scripts/ifcfg-eth6 will be 
    saved in /opt/oracle.cellos/conf/network-scripts/backup_by_Exadata_ipconf
    [Info]: Original file /etc/ssh/sshd_config will be saved in /etc/ssh/sshd_
    config.backupbyExadata
    [Info]: Generate /etc/ssh/sshd_config with ListenAddress(es) 10.128.18.106, 
    10.129.19.34, 10.128.1.169, 192.168.18.44, 192.168.18.45
    Stopping sshd:                                             [  OK  ]
    Starting sshd:                                             [  OK  ]
    [Info]: Save /etc/sysctl.conf in /etc/sysctl.conf.backupbyExadata
    [Info]: Adjust settings for IB interfaces in /etc/sysctl.conf
    Re-login using new IP address 10.128.18.106 if you were disconnected after 
    following commands
    ip addr show vmbondeth0
    ip addr show bondeth0
    ip addr show vmeth0
    ip addr show eth0
    ifup eth6
    sleep 1
    ifup vmeth6
    sleep 1
    ip addr show vmeth6
    ip addr show eth6
    sleep 4
    service sshd condrestart
    
  3. If you need to set up the network card with VLAN, perform these steps:
    1. Add the VLAN ID to the /opt/oracle.cellos/cell.conf file.
      • Locate the Ethernet interface in the file. For example:

        <Interfaces>
          <Gateway>10.129.16.0</Gateway>
          <Hostname>test08adm02-bkup.example.com</Hostname>
          <IP_address>10.129.19.34</IP_address>
          <IP_enabled>yes</IP_enabled>
          <IP_ssh_listen>enabled</IP_ssh_listen>
          <Inet_protocol>IPv4</Inet_protocol>
          <Name>eth6</Name>
          <Net_type>Other</Net_type>
          <Netmask>255.255.248.0</Netmask>
          <State>1</State>
          <Status>UP</Status>
          <Vlan_id>0</Vlan_id>
        </Interfaces>
        
      • Add the VLAN ID to the <Vlan_id> element. The following example shows the interface configured with VLAN ID of 2122.

        <Interfaces>
          <Gateway>10.129.16.0</Gateway>
          <Hostname>test08adm02-bkup.example.com</Hostname>
          <IP_address>10.129.19.34</IP_address>
          <IP_enabled>yes</IP_enabled>
          <IP_ssh_listen>enabled</IP_ssh_listen>
          <Inet_protocol>IPv4</Inet_protocol>
          <Name>eth6</Name>
          <Net_type>Other</Net_type>
          <Netmask>255.255.248.0</Netmask>
          <State>1</State>
          <Status>UP</Status>
          <Vlan_id>2122</Vlan_id>
        </Interfaces>
        
    2. Run the following command to configure the network interface using the modified cell.conf file:
      # /opt/oracle.cellos/ipconf.pl -init -force
      
    3. Validate the interface has the VLAN configured by checking that the /etc/sysconfig/network-scripts directory contains files with the VLAN ID in the filename. For example, if the VLAN ID is 2122, you should see the following files:
      # ls -ltr /etc/sysconfig/network-scripts/*2122*
      
      -rw-r----- 1 root root 250 Sep  7 14:39 /etc/sysconfig/network-scripts/ifcfg-eth6.2122
      -rw-r----- 1 root root  85 Sep  7 14:39 /etc/sysconfig/network-scripts/route-eth6.2122
      -rw-r----- 1 root root  56 Sep  7 14:39 /etc/sysconfig/network-scripts/rule-eth6.2122
  4. Reboot the database server for the changes to take effect.
    # shutdown -r now
  5. Check that the network is working by pinging the gateway. For example:
    # ping 10.129.16.0

2.7.3 Configuring the Additional Network Card for an Oracle VM Environment

You can configure the additional network card on an Oracle Exadata Database Machine X6-2 and later database server for an Oracle VM environment.

This procedure assumes that you have already installed the network card in the Oracle Exadata Database Machine database server but have not yet completed the configuration with Oracle Exadata Deployment Assistant (OEDA).

Caution:

Do not attempt this procedure if you have already installed Oracle Grid Infrastructure on Oracle Exadata Database Machine.
  1. Add a section for the new network in the /EXAVMIMAGES/conf/virtual_machine_config_file configuration file in dom0.

    The following example assumes the bridge is called vmeth6, and the interface is called eth1. The virtual machine configuration file name is /EXAVMIMAGES/conf/test08adm01vm01.example.com-vm.xml.

    <Interfaces>
      <Bridge>vmeth6</Bridge>
      <Gateway>10.129.16.0</Gateway>
      <Hostname>test08adm02-bkup.example.com</Hostname>
      <IP_address>10.129.19.34</IP_address>
      <Name>eth1</Name>
      <IP_enabled>yes</IP_enabled>
      <IP_ssh_listen>disabled</IP_ssh_listen>
      <Net_type>Other</Net_type>
      <Netmask>255.255.248.0</Netmask>
      <Vlan_id>0</Vlan_id>
      <State>1</State>
      <Status>UP</Status>
    </Interfaces>
    

    If you are using VLANs, enter the appropriate VLAN ID [1-4095] in the <Vlan_id> element.

  2. Create the bridge.
    1. To create an unbonded bridge named vmeth6:
      # /opt/exadata_ovm/exadata.img.domu_maker add-single-bridge-dom0 vmeth6
      
    2. To create an bonded bridge, use a command similar to the following:
      # /opt/exadata_ovm/exadata.img.domu_maker add-bonded-bridge-dom0 bridge_name slave1 slave2 [vlan]

      slave1 and slave2 are the names of the bonded interfaces.

      For example:

      # /opt/exadata_ovm/exadata.img.domu_maker add-bonded-bridge-dom0 vmbondeth1 eth6 eth7
  3. (X2 to X8 servers only) Allocate the GUIDs for the InfiniBand Network Fabric:
    # /opt/exadata_ovm/exadata.img.domu_maker allocate-guids virtual_machine_config_file virtual_machine_config_file_final
    

    The virtual machine configuration files are located in the /EXAVMIMAGES/conf directory. For example:

    # /opt/exadata_ovm/exadata.img.domu_maker allocate-guids /EXAVMIMAGES/conf/
    test08adm01vm01.example.com-vm.xml /EXAVMIMAGES/conf/final-test08adm01vm01
    .example.com-vm.xml
    
  4. Shut down the guest and restart it.
    # /opt/exadata_ovm/exadata.img.domu_maker remove-domain /EXAVMIMAGES/conf
    /final-test08adm01vm01.example.com-vm.xml
    
    # /opt/exadata_ovm/exadata.img.domu_maker start-domain /EXAVMIMAGES/conf
    /final-test08adm01vm01.example.com-vm.xml
  5. Once the guest is running, use the ip addr command to verify the interface is valid.

    The following example checks the eth1 interface.

    # ip addr show eth1
    eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
      link/ether 00:16:3e:53:56:00 brd ff:ff:ff:ff:ff:ff
      inet 10.129.19.34/21 brd 10.129.23.255 scope global eth1
         valid_lft forever preferred_lft forever
    

2.8 Increasing the Number of Active Cores on Database Servers

You can increase the number of active cores on Oracle Exadata Database Machine using capacity-on-demand.

The number of active cores on the database servers on Oracle Exadata Database Machine X4-2 and newer systems can be reduced during installation. The number of active cores can be increased when additional capacity is needed. This is known as capacity-on-demand.

Additional cores are increased in 2-core increments on Oracle Exadata Database Machine X4-2 and newer systems, and in 8-core increments on Oracle Exadata Database Machine X4-8 Full Rack and newer systems. The following table lists the capacity-on-demand core processor configurations.

Table 2-3 Capacity-on-Demand Core Processor Configurations

Oracle Exadata Database Machine Eligible Systems Minimum Cores per Server Maximum Cores per Server Core Increments

Oracle Exadata Database Machine X7-2, X8-2, and X8M-2

Any configuration except Eighth Rack

14

48

From 14 to 48, in multiples of 2:

14, 16, 18, …, 46, 48

Oracle Exadata Database Machine X7-2, X8-2, and X8M-2

Eighth rack

8

24

From 8 to 24, in multiples of 2:

8, 10, 12, …, 22, 24

Oracle Exadata Database Machine X6-2

Any configuration except Eighth Rack

14

44

From 14 to 44, in multiples of 2:

14, 16, 18, …, 42, 44

Oracle Exadata Database Machine X6-2

Eighth rack

8

22

From 8 to 22, in multiples of 2:

8, 10, 12, …, 20, 22

Oracle Exadata Database Machine X5-2

Any configuration except Eighth Rack

14

36

From 14 to 36, in multiples of 2:

14, 16, 18, …, 34, 36

Oracle Exadata Database Machine X5-2

Eighth rack

8

18

From 8 to 18, in multiples of 2:

8, 10, 12, 14, 16, 18

Oracle Exadata Database Machine X4-2

Full rack

Half rack

Quarter rack

12

24

From 12 to 24, in multiples of 2:

12, 14, 16, 18, 20, 22, 24

Oracle Exadata Database Machine X7-8, X8-8, and X8M-8

Any configuration

56

192

From 56 to 192, in multiples of 8:

56, 64, 72, …, 184, 192

Oracle Exadata Database Machine X6-8 and X5-8

Any configuration

56

144

From 56 to 144, in multiples of 8:

56, 64, 72, …, 136, 144

Oracle Exadata Database Machine X4-8

Full rack

48

120

From 48 to 120, in multiples of 8:

48, 56, 64, …, 112, 120

Note:

Oracle recommends licensing the same number of cores on each server, in case of failover.

Database servers can be added one at a time, and capacity-on-demand can be applied to the individual database servers. This option includes Oracle Exadata Database Machine X5-2 Eighth Racks.

The database server must be restarted after enabling additional cores. If the database servers are part of a cluster, then they can be enabled in a rolling fashion.

  1. Verify the number of active physical cores using the following command:
    DBMCLI> LIST DBSERVER attributes coreCount
    
  2. Use the following command to increase the number of active physical cores:
    DBMCLI> ALTER DBSERVER pendingCoreCount = new_number_of_active_physical_cores
    
  3. Verify the pending number of active physical cores using the following command:
    DBMCLI> LIST DBSERVER attributes pendingCoreCount
    
  4. Restart the server.
  5. Verify the number of active physical cores using the following command:
    DBMCLI> LIST DBSERVER attributes coreCount
    

2.9 Extending LVM Partitions

Logical Volume Manager (LVM) provides flexibility to reorganize the partitions in the database servers.

Note:

  • Keep at least 1 GB of free space in the VGExaDb volume group. This space is used for the LVM snapshot created by the dbnodeupdate.sh utility during software maintenance.

  • If you make snapshot-based backups of the / (root) and /u01 directories by following the steps in Creating a Snapshot-Based Backup of Oracle Linux Database Server, then keep at least 6 GB of free space in the VGExaDb volume group.

This section contains the following topics:

2.9.1 Extending the root LVM Partition

The procedure for extending the root LVM partition depends on your Oracle Exadata System Software release.

2.9.1.1 Extending the root LVM Partition on Systems Running Oracle Exadata System Software Release 11.2.3.2.1 or Later

The following procedure describes how to extend the size of the root (/) partition on systems running Oracle Exadata System Software release 11.2.3.2.1 or later:

Note:

  • This procedure does not require an outage on the server.

  • For management domain systems, the active and inactive Sys LVM's are LVDbSys2 and LVDbSys3 instead of LVDbSys1 and LVDbSys2.

  • Make sure that LVDbSys1 and LVDbSys2 are sized the same.

  1. Collect information about the current environment.
    1. Use the df command to identify the amount of free and used space in the root partition (/).
      # df -h /
      

      The following is an example of the output from the command:

      Filesystem                    Size  Used Avail Use% Mounted on
      /dev/mapper/VGExaDb-LVDbSys1   30G   22G  6.2G  79% / 
      

      Note:

      The active root partition may be either LVDbSys1 or LVDbSys2, depending on previous maintenance activities.

    2. Use the lvs command to display the current volume configuration.
      # lvs -o lv_name,lv_path,vg_name,lv_size
      

      The following is an example of the output from the command:

      LV                 Path                            VG       LSize
      LVDbOra1           /dev/VGExaDb/LVDbOra1           VGExaDb  100.00g
      LVDbSwap1          /dev/VGExaDb/LVDbSwap1          VGExaDb  24.00g
      LVDbSys1           /dev/VGExaDb/LVDbSys1           VGExaDb  30.00g
      LVDbSys2           /dev/VGExaDb/LVDbSys2           VGExaDb  30.00g
      LVDoNotRemoveOrUse /dev/VGExaDb/LVDoNotRemoveOrUse VGExaDb  1.00g
      
  2. Use the df command to identify the file system type that is used in the root partition (/).
    # df -hT /
    

    The following is an example of the output from the command:

    Filesystem                    Type  Size  Used Avail Use% Mounted on
    /dev/mapper/VGExaDb-LVDbSys1  ext3   30G   22G  6.2G  79% / 
    

    In this example, the file system type is ext3.

  3. If the file system type is not xfs, use the following tune2fs command to check the online resize option. If the file system type is xfs, then you can skip this step.
    tune2fs -l /dev/mapper/vg_name-lv_name | grep resize_inode
    

    For example:

    tune2fs -l /dev/mapper/VGExaDb-LVDbSys1 | grep resize_inode
    

    The resize_inode option should be listed in the output from the command. If the option is not listed, then the file system must be unmounted before resizing the partition. Refer to Extending the root LVM Partition on Systems Running Oracle Exadata System Software Earlier than Release 11.2.3.2.1 to resize the partition.

  4. Verify there is available space in the volume group VGExaDb using the vgdisplay command.
    # vgdisplay -s
    

    The following is an example of the output from the command:

    "VGExaDb" 834.89 GB [184.00 GB used / 650.89 GB free]
    

    The volume group must contain enough free space to increase the size of both system partitions, and maintain at least 1 GB of free space for the LVM snapshot created by the dbnodeupdate.sh utility during upgrade.

    If there is not enough free space, then verify that the reclaimdisks.sh utility has been run. If the utility has not been run, then use the following command to reclaim disk space:

    # /opt/oracle.SupportTools/reclaimdisks.sh -free -reclaim 
    

    If the utility has been run and there is not enough free space, then the LVM cannot be resized.

    Note:

    reclaimdisks.sh cannot run at the same time as a RAID rebuild (that is, a disk replacement or expansion). Wait until the RAID rebuild is complete, then run reclaimdisks.sh.

  5. Resize both LVDbSys1 and LVDbSys2 logical volumes using the lvextend command.

    In the following example, XG is the amount of space in GB that the logical volume will be extended. The amount of space added to each system partition must be the same.

    # lvextend -L +XG --verbose /dev/VGExaDb/LVDbSys1
    # lvextend -L +XG --verbose /dev/VGExaDb/LVDbSys2
    

    The following example extends the logical volumes by 10 GB:

    # lvextend -L +10G /dev/VGExaDb/LVDbSys1
    # lvextend -L +10G /dev/VGExaDb/LVDbSys2
    
  6. Resize the file system within the logical volume.
    • For ext3 and ext4 file system types, use the resize2fs command:

      # resize2fs /dev/VGExaDb/LVDbSys1
      # resize2fs /dev/VGExaDb/LVDbSys2
      
    • For the xfs file system type:

      1. Use the xfs_growfs command to resize the active file system:

        # xfs_growfs /
      2. Determine the inactive root partition.

        The inactive root partition is LVDbSys1 or LVDbSys2, whichever is not currently mounted.

        Examine the output from the df command to confirm the active partition. For example:

        # df -hT /
        Filesystem                    Size  Used Avail Use% Mounted on
        /dev/mapper/VGExaDb-LVDbSys1   30G   22G  6.2G  79% / 
        

        The example shows LVDbSys1 as the active partition. Therefore, the inactive partition is LVDbSys2.

      3. Mount the inactive root partition to a temporary location.

        For example:

        # mkdir -p /tmp/mnt/root
        # mount -t xfs /dev/VGExaDb/LVDbSys2 /tmp/mnt/root
      4. Use the xfs_growfs command to resize the inactive file system:

        # xfs_growfs /tmp/mnt/root
      5. Unmount the inactive root partition.

        For example:

        # umount /tmp/mnt/root
  7. Verify the space was extended for the active system partition using the df command.
    # df -h /
    
2.9.1.2 Extending the root LVM Partition on Systems Running Oracle Exadata System Software Earlier than Release 11.2.3.2.1

You can extend the size of the root (/) partition on systems running Oracle Exadata System Software earlier than release 11.2.3.2.1 using this procedure.

Note:

  • This procedure requires the system to be offline and restarted.

  • Keep at least 1 GB of free space in the VGExaDb volume group to be used for the LVM snapshot created by the dbnodeupdate.sh utility during software maintenance. If you make snapshot-based backups of the / (root) and /u01 directories by following the steps in Creating a Snapshot-Based Backup of Oracle Linux Database Server, then keep at least 6 GB of free space in the VGExaDb volume group.

  • For management domain systems, active and inactive Sys LVM's are LVDbSys2 and LVDbSys3 instead of LVDbSys1 and LVDbSys2.

  • Make sure LVDbSys1 and LVDbSys2 are sized the same.

  1. Collect information about the current environment.
    1. Use the df command to identify the mount points for the root partition (/) and the non-root partition (/u01), and their respective LVMs.

      The following is an example of the output from the command:

      # df
      Filesystem                    1K-blocks   Used    Available Use% Mounted on
      /dev/mapper/VGExaDb-LVDbSys1 30963708   21867152   7523692  75%    /
      /dev/sda1                      126427      16355    103648  14%    /boot
      /dev/mapper/VGExaDb-LVDbOra1 103212320  67404336  30565104  69%    /u01
      tmpfs                         84132864   3294608  80838256   4%    /dev/shm
      

      The file system name in the df command output is in the following format:

      /dev/mapper/VolumeGroup-LogicalVolume
      

      The full logical volume name of the root file system in the preceding example is /dev/VGExaDb/LVDbSys1.

    2. Use the lvscan command to display logical volumes.
      #lvm lvscan
      
      ACTIVE            '/dev/VGExaDb/LVDbSys1'  [30.00 GB]  inherit
      ACTIVE            '/dev/VGExaDb/LVDbSwap1' [24.00 GB]  inherit
      ACTIVE            '/dev/VGExaDb/LVDbOra1'  [100.00 GB] inherit
      
    3. Use the lvdisplay command to display the current logical volume and the volume group configuration.
      #lvm lvdisplay /dev/VGExaDb/LVDbSys1
      
      --- Logical volume ---
      LV Name               /dev/VGExaDb/LVDbSys1
      VG Name               VGExaDb
      LV UUID               GScpD7-lKa8-gLg9-oBo2-uWaM-ZZ4W-Keazih
      LV Write Access       read/write
      LV Status             available
      # open                1
      LV Size               30.00 GB
      Current LE            7680
      Segments              1
      Allocation            inherit
      Read ahead sectors    auto
      - currently set to    256
      Block device          253:0
      
    4. Verify there is available space in the volume group VGExaDb so the logical volume can be extended.
      # lvm vgdisplay VGExaDb -s
      "VGExaDb" 556.80 GB [154.00 GB used / 402.80 GB free]
      

      If the command shows there is zero free space, then neither the logical volume or the file system can be extended.

  2. Restart the server using the diagnostics.iso file.
    See Booting a Server using the Diagnostic ISO File in Oracle Exadata System Software User's Guide.
  3. Log in to the diagnostics shell as the root user.
    When prompted, enter the diagnostics shell.

    For example:

    Choose from following by typing letter in '()':
    (e)nter interactive diagnostics shell. Must use credentials 
    from Oracle support to login (reboot or power cycle to exit
    the shell),
    (r)estore system from NFS backup archive, 
    Type e to enter the diagnostics shell and log in as the root user.
    If prompted, log in to the system as the root user. If you are prompted for the root user password and do not have it, then contact Oracle Support Services.
  4. Unmount the root file system.
    # cd /
    # umount /mnt/cell
    
  5. Verify the logical volume name.
    # lvm lvscan
    ACTIVE '/dev/VGExaDb/LVDbSys1' [30.00 GB] inherit
    ACTIVE '/dev/VGExaDb/LVDbSwap1' [24.00 GB] inherit
    ACTIVE '/dev/VGExaDb/LVDbOra1' [100.00 GB] inherit
    
  6. Resize the LVDbSys1 and LVDbSys2 holding the current and backup root file system.

    In the following commands, XG is the amount of space in GB that the logical volume will be extended.

    # lvm lvextend -L+XG --verbose /dev/VGExaDb/LVDbSys1
    # lvm lvextend -L+XG --verbose /dev/VGExaDb/LVDbSys2
    

    For example, if the logical volume is expanded 5 GB, then the commands would be:

    # lvm lvextend -L+5G --verbose /dev/VGExaDb/LVDbSys1
    # lvm lvextend -L+5G --verbose /dev/VGExaDb/LVDbSys2
    
  7. Verify the file system is valid using e2fsck.
    # e2fsck -f /dev/VGExaDb/LVDbSys1
    # e2fsck -f /dev/VGExaDb/LVDbSys2
    
  8. Resize the file system.
    # resize2fs -p /dev/VGExaDb/LVDbSys1
    # resize2fs -p /dev/VGExaDb/LVDbSys2
    
  9. Restart the system in normal mode.
    # shutdown -r now
  10. Log in to the system.
  11. Verify the root file system mount mounts without issues with the new size.

2.9.2 Resizing a Non-root LVM Partition

The procedure for resizing a non-root LVM partition depends on your Oracle Exadata System Software release.

2.9.2.1 Extending a Non-root LVM Partition on Systems Running Oracle Exadata System Software Release 11.2.3.2.1 or Later

This procedure describes how to extend the size of a non-root (/u01) partition on systems running Oracle Exadata System Software release 11.2.3.2.1 or later.

This procedure does not require an outage on the server.

  1. Collect information about the current environment.
    1. Use the df command to identify the amount of free and used space in the /u01 partition.
      # df -h /u01
      

      The following is an example of the output from the command:

      Filesystem                    Size  Used Avail Use% Mounted on
      /dev/mapper/VGExaDb-LVDbOra1   99G   25G  70G   26% /u01
    2. Use the lvs command to display the current logical volume configuration used by the /u01 file system.
      # lvs -o lv_name,lv_path,vg_name,lv_size
      

      The following is an example of the output from the command:

       LV        Path                   VG      LSize
       LVDbOra1  /dev/VGExaDb/LVDbOra1  VGExaDb 100.00G
       LVDbSwap1 /dev/VGExaDb/LVDbSwap1 VGExaDb  24.00G
       LVDbSys1  /dev/VGExaDb/LVDbSys1  VGExaDb  30.00G
       LVDbSys2  /dev/VGExaDb/LVDbSys2  VGExaDb  30.00G
      
  2. Use the df command to identify the file system type that is used in the /u01 partition.
    # df -hT /u01
    

    The following is an example of the output from the command:

    Filesystem                    Type  Size  Used Avail Use% Mounted on
    /dev/mapper/VGExaDb-LVDbOra1   xfs   99G   25G  70G   26% /u01

    In this example, the file system type is xfs.

  3. If the file system type is not xfs, use the following tune2fs command to check the online resize option. If the file system type is xfs, then you can skip this step.
    tune2fs -l /dev/mapper/vg_name | grep resize_inode
    

    The resize_inode option should be listed in the output from the command. If the option is not listed, then the file system must be unmounted before resizing the partition. Refer to "Extending a Non-root LVM Partition on Systems Running Oracle Exadata System Software Earlier than Release 11.2.3.2.1" when resizing the partition.

  4. Verify there is available space in the volume group VGExaDb using the vgdisplay command.
    # vgdisplay -s
    

    The following is an example of the output from the command:

    "VGExaDb" 834.89 GB [184.00 GB used / 650.89 GB free]
    

    If the output shows there is less than 1 GB of free space, then neither the logical volume nor file system should be extended. Maintain at least 1 GB of free space in the VGExaDb volume group for the LVM snapshot created by the dbnodeupdate.sh utility during an upgrade.

    If there is not enough free space, then verify that the reclaimdisks.sh utility has been run. If the utility has not been run, then use the following command to reclaim disk space:

    # /opt/oracle.SupportTools/reclaimdisks.sh -free -reclaim 
    

    If the utility has been run and there is not enough free space, then the LVM cannot be resized.

    Note:

    • reclaimdisks.sh cannot run at the same time as a RAID rebuild (that is, a disk replacement or expansion). Wait until the RAID rebuild is complete, then run reclaimdisks.sh.

  5. Resize the logical volume using the lvextend command.
    # lvextend -L +sizeG /dev/VGExaDb/LVDbOra1
    

    In the preceding command, size is the amount of space to be added to the logical volume.

    The following example extends the logical volume by 10 GB:

    # lvextend -L +10G /dev/VGExaDb/LVDbOra1
    
  6. Resize the file system within the logical volume.
    • For ext3 and ext4 file system types, use the resize2fs command:

      # resize2fs /dev/VGExaDb/LVDbOra1
    • For the xfs file system type, use the xfs_growfs command:

      # xfs_growfs /u01
  7. Verify the space was extended using the df command
    # df -h /u01
    
2.9.2.2 Extending a Non-root LVM Partition on Systems Running Oracle Exadata System Software Earlier than Release 11.2.3.2.1

This procedure describes how to extend the size of a non-root (/u01) partition on systems running Oracle Exadata System Software earlier than release 11.2.3.2.1.

In this procedure, /dev/VGExaDb/LVDbOra1 is mounted at /u01.

Note:

  • Keep at least 1 GB of free space in the VGExaDb volume group. This space is used for the LVM snapshot created by the dbnodeupdate.sh utility during software maintenance.

  • If you make snapshot-based backups of the / (root) and /u01 directories by following the steps in Creating a Snapshot-Based Backup of Oracle Linux Database Server, then keep at least 6 GB of free space in the VGExaDb volume group.

  1. Collect information about the current environment.
    1. Use the df command to identify the mount points for the root partition (/) and the non-root partition (/u01), and their respective LVMs.
      # df
      Filesystem                    1K-blocks   Used    Available Use% Mounted on
      /dev/mapper/VGExaDb-LVDbSys1 30963708   21867152   7523692  75%    /
      /dev/sda1                      126427      16355    103648  14%    /boot
      /dev/mapper/VGExaDb-LVDbOra1 103212320  67404336  30565104  69%    /u01
      tmpfs                         84132864   3294608  80838256   4%    /dev/shm
      
    2. Use the lvm lvscan command to display logical volumes.
      ACTIVE            '/dev/VGExaDb/LVDbSys1'  [30.00 GB]  inherit
      ACTIVE            '/dev/VGExaDb/LVDbSwap1' [24.00 GB]  inherit
      ACTIVE            '/dev/VGExaDb/LVDbOra1'  [100.00 GB] inherit
      
    3. Use the lvdisplay command to display the current volume group configuration.
      # lvdisplay /dev/VGExaDb/LVDbOra1
      
      --- Logical volume ---
      LV Name               /dev/VGExaDb/LVDbOra1
      VG Name               VGExaDb
      LV UUID               vzoIE6-uZrX-10Du-UD78-314Y-WXmz-f7SXyY
      LV Write Access       read/write
      LV Status             available
      # open                1
      LV Size               100.00 GB
      Current LE            25600
      Segments              1
      Allocation            inherit
      Read ahead sectors    auto
      - currently set to    256
      Block device          253:2
      
    4. Verify there is available space in the volume group VGExaDb so the logical drive can be extended.

      If the command shows there is zero free space, then neither the logical volume or file system can be extended.

      # lvm vgdisplay VGExaDb -s
      
      "VGExaDb" 556.80 GB [154.00 GB used / 402.80 GB free]
      
  2. Shut down any software that uses /u01.

    The following software typically uses /u01:

    • Oracle Clusterware, Oracle ASM, and Oracle Database

      # Grid_home/bin/crsctl stop crs
      
    • Trace File Analyzer

      # Grid_home/bin/tfactl stop
      
    • OS Watcher

      # /opt/oracle.oswatcher/osw/stopOSW.sh
      
    • Oracle Enterprise Manager agent

      (oracle)$ agent_home/bin/emctl stop agent
      
  3. Unmount the partition as the root user.
    # umount /u01
    

    Note:

    If the umount command reports that the file system is busy, then use the fuser(1) command to identify processes still accessing the file system that must be stopped before the umount command will succeed.

    # umount /u01
    umount: /u01: device is busy
    umount: /u01: device is busy
     
    # fuser -mv /u01
     
            USER      PID ACCESS COMMAND
    /u01:   root     6788 ..c..  ssh
            root     8422 ..c..  bash
            root    11444 ..c..  su
            oracle  11445 ..c..  bash
            oracle  11816 ....m  mgr
            root    16451 ..c..  bash
  4. Verify the file system.
    # e2fsck -f /dev/VGExaDb/LVDbOra1
    
  5. Extend the partition.

    In this example, the logical volume is expanded to 80% of the physical volume size. At the same time, the file system is resized with the command.

    # lvextend -L+XG --verbose /dev/VGExaDb/LVDbOra1
    

    In the preceding command, XG is the amount of GB the logical volume will be extended. The following example shows how to extend the logical volume by an additional 200 GB:

    # lvextend -L+200G --verbose /dev/VGExaDb/LVDbOra1
    

    Caution:

    Use extreme caution when reducing the size. The new size must be large enough to hold all the original content of the partition. To reduce the size, use a command similar to the following:

    lvreduce -L60G --resizefs --verbose /dev/VGExaDb/LVDbOra1
    

    In the preceding command, the size of /u01 is reduced to 60 GB.

  6. Check the /u01 file system using the e2fsck command.
    # e2fsck -f /dev/VGExaDb/LVDbOra1
    
  7. Resize the /u01 file system.
    # resize2fs -p /dev/VGExaDb/LVDbOra1
    
  8. Mount the partition.
    # mount -t ext3 /dev/VGExaDb/LVDbOra1 /u01
    
  9. Verify the space was extended.
    $ df -h /u01
    
  10. Restart any software that was stopped in step 2.
    • Oracle Clusterware, Oracle ASM, and Oracle Database

      # Grid_home/bin/crsctl start crs
      
    • Trace File Analyzer

      # Grid_home/bin/tfactl start
      
    • OS Watcher

      # /opt/oracle.cellos/vldrun -script oswatcher
      
    • Oracle Enterprise Manager agent

      (oracle)$ agent_home/bin/emctl start agent
      
2.9.2.3 Reducing a Non-root LVM Partition on Systems Running Oracle Exadata System Software Release 11.2.3.2.1 or Later

You can reduce the size of a non-root (/u01) partition on systems running Oracle Exadata System Software release 11.2.3.2.1 or later.

Note:

  • This procedure does not require an outage on the server.

  • It is recommended that you back up your file system before performing this procedure.

  1. Use the df command to determine the amount of free and used space in the /u01 partition:
    # df -h /u01
    

    The following is an example of the output:

    Filesystem                    Size  Used Avail Use% Mounted on
    /dev/mapper/VGExaDb-LVDbOra1  193G   25G  159G  14% /u01
    
  2. Use the lvm command to display the current logical volume configuration used by the /u01 file system.

    In this example, the size of the LVDbOra1 partition needs to be reduced so that LVDbSys2 (30.00 GB in size) can be created by the dbserver_backup.sh script.

    # lvm vgdisplay VGExaDb -s
      "VGExaDb" 271.82 GB [250.04 GB used / 21.79 GB free]
    
    # lvm lvscan
      ACTIVE            '/dev/VGExaDb/LVDbSys1' [30.00 GB] inherit
      ACTIVE            '/dev/VGExaDb/LVDbSwap1' [24.00 GB] inherit
      ACTIVE            '/dev/VGExaDb/LVDbOra1' [196.04 GB] inherit
    
  3. Shut down any software that uses /u01.

    The following software typically uses /u01:

    • Oracle Clusterware, Oracle ASM, and Oracle Database

      # Grid_home/bin/crsctl stop crs
      
    • Trace File Analyzer

      # Grid_home/bin/tfactl stop
      
    • OS Watcher (releases earlier than 11.2.3.3.0)

      # /opt/oracle.oswatcher/osw/stopOSW.sh
      
    • ExaWatcher (release 11.2.3.3.0 and later)

      # /opt/oracle.ExaWatcher/ExaWatcher.sh --stop
      
    • Oracle Enterprise Manager agent

      (oracle)$ agent_home/bin/emctl stop agent
      
  4. Unmount the partition as the root user.
    # umount /u01
    

    Note:

    If the umount command reports that the file system is busy, then use the fuser(1) command to identify the processes still accessing the file system that must be stopped before the umount command will succeed.

    # umount /u01
    umount: /u01: device is busy
    umount: /u01: device is busy
    
    # fuser -mv /u01
    
            USER      PID ACCESS COMMAND
    /u01:   root     6788 ..c..  ssh
            root     8422 ..c..  bash
            root    11444 ..c..  su
            oracle  11445 ..c..  bash
            oracle  11816 ....m  mgr
            root    16451 ..c..  bash
  5. Verify the file system.
    # e2fsck -f /dev/VGExaDb/LVDbOra1
    
    fsck 1.39 (29-May-2006)
    e2fsck 1.39 (29-May-2006)
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    DBORA: 72831/25706496 files (2.1% non-contiguous), 7152946/51389440 blocks
    
  6. Resize the file system to the required size (120G in the example below).
    # resize2fs /dev/VGExaDb/LVDbOra1 120G
    resize2fs 1.39 (29-May-2017)
    Resizing the filesystem on /dev/VGExaDb/LVDbOra1 to 26214400 (4k) blocks.
    The filesystem on /dev/VGExaDb/LVDbOra1 is now 26214400 blocks long.
    
  7. Resize the LVM to the desired size.
    # lvm lvreduce -L 120G --verbose /dev/VGExaDb/LVDbOra1
        Finding volume group VGExaDb
      WARNING: Reducing active logical volume to 120.00 GB
      THIS MAY DESTROY YOUR DATA (filesystem etc.)
    Do you really want to reduce LVDbOra1? [y/n]: y
        Archiving volume group "VGExaDb" metadata (seqno 8).
      Reducing logical volume LVDbOra1 to 120.00 GB
        Found volume group "VGExaDb"
        Found volume group "VGExaDb"
        Loading VGExaDb-LVDbOra1 table (253:2)
        Suspending VGExaDb-LVDbOra1 (253:2) with device flush
        Found volume group "VGExaDb"
        Resuming VGExaDb-LVDbOra1 (253:2)
        Creating volume group backup "/etc/lvm/backup/VGExaDb" (seqno 9).
      Logical volume LVDbOra1 successfully resized
    
  8. Mount the partition.
    # mount -t ext3 /dev/VGExaDb/LVDbOra1 /u01
    
  9. Verify the space was reduced.
    # df -h /u01
    Filesystem                    Size  Used Avail Use% Mounted on
    /dev/mapper/VGExaDb-LVDbOra1  119G   25G   88G  22% /u01
    
    # lvm vgdisplay -s
      "VGExaDb" 271.82 GB [174.00 GB used / 97.82 GB free]
    
  10. Restart any software that was stopped in step 3.
    • Oracle Clusterware, Oracle ASM, and Oracle Database

      # Grid_home/bin/crsctl start crs
      
    • Trace File Analyzer

      # Grid_home/bin/tfactl start
      
    • OS Watcher (releases earlier than 11.2.3.3.0)

      # /opt/oracle.cellos/vldrun -script oswatcher
      
    • ExaWatcher (release 11.2.3.3.0 to release 18.1.x)

      # /opt/oracle.cellos/vldrun -script oswatcher
      
    • ExaWatcher (release 19.0.0.0 and later)

      # systemctl start ExaWatcher
    • Oracle Enterprise Manager agent

      (oracle)$ agent_home/bin/emctl start agent
      

2.9.3 Extending the Swap Partition

This procedure describes how to extend the size of the swap (/swap) partition.

Note:

This procedure requires the system to be offline and restarted.

Keep at least 1 GB of free space in the VGExaDb volume group to be used for the Logical Volume Manager (LVM) snapshot created by the dbnodeupdate.sh utility during software maintenance. If you make snapshot-based backups of the / (root) and /u01 directories by following the steps in "Creating a Snapshot-Based Backup of Oracle Linux Database Server", then keep at least 6 GB of free space in the VGExaDb volume group.

  1. Collect information about the current environment.
    1. Use the swapon command to identify the swap partition.
      # swapon -s
      Filename    Type        Size       Used   Priority
      /dev/dm-2   partition   25165816   0      -1
      
    2. Use the lvm lvscan command to display the logical volumes.
      # lvm lvscan
      ACTIVE '/dev/VGExaDb/LVDbSys1' [30.00 GiB] inherit
      ACTIVE '/dev/VGExaDb/LVDbSys2' [30.00 GiB] inherit
      ACTIVE '/dev/VGExaDb/LVDbSwap1' [24.00 GiB] inherit
      ACTIVE '/dev/VGExaDb/LVDbOra1' [103.00 GiB] inherit
      ACTIVE '/dev/VGExaDb/LVDoNotRemoveOrUse' [1.00 GiB] inherit
      
    3. Use the vgdisplay command to display the current volume group configuration.
      # vgdisplay
        --- Volume group ---
        VG Name               VGExaDb
        System ID            
        Format                lvm2
        Metadata Areas        1
        Metadata Sequence No  4
        VG Access             read/write
        VG Status             resizable
        MAX LV                0
        Cur LV                3
        Open LV               3
        Max PV                0
        Cur PV                1
        Act PV                1
        VG Size               556.80 GB
        PE Size               4.00 MB
        Total PE              142541
        Alloc PE / Size       39424 / 154.00 GB
        Free  PE / Size       103117 / 402.80 GB
        VG UUID               po3xVH-9prk-ftEI-vijh-giTy-5chm-Av0fBu
      
    4. Use the pvdisplay command to display the name of the physical device created by LVM and used with the operating system.
      # pvdisplay
        --- Physical volume ---
        PV Name               /dev/sda2
        VG Name               VGExaDb
        PV Size               556.80 GB / not usable 2.30 MB
        Allocatable           yes
        PE Size (KByte)       4096
        Total PE              142541
        Free PE               103117
        Allocated PE          39424
        PV UUID               Eq0e7e-p1fS-FyGN-zrvj-0Oqd-oUSb-55x2TX
  2. Restart the server using the diagnostics.iso file.
    See Booting a Server using the Diagnostic ISO File in Oracle Exadata System Software User's Guide.
  3. Log in to the diagnostics shell as the root user.
    When prompted, enter the diagnostics shell.

    For example:

    Choose from following by typing letter in '()':
    (e)nter interactive diagnostics shell. Must use credentials 
    from Oracle support to login (reboot or power cycle to exit
    the shell),
    (r)estore system from NFS backup archive, 
    Type e to enter the diagnostics shell and log in as the root user.
    If prompted, log in to the system as the root user. If you are prompted for the root user password and do not have it, then contact Oracle Support Services.
  4. Verify the file system is valid.

    Use the following command:

    #fsck -f /dev/VGExaDb/LVDbSwap1
    
  5. Extend the partition.

    In this example, the logical volume is expanded to 80% of the physical volume size. At the same time, the file system is resized with this command. In the following command, the value for LogicalVolumePath is obtained by the lvm lvscan command, and the value for PhysicalVolumePath is obtained by the pvdisplay command.

    #lvextend -l+80%PVS --resizefs --verbose LogicalVolumePath PhysicalVolumePath
    
  6. Restart the system in normal mode.

2.10 Creating a Snapshot-Based Backup of Oracle Linux Database Server

A backup should be made before and after every significant change to the software on the database server. For example, a backup should be made before and after the following procedures:

  • Application of operating system patches
  • Application of Oracle patches
  • Reconfiguration of significant operating parameters
  • Installation or reconfiguration of significant non Oracle software

Starting with Oracle Exadata System Software release 19.1.0, the SSHD ClientAliveInterval defaults to 600 seconds. If the time needed for completing backups exceeds 10 minutes, then you can specify a larger value for ClientAliveInterval in the /etc/ssh/sshd_config file. You must restart the SSH service for changes to take effect. After the long running operation completes, remove the modification to the ClientAliveInterval parameter and restart the SSH service.

This section contains the following topics:

2.10.1 Creating a Snapshot-Based Backup of Exadata Database Servers X8M with Uncustomized Partitions

This procedure describes how to take a snapshot-based backup of an Oracle Exadata Database Machine X8M database server with uncustomized storage partitions.

Starting with Oracle Exadata Database Machine X8M and Oracle Exadata System Software release 19.3, the database servers use the following storage partitions:

File System Mount Point Logical Volume Name

/ (root)

LVDbSys1 or LVDbSys2, whichever is active.

/u01

LVDbOra1

/home

LVDbHome

/var

LVDbVar1 or LVDbVar2, whichever is active.

/var/log

LVDbVarLog

/var/log/audit

LVDbVarLogAudit

/tmp

LVDbTmp

Note:

  • This procedure relies on the exact storage partitions that are originally shipped on the database server. If you modified the storage partitions in any way, then you cannot use this procedure and the associated recovery procedure without modification. Modifications include changing partition sizes, renaming partitions, adding partitions, or removing partitions.
  • All steps must be performed as the root user.
  1. Prepare a destination to hold the backup.

    The destination should reside outside of the local machine, such as a writable NFS location, and be large enough to hold the backup tar files. For non-customized partitions, the space needed for holding the backup is approximately 145 GB.

    You can use the following commands to prepare a backup destination using NFS.

    # mkdir -p /root/remote_FS
    # mount -t nfs -o rw,intr,soft,proto=tcp,nolock ip_address:/nfs_location/ /root/remote_FS

    In the mount command, ip_address is the IP address of the NFS server, and nfs_location is the NFS location holding the backups.

  2. Remove the LVDoNotRemoveOrUse logical volume.

    The logical volume /dev/VGExaDb/LVDoNotRemoveOrUse is a placeholder to make sure there is always free space available to create a snapshot.

    Use the following script to check for the existence of the LVDoNotRemoveOrUse logical volume and remove it if present.

    lvm lvdisplay --ignorelockingfailure /dev/VGExaDb/LVDoNotRemoveOrUse
    if [ $? -eq 0 ]; then 
      # LVDoNotRemoveOrUse logical volume exists. 
      lvm lvremove -f /dev/VGExaDb/LVDoNotRemoveOrUse 
      if [ $? -ne 0 ]; then 
        echo "Unable to remove logical volume: LVDoNotRemoveOrUse. Do not proceed with backup." 
      fi
    fi

    If the LVDoNotRemoveOrUse logical volume does not exist, then do not proceed with the remaining steps and determine the reason.

  3. Determine the active system volume.
    You can use the imageinfo command and examine the device hosting the active system partition.
    # imageinfo
    
    Kernel version: 4.14.35-1902.5.1.4.el7uek.x86_64 #2 SMP Wed Oct 9 19:29:16 PDT 2019 x86_64
    Image kernel version: 4.14.35-1902.5.1.4.el7uek
    Image version: 19.3.1.0.0.191018
    Image activated: 2019-11-04 19:18:32 -0800
    Image status: success
    Node type: KVMHOST
    System partition on device: /dev/mapper/VGExaDb-LVDbSys1

    In the imageinfo output, the system partition device ends with the name of the logical volume supports the active root (/) file system. Depending on the system image that is in use, the logical volume name is LVDbSys1 or LVDbSys2. Likewise, the logical volume for the /var file system is either LVDbVar1 or LVDbVar2.

    You can also confirm the active devices by using the df -hT command and examining the output associated with the root (/) and /var file systems. For example:

    # df -hT
    Filesystem                          Type      Size  Used Avail Use% Mounted on
    devtmpfs                            devtmpfs  378G     0  378G   0% /dev
    tmpfs                               tmpfs     755G  1.0G  754G   1% /dev/shm
    tmpfs                               tmpfs     378G  4.8M  378G   1% /run
    tmpfs                               tmpfs     378G     0  378G   0% /sys/fs/cgroup
    /dev/mapper/VGExaDb-LVDbSys1        xfs        15G  7.7G  7.4G  52% /
    /dev/sda1                           xfs       510M  112M  398M  22% /boot
    /dev/sda2                           vfat      254M  8.5M  246M   4% /boot/efi
    /dev/mapper/VGExaDb-LVDbHome        xfs       4.0G   33M  4.0G   1% /home
    /dev/mapper/VGExaDb-LVDbVar1        xfs       2.0G  139M  1.9G   7% /var
    /dev/mapper/VGExaDb-LVDbVarLog      xfs        18G  403M   18G   3% /var/log
    /dev/mapper/VGExaDb-LVDbVarLogAudit xfs      1014M  143M  872M  15% /var/log/audit
    /dev/mapper/VGExaDb-LVDbTmp         xfs       3.0G  148M  2.9G   5% /tmp
    /dev/mapper/VGExaDb-LVDbOra1        xfs       100G   32G   69G  32% /u01
    tmpfs                               tmpfs      76G     0   76G   0% /run/user/0

    The remaining examples in the procedure use LVDbSys1 and LVDbVar1, which is consistent with the above imageinfo and df output. However, if the active image is using LVDbSys2, then modify the examples in the following steps to use LVDbSys2 instead of LVDbSys1, and LVDbVar2 instead of LVDbVar1.

  4. Take snapshots of the logical volumes on the server.

    Depending on the active system partition identified in the previous step, remember to use either LVDbSys1 or LVDbSys2 to identify the logical volume for the root (/) file system, and likewise use either LVDbVar1 or LVDbVar2 to identify the logical volume for the /var file system.

    # lvcreate -L1G -s -c 32K -n root_snap /dev/VGExaDb/LVDbSys1
    # lvcreate -L5G -s -c 32K -n u01_snap /dev/VGExaDb/LVDbOra1
    # lvcreate -L1G -s -c 32K -n home_snap /dev/VGExaDb/LVDbHome
    # lvcreate -L1G -s -c 32K -n var_snap /dev/VGExaDb/LVDbVar1
    # lvcreate -L1G -s -c 32K -n varlog_snap /dev/VGExaDb/LVDbVarLog
    # lvcreate -L1G -s -c 32K -n audit_snap /dev/VGExaDb/LVDbVarLogAudit
    # lvcreate -L1G -s -c 32K -n tmp_snap /dev/VGExaDb/LVDbTmp
  5. Label the snapshots.
    # xfs_admin -L DBSYS_SNAP /dev/VGExaDb/root_snap
    # xfs_admin -L DBORA_SNAP /dev/VGExaDb/u01_snap
    # xfs_admin -L HOME_SNAP /dev/VGExaDb/home_snap
    # xfs_admin -L VAR_SNAP /dev/VGExaDb/var_snap
    # xfs_admin -L VARLOG_SNAP /dev/VGExaDb/varlog_snap
    # xfs_admin -L AUDIT_SNAP /dev/VGExaDb/audit_snap
    # xfs_admin -L TMP_SNAP /dev/VGExaDb/tmp_snap
  6. Mount the snapshots.
    Mount all of the snapshots under a common directory location; for example, /root/mnt.
    # mkdir -p /root/mnt
    # mount -t xfs -o nouuid /dev/VGExaDb/root_snap /root/mnt
    # mkdir -p /root/mnt/u01
    # mount -t xfs -o nouuid /dev/VGExaDb/u01_snap /root/mnt/u01
    # mkdir -p /root/mnt/home
    # mount -t xfs -o nouuid /dev/VGExaDb/home_snap /root/mnt/home
    # mkdir -p /root/mnt/var
    # mount -t xfs -o nouuid /dev/VGExaDb/var_snap /root/mnt/var
    # mkdir -p /root/mnt/var/log
    # mount -t xfs -o nouuid /dev/VGExaDb/varlog_snap /root/mnt/var/log
    # mkdir -p /root/mnt/var/log/audit
    # mount -t xfs -o nouuid /dev/VGExaDb/audit_snap /root/mnt/var/log/audit
    # mkdir -p /root/mnt/tmp
    # mount -t xfs -o nouuid /dev/VGExaDb/tmp_snap /root/mnt/tmp
  7. Back up the snapshots.
    Use the following commands to write a backup file as a compressed tar file to your prepared NFS backup destination.
    # cd /root/mnt
    # tar --acls -pjcvf /root/remote_FS/mybackup.tar.bz2 * /boot > /tmp/backup_tar.stdout 2> /tmp/backup_tar.stderr
  8. Check the /tmp/backup_tar.stderr file for any significant errors.
    Errors about failing to tar open sockets, and other similar errors, can be ignored.
  9. Unmount and remove all of the snapshots.
    # cd /
    # umount /root/mnt/tmp
    # umount /root/mnt/var/log/audit
    # umount /root/mnt/var/log
    # umount /root/mnt/var
    # umount /root/mnt/home
    # umount /root/mnt/u01
    # umount /root/mnt
    # lvremove /dev/VGExaDb/tmp_snap
    # lvremove /dev/VGExaDb/audit_snap
    # lvremove /dev/VGExaDb/varlog_snap
    # lvremove /dev/VGExaDb/var_snap
    # lvremove /dev/VGExaDb/home_snap
    # lvremove /dev/VGExaDb/u01_snap
    # lvremove /dev/VGExaDb/root_snap
  10. Unmount the NFS backup destination.
    # umount /root/remote_FS
  11. Remove the mount point directories that you created during this procedure.
    # rm -r /root/mnt
    # rmdir /root/remote_FS
  12. Recreate the /dev/VGExaDb/LVDoNotRemoveOrUse logical volume.
    # lvm lvcreate -n LVDoNotRemoveOrUse -L2G VGExaDb -y

2.10.2 Creating a Snapshot-Based Backup of Exadata X8 or Earlier Database Servers with Uncustomized Partitions

This procedure describes how to take a snapshot-based backup. The values shown in the procedure are examples.

If you have not customized the database server partitions from their original shipped configuration, then use the procedures in this section to take a backup and use the backup to restore the database server using the backup.

Note:

  • The recovery procedure restores the exact partitions, including the name and sizes, as they were originally shipped. If you modified the partitions in any way, then you cannot use this procedure. Modifications include changing sizes, renaming, addition or removal of partitions.

  • All steps must be performed as the root user.

  1. Prepare a destination to hold the backup.

    The destination can be a large, writable NFS location. The NFS location should be large enough to hold the backup tar files. For uncustomized partitions, 145 GB should be adequate.

    1. Create a mount point for the NFS share.
      mkdir -p /root/tar
    2. Mount the NFS location.

      In the following command, ip_address is the IP address of the NFS server, and nfs_location is the NFS location.

      mount -t nfs -o rw,intr,soft,proto=tcp,nolock
      ip_address:/nfs_location/ /root/tar
      
  2. Take a snapshot-based backup of the / (root) and /u01 directories.
    1. Create a snapshot named root_snap for the root directory.

      LVDbSys1 is used in the example below, but you should use the value based on the output of imageinfo. If the active image is on LVDbSys2, then the command would be: lvcreate -L1G -s -c 32K -n root_snap /dev/VGExaDb/LVDbSys2.

      lvcreate -L1G -s -c 32K -n root_snap /dev/VGExaDb/LVDbSys1
    2. Label the snapshot.
      e2label /dev/VGExaDb/root_snap DBSYS_SNAP
      
    3. Determine the file system type of the / (root) and /u01 directories.
      • InfiniBand Network Fabric-based servers running Oracle Exadata System Software release 12.1.2.1.0 or later use the ext4 file system type.
      • InfiniBand Network Fabric-based servers running a release of Oracle Exadata System Software earlier than 12.1.2.1.0 use the ext3 file system type.
      • Exadata X5 servers or earlier server models that were updated to Oracle Exadata System Software release 12.1.2.1.0 or later also use ext3 file system type.
      # mount -l
      /dev/mapper/VGExaDb-LVDbSys1 on / type ext4 (rw) [DBSYS]
      ...
      
    4. Mount the snapshot.

      In the mount command below, filesystem_type_of_root_directory is a placeholder for the file system type as determined in the previous step.

      mkdir /root/mnt
      mount /dev/VGExaDb/root_snap /root/mnt -t filesystem_type_of_root_directory
    5. Create a snapshot named u01_snap for the /u01 directory.
      lvcreate -L5G -s -c 32K -n u01_snap /dev/VGExaDb/LVDbOra1
    6. Label the snapshot.
      e2label /dev/VGExaDb/u01_snap DBORA_SNAP
    7. Mount the snapshot.

      In the mount command below, filesystem_type_of_u01_directory is a placeholder for the file system type as determined in step 2.c above.

      mkdir -p /root/mnt/u01
      mount /dev/VGExaDb/u01_snap /root/mnt/u01 -t filesystem_type_of_u01_directory
    8. Change to the directory for the backup.
      cd /root/mnt
    9. Create the backup file using one of the following commands:
      • System does not have NFS mount points:

        # tar -pjcvf /root/tar/mybackup.tar.bz2 * /boot --exclude \
        tar/mybackup.tar.bz2 > /tmp/backup_tar.stdout 2> /tmp/backup_tar.stderr
      • System has NFS mount points:

        In the following command, nfs_mount_points are the NFS mount points. Excluding the mount points prevents the generation of large files and long backup times.

        # tar -pjcvf /root/tar/mybackup.tar.bz2 * /boot --exclude \
        tar/mybackup.tar.bz2 --exclude nfs_mount_points >         \
        /tmp/backup_tar.stdout 2> /tmp/backup_tar.stderr
    10. Check the /tmp/backup_tar.stderr file for any significant errors.
      Errors about failing to tar open sockets, and other similar errors, can be ignored.
  3. Unmount the snapshots and remove the snapshots for the / (root) and /u01 directories.
    cd /
    umount /root/mnt/u01
    umount /root/mnt
    /bin/rm -rf /root/mnt
    lvremove /dev/VGExaDb/u01_snap
    lvremove /dev/VGExaDb/root_snap
  4. Unmount the NFS share.
    umount /root/tar

2.10.3 Creating a Snapshot-Based Backup of Oracle Linux Database Server with Customized Partitions

When you have customized the partitions, the backup procedure is generally the same as the procedure used for non-customized database servers, with the following alterations:

  • You must add the commands to back up any additional partitions. Throughout the procedure, use the command relating to the /u01 partition as a template, and modify the arguments to suit.

  • If any partitions are altered, then use the modified attributes in your commands. For example, if /u01 is renamed to /myown_u01, then use /myown_u01 in the commands.

2.11 Recovering Oracle Linux Database Servers Using a Snapshot-Based Backup

You can recover a database server file systems running Oracle Linux using a snapshot-based backup after severe disaster conditions happen for the database server, or when the server hardware is replaced to such an extent that it amounts to new hardware.

For example, replacing all hard disks leaves no trace of original software on the system. This is similar to replacing the complete system as far as the software is concerned. In addition, it provides a method for disaster recovery of the database servers using an LVM snapshot-based backup taken when the database server was healthy before the disaster condition.

The recovery procedures described in this section do not include backup or recovery of storage servers or the data within the Oracle databases. Oracle recommends testing the backup and recovery procedures on a regular basis.

2.11.1 Overview of Snapshot-Based Recovery of Database Servers

The recovery process consists of a series of tasks.

The recovery procedures use the diagnostics.iso image as a virtual CD-ROM to restart the database server in rescue mode using the ILOM.

Note:

Restoring files from tape may require additional drives to be loaded, and is not covered in this chapter. Oracle recommends backing up files to an NFS location, and using existing tape options to back up and recover from the NFS host.

The general work flow includes the following tasks:

  1. Recreate the following:

    • Boot partitions
    • Physical volumes
    • Volume groups
    • Logical volumes
    • File system
    • Swap partition
  2. Activate the swap partition.
  3. Ensure the /boot partition is the active boot partition.
  4. Restore the data.
  5. Reconfigure GRUB.
  6. Restart the server.

If you use quorum disks, then after recovering the database servers from backup, you must manually reconfigure the quorum disk for the recovered server. See Reconfigure Quorum Disk After Restoring a Database Server for more information.

2.11.2 Recovering Oracle Linux Database Server with Uncustomized Partitions

You can recover the Oracle Linux database server from a snapshot-based backup when using uncustomized partitions.

This procedure is applicable when the layout of the partitions, logical volumes, file systems, and their sizes are equal to the layout when the database server was initially deployed.

Caution:

All existing data on the disks is lost during the procedure.
  1. Prepare an NFS server to host the backup archive file (mybackup.tar.bz2).

    The NFS server must be accessible by IP address.

    For example, on an NFS server with the IP address nfs_ip, where the directory /export is exported as an NFS mount, put the backup file (mybackup.tar.bz2) in the /export directory.

  2. Restart the recovery target system using the diagnostics.iso file.
    See Booting a Server using the Diagnostic ISO File in Oracle Exadata System Software User's Guide.
  3. Answer as indicated in these examples when prompted by the system. The responses are shown in bold.

    Note that for Oracle Exadata System Software release 12.1.2.2.0 or later, DHCP is used and you do not have to manually set up the network.

    • If you are using Oracle Exadata System Software release 18.1 or later, running on Oracle Exadata Database Machine X7 or later, then the prompt looks like the following:


      Description of boot_screen_18.1.jpg follows
      Description of the illustration boot_screen_18.1.jpg
    • If you are using Oracle Exadata System Software release 18.1 or later and restoring through one of the 10GbE Ethernet SFP+ ports on Oracle Exadata Database Machine X3-2 or later, then the prompt looks like the following:

      ------------------------------------------------------------------------------ 
               Choose from the following by typing letter in '()': 
                 (e)nter interactive diagnostics shell. 
                   Use diagnostics shell password to login as root user 
                   (reboot or power cycle to exit the shell), 
                 (r)estore system from NFS backup archive, 
       Select: r 
       Continue (y/n) [n]: y 
       Rescue password: 
       [INFO     ] Enter path to the backup file on the NFS server in format: 
               Enter path to the backup file on the NFS server in format: 
               <ip_address_of_the_NFS_share>:/<path>/<archive_file> 
               For example, 10.124.1.15:/export/nfs/share/backup.2010.04.08.tar.bz2 
       NFS line: <nfs_ip>:/export/mybackup.tar.bz2 
       [INFO     ] The backup file could be created either from LVM or non-LVM 
      based COMPUTE node 
       [INFO     ] Versions below 11.2.1.3.0 do not support LVM based partitioning 
       Use LVM based scheme. (y/n) [y]: y 
       Configure network settings on host via DHCP. (y/n) [y]: n 
       Configure bonded network interface. (y/n) [y]: y 
       IP Address of bondeth0 on this host: <IP address of the DB host> 
       
      Netmask of bondeth0 on this host: <netmask for the above IP address>
       Bonding mode:active-backup or 802.3ad [802.3ad]: active-backup 
       Slave interface1 for bondeth0 (ethX) [eth4]: eth4 
       Slave interface2 for bondeth0 (ethX) [eth5]: eth5 
      ...
       [  354.619610] bondeth0: first active interface up!
       [  354.661427] ixgbe 0000:13:00.1 eth5: NIC Link is Up 10 Gbps, Flow Control: RX/TX
       [  354.724414] bondeth0: link status definitely up for interface eth5, 10000 Mbps full duplex
       Default gateway: <Gateway for the above IP address>
      ------------------------------------------------------------------------------ 
    • If you are using Oracle Exadata System Software release 12.1.x or 12.2.x, then the prompts look like the following:

      ------------------------------------------------------------------------------ 
       Use diagnostics shell password to login as root user
                  (reboot or power cycle to exit the shell),
                (r)estore system from NFS backup archive.
      Select: r
      Continue (y/n) [n]: y
      Rescue password:
      [INFO: ] Enter path to the backup file on the NFS server in format:
             Enter path to the backup file on the NFS server in format:
             <ip_address_of_the_NFS_share>:/<path>/<archive_file>
             For example, 10.124.1.15:/export/nfs/share/backup.2010.04.08.tar.bz2
      NFS line: <nfs_ip>:/export/mybackup.tar.bz2
      [INFO: ] The backup file could be created either from LVM or non-LVM based COMPUTE node
      [INFO: ] Versions below 11.2.1.3.0 do not support LVM based partitioning
      Use LVM based scheme. (y/n) [y]: y
      ------------------------------------------------------------------------------ 
    • If you are using Oracle Exadata System Software release earlier than 12.1.2.2.0, then the prompts look like the following

      ------------------------------------------------------------------------------ 
            Choose from following by typing letter in '()':
         (e)nter interactive diagnostics shell. Must use credentials from Oracle
            support to login (reboot or power cycle to exit the shell),
         (r)estore system from NFS backup archive,
      Select:r
      Are you sure (y/n) [n]:y
       
      The backup file could be created either from LVM or non-LVM based compute node
      versions below 11.2.1.3.1 and 11.2.2.1.0 or higher do not support LVM based partitioning
      use LVM based scheme(y/n):y
       
      Enter path to the backup file on the NFS server in format:
      ip_address_of_the_NFS_share:/path/archive_file
      For example, 10.10.10.10:/export/operating_system.tar.bz2
      NFS line:<nfs_ip>:/export/mybackup.tar.bz2
      IP Address of this host:IP address of the DB host
      Netmask of this host:netmask for the above IP address
      Default gateway:Gateway for the above IP address. If there is no default gateway in your network, enter 0.0.0.0.
      ------------------------------------------------------------------------------ 
      

    When the recovery completes, the log in screen appears.

  4. Log in as the root user.
    If you do not have the password for the root user, then contact Oracle Support Services.
  5. Restart the system.
    # shutdown -r now
    The restoration process is complete.
  6. Verify that all Oracle software can start and function by logging in to the database server.
    The /usr/local/bin/imagehistory command indicates that the database server was reconstructed.

    The following is an example of the output:

    # imagehistory
    
    Version                  : 11.2.2.1.0
    Image activation date    : 2010-10-13 13:42:58 -0700
    Imaging mode             : fresh
    Imaging status           : success
    
    Version                  : 11.2.2.1.0
    Image activation date    : 2010-10-30 19:41:18 -0700
    Imaging mode             : restore from nfs backup
    Imaging status           : success
    
  7. On systems with InfiniBand Network Fabric only, run reclaimdisks.sh on the restored database server.
    /opt/oracle.SupportTools/reclaimdisks.sh -free -reclaim

    Note:

    This command is not required on RoCE-based Exadata database servers.

  8. If the recovery was on Oracle Exadata Database Machine Eighth Rack, then perform the procedure described in Configuring Oracle Exadata Database Machine Eighth Rack Oracle Linux Database Server After Recovery.

2.11.3 Recovering Exadata X8M Database Servers with Customized Partitions

This procedure describes how to recover an Oracle Exadata Database Machine X8M Oracle Linux database server with RoCE Network Fabric from a snapshot-based backup when using customized partitions.

  1. Prepare an NFS server to host the backup archive file (mybackup.tar.bz2).

    The NFS server must be accessible by IP address.

    For example, on an NFS server with the IP address nfs_ip, where the directory /export is exported as an NFS mount, put the backup file (mybackup.tar.bz2) in the /export directory.

  2. Restart the recovery target system using the diagnostics.iso file.
    See Booting a Server using the Diagnostic ISO File in Oracle Exadata System Software User's Guide.
  3. Log in to the diagnostics shell as the root user.
    When prompted, enter the diagnostics shell.

    For example:

    Choose from following by typing letter in '()':
    (e)nter interactive diagnostics shell. Must use credentials 
    from Oracle support to login (reboot or power cycle to exit
    the shell),
    (r)estore system from NFS backup archive, 
    Type e to enter the diagnostics shell and log in as the root user.
    If prompted, log in to the system as the root user. If you are prompted for the root user password and do not have it, then contact Oracle Support Services.
  4. If required, use /opt/MegaRAID/storcli/storcli64 to configure the disk controller to set up the disks.
  5. If it is mounted, unmount /mnt/cell
    # umount /mnt/cell
  6. Create the boot partition.
    1. Start an interactive session using the partd command.
      # parted /dev/sda
    2. Assign a disk label.
      (parted) mklabel gpt
    3. Set the unit size as sector.
      (parted) unit s
    4. Check the partition table by displaying the existing partitions.
      (parted) print
    5. Remove the partitions listed in the previous step.
      (parted) rm part#
    6. Create a new first partition.
      (parted) mkpart primary 64s 1048639s
    7. Specify this is a bootable partition.
      (parted) set 1 boot on
  7. Create second primary (boot) and third primary (LVM) partitions.
    1. Create a second primary partition as a UEFI boot partition with fat32.
      (parted) mkpart primary fat32 1048640s 1572927s 
      (parted) set 2 boot on
    2. Create a new third partition.
      (parted) mkpart primary 1572928s –1s
    3. Configure the third partition as a physical volume.
      (parted) set 3 lvm on
    4. Write the information to disk, then quit.
      (parted) quit
  8. Create the physical volume and volume group.
    # lvm pvcreate /dev/sda3
    # lvm vgcreate VGExaDb /dev/sda3

    If the physical volume or volume group already exists, then remove and then re-create them as follows:.

    # lvm vgremove VGExaDb
    # lvm pvremove /dev/sda3
    # lvm pvcreate /dev/sda3
    # lvm vgcreate VGExaDb /dev/sda3
  9. Re-create the customized LVM partitions, then create and mount the file systems.

    Note:

    Use the following information and examples as guidance for this step. You must make the necessary adjustments for customized LVM partitions and file systems. For example, you may need to adjust the names and sizes of various partitions to match your previous customizations, or you may need to create additional custom partitions.
    1. Create the logical volumes.

      For example, the following commands re-create the logical volumes that exist by default on Oracle Exadata Database Machine X8M systems with Oracle Exadata System Software release 19.3 or later:

      # lvm lvcreate -n LVDbSys1 -L15G VGExaDb
      # lvm lvcreate -n LVDbSys2 -L15G VGExaDb
      # lvm lvcreate -n LVDbOra1 -L100G VGExaDb
      # lvm lvcreate -n LVDbHome -L4G VGExaDb
      # lvm lvcreate -n LVDbVar1 -L2G VGExaDb
      # lvm lvcreate -n LVDbVar2 -L2G VGExaDb
      # lvm lvcreate -n LVDbVarLog -L18G VGExaDb
      # lvm lvcreate -n LVDbVarLogAudit -L1G VGExaDb
      # lvm lvcreate -n LVDbTmp -L3G VGExaDb
    2. Create the file systems.
      # mkfs.xfs -f /dev/VGExaDb/LVDbSys1
      # mkfs.xfs -f /dev/VGExaDb/LVDbSys2
      # mkfs.xfs -f /dev/VGExaDb/LVDbOra1
      # mkfs.xfs -f /dev/VGExaDb/LVDbHome
      # mkfs.xfs -f /dev/VGExaDb/LVDbVar1
      # mkfs.xfs -f /dev/VGExaDb/LVDbVar2
      # mkfs.xfs -f /dev/VGExaDb/LVDbVarLog
      # mkfs.xfs -f /dev/VGExaDb/LVDbVarLogAudit
      # mkfs.xfs -f /dev/VGExaDb/LVDbTmp
      # mkfs.xfs -f /dev/sda1
    3. Label the file systems.
      # xfs_admin -L DBSYS /dev/VGExaDb/LVDbSys1
      # xfs_admin -L DBORA /dev/VGExaDb/LVDbOra1
      # xfs_admin -L HOME /dev/VGExaDb/LVDbHome
      # xfs_admin -L VAR /dev/VGExaDb/LVDbVar1
      # xfs_admin -L DIAG /dev/VGExaDb/LVDbVarLog
      # xfs_admin -L AUDIT /dev/VGExaDb/LVDbVarLogAudit
      # xfs_admin -L TMP /dev/VGExaDb/LVDbTmp
      # xfs_admin -L BOOT /dev/sda1
    4. Create mount points for all the partitions to mirror the original system, and mount the respective partitions.

      For example, assuming that /mnt is used as the top level directory for the recovery operation, you could use the following commands to create the directories and mount the partitions:

      # mount -t xfs /dev/VGExaDb/LVDbSys1 /mnt
      # mkdir -p /mnt/u01
      # mount -t xfs /dev/VGExaDb/LVDbOra1 /mnt/u01
      # mkdir -p /mnt/home
      # mount -t xfs /dev/VGExaDb/LVDbHome /mnt/home
      # mkdir -p /mnt/var
      # mount -t xfs /dev/VGExaDb/LVDbVar1 /mnt/var
      # mkdir -p /mnt/var/log
      # mount -t xfs /dev/VGExaDb/LVDbVarLog /mnt/var/log
      # mkdir -p /mnt/var/log/audit
      # mount -t xfs /dev/VGExaDb/LVDbVarLogAudit /mnt/var/log/audit
      # mkdir -p /mnt/tmp
      # mount -t xfs /dev/VGExaDb/LVDbTmp /mnt/tmp
      # mkdir -p /mnt/boot
      # mount -t xfs /dev/sda1 /mnt/boot
  10. Create the system swap space.
    For Oracle Exadata Database Machine X8M, with Oracle Exadata System Software release 19.3 or later, the default swap size is 16 GB.

    For example:

    # lvm lvcreate -n LVDbSwap1 -L16G VGExaDb
    # mkswap -L SWAP /dev/VGExaDb/LVDbSwap1
    
  11. Create /mnt/boot/efi, label /dev/sda2, and mount /dev/sda2 on /mnt/boot/efi with type vfat.
    # mkdir /mnt/boot/efi
    # dosfslabel /dev/sda2 ESP
    # mount /dev/sda2 /mnt/boot/efi -t vfat
  12. Bring up the network.
    # ip address add ip_address_for_eth0/netmask_for_eth0 dev eth0
    # ip link set up eth0
    # ip route add default via gateway_address dev eth0
  13. Mount the NFS server where you have the backup.

    The following example assumes that the backup is located in the /export directory of the NFS server with IP address nfs_ip.

    # mkdir -p /root/mnt
    # mount -t nfs -o ro,intr,soft,proto=tcp,nolock nfs_ip:/export /root/mnt
  14. Restore from backup.
    # tar --acls -pjxvf /root/mnt/mybackup.tar.bz2 -C /mnt
  15. Unmount the restored file systems.

    For example:

    # umount /mnt/boot/efi
    # umount /mnt/boot
    # umount /mnt/tmp
    # umount /mnt/var/log/audit
    # umount /mnt/var/log
    # umount /mnt/var
    # umount /mnt/home
    # umount /mnt/u01
    # umount /mnt
  16. Detach the diagnostics.iso file.
  17. Check the boot devices and boot order for the ExadataLinux_1 device.
    1. Check the available boot devices.
      # efibootmgr
      BootCurrent: 000C
      Timeout: 1 seconds
      BootOrder: 000C,0001,0002,0003,0004,0005,0007,0008,0009,000A,000B
      Boot0001* NET0:PXE IP4 Intel(R) I210 Gigabit  Network Connection
      Boot0002* NET1:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0003* NET2:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0004* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0005* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0007* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0008* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0009* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000A* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000B* Oracle Linux
      Boot000C* USB:SUN

      If the Boot0000* ExadataLinux_1 device is not listed then create the device.

      # efibootmgr -c -d /dev/sda -p 2 -l '\EFI\REDHAT\SHIM.EFI' -L 'ExadataLinux_1'
      BootCurrent: 000C
      Timeout: 1 seconds
      BootOrder: 0000,000C,0001,0002,0003,0004,0005,0007,0008,0009,000A,000B
      Boot0001* NET0:PXE IP4 Intel(R) I210 Gigabit  Network Connection
      Boot0002* NET1:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0003* NET2:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0004* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0005* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0007* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0008* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0009* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000A* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000B* Oracle Linux
      Boot000C* USB:SUN
      Boot0000* ExadataLinux_1
    2. Configure the Boot0000* ExadataLinux_1 device to be first in the boot order.
      # efibootmgr -o 0000
      BootCurrent: 000B
      Timeout: 1 seconds
      BootOrder: 0000
      Boot0000* ExadataLinux_1
      Boot0001* NET0:PXE IP4 Intel(R) I210 Gigabit  Network Connection
      Boot0002* NET1:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0003* NET2:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0004* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0005* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0007* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0008* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0009* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000A* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000B* USB:SUN
      Boot000C* UEFI OS
  18. Restart the system and update the boot order in the BIOS.
    # reboot

    Modify the boot order to set the ExadataLinux_1 boot device as the first device.

    1. Press F2 when booting the system.
    2. Go to the Setup Utility.
    3. Select BOOT.
    4. Set ExadataLinux_1 for Boot Option #1.
    5. Exit the Setup Utility.

    This completes the restoration procedure for the server.

  19. If the recovery was on Oracle Exadata Database Machine Eighth Rack, then perform the procedure described in Configuring Oracle Exadata Database Machine Eighth Rack Oracle Linux Database Server After Recovery.

2.11.4 Recovering Exadata Database Servers X7 or X8 with Customized Partitions

This procedure describes how to recover an Oracle Exadata Database Machine X7 or X8 Oracle Linux database server with InfiniBand Network Fabric from a snapshot-based backup when using customized partitions.

Note:

This task assumes you are running Oracle Exadata System Software release 18c (18.1.0) or greater.
  1. Prepare an NFS server to host the backup archive file (mybackup.tar.bz2).

    The NFS server must be accessible by IP address.

    For example, on an NFS server with the IP address nfs_ip, where the directory /export is exported as an NFS mount, put the backup file (mybackup.tar.bz2) in the /export directory.

  2. Restart the recovery target system using the diagnostics.iso file.
    See Booting a Server using the Diagnostic ISO File in Oracle Exadata System Software User's Guide.
  3. Log in to the diagnostics shell as the root user.
    When prompted, enter the diagnostics shell.

    For example:

    Choose from following by typing letter in '()':
    (e)nter interactive diagnostics shell. Must use credentials 
    from Oracle support to login (reboot or power cycle to exit
    the shell),
    (r)estore system from NFS backup archive, 
    Type e to enter the diagnostics shell and log in as the root user.
    If prompted, log in to the system as the root user. If you are prompted for the root user password and do not have it, then contact Oracle Support Services.
  4. If required, use /opt/MegaRAID/storcli/storcli64 (or /opt/MegaRAID/MegaCli/MegaCli64 for releases earlier than Oracle Exadata System Software 19c) to configure the disk controller to set up the disks.
  5. If it is mounted, unmount /mnt/cell
    # umount /mnt/cell
  6. Create the boot partition.
    1. Start an interactive session using the partd command.
      # parted /dev/sda
    2. Assign a disk label.
      (parted) mklabel gpt
    3. Set the unit size as sector.
      (parted) unit s
    4. Check the partition table by displaying the existing partitions.
      (parted) print
    5. Remove the partitions listed in the previous step.
      (parted) rm part#
    6. Create a new first partition.
      (parted) mkpart primary 64s 1048639s
    7. Specify this is a bootable partition.
      (parted) set 1 boot on
  7. Create second primary (boot) and third primary (LVM) partitions.
    1. Create a second primary partition as a UEFI boot partition with fat32.
      (parted) mkpart primary fat32 1048640s 1572927s 
      (parted) set 2 boot on
    2. Create a new third partition.
      (parted) mkpart primary 1572928s –1s
    3. Configure the third partition as a physical volume.
      (parted) set 3 lvm on
    4. Write the information to disk, then quit.
      (parted) quit
  8. Create the physical volume and volume group.
    # lvm pvcreate /dev/sda3
    # lvm vgcreate VGExaDb /dev/sda3

    If the physical volume or volume group already exists, then remove and then re-create them as follows:.

    # lvm vgremove VGExaDb
    # lvm pvremove /dev/sda3
    # lvm pvcreate /dev/sda3
    # lvm vgcreate VGExaDb /dev/sda3
  9. Re-create the customized LVM partitions, then create and mount the file systems.

    Note:

    Use the following information and examples as guidance for this step. You must make the necessary adjustments for customized LVM partitions and file systems. For example, you may need to adjust the names and sizes of various partitions to match your previous customizations, or you may need to create additional custom partitions.
    1. Create the logical volumes.

      For example, to create logical volumes for the / (root) and /u01 file systems:

      # lvm lvcreate -n LVDbSys1 -L40G VGExaDb
      # lvm lvcreate -n LVDbOra1 -L100G VGExaDb
    2. Create the file systems.
      • If your environment uses the xfs file system type, then you could use the following commands to create the / (root), /u01, and /boot file systems:

        # mkfs.xfs /dev/VGExaDb/LVDbSys1 -f
        # mkfs.xfs /dev/VGExaDb/LVDbOra1 -f
        # mkfs.xfs /dev/sda1 -f
      • Alternatively, if your environment uses the ext4 file system type, then you could use the following commands:

        # mkfs.ext4 /dev/VGExaDb/LVDbSys1
        # mkfs.ext4 /dev/VGExaDb/LVDbOra1
        # mkfs.ext4 /dev/sda1
    3. Label the file systems.
      • If your environment uses the xfs file system type, then you could use the following commands to label the / (root), /u01, and /boot file systems:

        # xfs_admin -L DBSYS /dev/VGExaDb/LVDbSys1
        # xfs_admin -L DBORA /dev/VGExaDb/LVDbOra1
        # xfs_admin -L BOOT /dev/sda1
      • Alternatively, if your environment uses the ext4 file system type, then you could use the following commands:

        # e2label /dev/VGExaDb/LVDbSys1 DBSYS
        # e2label /dev/VGExaDb/LVDbOra1 DBORA
        # e2label /dev/sda1 BOOT
    4. Create mount points for all the partitions to mirror the original system, and mount the respective partitions.

      For example, assuming that /mnt is used as the top level directory for the recovery operation, you could use the following commands to create the directories and mount the partitions:

      # mkdir -p /mnt/u01
      # mkdir -p /mnt/boot
      # mount /dev/VGExaDb/LVDbSys1 /mnt -t filesystem_type
      # mount /dev/VGExaDb/LVDbOra1 /mnt/u01 -t filesystem_type
      # mount /dev/sda1 /mnt/boot -t filesystem_type

      In the preceding commands, specify xfs or ext4 as the filesystem_type according to your system configuration.

  10. Create the system swap space.
    For Oracle Exadata Database Machine X7 and X8 systems, the default swap size is 24 GB.

    For example:

    # lvm lvcreate -n LVDbSwap1 -L24G VGExaDb
    # mkswap -L SWAP /dev/VGExaDb/LVDbSwap1
    
  11. Create /mnt/boot/efi, label /dev/sda2, and mount /dev/sda2 on /mnt/boot/efi with type vfat.
    # mkdir /mnt/boot/efi
    # dosfslabel /dev/sda2 ESP
    # mount /dev/sda2 /mnt/boot/efi -t vfat
  12. Bring up the network.
    # ip address add ip_address_for_eth0/netmask_for_eth0 dev eth0
    # ip link set up eth0
    # ip route add default via gateway_address dev eth0
  13. Mount the NFS server where you have the backup.

    The following example assumes that the backup is located in the /export directory of the NFS server with IP address nfs_ip.

    # mkdir -p /root/mnt
    # mount -t nfs -o ro,intr,soft,proto=tcp,nolock nfs_ip:/export /root/mnt
  14. Restore from backup.
    # tar -pjxvf /root/mnt/mybackup.tar.bz2 -C /mnt
  15. Unmount the restored file systems.

    For example:

    # umount /mnt/boot/efi
    # umount /mnt/boot
    # umount /mnt/tmp
    # umount /mnt/var/log/audit
    # umount /mnt/var/log
    # umount /mnt/var
    # umount /mnt/home
    # umount /mnt/u01
    # umount /mnt
  16. Detach the diagnostics.iso file.
  17. Check the boot devices and boot order for the ExadataLinux_1 device.
    1. Check the available boot devices.
      # efibootmgr
      BootCurrent: 000C
      Timeout: 1 seconds
      BootOrder: 000C,0001,0002,0003,0004,0005,0007,0008,0009,000A,000B
      Boot0001* NET0:PXE IP4 Intel(R) I210 Gigabit  Network Connection
      Boot0002* NET1:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0003* NET2:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0004* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0005* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0007* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0008* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0009* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000A* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000B* Oracle Linux
      Boot000C* USB:SUN

      If the Boot0000* ExadataLinux_1 device is not listed then create the device.

      # efibootmgr -c -d /dev/sda -p 2 -l '\EFI\REDHAT\SHIM.EFI' -L 'ExadataLinux_1'
      BootCurrent: 000C
      Timeout: 1 seconds
      BootOrder: 0000,000C,0001,0002,0003,0004,0005,0007,0008,0009,000A,000B
      Boot0001* NET0:PXE IP4 Intel(R) I210 Gigabit  Network Connection
      Boot0002* NET1:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0003* NET2:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0004* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0005* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0007* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0008* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0009* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000A* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000B* Oracle Linux
      Boot000C* USB:SUN
      Boot0000* ExadataLinux_1
    2. Configure the Boot0000* ExadataLinux_1 device to be first in the boot order.
      # efibootmgr -o 0000
      BootCurrent: 000B
      Timeout: 1 seconds
      BootOrder: 0000
      Boot0000* ExadataLinux_1
      Boot0001* NET0:PXE IP4 Intel(R) I210 Gigabit  Network Connection
      Boot0002* NET1:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0003* NET2:PXE IP4 Oracle Dual Port 10GBase-T Ethernet Controller
      Boot0004* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0005* PCIE1:PXE IP4 Oracle Dual Port 25Gb Ethernet Adapter
      Boot0007* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0008* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot0009* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000A* PCIE3:PXE IP4 Oracle Quad Port 10GBase-T Adapter
      Boot000B* USB:SUN
      Boot000C* UEFI OS
  18. Restart the system and update the boot order in the BIOS.
    # reboot

    Modify the boot order to set the ExadataLinux_1 boot device as the first device.

    1. Press F2 when booting the system.
    2. Go to the Setup Utility.
    3. Select BOOT.
    4. Set ExadataLinux_1 for Boot Option #1.
    5. Exit the Setup Utility.
  19. Run reclaimdisks.sh on restored database server.
    # /opt/oracle.SupportTools/reclaimdisks.sh -free -reclaim
  20. If the recovery was on Oracle Exadata Database Machine Eighth Rack, then perform the procedure described in Configuring Oracle Exadata Database Machine Eighth Rack Oracle Linux Database Server After Recovery.

2.11.5 Recovering Exadata X6 or Earlier Database Servers with Customized Partitions

This procedure describes how to recover Oracle Exadata Database Servers for Oracle Exadata Database Machine X6-2 or earlier running Oracle Linux from a snapshot-based backup when using customized partitions.

  1. Prepare an NFS server to host the backup archive file (mybackup.tar.bz2).

    The NFS server must be accessible by IP address.

    For example, on an NFS server with the IP address nfs_ip, where the directory /export is exported as an NFS mount, put the backup file (mybackup.tar.bz2) in the /export directory.

  2. Restart the recovery target system using the diagnostics.iso file.
    See Booting a Server using the Diagnostic ISO File in Oracle Exadata System Software User's Guide.
  3. Log in to the diagnostics shell as the root user.
    When prompted, enter the diagnostics shell.

    For example:

    Choose from following by typing letter in '()':
    (e)nter interactive diagnostics shell. Must use credentials 
    from Oracle support to login (reboot or power cycle to exit
    the shell),
    (r)estore system from NFS backup archive, 
    Type e to enter the diagnostics shell and log in as the root user.
    If prompted, log in to the system as the root user. If you are prompted for the root user password and do not have it, then contact Oracle Support Services.
  4. If required, use /opt/MegaRAID/storcli/storcli64 (or /opt/MegaRAID/MegaCli/MegaCli64 for releases earlier than Oracle Exadata System Software 19c) to configure the disk controller to set up the disks.
  5. Unmount /mnt/cell
    # umount /mnt/cell
  6. Create the boot partition.
    1. Start an interactive session using the partd command.
      # parted /dev/sda
    2. Assign a disk label.
      • If you are running Oracle Exadata System Software release 11.2.3.3.0 or later:

        (parted) mklabel gpt
      • If you are running a release earlier than Oracle Exadata System Software release 11.2.3.3.0:

        (parted) mklabel msdos
    3. Set the unit size as sector.
      (parted) unit s
    4. Check the partition table by displaying the existing partitions.
      (parted) print
    5. Remove the partitions that will be re-created.
      (parted) rm <part#>
    6. Create a new first partition.
      (parted) mkpart primary 63 1048639
    7. Specify this is a bootable partition.
      (parted) set 1 boot on
  7. Create an additional primary (LVM) partition.
    • If using Oracle Exadata System Software release 18.1.0.0.0 or later — Create second primary (bios_grub) and third primary (LVM) partitions:
      1. Create a new second partition.

        (parted) mkpart primary 1048640 1050687
      2. Specify this is a GRUB BIOS partition.

        (parted) set 2 bios_grub on
      3. Create a new third partition.

        (parted) mkpart primary 1050688 1751949278
      4. Specify this is a physical volume.

        (parted) set 3 lvm on
      5. Write the information to disk, then quit.

        (parted) quit
    • If using a release earlier than Oracle Exadata System Software release 18.1.0.0.0:
      1. Create a new second partition.

        (parted) mkpart primary 1048640 -1
      2. Specify this is a physical volume.

        (parted) set 2 lvm on
      3. Write the information to disk, then quit.

        (parted) quit
  8. Re-create the customized LVM partitions and create the file systems.
    1. Create the physical volume, volume group, and the logical volumes as follows:
      # lvm pvcreate /dev/sda2
      # lvm vgcreate VGExaDb /dev/sda2
    2. Create the logical volume for the / (root) directory, a file system, and label it.
      • Create the logical volume:

        # lvm lvcreate -n LVDbSys1 -L40G VGExaDb
      • If using Oracle Exadata System Software release 12.1.2.2.0 or later, then create the logical volume for the reserved partition.

        # lvm lvcreate -n LVDoNotRemoveOrUse –L1G VGExaDb

        Note:

        Do not create any file system on this logical volume.
      • Create the file system.

        • If you previously had an ext4 file system, use the mkfs.ext4 command:

          # mkfs.ext4 /dev/VGExaDb/LVDbSys1
        • If you previously had an ext3 file system, use the mkfs.ext3 command:

          # mkfs.ext3 /dev/VGExaDb/LVDbSys1
      • Label the file system.

        # e2label /dev/VGExaDb/LVDbSys1 DBSYS
    3. Create the system swap space.
      # lvm lvcreate -n LVDbSwap1 -L24G VGExaDb
      # mkswap -L SWAP /dev/VGExaDb/LVDbSwap1
    4. Create the logical volume for the /u01 directory, and label it.
      • Create the logical volume:

        # lvm lvcreate -n LVDbOra1 -L100G VGExaDb
      • Create the file system.

        • If you previously had an ext4 file system, then use the mkfs.ext4 command:

          # mkfs.ext4 /dev/VGExaDb/LVDbOra1
        • If you previously had an ext3 file system, then use the mkfs.ext3 command:

          # mkfs.ext3 /dev/VGExaDb/LVDbOra1
      • Label the file system.

        # e2label /dev/VGExaDb/LVDbOra1 DBORA
    5. Create a file system on the /boot partition, and label it.
      • Create the file system.

        • If you previously had an ext4 file system, use the mkfs.ext4 command:

          # mkfs.ext4 /dev/sda1
        • If you previously had an ext3 file system, use the mkfs.ext3 command:

          # mkfs.ext3 /dev/sda1
      • Label the file system:

        # e2label /dev/sda1 BOOT

      Note:

      For customized file system layouts, additional logical volumes can be created at this time. For customized layouts, different sizes may be used.
  9. Create mount points for all the partitions to mirror the original system, and mount the respective partitions.

    For example, assuming /mnt is used as the top level directory for this, the mounted list of partitions may look like the following:

    /dev/VGExaDb/LVDbSys1 on /mnt
    /dev/VGExaDb/LVDbOra1 on /mnt/u01
    /dev/sda1 on /mnt/boot

    Note:

    For customized file system layouts with additional logical volumes, additional mount points need to be created during this step.

    The following is an example for Oracle Exadata Database Machine X6-2 and earlier systems of how to mount the root file system, and create two mount points. In the commands below, filesystem_type specifies the applicable file system type; either ext3 or ext4.

    # mount /dev/VGExaDb/LVDbSys1 /mnt -t filesystem_type
    # mkdir /mnt/u01 /mnt/boot
    # mount /dev/VGExaDb/LVDbOra1 /mnt/u01 -t filesystem_type
    # mount /dev/sda1 /mnt/boot -t filesystem_type
  10. Bring up the network.
    • If the operating system is Oracle Linux 6 or later:
      # ip address add ip_address_for_eth0/netmask_for_eth0 dev eth0
      # ip link set up eth0
      # ip route add default via gateway_address dev eth0
    • If the operating system is Oracle Linux 5:
      # ifconfig eth0 ip_address_for_eth0 netmask netmask_for_eth0 up
  11. Mount the NFS server where you have the backup.

    The following example assumes that the backup is located in the /export directory of the NFS server with IP address nfs_ip.

    # mkdir -p /root/mnt
    # mount -t nfs -o ro,intr,soft,proto=tcp,nolock nfs_ip:/export /root/mnt
  12. Restore from backup.
    # tar -pjxvf /root/mnt/mybackup.tar.bz2 -C /mnt
  13. Unmount the restored file systems, and remount the /boot partition.
    # umount /mnt/u01
    # umount /mnt/boot
    # umount /mnt
    # mkdir /boot
    # mount /dev/sda1 /boot -t filesystem_type
  14. Set up the boot loader.

    In the following instructions, /dev/sda1 is the /boot area.

    • If using Oracle Exadata System Software release 18.1.0.0.0 or later:
      # grub2-install /dev/sda
      
      Installing for i386-pc platform.
      Installation finished. No error reported.
    • If using a release earlier than Oracle Exadata System Software release 18.1.0.0.0:
      # grub
      grub> find /I_am_hd_boot
      grub> root (hdX,0)
      grub> setup (hdX)
      grub> quit

      In the preceding commands, the find command identifies the hard disk that contains the file I_am_hd_boot; for example (hd0,0). Use the value that you observe to specify the hdX value in the GRUB root and setup commands.

  15. Detach the diagnostics.iso file.
  16. Unmount the /boot partition.
    # umount /boot
  17. Restart the system.
    # shutdown -r now

    This completes the restoration procedure for the server.

  18. If the recovery was on Oracle Exadata Database Machine Eighth Rack, then perform the procedure described in Configuring Oracle Exadata Database Machine Eighth Rack Oracle Linux Database Server After Recovery.

2.11.6 Configuring Oracle Exadata Database Machine Eighth Rack Oracle Linux Database Server After Recovery

After the Oracle Linux database server in Oracle Exadata Database Machine Eighth Rack has been re-imaged, restored, or rescued, you can then reconfigure the eighth rack.

2.11.6.1 Configuring Eighth Rack On X3-2 or Later Machines Running Oracle Exadata Storage Server Release 12.1.2.3.0 or Later

The following procedure should be performed after Oracle Linux database server in Oracle Exadata Database Machine Eighth Rack has been re-imaged, restored, or rescued.

For X3-2 systems, use this method only if you are running Oracle Exadata System Software release 12.1.2.3.0 or later.

  1. On the recovered server, check that the resourcecontrol utility exists in the /opt/oracle.SupportTools directory. If not, copy it from another database server to the recovered server.
  2. Ensure proper permissions are set on the resourcecontrol utility.
    # chmod 740 /opt/oracle.SupportTools/resourcecontrol
    
  3. Verify the current configuration.
    # dbmcli -e LIST DBSERVER ATTRIBUTES coreCount
    

    See Table 2-3 for the number of cores allowed for each machine configuration. If the correct value is shown, then no configuration changes are necessary. If that value is not shown, then continue to step 4 of this procedure.

  4. Change the enabled core configuration.
    # dbmcli -e ALTER DBSERVER pendingCoreCount=new_core_count FORCE
    

    new_core_count for an Eighth Rack is:

    • X8-2: 24
    • X7-2: 24
    • X6-2: 22
    • X5-2: 18
    • X4-8: 60
    • X4-2: 12
  5. Restart the server.
    # reboot
    
  6. Verify the changes to the configuration.
    # dbmcli -e LIST DBSERVER ATTRIBUTES coreCount
    
2.11.6.2 Configuring Eighth Rack On X3-2 Machines Running Oracle Exadata Storage Server Release 12.1.2.2.3 or Earlier

The following procedure should be performed after Oracle Linux database server in Oracle Exadata Database Machine Eighth Rack has been re-imaged, restored, or rescued.

  1. Copy the /opt/oracle.SupportTools/resourcecontrol utility from another database server to the /opt/oracle.SupportTools/resourcecontrol directory on the recovered server.
  2. Ensure proper permissions are set on the utility.
    # chmod 740 /opt/oracle.SupportTools/resourcecontrol
    
  3. Verify the current configuration.

    The output from the command is shown in this example.

    # /opt/oracle.SupportTools/resourcecontrol -show
    
      Validated hardware and OS. Proceed.
      Number of cores active: 8
    

    For an eighth rack configuration, eight cores should be enabled. If that value is shown, then no configuration changes are necessary. If that value is not shown, then continue to step 4 of this procedure.

    Note:

    If there is an error similar to the following after running the utility, then restarting the server one or more times usually clears the error:

    Validated hardware and OS. Proceed.
    Cannot get ubisconfig export. Cannot Proceed. Exit.
  4. Change the configuration for enabled cores.
    # /opt/oracle.SupportTools/resourcecontrol -cores 8
    
  5. Restart the server.
    # shutdown -r now
    
  6. Verify the changes to the configuration.
    # /opt/oracle.SupportTools/resourcecontrol -show
    

    The following is an example of the expected output from the command for the database server:

    This is a Linux database server.
    Validated hardware and OS. Proceed.
    Number of cores active per socket: 4

2.12 Re-Imaging the Oracle Exadata Database Server

The re-image procedure is necessary when a database server needs to be brought to an initial state for various reasons.

Some examples scenarios for re-imaging the database server are:

  • You want to install a new server and need to use an earlier release than is in the image already installed on the server.
  • You need to replace a damaged database server with a new database server.
  • Your database server had multiple disk failures causing local disk storage failure and you do not have a database server backup.
  • You want to repurpose the server to a new rack.

During the re-imaging procedure, the other database servers on Oracle Exadata Database Machine are available. When the new server is added to the cluster, the software is copied from an existing database server to the new server. It is your responsibility to restore scripting, CRON jobs, maintenance actions, and non-Oracle software.

Note:

The procedures in this section assume the database is Oracle Database 11g Release 2 (11.2) or later.

Starting with Oracle Exadata System Software release 19.1.0, Secure Eraser is automatically started during re-imaging if the hardware supports Secure Eraser. This significantly simplifies the re-imaging procedure while maintaining performance. Now, when re-purposing a rack, you only have to image the rack and the secure data erasure is taken care of transparently as part of the process.

The following tasks describes how to re-image an Oracle Exadata Database Server running Oracle Linux:

2.12.1 Contact Oracle Support Services

If a failed server is being replaced, open a support request with Oracle Support Services.

The support engineer will identify the failed server, and send a replacement. The support engineer will ask for the output from the imagehistory command run from a surviving database server. The output provides a link to the computeImageMaker file that was used to image the original database server, and provides a means to restore the system to the same level.

2.12.2 Download Latest Release of Cluster Verification Utility

The latest release of the cluster verification utility (cluvfy) is available from My Oracle Support.

See My Oracle Support note 316817.1 for download instructions and other information.

2.12.3 Remove the Database Server from the Cluster

If you are reimaging a failed server or repurposing a server, follow the steps in this task to remove the server from the cluster before you reimage it. If you are reimaging the server for a different reason, skip this task and proceed with the reimaging task next.

The steps in this task are performed using a working database server in the cluster. In the following commands, working_server is a working database server, and failed_server is the database server you are removing, either because it failed or it is being repurposed.

  1. Log in as the oracle or grid user on a database server in the cluster.
    Log in as the user that owns the Oracle Grid Infrastructure software installation.
  2. Disable the listener that runs on the failed server.
    $ srvctl disable listener -n failed_server
    $ srvctl stop listener -n failed_server
    
  3. Delete the Oracle home from the Oracle inventory.

    In the following command, list_of_working_servers is a list of the servers that are still working in the cluster, such as dm01db02, dm01db03, and so on.

    In the following command, replace /u01/app/oracle/product/12.1.0.2/dbhome_1 with the location of your Oracle Database home directory.

    $ cd $ORACLE_HOME/oui/bin
    $ ./runInstaller -updateNodeList ORACLE_HOME= \
    /u01/app/oracle/product/12.1.0.2/dbhome_1 "CLUSTER_NODES=list_of_working_servers"
    
  4. Log in as the grid user on the database server.
    The grid user refers to the operating system user that owns the Oracle Grid Infrastructure software installation. The $ORACLE_HOME variable should point to the location of the Grid home.
  5. Verify the failed server is unpinned.
    $ olsnodes -s -t
    

    The following is an example of the output from the command:

    dm01adm05        Inactive        Unpinned
    dm01adm06        Active          Unpinned
    dm01adm07        Active          Unpinned
    dm01adm08        Active          Unpinned
    
  6. Log in as the root user on the database server.
  7. Stop and delete the VIP resources for the failed database server.
    # srvctl stop vip -i failed_server-vip
    PRCC-1016 : failed_server-vip.example.com was already stopped
    
    # srvctl remove vip -i failed_server-vip
    Please confirm that you intend to remove the VIPs failed_server-vip (y/[n]) y
    
  8. Delete the server from the cluster.
    # crsctl delete node -n failed_server
    CRS-4661: Node dm01db01 successfully deleted.
    

    If you receive an error message similar to the following, then relocate the voting disks.

    CRS-4662: Error while trying to delete node dm01db01.
    CRS-4000: Command Delete failed, or completed with errors.
    

    To relocate the voting disks use the following steps:

    1. Determine the current location of the voting disks.
      # crsctl query css votedisk
      

      The following is an example of the output from the command. The current location is DBFS_DG.

      ##  STATE    File Universal Id          File Name                Disk group
      --  -----    -----------------          ---------                ----------
      1. ONLINE   123456789abab (o/192.168.73.102/DATA_CD_00_dm01cel07) [DBFS_DG]
      2. ONLINE   123456789cdcd (o/192.168.73.103/DATA_CD_00_dm01cel08) [DBFS_DG]
      3. ONLINE   123456789efef (o/192.168.73.100/DATA_CD_00_dm01cel05) [DBFS_DG]
      Located 3 voting disk(s).
      
    2. Relocate the voting disks to another disk group.
      # ./crsctl replace votedisk +DATA
      
      Successful addition of voting disk 2345667aabbdd.
      ...
      CRS-4266: Voting file(s) successfully replaced
      
    3. Relocate the voting disks to the original location using a command similar to the following:
      # ./crsctl replace votedisk +DBFS_DG
      
    4. Delete the server from the cluster.
  9. Log in as the grid user on the database server.
    The grid user refers to the operating system user that owns the Oracle Grid Infrastructure software installation. The $ORACLE_HOME variable should point to the location of the Grid home.
  10. Update the Oracle inventory.

    In the following command, replace /u01/app/12.1.0.2/grid with the location of your Oracle Grid Infrastructure home directory.

    $ cd $ORACLE_HOME/oui/bin
    $ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/12.1.0.2/grid \
      "CLUSTER_NODES=list_of_working_servers" CRS=TRUE
    
  11. Verify the server was deleted successfully.
    $ cluvfy stage -post nodedel -n failed_server -verbose
    

    The following is an example of the output from the command:

    Performing post-checks for node removal
    Checking CRS integrity...
    The Oracle clusterware is healthy on node "dm01db02"
    CRS integrity check passed
    Result:
    Node removal check passed
    Post-check for node removal was successful.

2.12.4 Image the Database Server

After the database server has been installed or replaced, you can image the new database server.

You can use installation media on a USB thumb drive, or a touchless option using PXE or ISO attached to the ILOM. See Imaging a New System in Oracle Exadata Database Machine Installation and Configuration Guide for the details.

2.12.5 Configure the Re-imaged Database Server

The re-imaged database server does not have any host names, IP addresses, DNS or NTP settings. The steps in this task describe how to configure the re-imaged database server.

You need the following information prior to configuring the re-imaged database server:

  • Name servers
  • Time zone, such as Americas/Chicago
  • NTP servers
  • IP address information for the management network
  • IP address information for the client access network
  • IP address information for the RDMA Network Fabric
  • Canonical host name
  • Default gateway

The information should be the same on all database servers in Oracle Exadata Database Machine. The IP addresses can be obtained from DNS. In addition, a document with the information should have been provided when Oracle Exadata Database Machine was installed.

The following procedure describes how to configure the re-imaged database server:

  1. Power on the replacement database server. When the system boots, it automatically runs the Configure Oracle Exadata routine, and prompts for information.
  2. Enter the information when prompted, and confirm the settings. The start up process will continue.

Note:

  • If the database server does not use all network interfaces, then the configuration process stops, and warns that some network interfaces are disconnected. It prompts whether to retry the discovery process. Respond with yes or no, as appropriate for the environment.

  • If bonding is used for the client access network, then it is set in the default active-passive mode at this time.

2.12.6 Prepare the Re-imaged Database Server for the Cluster

This task describes how to ensure the changes made during initial installation are done to the re-imaged, bare metal database server.

Note:

For Oracle VM systems, follow the procedure in Expanding an Oracle RAC Cluster on Oracle VM Using OEDACLI.
  1. Copy or merge the contents of the following files using files on a working database server as reference:
    1. Copy the contents of the /etc/security/limits.conf file.
    2. Merge the contents of the /etc/hosts files.
    3. Copy the /etc/oracle/cell/network-config/cellinit.ora file.
    4. Update the /etc/oracle/cell/network-config/cellinit.ora file with the IP_ADDRESS of the ifcfg-bondib0 interface (in case of active/passive bonding) or ib0 and ib1 interfaces (in case of active/active bonding) of the replacement server.
    5. Copy the /etc/oracle/cell/network-config/cellip.ora file.
      The content of the cellip.ora file should be the same on all database servers.
    6. Configure additional network requirements, such as 10 GbE.
    7. Copy the modprobe configuration.

      The contents of the configuration file should be the same on all database servers.

      • Oracle Linux 5 or 6: The file is located at /etc/modprobe.conf.
      • Oracle Linux 7: The file is located at /etc/modprobe.d/exadata.conf.
    8. Copy the /etc/sysctl.conf file.
      The contents of the file should be the same on all database servers.
    9. Update the cellroute.ora.

      Make a copy of the /etc/oracle/cell/network-config/cellroute.ora file. Modify the contents on the replacement server to use the local InfiniBand interfaces on the new node.

    10. Restart the database server so the network changes take effect.
  2. Set up the users for the software owners on the replacement database server by adding groups.
    If you are using role-separated management, then the users are usually oracle and grid. If you use a single software owner, then the user is usually oracle. The group information is available on a working database server.
    1. Obtain the current group information from a working database server.
      # id oracle
      uid=1000(oracle) gid=1001(oinstall) groups=1001(oinstall),1002(dba),1003(oper),1004(asmdba)
      
    2. Use the groupadd command to add the group information to the replacement database server.
      # groupadd -g 1001 oinstall
      # groupadd -g 1002 dba
      # groupadd -g 1003 oper
      # groupadd -g 1004 asmdba
      
    3. Obtain the current user information from a working database server.
      # id oracle uid=1000(oracle) gid=1001(oinstall) \
        groups=1001(oinstall),1002(dba),1003(oper),1004(asmdba)
      
    4. Add the user information to the replacement database server.
      # useradd -u 1000 -g 1001 -G 1001,1002,1003,1004 -m -d /home/oracle -s \
        /bin/bash oracle
      
    5. Create the Oracle Base and Grid home directories, such as /u01/app/oracle and /u01/app/12.2.0.1/grid.
      # mkdir -p /u01/app/oracle
      # mkdir -p /u01/app/12.2.0.1/grid
      # chown -R oracle:oinstall /u01/app
      
    6. Change the ownership on the cellip.ora and cellinit.ora files.

      The ownership is usually oracle:oinstall.

      # chown -R oracle:oinstall /etc/oracle/cell/network-config
      
    7. Secure the restored database server.
      # chmod u+x /opt/oracle.SupportTools/harden_passwords_reset_root_ssh
      # /opt/oracle.SupportTools/harden_passwords_reset_root_ssh
      

      The database server restarts. Log in as the root user when prompted by the system. You are prompted for a new password. Set the password to match the root password of the other database servers.

    8. Set the password for the Oracle software owner.
      The owner is usually oracle.
      # passwd oracle
      
  3. Set up SSH for the oracle account.
    1. Log in to the oracle account on the replacement database server.
      # su - oracle
      
    2. Create the dcli group file on the replacement database server listing the servers in the Oracle cluster.
    3. Run the following command on the replacement database server.
      $ dcli -g dbs_group -l oracle -k
      
    4. Exit and log in again as the oracle user.
      $ exit
      # su - oracle
      
    5. Verify SSH equivalency.
      $ dcli -g dbs_group -l oracle date
      
  4. Set up or copy any custom login scripts from a working database server to the replacement database server.

    In the following command, replacement_server is the name of the new server, such as dm01db01.

    $ scp .bash* oracle@replacement_server:. 
    

2.12.7 Apply Oracle Exadata System Software Patch Bundles to the Replacement Database Server

Oracle periodically releases Oracle Exadata System Software patch bundles for Oracle Exadata Database Machine.

If a patch bundle has been applied to the working database servers that was later than the release of the computeImageMaker file, then the patch bundle must be applied to the replacement Oracle Exadata Database Server. Determine if a patch bundle has been applied as follows:

  • Prior to Oracle Exadata System Software release 11.2.1.2.3, the database servers did not maintain version history information. To determine the release number, log in to Oracle Exadata Storage Server, and run the following command:

    imageinfo -ver
    

    If the command shows a different release than the release used by the computeImageMaker file, then Oracle Exadata System Software patch has been applied to Oracle Exadata Database Machine and must be applied to the replacement Oracle Exadata Database Server.

  • Starting with Oracle Exadata System Software release 11.2.1.2.3, the imagehistory command exists on the Oracle Exadata Database Server. Compare information on the replacement Oracle Exadata Database Server to information on a working Oracle Exadata Database Server. If the working database has a later release, then apply the Oracle Exadata Storage Server patch bundle to the replacement Oracle Exadata Database Server.

2.12.8 Clone Oracle Grid Infrastructure to the Replacement Database Server

This procedure describes how to clone Oracle Grid Infrastructure to the replacement database server.

In the following commands, working_server is a working database server, and replacement_server is the replacement database server. The commands in this procedure are run from a working database server as the Grid home owner. When the root user is needed to run a command, it will be called out.

  1. Verify the hardware and operating system installation using the cluster verification utility (cluvfy).
    $ cluvfy stage -post hwos -n replacement_server,working_server -verbose
    

    The phrase Post-check for hardware and operating system setup was successful should appear at the end of the report. If the cluster verification utility fails to validate the storage on the replacement server, you can ignore those messages.

  2. Verify peer compatibility.
    $ cluvfy comp peer -refnode working_server -n replacement_server  \
      -orainv oinstall -osdba dba | grep -B 3 -A 2 mismatched
    

    The following is an example of the output:

    Compatibility check: Available memory [reference node: dm01db02]
    Node Name Status Ref. node status Comment
    ------------ ----------------------- ----------------------- ----------
    dm01db01 31.02GB (3.2527572E7KB) 29.26GB (3.0681252E7KB) mismatched
    Available memory check failed
    Compatibility check: Free disk space for "/tmp" [reference node: dm01db02]
    Node Name Status Ref. node status Comment
    ------------ ----------------------- ---------------------- ----------
    dm01db01 55.52GB (5.8217472E7KB) 51.82GB (5.4340608E7KB) mismatched
    Free disk space check failed
    

    If the only failed components are related to the physical memory, swap space and disk space, then it is safe to continue.

  3. Perform the requisite checks for adding the server.
    1. Ensure the GRID_HOME/network/admin/samples directory has permissions set to 750.
    2. Validate the addition of the database server.

      Run the following command as the oracle user. The command prompts for the password of the root user.

      $ cluvfy stage -pre nodeadd -n replacement_server -fixup -method root -verbose
      Enter "ROOT" password:

      If the only failed component is related to swap space, then it is safe to continue.

      If the command returns an error, then set the following environment variable and rerun the command:

      $ export IGNORE_PREADDNODE_CHECKS=Y
      
  4. Add the replacement database server to the cluster.

    If you are using Oracle Grid Infrastructure release 12.1 or higher, include the CLUSTER_NEW_NODE_ROLES attribute, as shown in the following example.

    $ cd GRID_HOME/addnode
    
    $ ./addnode.sh -silent "CLUSTER_NEW_NODES={replacement_server}" \
         "CLUSTER_NEW_VIRTUAL_HOSTNAMES={replacement_server-vip}" \
         "CLUSTER_NEW_NODE_ROLES={hub}"
    

    The second command causes Oracle Universal Installer to copy the Oracle Clusterware software to the replacement database server. A message similar to the following is displayed:

    WARNING: A new inventory has been created on one or more nodes in this session.
    However, it has not yet been registered as the central inventory of this
    system. To register the new inventory please run the script at
    '/u01/app/oraInventory/orainstRoot.sh' with root privileges on nodes
    'dm01db01'. If you do not register the inventory, you may not be able to update
    or patch the products you installed.
    
    The following configuration scripts need to be executed as the "root" user in
    each cluster node:
    
    /u01/app/oraInventory/orainstRoot.sh #On nodes dm01db01
    
    /u01/app/12.1.0.2/grid/root.sh #On nodes dm01db01
    
  5. Run the configuration scripts.
    As the root user, first disable HAIP, then run the orainstRoot.sh and root.sh scripts on the replacement database server using the commands shown in the following example.
    # export HAIP_UNSUPPORTED=true
    # /u01/app/oraInventory/orainstRoot.sh
    Creating the Oracle inventory pointer file (/etc/oraInst.loc)
    Changing permissions of /u01/app/oraInventory.
    Adding read,write permissions for group.
    Removing read,write,execute permissions for world.
    Changing groupname of /u01/app/oraInventory to oinstall.
    The execution of the script is complete.
     
    # GRID_HOME/root.sh
    

    Note:

    Check GRID_HOME/install/ log files for the output of root.sh script.

    If you are running Oracle Grid Infrastructure release 11.2, then the output file created by the script reports that the listener resource on the replaced database server failed to start. This is the expected output.

    /u01/app/11.2.0/grid/bin/srvctl start listener -n dm01db01 \
    ...Failed
    /u01/app/11.2.0/grid/perl/bin/perl \
    -I/u01/app/11.2.0/grid/perl/lib \
    -I/u01/app/11.2.0/grid/crs/install \
    /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed

    After the scripts are run, the following message is displayed:

    The Cluster Node Addition of /u01/app/12.1.0.2/grid was successful.
    Please check '/tmp/silentInstall.log' for more details.
    
  6. Check the cluster.
    $ GRID_HOME/bin/crsctl check cluster -all
    
    **************************************************************
    node1:
    CRS-4537: Cluster Ready Services is online
    CRS-4529: Cluster Synchronization Services is online
    CRS-4533: Event Manager is online
    **************************************************************
    node2:
    CRS-4537: Cluster Ready Services is online
    CRS-4529: Cluster Synchronization Services is online
    CRS-4533: Event Manager is online
    **************************************************************
    node3:
    CRS-4537: Cluster Ready Services is online
    CRS-4529: Cluster Synchronization Services is online
    CRS-4533: Event Manager is online
  7. If you are running Oracle Grid Infrastructure release 11.2, then re-enable the listener resource.

    Run the following commands on the replacement database server.

    # GRID_HOME/grid/bin/srvctl enable listener -l LISTENER \
      -n replacement_server
    
    # GRID_HOME/grid/bin/srvctl start listener -l LISTENER  \
      -n replacement_server
  8. Start the disk groups on the replacement server.
    1. Check disk group status.

      In the following example, notice that disk groups are offline on the replacement server.

      $ crsctl stat res -t
      --------------------------------------------------------------------------------
      Name           Target  State        Server                   State details       
      --------------------------------------------------------------------------------
      Local Resources
      --------------------------------------------------------------------------------
      ora.DATAC1.dg
                     ONLINE  ONLINE       node1              STABLE
                     OFFLINE OFFLINE      node2              STABLE
      ora.DBFS_DG.dg
                     ONLINE  ONLINE       node1              STABLE
                     ONLINE  ONLINE       node2              STABLE
      ora.LISTENER.lsnr
                     ONLINE  ONLINE       node1              STABLE
                     ONLINE  ONLINE       node2              STABLE
      ora.RECOC1.dg
                     ONLINE  ONLINE       node1              STABLE
                     OFFLINE OFFLINE      node2              STABLE
      
    2. For each offline disk group, run the START DISKGROUP command for each disk group that is offline from either the original server or the replacement server.
      $ srvctl start diskgroup -diskgroup dgname

2.12.9 Clone Oracle Database Homes to the Replacement Database Server

The following procedure describes how to clone the Oracle Database homes to the replacement server.

Run the commands from a working database server as the oracle user. When the root user is needed to run a command, it will be called out.

  1. Add the Oracle Database ORACLE_HOME to the replacement database server using the following commands:
    $ cd /u01/app/oracle/product/12.1.0.2/dbhome_1/addnode
    
    $ ./addnode.sh -silent "CLUSTER_NEW_NODES={replacement_server}"
    

    The second command causes Oracle Universal Installer to copy the Oracle Database software to the replacement database server.

    WARNING: The following configuration scripts need to be executed as the "root"
    user in each cluster node.
    /u01/app/oracle/product/12.1.0.2/dbhome_1/root.sh #On nodes dm01db01
    To execute the configuration scripts:
    Open a terminal window.
    Log in as root.
    Run the scripts on each cluster node.
     

    After the scripts are finished, the following messages appear:

    The Cluster Node Addition of /u01/app/oracle/product/12.1.0.2/dbhome_1 was successful.
    Please check '/tmp/silentInstall.log' for more details.
    
  2. Run the following script on the replacement database server:
    # /u01/app/oracle/product/12.1.0.2/dbhome_1/root.sh
     

    Check the /u01/app/orcale/product/12.1.0.2/dbhome_1/install/root_replacement_server.com_date.log file for the output of the script.

  3. Run the Oracle Database Configuration Assistant (DBCA) in interactive mode to add database instances to the target nodes.
    1. Start up DBCA.

      $ cd /u01/app/oracle/product/12.1.0.2/dbhome_1/bin
      
      $ ./dbca
    2. On the Database Operation screen, select Instance Management. Click Next.

    3. On the Instance Operation screen, select Add an instance. Click Next.

    4. On the Database List screen, select the cluster database to which you want to add an instance.

    5. The List Instance screen displays the current instances. Click Next to add a new instance.

    6. The Add Instance screen displays the default name and the newly added node to the cluster. Accept the defaults and click Next.

    7. On the Summary screen, verify the plan and click Finish.

    8. On the Progress screen, watch for 100% completion.

    9. On the Finish screen, acknowledge the confirmation that the new instance was successfully added.

    Verify that the instance has been added:

    $ srvctl config database -db dbm01
    

    Verify the administrative privileges on the target node:

    $ cd /u01/app/oracle/product/12.1.0.2/dbhome_1/bin
    
    $ ./cluvfy comp admprv -o db_config -d /u01/app/oracle/product/12.1.0.2/dbhome_1 -n new_node
  4. Ensure the instance parameters are set for the replaced database instance. The following is an example for the CLUSTER_INTERCONNECTS parameter.
    SQL> SHOW PARAMETER cluster_interconnects
    
    NAME                                 TYPE        VALUE
    ------------------------------       --------    -------------------------
    cluster_interconnects                string
     
    SQL> ALTER SYSTEM SET cluster_interconnects='192.168.73.90' SCOPE=spfile SID='dbm1';
    
  5. Validate the configuration files as follows:
    • The Oracle_home/dbs/initSID.ora file points to the SPFILE in the Oracle ASM shared storage.

    • The password file that is copied in the Oracle_home/dbs directory has been changed to orapwSID.

  6. Check that any services that incorporated this instance before and ensure the services are updated to include this replacement instance.
  7. If this procedure was performed on Oracle Exadata Database Machine Eighth Rack, then perform the procedure described in Configuring Oracle Exadata Database Machine Eighth Rack Oracle Linux Database Server After Recovery.

2.13 Changing Existing Elastic Configurations for Database Servers

Elastic configurations provide a flexible and efficient mechanism to change the server configuration of your Oracle Exadata Database Machine.

2.13.1 Adding a New Database Server to the Cluster

You can add a new database server to an existing Oracle Real Application Clusters (Oracle RAC) cluster running on Oracle Exadata Database Machine.

  1. Determine if the new database server needs to be re-imaged or upgraded.

    Check the image label of the database servers in the cluster to which you want to add the new database server.

  2. Add the database server to the cluster by completing the following tasks:
  3. Download and run the latest version of Oracle EXAchk to ensure that the resulting configuration implements the latest best practices for Oracle Exadata Database Machine.

2.13.2 Moving an Existing Database Server to a Different Cluster

You can repurpose an existing database server and move it to a different cluster within the same Oracle Exadata Rack.

  1. Remove the database server from the existing Oracle Real Application Clusters (Oracle RAC) cluster.
    1. Stop Oracle Grid Infrastructure on the database server.
      Grid_home/bin/crstl stop crs
      
    2. Remove the database server from the cluster by completing the steps in Remove the Database Server from the Cluster.
  2. Determine if the database server that is being repurposed needs to be reimaged.

    Check the image label of the existing database servers in the cluster to which you want to add the database server. If the image label of the database server being added does not match the image label of the existing database servers in the cluster, then reimage the database server being added. Complete the following tasks:

    If an upgrade is required, the upgrade may be performed using patchmgr. See Updating Exadata Software for details.

  3. Add the database server to the cluster.
  4. Download and run the latest version of Oracle EXAchk to ensure that the resulting configuration implements the latest best practices for Oracle Exadata Database Machine.

2.13.3 Dropping a Database Server from an Oracle RAC Cluster

You can remove a database server that is a member of an Oracle Real Application Clusters (Oracle RAC) cluster.

  1. Stop Oracle Grid Infrastructure on the database server to be removed.
    $ Grid_home/bin/crstl stop crs
    
  2. Remove the database server from the cluster by completing the steps in Remove the Database Server from the Cluster.
  3. Download and run the latest Oracle EXAchk to ensure that the resulting configuration implements the latest best practices for Oracle Exadata Database Machine.

2.14 Managing Quorum Disks for High Redundancy Disk Groups

Quorum disks allow for the Oracle RAC voting files to be stored in a high redundancy disk group on an Oracle Exadata Rack with fewer than five storage servers due to the presence of two extra failure groups.

2.14.1 Using Quorum Disks to Increase Fault Tolerance

Quorum disks are used to meet the minimum requirement of five failure groups for a high redundancy disk group on a system that does not have five storage servers.

A failure group is a subset of the disks in a disk group, which could fail at the same time because they share hardware. Oracle recommends a minimum of three failure groups for normal redundancy disk groups and five failure groups for high redundancy disk groups to maintain the necessary number of copies of the Partner Status Table (PST) and to ensure robustness with respect to storage hardware failures. On Engineered Systems, these recommendations are enforced to ensure the highest availability of the system.

The PST contains status information about the Oracle Automatic Storage Management (Oracle ASM) disks in a disk group, such as the disk number, status (either online or offline), partner disk number, failure group info, and heartbeat info. To tolerate a single hardware failure, you need 3 total copies of the PST available to form a 2 of 3 majority. If there are two hardware failures, then you need a total for 5 copies of the PST so that after a double failure you still have a 3 of 5 majority.

A quorum failure group is a special type of failure group that does not contain user data. Quorum failure groups are used for storing the PST. A quorum failure group can also be used to store a copy of the voting file for Oracle Clusterware. Because disks in quorum failure groups do not contain user data, a quorum failure group is not considered when determining redundancy requirements in respect to storing user data.

In the event of a system failure, three failure groups in a normal redundancy disk group allow a comparison among three PSTs to accurately determine the most up to date and correct version of the PST, which could not be done with a comparison between only two PSTs. Similarly with a high redundancy disk group, if two failure groups are offline, then Oracle ASM would be able to make a comparison among the three remaining PSTs.

You can create quorum failure groups with Oracle Exadata Deployment Assistant (OEDA) when deploying Exadata or you can add them later using the Quorum Disk Manager Utility. The iSCSI quorum disks are created on database nodes and a voting file is created on those quorum disks. These additional quorum failure groups are used to meet the minimum requirement of five voting files and PSTs for a high redundancy disk group. Quorum disks are required when the following conditions exist:

  • The Oracle Exadata Rack has fewer than five storage servers.
  • The Oracle Exadata Rack has at least two database nodes.
  • The Oracle Exadata Rack has at least one high redundancy disk group.

Quorum failure groups allow for a high redundancy disk group to exist on Oracle Exadata Racks with fewer than five storage servers by creating two extra failure groups. Without this feature, a disk group is vulnerable to a double partner storage server failure that results in the loss of the PST or voting file quorum, which can cause a complete cluster and database outage. Refer to My Oracle Support note 1339373.1 for how to restart the clusterware and databases in this scenario.

The iSCSI quorum disk implementation has high availability because the IP addresses on the RDMA Network Fabric are highly available using RDS.

Each iSCSI device shown in the figure below corresponds to a particular path to the iSCSI target. Each path corresponds to an RDMA Network Fabric port on the database node. For each multipath quorum disk device in an active–active system, there are two iSCSI devices, one for ib0 or re0 and the other for ib1 or re1.

Figure 2-1 Multipath Device Connects to Both iSCSI Devices in an Active-Active System

Description of Figure 2-1 follows
Description of "Figure 2-1 Multipath Device Connects to Both iSCSI Devices in an Active-Active System"

Quorum disks can be used with bare metal Oracle Real Application Clusters (Oracle RAC) clusters and Oracle VM Oracle RAC clusters. For Oracle VM Oracle RAC clusters, the quorum disk devices reside in the Oracle RAC cluster nodes which are Oracle VM user domains as shown in the following figure.

Figure 2-2 Quorum Disk Devices on Oracle VM Oracle RAC Cluster

Description of Figure 2-2 follows
Description of "Figure 2-2 Quorum Disk Devices on Oracle VM Oracle RAC Cluster"

Note:

For pkey-enabled environments, the interfaces used for discovering the targets should be the pkey interfaces used for the Oracle Clusterware communication. These interfaces are listed using the following command:

Grid_home/bin/oifcfg getif | grep cluster_interconnect | awk '{print $1}'

2.14.2 Overview of Quorum Disk Manager

The Quorum Disk Manager utility, introduced in Oracle Exadata System Software release 12.1.2.3.0, helps you to manage the quorum disks.

This utility enables you to create an iSCSI quorum disk on two of the database nodes and store a voting file on those two quorum disks. These two additional voting files are used to meet the minimum requirement of five voting files for a high redundancy disk group.

The Quorum Disk Manager utility (quorumdiskmgr) is used to create and manage all the necessary components including the iSCSI configuration, the iSCSI targets, the iSCSI LUNs, and the iSCSI devices for implementing quorum disks.

Related Topics

2.14.3 Software Requirements for Quorum Disk Manager

You must satisfy the minimum software requirements to use the Quorum Disk Manager utility.

To use this feature, the following releases are required:

  • Oracle Exadata System Software release 12.1.2.3.0 and above

  • Patch 23200778 for all Oracle Database homes

  • Oracle Grid Infrastructure release 12.1.0.2.160119 with patches 22722476 and 22682752, or Oracle Grid Infrastructure release 12.1.0.2.160419 and above

    For new deployments, Oracle Exadata Deployment Assistant (OEDA) installs the patches automatically.

2.14.4 quorumdiskmgr Reference

The quorum disk manager utility (quorumdiskmgr) runs on each database server to enable you to create and manage iSCSI quorum disks on database servers. You use quorumdiskmgr to create, list, alter, and delete iSCSI quorum disks on database servers. The utility is installed on database servers when they are shipped.

2.14.4.1 Syntax for the Quorum Disk Manager Utility

The quorum disk manager utility is a command-line tool. It has the following syntax:

quorumdiskmgr --verb --object [--options] 

verb is an action performed on an object. It is one of: alter, create, delete, list.

object is an object on which the command performs an action.

options extend the use of a command combination to include additional parameters for the command.

When using the quorumdiskmgr utility, the following rules apply:

  • Verbs, objects, and options are case-sensitive except where explicitly stated.

  • Use the double quote character around the value of an option that includes spaces or punctuation.

2.14.4.2 quorumdiskmgr Objects
Object Description

config

The quorum disk configurations include the owner and group of the ASM instance to which the iSCSI quorum disks will be added, and the list of network interfaces through which local and remote iSCSI quorum disks will be discovered.

target

A target is an endpoint on each database server that waits for an iSCSI initiator to establish a session and provides required IO data transfer.

device

A device is an iSCSI device created by logging into a local or remote target.

2.14.4.3 Creating a Quorum Disk Configuration (--create --config)

The --create --config action creates a quorum disk configuration.

The configuration must be created before any targets or devices can be created.

Syntax

quorumdiskmgr --create --config [--owner owner --group group] 
  --network-iface-list network-iface-list 

Parameters

The following table lists the parameters for the --create --config action:

Parameter Description

--owner

Specifies the owner of the Oracle ASM instance to which the iSCSI quorum disks will be added. This is an optional parameter. The default value is grid.

--group

Specifies the group of the Oracle ASM instance to which the iSCSI quorum disks will be added. This is an optional parameter. The default value is dba.

--network-iface-list

Specifies the list of RDMA Network Fabric interface names through which the local and remote targets will be discovered.

Example 2-7 Create a Quorum Disk Configuration for a System with InfiniBand Network Fabric

quorumdiskmgr --create --config --owner=oracle --group=dba --network-iface-list="ib0, ib1"

Example 2-8 Create a Quorum Disk Configuration for a System with RoCE Network Fabric

quorumdiskmgr --create --config --owner=oracle --group=dba --network-iface-list="re0, re1"
2.14.4.4 Creating a Target (--create --target)

The --create --target action creates a target that will be used to create the devices to add to the specified Oracle ASM disk group.

The --create --target action creates a target that can be accessed by database servers with an RDMA Network Fabric IP address in the specified IP address list.

After a target is created, the asm-disk-group, host-name, and size attributes cannot be changed.

Syntax

quorumdiskmgr --create --target --asm-disk-group asm_disk_group --visible-to ip_list
   [--host-name host_name] [--size size]

Parameters

Parameter Description

--asm-disk-group

Specifies the Oracle ASM disk group to which the device created from the target will be added. The value of asm-disk-group is not case-sensitive.

--visible-to

Specifies a list of RDMA Network Fabric IP addresses. Database servers with an RDMA Network Fabric IP address in the list will have access to the target.

--host-name

Specifies the host name of the database server on which quorumdiskmgr runs. The total length of the values for asm-disk-group and host-name cannot exceed 26 characters. If the host name is too long, a shorter host name can be specified as long as a different host name is specified for each database server in the rack.

This is an optional parameter. The default value is the host name of the database server on which quorumdiskmgr runs. The value of host-name is not case-sensitive.

--size

Specifies the size of the target. This is an optional parameter. The default value is 128 MB.

Example 2-9 Creating a Target For Oracle ASM Disk Group Devices

This example shows how to create a target for devices added to the DATAC1 disk group. That target is only visible to database servers that have an RDMA Network Fabric IP address of 192.168.10.45 or 192.168.10.46.

quorumdiskmgr --create --target --asm-disk-group=datac1 --visible-to="192.168.10.45, 192.168.10.46"
 --host-name=db01
2.14.4.5 Creating a Device (--create --device)

The --create --device action creates devices by discovering and logging into targets on database servers with an RDMA Network Fabric IP address in the specified list of IP addresses.

The created devices will be automatically discovered by the Oracle ASM instance with the owner and group specified during configuration creation.

Syntax

quorumdiskmgr --create --device --target-ip-list target_ip_list

Parameters

  • --target-ip-list: Specifies a list of RDMA Network Fabric IP addresses.

    quorumdiskmgr discovers targets on database servers that have an IP address in the list, then logs in to those targets to create devices.

Example

Example 2-10 Creating Devices From a Target For an Oracle ASM Disk Group

This example shows how to create devices using targets on database servers that have an IP address of 192.168.10.45 or 192.168.10.46.

quorumdiskmgr --create --device --target-ip-list="192.168.10.45, 192.168.10.46"
2.14.4.6 Listing Quorum Disk Configurations (--list --config)

The --list --config action lists the quorum disk configurations.

Syntax

quorumdiskmgr --list --config

Sample Output

Example 2-11 Listing the quorum disk configuration on rack with InfiniBand Network Fabric

$ quorumdiskmgr --list --config
Owner: grid
Group: dba
ifaces: exadata_ib1 exadata_ib0

Example 2-12 Listing the quorum disk configuration on rack with RoCE Network Fabric

$ quorumdiskmgr --list --config
Owner: grid
Group: dba
ifaces: exadata_re1 exadata_re0
2.14.4.7 Listing Targets (--list --target)

The --list --target action lists the attributes of targets.

The target attributes listed include target name, size, host name, Oracle ASM disk group name, the list of IP addresses (a visible-to IP address list) indicating which database servers have access to the target, and the list of IP addresses (a discovered-by IP address list) indicating which database servers have logged into the target.

If an Oracle ASM disk group name is specified, the action lists all local targets created for the specified Oracle ASM disk group. Otherwise, the action lists all local targets created for quorum disks.

Syntax

quorumdiskmgr --list --target [--asm-disk-group asm_disk_group]

Parameters

  • --asm-disk-group: Specifies the Oracle ASM disk group. quorumdiskmgr displays all local targets for this Oracle ASM disk group. The value of asm-disk-group is not case-sensitive.

Example 2-13 Listing the Target Attributes for a Specific Oracle ASM Disk Group

This example shows how to list the attributes of the target for the DATAC1 disk group.

quorumdiskmgr --list --target --asm-disk-group=datac1 
Name: iqn.2015-05.com.oracle:qd--datac1_db01 
Size: 128 MB 
Host name: DB01 
ASM disk group name: DATAC1 
Visible to: iqn.1988-12.com.oracle:192.168.10.23, iqn.1988-12.com.oracle:192.168.10.24,
 iqn.1988-12.com.oracle:1b48248af770, iqn.1988-12.com.oracle:7a4a399566
Discovered by: 192.168.10.47, 192.168.10.46

Note:

For systems installed using a release prior to Oracle Exadata System Software 19.1.0, the Name might appear as iqn.2015-05.com.oracle:QD_DATAC1_DB01. Also, the Visible to field displays IP addresses instead of names.
2.14.4.8 Listing Devices (--list --device)

The --list --device action lists the attributes of devices, including device path, size, host name and ASM disk group name.

  • If only the Oracle ASM disk group name is specified, then the output includes all the devices that have been added to the Oracle ASM disk group.

  • If only the host name is specified, then the output includes all the devices created from the targets on the host.

  • If both an Oracle ASM disk group name and a host name are specified, then the output includes a single device created from the target on the host that has been added to the Oracle ASM disk group.

  • If neither an Oracle ASM disk group name or a host name is specified, then the output includes all quorum disk devices.

Syntax

quorumdiskmgr --list --device [--asm-disk-group asm_disk_group] [--host-name host_name]

Parameters

Parameter Description

--asm-disk-group

Specifies the Oracle ASM disk group to which devices have been added. The value of asm-disk-group is not case-sensitive.

--host-name

Specifies the host name of the database server from whose targets devices are created. The value of host-name is not case-sensitive.

Example 2-14 Listing Device Attributes for an Oracle ASM Disk Group

This example shows how to list the attributes for devices used by the DATAC1 disk group.

$ quorumdiskmgr --list --device --asm-disk-group datac1
Device path: /dev/exadata_quorum/QD_DATAC1_DB01 
Size: 128 MB 
Host name: DB01 
ASM disk group name: DATAC1

Device path: /dev/exadata_quorum/QD_DATAC1_DB02 
Size: 128 MB 
Host name: DB02
ASM disk group name: DATAC1
2.14.4.9 Deleting Configurations (--delete --config)

The --delete --config action deletes quorum disk configurations.

The configurations can only be deleted when there are no targets or devices present.

Syntax

quorumdiskmgr --delete --config
2.14.4.10 Deleting Targets (--delete --target)

The --delete --target action deletes the targets created for quorum disks on database servers.

If an Oracle ASM disk group name is specified, then this command deletes all the local targets created for the specified Oracle ASM disk group. Otherwise, this command deletes all local targets created for quorum disks.

Syntax

quorumdiskmgr --delete --target [--asm-disk-group asm_disk_group]

Parameters

  • --asm-disk-group: Specifies the Oracle ASM disk group. Local targets created for this disk group will be deleted.

    The value of asm-disk-group is not case-sensitive.

Example 2-15 Deleting Targets Created for an Oracle ASM Disk Group

This example shows how to delete targets created for the DATAC1 disk group.

quorumdiskmgr --delete --target --asm-disk-group=datac1
2.14.4.11 Deleting Devices (--delete --device)

The --delete --device command deletes quorum disk devices.

  • If only an Oracle ASM disk group name is specified, then the command deletes all the devices that have been added to the Oracle ASM disk group.

  • If only a host name is specified, then the command deletes all the devices created from the targets on the host.

  • If both an Oracle ASM disk group name and a host name are specified, then the command deletes a single device created from the target on the host and that has been added to the Oracle ASM disk group.

  • If neither an Oracle ASM disk group name nor a host name is specified, then the command deletes all quorum disk devices.

Syntax

quorumdiskmgr --delete --device [--asm-disk-group asm_disk_group] [--host-name host_name]

Parameters

Parameter Description

--asm-disk-group

Specifies the Oracle ASM disk group whose device you want to delete. The value of asm-disk-group is not case-sensitive.

--host-name

Specifies the host name of the database server. Devices created from targets on this host will be deleted. The value of host-name is not case-sensitive.

Example 2-16 Deleting Quorum Disk Devices Created from Targets on a Specific Host

This example shows how to delete all the quorum disk devices that were created from the targets on the host DB01.

quorumdiskmgr --delete --device --host-name=db01
2.14.4.12 Changing Owner and Group Values (--alter --config)

The --alter --config action changes the owner and group configurations.

Syntax

quorumdiskmgr --alter --config --owner owner --group group

Parameters

Parameter Description

--owner

Specifies the new owner for the quorum disk configuration. This parameter is optional. If not specified, the owner is unchanged.

--group

Specifies the new group for the quorum disk configuration. This parameter is optional. If not specified, the group is unchanged.

Example 2-17 Changes the Owner and Group Configuration for Quorum Disk Devices

This example shows how to change the assigned owner and group for quorum disk devices.

quorumdiskmgr --alter --config --owner=grid --group=dba
2.14.4.13 Changing the RDMA Network Fabric IP Addresses (--alter --target)

The --alter --target command changes the RDMA Network Fabric IP addresses of the database servers that have access to the local target created for the specified Oracle ASM disk group.

Syntax

quorumdiskmgr --alter --target --asm-disk-group asm_disk_group --visible-to ip_list

Parameters

Parameter Description

--asm-disk-group

Specifies the Oracle ASM disk group to which the device created from the target will be added. The value of asm-disk-group is not case-sensitive.

--visible-to

Specifies a list of RDMA Network Fabric IP addresses. Database servers with an RDMA Network Fabric IP address in the list will have access to the target.

Example 2-18 Changing the RDMA Network Fabric IP Addresses for Accessing Targets

This example shows how to change the RDMA Network Fabric IP address list that determines which database servers have access to the local target created for DATAC1 disk group

quorumdiskmgr --alter --target --asm-disk-group=datac1 --visible-to="192.168.10.45, 192.168.10.47

2.14.5 Add Quorum Disks to Database Nodes

You can add quorum disks to database nodes on an Oracle Exadata Rack with fewer than five storage servers that contains a high redundancy disk group.

Oracle strongly recommends having quorum disks in all high redundancy disk groups with less than five failure groups. Having five quorum disks is important to mirror ASM metadata in any high redundancy disk group, and not just for the disk group housing the voting files.

The example in this section creates quorum disks for an Oracle Exadata Rack that has two database servers: db01 and db02.

Typically, there are two RDMA Network Fabric ports on each database server:

  • For systems with InfiniBand Network Fabric the ports are: ib0 and ib1
  • For systems with RoCE Network Fabric the ports are: re0 and re1

On each cluster node, the network interfaces to be used for communication with the iSCSI devices can be found using the following command:

$ oifcfg getif | grep cluster_interconnect | awk '{print $1}'

The IP address of each interface can be found using the following command:

# ip addr show interface_name

The RDMA Network Fabric IP addresses for this example are as follows:

On db01:

  • Network interface: ib0 or re0, IP address: 192.168.10.45
  • Network interface: ib1 or re1, IP address: 192.168.10.46

On db02:

  • Network interface: ib0 or re0, IP address: 192.168.10.47
  • Network interface: ib1 or re1, IP address: 192.168.10.48

The Oracle ASM disk group to which the quorum disks will be added is DATAC1. The Oracle ASM owner is grid, and the user group is dba.

In this example, we will move the voting files from the normal redundancy disk group RECOC1 to DATAC1 after it has been augmented with quorum disks to yield five failure groups. The example shows the cluster voting files moving from RECOC1 to DATAC1, but there is no need to relocate voting files if you are just adding quorum disks to a high redundancy disk group and you already have your voting files in some other high redundancy disk group.

Initially, the voting files reside on a normal redundancy disk group RECOC1:

$ Grid_home/bin/crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   21f5507a28934f77bf3b7ecf88b26c47 (o/192.168.76.187;192.168.76.188/RECOC1_CD_00_celadm12) [RECOC1]
 2. ONLINE   387f71ee81f14f38bfbdf0693451e328 (o/192.168.76.189;192.168.76.190/RECOC1_CD_00_celadm13) [RECOC1]
 3. ONLINE   6f7fab62e6054fb8bf167108cdbd2f64 (o/192.168.76.191;192.168.76.192/RECOC1_CD_00_celadm14) [RECOC1]
Located 3 voting disk(s).
  1. Log into the database servers (for example, db01 and db02) as the root user.
  2. Check if quorum disks are already configured on your system. If so, you can skip the next step.

    Run the following quorumdiskmgr command on the database servers.

    # /opt/oracle.SupportTools/quorumdiskmgr --list --config
    
    If quorum disks are already configured on your system, your output should resemble one of the following:
    • For Oracle Exadata System Software release 18.x or earlier, the output should look like this:

      Owner: grid
      Group: dba
      ifaces: exadata_ib1 exadata_ib0
      
    • If you have upgraded to Oracle Exadata System Software release 19.1.0 or later from an earlier release, then the output should look like this:

      Owner: grid 
      Group: dba 
      ifaces: exadata_ib0 
      Initiatior name: iqn.1988-12.com.oracle:7a4a399566
    • If you have a system that was imaged with Oracle Exadata System Software release 19.1.0 or later (not upgraded), then the output should look like this:

      Owner: grid 
      Group: dba 
      ifaces: exadata_ib0 
      Initiatior name: iqn.1988-12.com.oracle:192.168.18.205
    • If your rack uses RoCE Network Fabric, then the output should look like this:

      Owner: grid 
      Group: dba 
      ifaces: exadata_re0 
      Initiatior name: iqn.1988-12.com.oracle:192.168.18.205
  3. If the previous step didn't show an existing quorum disk configuration, run the quorumdiskmgr command with the --create --config options to create quorum disk configurations on the database servers.
    • For systems with InfiniBand Network Fabric:

      # /opt/oracle.SupportTools/quorumdiskmgr --create --config --owner=grid --group=dba --network-iface-list="ib0, ib1"
    • For systems with RoCE Network Fabric:

      # /opt/oracle.SupportTools/quorumdiskmgr --create --config --owner=grid --group=dba --network-iface-list="re0, re1"
  4. Run the quorumdiskmgr command with the --list --config options to verify that the configurations have been successfully created on the database servers.
    # /opt/oracle.SupportTools/quorumdiskmgr --list --config
    Refer to step 2 for examples of the expected output.
  5. Run the quorumdiskmgr command with the --create --target options to create a target on the database servers for Oracle ASM disk group DATAC1, and make the target visible to the database servers.

    In this example scenario, the following command would be run on the database servers db01 and db02:

    # /opt/oracle.SupportTools/quorumdiskmgr --create --target --asm-disk-group=datac1 
    --visible-to="192.168.10.45, 192.168.10.46, 192.168.10.47, 192.168.10.48"
  6. Run the quorumdiskmgr command with the --list --target options to verify the target has been successfully created on the database servers.
    # /opt/oracle.SupportTools/quorumdiskmgr --list --target
    • If you are running Oracle Exadata System Software release 18.x or earlier, then the output should look like this for each node:

      Name: iqn.2015-05.com.oracle:QD_DATAC1_DB01 
      Size: 128 MB 
      Host name: DB01
      ASM disk group name: DATAC1 
      Visible to: 192.168.10.45, 192.168.10.46, 192.168.10.47, 192.168.10.48
      Discovered by:
      
      Name: iqn.2015-05.com.oracle:QD_DATAC1_DB02 
      Size: 128 MB 
      Host name: DB02
      ASM disk group name: DATAC1 
      Visible to: 192.168.10.45, 192.168.10.46, 192.168.10.47, 192.168.10.48
      Discovered by:
      
    • If you are running Oracle Exadata System Software release 19.x or later, then the output should look like this for each node:

      
      Name: iqn.2015-05.com.oracle:QD_DATAC1_DB01 
      Size: 128 MB 
      Host name: DB01
      ASM disk group name: DATAC1 
      Visible  to: 192.168.10.45, 192.168.10.46, 192.168.10.47, 
      192.168.10.48, iqn.1988-12.com.oracle:ee657eb81b53, 
      iqn.1988-12.com.oracle:db357ba82b24
      
      Name: iqn.2015-05.com.oracle:QD_DATAC1_DB02
      Size: 128 MB
      Host name: DB02
      ASM disk group name: DATAC1
      Visible to: 192.168.10.45, 192.168.10.46, 192.168.10.47, 
      192.168.10.48, iqn.1988-12.com.oracle:ee657eb81b53,
      iqn.1988-12.com.oracle:db357ba82b24

      Note:

      The output shows both IP addresses and initiator names in the Visible to list only if you have a system that was upgraded from a release older than Oracle Exadata System Software release 19.1.0. Otherwise, the Visible to list shows only IP addresses in it.

  7. Run the quorumdiskmgr command with the --create --device options to create devices on the database servers using the previously created targets.

    In this example scenario, the following command would be run on the database servers db01 and db02:

    # /opt/oracle.SupportTools/quorumdiskmgr --create --device --target-ip-list="192.168.10.45, 192.168.10.46,
     192.168.10.47, 192.168.10.48"
  8. Run the quorumdiskmgr command with the --list --device options to verify the devices have been successfully created on the database servers.
    # /opt/oracle.SupportTools/quorumdiskmgr --list --device

    For this example scenario, the output should look like:

    Device path: /dev/exadata_quorum/QD_DATAC1_DB01 
    Size: 128 MB 
    Host name: DB01
    ASM disk group name: DATAC1 
    
    Device path: /dev/exadata_quorum/QD_DATAC1_DB02 
    Size: 128 MB 
    Host name: DB02
    ASM disk group name: DATAC1
    
  9. Switch to the grid user on one of the database servers (for example, db01 or db02) and set the environment to access Oracle ASM.
  10. Alter the asm_diskstring initialization parameter and add /dev/exadata_quorum/* to the existing string:

    For example:

    SQL> ALTER SYSTEM SET asm_diskstring='o/*/DATAC1_*','o/*/RECOC1_*','/dev/exadata_quorum/*' scope=both sid='*';
    
  11. Verify that the quorum disk devices are automatically discovered by Oracle ASM.

    For example:

    SQL> set linesize 200
    SQL> col path format a50
    SQL> SELECT inst_id, label, path, mode_status, header_status
         FROM gv$asm_disk WHERE path LIKE '/dev/exadata_quorum/%';

    The output should look like:

    INST_ID LABEL          PATH                                MODE_STATUS HEADER_STATUS
    ------- -------------- ----------------------------------  ----------- -------------
          1 QD_DATAC1_DB01 /dev/exadata_quorum/QD_DATAC1_DB01  ONLINE      CANDIDATE
          1 QD_DATAC1_DB02 /dev/exadata_quorum/QD_DATAC1_DB02  ONLINE      CANDIDATE
          2 QD_DATAC1_DB01 /dev/exadata_quorum/QD_DATAC1_DB01  ONLINE      CANDIDATE
          2 QD_DATAC1_DB02 /dev/exadata_quorum/QD_DATAC1_DB02  ONLINE      CANDIDATE
    
  12. Add the quorum disk devices to a high redundancy Oracle ASM disk group.

    If there is no high redundancy disk group, create a high redundancy disk group and include the new quorum disks. For example:

    SQL> CREATE DISKGROUP DATAC1 HIGH REDUNDANCY ADD 
         QUORUM FAILGROUP db01 DISK '/dev/exadata_quorum/QD_DATAC1_DB01' 
         QUORUM FAILGROUP db02 DISK '/dev/exadata_quorum/QD_DATAC1_DB02' ...
    

    If a high redundancy disk group already exists, add the new quorum disks. For example:

    SQL> ALTER DISKGROUP DATAC1 ADD 
         QUORUM FAILGROUP db01 DISK '/dev/exadata_quorum/QD_DATAC1_DB01' 
         QUORUM FAILGROUP db02 DISK '/dev/exadata_quorum/QD_DATAC1_DB02';
    

    Tip:

    The failure group name should match the last part of the device path following QD_disk_group_name_. In this example, the failure group names are DB01 and DB02.
  13. Verify that the status of the quorum disks in Oracle ASM is changed from CANDIDATE to MEMBER.

    For example:

    SQL> set linesize 200
    SQL> col path format a50
    SQL> SELECT inst_id, label, path, mode_status, header_status
         FROM gv$asm_disk WHERE path LIKE '/dev/exadata_quorum/%';

    The output should look like:

    INST_ID LABEL          PATH                                MODE_STATUS HEADER_STATUS
    ------- -------------- ----------------------------------  ----------- -------------
          1 QD_DATAC1_DB01 /dev/exadata_quorum/QD_DATAC1_DB01  ONLINE      MEMBER
          1 QD_DATAC1_DB02 /dev/exadata_quorum/QD_DATAC1_DB02  ONLINE      MEMBER
          2 QD_DATAC1_DB01 /dev/exadata_quorum/QD_DATAC1_DB01  ONLINE      MEMBER
          2 QD_DATAC1_DB02 /dev/exadata_quorum/QD_DATAC1_DB02  ONLINE      MEMBER
    
  14. If you are moving voting files from a normal redundancy disk group to a high redundancy disk group with 5 five quorum disks, you can now relocate the voting files.

    For example:

    $ Grid_home/bin/crsctl replace votedisk +DATAC1
  15. Verify that the voting disks have been successfully relocated to the high redundancy disk group and that five voting files exist.
    $ crsctl query css votedisk
    

    In this example, the output shows three voting disks on storage servers and two voting disks on database servers:

    ## STATE File Universal Id File Name Disk group
    -- ----- ----------------- --------- ---------
    1. ONLINE ca2f1b57873f4ff4bf1dfb78824f2912 (o/192.168.10.42/DATAC1_CD_09_celadm12) [DATAC1]
    2. ONLINE a8c3609a3dd44f53bf17c89429c6ebe6 (o/192.168.10.43/DATAC1_CD_09_celadm13) [DATAC1]
    3. ONLINE cafb7e95a5be4f00bf10bc094469cad9 (o/192.168.10.44/DATAC1_CD_09_celadm14) [DATAC1]
    4. ONLINE 4dca8fb7bd594f6ebf8321ac23e53434 (/dev/exadata_quorum/QD_ DATAC1_DB01) [DATAC1]
    5. ONLINE 4948b73db0514f47bf94ee53b98fdb51 (/dev/exadata_quorum/QD_ DATAC1_DB02) [DATAC1]
    Located 5 voting disk(s).
    
  16. Move the Oracle ASM password file to the high redundancy disk group.
    1. Get the source Oracle ASM password file location.
      $ asmcmd pwget --asm
    2. Move the Oracle ASM password file to the high redundancy disk group.
      $ asmcmd pwmove --asm full_path_of_source_file full_path_of_destination_file

      For example:

      $ asmcmd pwmove --asm +recoc1/ASM/PASSWORD/pwdasm.256.898960531 +datac1/asmpwdfile
  17. Move the Oracle ASM SPFILE to the high redundancy disk group.
    1. Get the Oracle ASM SPFILE in use.
      $ asmcmd spget
    2. Copy the Oracle ASM SPFILE to the high redundancy disk group.
      $ asmcmd spcopy full_path_of_source_file full_path_of_destination_file
    3. Modify the Oracle Grid Infrastructure configuration to use the relocated SPFILE upon next restart.
      $ asmcmd spset full_path_of_destination_file
    4. If it can be tolerated, restart Oracle Grid Infrastructure.
      # Grid_home/bin/crsctl stop crs
      
      # Grid_home/bin/crsctl start crs

      If restarting Oracle Grid Infrastructure is not permitted, repeat this step (17) every time the Oracle ASM SPFILE is modified until Oracle Grid Infrastructure is restarted.

  18. Relocate the MGMTDB to the high redundancy disk group.

    Move the MGMTDB (if running) to the high redundancy disk group using How to Move/Recreate GI Management Repository to Different Shared Storage (Diskgroup, CFS or NFS etc) (My Oracle Support Doc ID 1589394.1).

    Configure the MGMTDB to not use hugepages using the steps below:

    export ORACLE_SID=-MGMTDB
    export ORACLE_HOME=$GRID_HOME
    sqlplus ”sys as sysdba”
    SQL> ALTER SYSTEM SET use_large_pages=false scope=spfile  sid='*';
    
  19. Optional: Restart Oracle Grid Infrastructure.
    # Grid_home/bin/crsctl stop crs
    
    # Grid_home/bin/crsctl start crs
  20. Optional: Convert the normal redundancy disk group to a high redundancy disk group.

2.14.6 Recreate Quorum Disks

In certain circumstances, you might need to recreate a quorum disk.

Some examples of when you might need to recreate a quorum disk are:
  • When recreating a guest domU

  • If you deleted the quorum disks without first dropping the quorum disks from the Oracle ASM disk group

  1. Force drop the lost quorum disk.
    ALTER DISKGROUP dg_name DROP QUORUM DISK disk_name FORCE;
  2. Follow the instructions in "Adding Quorum Disks to Database Nodes" to add a new quorum disk.

2.14.7 Use Cases

The following topics describe various configuration cases when using the quorum disk manager utility.

2.14.7.1 New Deployments on Oracle Exadata 12.1.2.3.0 or Later

For new deployments on Oracle Exadata release 12.1.2.3.0 and above, OEDA implements this feature by default when all of the following requirements are satisfied:

  • The system has at least two database nodes and fewer than five storage servers.

  • You are running OEDA release February 2016 or later.

  • You meet the software requirements listed in Software Requirements for Quorum Disk Manager.

  • Oracle Database is 11.2.0.4 and above.

  • The system has at least one high redundancy disk group.

If the system has three storage servers in place, then two quorum disks will be created on the first two database nodes of the cluster picked by OEDA.

If the system has four storage servers in place, then one quorum disk will be created on the first database node picked by OEDA.

2.14.7.2 Upgrading to Oracle Exadata Release 12.1.2.3.0 or Later

If the target Exadata system has fewer than five storage servers, at least one high redundancy disk group, and two or more database nodes, you can implement this feature manually using quorumdiskmgr.

2.14.7.3 Downgrading to a Pre-12.1.2.3.0 Oracle Exadata Release

Rolling back to a pre-12.1.2.3.0 Oracle Exadata release, which does not support quorum disks, from a release that supports quorum disks, which is any release 12.1.2.3.0 and later, requires quorum disk configuration to be removed if the environment has quorum disk implementation in place. You need to remove the quorum disk configuration before performing the Exadata software rollback.

To remove quorum disk configuration, perform these steps:

  1. Ensure there is at least one normal redundancy disk group in place. If not, create one.

  2. Relocate the voting files to a normal redundancy disk group:

    $GI_HOME/bin/crsctl replace votedisk +normal_redundancy_diskgroup
    
  3. Drop the quorum disks from ASM. Run the following command for each quorum disk:

    SQL> alter diskgroup diskgroup_name drop quorum disk quorum_disk_name force;
    

    Wait for the rebalance operation to complete. You can tell it is complete when v$asm_operation returns no rows for the disk group.

  4. Delete the quorum devices. Run the following command from each database node that has quorum disks in place:

    /opt/oracle.SupportTools/quorumdiskmgr --delete --device [--asm-disk-group asm_disk_group] [--host-name host_name]
    
  5. Delete the targets. Run the following command from each database node that has quorum disks in place:

    /opt/oracle.SupportTools/quorumdiskmgr --delete --target [--asm-disk-group asm_disk_group]
    
  6. Delete the configuration. Run the following command from each database node that has quorum disks in place:

    /opt/oracle.SupportTools/quorumdiskmgr --delete –config
    
2.14.7.4 Managing Quorum Disks When Changing Elastic Configurations

When modifying the elastic configuration of an Oracle Exadata Rack, you might have to perform additional actions if you use quorum disks.

2.14.7.4.1 Adding a Database Node if Using Quorum Disks

If the existing Oracle Real Application Clusters (Oracle RAC) cluster has fewer than two database nodes and fewer than five storage servers, and the voting files are not stored in a high redundancy disk group, then Oracle recommends adding quorum disks to the database node(s) and relocating the voting files to a high redundancy disk group.

Note:

The requirements listed in "Software Requirements for Quorum Disk Manager" must be met.

If the existing Oracle RAC cluster already has quorum disks in place, the quorum disks need to be made visible to the newly added node prior to adding the node to the Oracle RAC cluster using the addnode.sh procedure.

  1. Log in to the 2 database nodes that contain the quorum devices as the root user.
  2. Retrieve the quorum disk ISCSI target configuration.
    /opt/oracle.SupportTools/quorumdiskmgr --list --target

    The output of this command should be similar to the following (where the host name is db01 and the diskgroup_name is DATA):

    Name: iqn.2015-05.com.oracle:QD_DATA_DB01
    Host name: DB01
    ASM disk group name: DATA
    Size: 128 MB
    Visible to: IP_address1, IP_address2, IP_address3, IP_address4... IP_address2n
    Discovered by: IP_address1, IP_address2, IP_address3, IP_address4
    

    IP_address1, IP_address2, IP_address3, IP_address4IP_address2n above refer to the IP addresses of the RDMA Network Fabric interfaces of all the existing cluster nodes. In the example above, the number of number of nodes in the cluster is n.

  3. Modify the target in each node to make the device target visible to the node being added.

    In this command, use the IP addresses shown in the visibleToList field in the previous step in IP_list and append the list with the IP addresses of the node being added.

    /opt/oracle.SupportTools/quorumdiskmgr --alter --target --asm-disk-group asm_diskgroupname --visible-to 'IP_list,
     IP_addressX, IP_addressY'

    IP_addressX and IP_addressY in the previous command refer to the IP addresses of the 2 RDMA Network Fabric interfaces of the node being added.

  4. Run /opt/oracle.SupportTools/quorumdiskmgr --list –target in the two database nodes that contain the quorum devices and make sure the two IP addresses of the node being added are seen in the Visible to list.
  5. Log in as the root user on the node being added.
  6. Run the quorumdiskmgr command with the --list --config option to verify that the configurations have been successfully created on the node.
    # /opt/oracle.SupportTools/quorumdiskmgr --list --config
    
    Your output should resemble one of the following:
    • For Oracle Exadata System Software release 18.x or earlier, the output should look like this:

      Owner: grid
      Group: dba
      ifaces: exadata_ib1 exadata_ib0
      
    • If you have upgraded to Oracle Exadata System Software release 19.1.0 or later from an earlier release, then the output should look like this:

      Owner: grid 
      Group: dba 
      ifaces: exadata_ib0 
      Initiatior name: iqn.1988-12.com.oracle:7a4a399566
    • If you have a system that was imaged with Oracle Exadata System Software release 19.1.0 or later (not upgraded), then the output should look like this:

      Owner: grid 
      Group: dba 
      ifaces: exadata_ib0 
      Initiatior name: iqn.1988-12.com.oracle:192.168.18.205
    • If your rack uses RoCE Network Fabric, then the output should look like this:

      Owner: grid 
      Group: dba 
      ifaces: exadata_re0 
      Initiatior name: iqn.1988-12.com.oracle:192.168.18.205
  7. Run the quorumdiskmgr command with the --create --device option to create the quorum devices on the node being added pointing to the targets for the existing quorum devices.

    In the following command, IP_List is a command delimited list of the IP addresses obtained in step 4.

    # /opt/oracle.SupportTools/quorumdiskmgr --create --device --target-ip-list='IP_List'
  8. Run the quorumdiskmgr command with the --list --device options to verify the existing quorum devices have been successfully discovered and are visible on the node being added.
    # /opt/oracle.SupportTools/quorumdiskmgr --list –device

    On the newly added node the output should be similar to the following and it should be the same as in any of the existing cluster nodes:

    Device path: /dev/exadata_quorum/QD_DATAC1_DB01 
    Size: 128 MB 
    Host name: DB01
    ASM disk group name: DATA 
    Device path: /dev/exadata_quorum/QD_DATAC1_DB02 
    Size: 128 MB 
    Host name: DB02
    ASM disk group name: DATA
2.14.7.4.2 Removing a Database Node When Using Quorum Disks

If database node being removed hosted a quorum disk containing a voting file and there are fewer than five storage servers in the Oracle Real Application Clusters (Oracle RAC) cluster, then a quorum disk must be created on a different database node before the database node is removed.

If the database node being removed did not host a quorum disk, then no action is required. Otherwise, use these steps to create a quorum disk on a database node that does not currently host a quorum disk.

  1. Log into db01 and db02 as the root user.
  2. Run the quorumdiskmgr command with the --create --config options to create quorum disk configurations on both db01 and db02.
    • For InfiniBand Network Fabric:

      # /opt/oracle.SupportTools/quorumdiskmgr --create --config --owner=grid --group=dba --network-iface-list="ib0, ib1"
      
    • For RoCE Network Fabric:

      # /opt/oracle.SupportTools/quorumdiskmgr --create --config --owner=grid --group=dba --network-iface-list="re0, re1"
      
  3. Run the quorumdiskmgr command with the --list --config options to verify that the configurations have been successfully created on both db01 and db02.
    # /opt/oracle.SupportTools/quorumdiskmgr --list --config
    
    Your output should resemble one of the following:
    • For Oracle Exadata System Software release 18.x or earlier, the output should look like this:

      Owner: grid
      Group: dba
      ifaces: exadata_ib1 exadata_ib0
      
    • If you have upgraded to Oracle Exadata System Software release 19.1.0 or later from an earlier release, then the output should look like this:

      Owner: grid 
      Group: dba 
      ifaces: exadata_ib0 
      Initiatior name: iqn.1988-12.com.oracle:7a4a399566
    • If you have a system that was imaged with Oracle Exadata System Software release 19.1.0 or later (not upgraded), then the output should look like this:

      Owner: grid 
      Group: dba 
      ifaces: exadata_ib0 
      Initiatior name: iqn.1988-12.com.oracle:192.168.18.205
    • If your rack uses RoCE Network Fabric, then the output should look like this:

      Owner: grid 
      Group: dba 
      ifaces: exadata_re0 
      Initiatior name: iqn.1988-12.com.oracle:192.168.18.205
  4. Run the quorumdiskmgr command with the --create --target options to create a target on both db01 and db02 for Oracle ASM disk group DATAC1 and make the target visible to both db01 and db02.

    For example, you would use a command similar to this:

    # /opt/oracle.SupportTools/quorumdiskmgr --create --target --asm-disk-group=datac1 
    --visible-to="192.168.10.45, 192.168.10.46, 192.168.10.47, 192.168.10.48"
    
  5. Run the quorumdiskmgr command with the --list --target options to verify the target has been successfully created on both db01 and db02.
    # /opt/oracle.SupportTools/quorumdiskmgr --list --target
    

    The output shows both IP addresses and initiator names in the Visible to list only if you have a system that was upgraded from a release older than Oracle Exadata System Software release 19.1.0. Otherwise, the Visible to list shows only IP addresses in it.

    • If you are running Oracle Exadata System Software release 18.x or earlier, then the output should look like this for each node:

      Name: iqn.2015-05.com.oracle:QD_DATAC1_DB01 
      Size: 128 MB 
      Host name: DB01
      ASM disk group name: DATAC1 
      Visible to: 192.168.10.45, 192.168.10.46, 192.168.10.47, 192.168.10.48
      Discovered by:
      
      Name: iqn.2015-05.com.oracle:QD_DATAC1_DB02 
      Size: 128 MB 
      Host name: DB02
      ASM disk group name: DATAC1 
      Visible to: 192.168.10.45, 192.168.10.46, 192.168.10.47, 192.168.10.48
      Discovered by:
      
    • If you are running Oracle Exadata System Software release 19.x or later, then the output should look like this for each node:

      
      Name: iqn.2015-05.com.oracle:QD_DATAC1_DB01 
      Size: 128 MB 
      Host name: DB01
      ASM disk group name: DATAC1 
      Visible  to: 192.168.10.45, 192.168.10.46, 192.168.10.47, 
      192.168.10.48, iqn.1988-12.com.oracle:ee657eb81b53, 
      iqn.1988-12.com.oracle:db357ba82b24
      
      Name: iqn.2015-05.com.oracle:QD_DATAC1_DB02
      Size: 128 MB
      Host name: DB02
      ASM disk group name: DATAC1
      Visible to: 192.168.10.45, 192.168.10.46, 192.168.10.47, 
      192.168.10.48, iqn.1988-12.com.oracle:ee657eb81b53,
      iqn.1988-12.com.oracle:db357ba82b24
  6. Run the quorumdiskmgr command with the --create --device options to create devices on both db01 and db02 from targets on both db01 and db02.

    For example, you would use a command similar to this:

    # /opt/oracle.SupportTools/quorumdiskmgr --create --device --target-ip-list="192.168.10.45, 192.168.10.46,
     192.168.10.47, 192.168.10.48"
    
  7. Run the quorumdiskmgr command with the --list --device options to verify the devices have been successfully created on both db01 and db02.
    # /opt/oracle.SupportTools/quorumdiskmgr --list --device
    

    On both db01 and db02, the output should look like:

    Device path: /dev/exadata_quorum/QD_DATAC1_DB01 
    Size: 128 MB 
    Host name: DB01
    ASM disk group name: DATAC1 
    
    Device path: /dev/exadata_quorum/QD_DATAC1_DB02 
    Size: 128 MB 
    Host name: DB02
    ASM disk group name: DATAC1
    
  8. Add the two quorum disk devices to a high redundancy Oracle ASM disk group.

    If there is no high redundancy disk group, create a high redundancy disk group and include the two new quorum disks. For example:

    SQL> CREATE DISKGROUP DATAC1 HIGH REDUNDANCY ADD QUORUM FAILGROUP db01 DISK '/dev/exadata_quorum/QD_ DATAC1_DB01' 
    QUORUM FAILGROUP db02 DISK '/dev/exadata_quorum/QD_ DATAC1_DB02' ...
    

    If a high redundancy disk group already exists, add the two new quorum disks. For example:

    SQL> ALTER DISKGROUP datac1 ADD QUORUM FAILGROUP db01 DISK '/dev/exadata_quorum/QD_DATAC1_DB01' 
    QUORUM FAILGROUP db02 DISK '/dev/exadata_quorum/QD_DATAC1_DB02';
    
  9. Remove the database node.
    After the database node is removed, its voting file will get relocated automatically to the quorum disk added in the previous steps.
2.14.7.4.3 Adding an Oracle Exadata Storage Server and Expanding an Existing High Redundancy Disk Group

When you add a storage server that uses quorum disks, Oracle recommends relocating a voting file from a database node to the newly added storage server.

  1. Add the Exadata storage server. See Adding a Cell Node for details.

    In the example below, the new storage server added is called "celadm04".

  2. After the storage server is added, verify the new fail group from v$asm_disk.

    SQL> select distinct failgroup from v$asm_disk;
    FAILGROUP
    ------------------------------
    ADM01
    ADM02
    CELADM01
    CELADM02
    CELADM03
    CELADM04
    
  3. Verify at least one database node has a quorum disk containing a voting file.

    $ crsctl query css votedisk
    ##  STATE    File Universal Id                File Name Disk group
    --  -----    -----------------                --------- ---------
     1. ONLINE   834ee5a8f5054f12bf47210c51ecb8f4 (o/192.168.12.125;192.168.12.126/DATAC5_CD_00_celadm01) [DATAC5]
     2. ONLINE   f4af2213d9964f0bbfa30b2ba711b475 (o/192.168.12.127;192.168.12.128/DATAC5_CD_00_celadm02) [DATAC5]
     3. ONLINE   ed61778df2964f37bf1d53ea03cd7173 (o/192.168.12.129;192.168.12.130/DATAC5_CD_00_celadm03) [DATAC5]
     4. ONLINE   bfe1c3aa91334f16bf78ee7d33ad77e0 (/dev/exadata_quorum/QD_DATAC5_ADM01) [DATAC5]
     5. ONLINE   a3a56e7145694f75bf21751520b226ef (/dev/exadata_quorum/QD_DATAC5_ADM02) [DATAC5]
    Located 5 voting disk(s).
    

    The example above shows there are two quorum disks with voting files on two database nodes.

  4. Drop one of the quorum disks.

    SQL> alter diskgroup datac5 drop quorum disk QD_DATAC5_ADM01;
    

    The voting file on the dropped quorum disk will be relocated automatically to the newly added storage server by the Grid Infrastructure as part of the voting file refresh. You can verify this as follows:

    $ crsctl query css votedisk
    ##  STATE    File Universal Id                File Name Disk group
    --  -----    -----------------                --------- ---------
     1. ONLINE   834ee5a8f5054f12bf47210c51ecb8f4 (o/192.168.12.125;192.168.12.126/DATAC5_CD_00_celadm01) [DATAC5]
     2. ONLINE   f4af2213d9964f0bbfa30b2ba711b475 (o/192.168.12.127;192.168.12.128/DATAC5_CD_00_celadm02) [DATAC5]
     3. ONLINE   ed61778df2964f37bf1d53ea03cd7173 (o/192.168.12.129;192.168.12.130/DATAC5_CD_00_celadm03) [DATAC5]
     4. ONLINE   a3a56e7145694f75bf21751520b226ef (/dev/exadata_quorum/QD_DATAC5_ADM02) [DATAC5]
     5. ONLINE   ab5aefd60cf84fe9bff6541b16e33787 (o/192.168.12.131;192.168.12.132/DATAC5_CD_00_celadm04) [DATAC5]
    
2.14.7.4.4 Removing an Oracle Exadata Storage Server When Using Quorum Disks

If removing a storage server results in the number of storage servers being used by the Oracle RAC cluster to fewer than five, and the voting files reside in a high redundancy disk group, then Oracle recommends adding quorum disks to the database nodes, if not in place already.

Prior to removing the storage server, add the quorum disks so that five copies of the voting files are available immediately after removing the storage server.

2.14.8 Reconfigure Quorum Disk After Restoring a Database Server

After restoring a database server, lvdisplay shows the quorum disk was not restored.

When you restore a database server, Exadata image rescue mode restores the layout of disks and file systems, with the exception of custom partitions, including quorum disks. These files must be recreated after being restored from backup.

The logical volumes created for quorum disks are in /dev/VGExaDb and have the name-prefix LVDbVd*.

  1. Using the configuration backed up under /etc/lvm/archive, make a logical volume (LV) for the quorum disk on every node.

    For example, you would use a command similar to the following, but using values from the backup configuration information.

    # lvcreate -L 128MB -n <LVName> VGExaDb
  2. Reboot all database servers.
    # shutdown -r now
  3. After the server restarts, verify the quorum disks were restored.
    # /opt/oracle.SupportTools/quorumdiskmgr --list --config
    Owner: grid
    Group: dba
    ifaces: exadata_ib1 exadata_ib0
    
    # /opt/oracle.SupportTools/quorumdiskmgr --list --target
    Name: iqn.2015-05.com.oracle:QD_DATAC1_DB01
    Host name: DB01
    ASM disk group name: DATAC1
    Size: 128 MB
    Visible to: 192.168.10.45, 192.168.10.46
    Discovered by: 192.168.10.45, 192.168.10.46
    
    # /opt/oracle.SupportTools/quorumdiskmgr --list --device
    Device path: /dev/exadata_quorum/QD_DATAC1_DB01
    Host name: DB01
    ASM disk group name: DATAC1
    Size: 128 MB
    
    Device path: /dev/exadata_quorum/QD_DATAL1_DB01
    Host name: DB01
    ASM disk group name: DATAC1
    Size: 128 MB
  4. Query the voting disks for the cluster to see if all voting disks are available.
    # crsctl query css votedisk
    ##  STATE    File Universal Id                
      File Name                               Disk group
    --  -----    -----------------                
      ------------------------------------    -----------
     1. ONLINE   ca2f1b57873f4ff4bf1dfb78824f2912 
      (o/192.168.10.42/DATAC1_CD_09_celadm12) [DATAC1]
     2. ONLINE   a8c3609a3dd44f53bf17c89429c6ebe6 
    (o/192.168.10.43/DATAC1_CD_09_celadm13)   [DATAC1]
     3. ONLINE   4948b73db0514f47bf94ee53b98fdb51  
    (/dev/exadata_quorum/QD_ DATAC1_DB02) [DATAC1]
     4. ONLINE   cafb7e95a5be4f00bf10bc094469cad9  
    (o/192.168.10.44/DATAC1_CD_09_celadm14) [DATAC1]
    Located 4 voting disk(s).

    Notice that there is one voting disk missing, for the recovered database server (DB01). If you query V$ASM_DISK, you can see that the quorum disk was offlined by the recovery process.

    SQL> set line 200
     col LABEL for a20
     col path for a30
     col mode_status for a20
     col header_status for a30
     SELECT label, path, mode_status, header_status, mount_status 
     FROM v$asm_disk
     WHERE path LIKE '/dev/%';
    
    LABEL                PATH                           MODE_STATUS          
    HEADER_STATUS                  MOUNT_S
    -------------------- ------------------------------ --------------------
    ------------------------------ -------
    QD_DATAC1_DB01       /dev/exadata_quorum/QD_DATAC1_ ONLINE              
    CANDIDATE                      CLOSED
    
    QD_DATAC1_DB02       /dev/exadata_quorum/QD_DATAC1_ ONLINE              
    MEMBER                         CACHED
  5. Drop the unavailable quorum disk from the Oracle ASM disk group using the FORCE option.
    SQL> alter diskgroup DATA_C1 drop quorum disk QD_DATAC1_DB01 force;
  6. Add the same quorum disk to the Oracle ASM disk group.
    SQL> alter diskgroup DATA_C1 add quorum failgroup DB01 disk '
    /dev/exadata_quorum/QD_DATAC1_DB01';
  7. Requery V$ASM_DISK to verify both quorum disks are available.
    SQL> SELECT label, path, mode_status, header_status, mount_status 
     FROM v$asm_disk
     WHERE path LIKE '/dev/%';
    
    LABEL                PATH                           MODE_STATUS          
    HEADER_STATUS                  MOUNT_S
    -------------------- ------------------------------ --------------------
    ------------------------------ -------
    QD_DATAC1_DB01       /dev/exadata_quorum/QD_DATAC1_ ONLINE              
    MEMBER                         CACHED
    
    QD_DATAC1_DB02       /dev/exadata_quorum/QD_DATAC1_ ONLINE              
    MEMBER                         CACHED
  8. Query the voting disks for the cluster to verify all voting disks are now available.
    # crsctl query css votedisk
    ##  STATE    File Universal Id                
      File Name                               Disk group
    --  -----    -----------------                
      ------------------------------------    -----------
     1. ONLINE   ca2f1b57873f4ff4bf1dfb78824f2912 
      (o/192.168.10.42/DATAC1_CD_09_celadm12) [DATAC1]
     2. ONLINE   a8c3609a3dd44f53bf17c89429c6ebe6 
    (o/192.168.10.43/DATAC1_CD_09_celadm13)   [DATAC1]
     3. ONLINE   4948b73db0514f47bf94ee53b98fdb51  
    (/dev/exadata_quorum/QD_ DATAC1_DB02) [DATAC1]
     4. ONLINE   cafb7e95a5be4f00bf10bc094469cad9  
    (o/192.168.10.44/DATAC1_CD_09_celadm14) [DATAC1]
     5. ONLINE   4dca8fb7bd594f6ebf8321ac23e53434  
    (/dev/exadata_quorum/QD_ DATAC1_DB01) [DATAC1]
    Located 5 voting disk(s).

2.15 Using vmetrics

The vmetrics package enables you to display system statistics gathered by the vmetrics service.

2.15.1 About the vmetrics Package

The vmetrics service collects the statistics required for SAP monitoring of Oracle VM domains.

You can access the system statistics from the management domain (dom0) or the user domain (domU). The vmetrics service runs on the management domain, collects the statistics, and pushes them to the xenstore. This allows the user domains to access the statistics.

System statistics collected by the vmetrics service are shown below, with sample values:

com.sap.host.host.VirtualizationVendor=Oracle Corporation;
com.sap.host.host.VirtProductInfo=Oracle VM 3;
com.sap.host.host.PagedInMemory=0;
com.sap.host.host.PagedOutMemory=0;
com.sap.host.host.PageRates=0;
com.sap.vm.vm.uuid=2b80522b-060d-47ee-8209-2ab65778eb7e;
com.sap.host.host.HostName=sc10adm01.example.com;
com.sap.host.host.HostSystemInfo=sc10adm01;
com.sap.host.host.NumberOfPhysicalCPUs=24;
com.sap.host.host.NumCPUs=4;
com.sap.host.host.TotalPhyMem=98295;
com.sap.host.host.UsedVirtualMemory=2577;
com.sap.host.host.MemoryAllocatedToVirtualServers=2577;
com.sap.host.host.FreeVirtualMemory=29788;
com.sap.host.host.FreePhysicalMemory=5212;
com.sap.host.host.TotalCPUTime=242507.220000;
com.sap.host.host.Time=1453150151;
com.sap.vm.vm.PhysicalMemoryAllocatedToVirtualSystem=8192;
com.sap.vm.vm.ResourceMemoryLimit=8192;
com.sap.vm.vm.TotalCPUTime=10160.1831404;
com.sap.vm.vm.ResourceProcessorLimit=4; 

2.15.2 Installing and Starting the vmetrics Service

To install the vmetrics service, run the install.sh script as the root user on dom0:

[root@scac10adm01]# cd /opt/oracle.SupportTools/vmetrics
[root@scac10adm01]# ./install.sh

The install.sh script verifies that it is running on dom0, stops any vmetrics services currently running, copies the package files to /opt/oracle.vmetrics, and copies vmetrics.svc to /etc/init.d.

To start the vmetrics service on dom0, run the following command as the root user on dom0:

[root@scac10adm01 vmetrics]# service vmetrics.svc start

The commands to gather the statistics are run every 30 seconds.

2.15.3 Files in the vmetrics Package

The vmetrics package contains the following files:

File Description

install.sh

This file installs the package.

vm-dump-metrics

This script reads the statistics from the xenstore and displays them in XML format.

vmetrics

This Python script runs the system commands and uploads them to the xenstore. The system commands are listed in the vmetrics.conf file.

vmetrics.conf

This XML file specifies the metrics that the dom0 should push to the xenstore, and the system commands to run for each metric.

vmetrics.svc

The init.d file that makes vmetrics a Linux service.

2.15.4 Displaying the Statistics

Once the statistics have been pushed to the xenstore, you can view the statistics on dom0 and domU by running either of the following commands:

Note:

On domU's, ensure that the xenstoreprovider and ovmd packages are installed.

xenstoreprovider is the library which communicates with the ovmapi kernel infrastructure.

ovmd is a daemon that handles configuration and reconfiguration events and provides a mechanism to send/receive messages between the VM and the Oracle VM Manager.

The following command installs the necessary packages on Oracle Linux 5 and 6 to support the Oracle VM API.

# yum install ovmd xenstoreprovider
  • The /usr/sbin/ovmd -g vmhost command displays the statistics on one line. The sed command breaks up the line into multiple lines, one statistic per line. You need to run this command as the root user.

    root@scac10db01vm04 ~]# /usr/sbin/ovmd -g vmhost |sed 's/; */;\n/g;s/:"/:"\n/g'
    com.sap.host.host.VirtualizationVendor=Oracle Corporation;
    com.sap.host.host.VirtProductInfo=Oracle VM 3;
    com.sap.host.host.PagedInMemory=0;
    com.sap.host.host.PagedOutMemory=0;
    com.sap.host.host.PageRates=0;
    com.sap.vm.vm.uuid=2b80522b-060d-47ee-8209-2ab65778eb7e;
    com.sap.host.host.HostName=scac10adm01.example.com;
    com.sap.host.host.HostSystemInfo=scac10adm01;
    com.sap.host.host.NumberOfPhysicalCPUs=24;
    com.sap.host.host.NumCPUs=4;
    ...
    
  • The vm-dump-metrics command displays the metrics in XML format.

    [root@scac10db01vm04 ~]# ./vm-dump-metrics
    <metrics>
    <metric type='real64' context='host'>
    <name>TotalCPUTime</name>
    <value>242773.600000</value>
    </metric>
    <metric type='uint64' context='host'>
    <name>PagedOutMemory</name>
    <value>0</value>
    </metric>
    ...
    

    Note that you have copy the vm-dump-metrics command to the domU's from which you want to run the command.

2.15.5 Adding Metrics to vmetrics

You can add your own metric to be collected by the vmetrics service.

  1. In /opt/oracle.SupportTools/vmetrics/vmetrics.conf, add the new metric and the system commands to retrieve and parse that metric. For example:
    <metric type="uint32" context="host">
     <name>NumCPUs</name>
     <action>grep -c processor /proc/cpuinfo</action>
     <action2>xm list | grep '^Domain-0' |awk '{print $4}'</action2>
    </metric>
    

    In the <name> element, enter the name of the new metric.

    In the <action> and <action2> elements, specify the system command for the new metric. You only need to have <action2>, but you can use <action> as a fallback in case <action2> does not work on some systems.

    Note that any action that needs the name of the vm should be done with scas07client07vm01. When vmetrics runs, it swaps out this dummy name for the actual domU names that are running in the dom0.

  2. In /opt/oracle.SupportTools/vmetrics/vmetrics, add the metric in the list gFieldsList. Prefix the metric name with "host" if the metric is about the host (dom0) or with "vm" if the metric is about the vm (domU). For example:

    Suppose the gFieldsList looks like this:

    gFieldsList = [ 'host.VirtualizationVendor',
        'host.VirtProductInfo',
        'host.PagedInMemory',
        'vm.ResourceProcessorLimit' ]
    

    If you are adding a new metric called "NumCPUs" (as shown in the example in step 1), and this metric is intended to tell the domU how many cpu's the dom0 has available, then gFieldsList would now look like:

     gFieldsList = [ 'host.VirtualizationVendor',
        'host.VirtProductInfo',
        'host.PagedInMemory',
        'vm.ResourceProcessorLimit',
        'host.NumCPUs']
    
  3. (optional) In /opt/oracle.SupportTools/vmetrics/vm-dump-metrics, add the new metric if you want the new metric to be included in the XML output.

    If you skip this step, you can view the new metric using the ovmd -g vmhost command.

2.16 Using FIPS mode

On database servers running Oracle Linux 7, you can enable the kernel to run in FIPS mode.

Starting with Oracle Exadata System Software release 20.1.0, you can enable and disable the Federal Information Processing Standards (FIPS) compatibility mode on Oracle Exadata Database Machine database servers running Oracle Linux 7.

After you enable or disable FIPS mode, you must reboot the server for the action to take effect.

To enable, disable, and get status information about FIPS mode, use the utility at /opt/oracle.cellos/host_access_control with the fips-mode option:

  • To display the current FIPS mode setting, run:

    # /opt/oracle.cellos/host_access_control fips-mode --status
  • To enable FIPS mode, run:

    # /opt/oracle.cellos/host_access_control fips-mode --enable

    Then, reboot the server to finalize the action.

  • To disable FIPS mode, run:

    # /opt/oracle.cellos/host_access_control fips-mode --disable

    Then, reboot the server to finalize the action.

  • To display information about FIPS mode, run:

    # /opt/oracle.cellos/host_access_control fips-mode --info

The following example shows the typical command sequence and command output for enabling and disabling FIPS mode on a server.

# /opt/oracle.cellos/host_access_control fips-mode --status
[2020-04-14 09:19:45 -0700] [INFO] [IMG-SEC-1101] FIPS mode is disabled

# /opt/oracle.cellos/host_access_control fips-mode --enable
[2020-04-14 09:30:10 -0700] [INFO] [IMG-SEC-1107] Using only FIPS compliant
SSH host keys and sshd configuration updated in /etc/ssh/sshd_config
[2020-04-14 09:30:10 -0700] [INFO] [IMG-SEC-1103] FIPS mode is set to
enabled. A reboot is required to effect this change.

# /opt/oracle.cellos/host_access_control fips-mode --status
[2020-04-14 09:30:14 -0700] [INFO] [IMG-SEC-1101] FIPS mode is configured but
not activated. A reboot is required to activate.

# reboot

...

# /opt/oracle.cellos/host_access_control fips-mode --status
[2020-04-14 09:23:15 -0700] [INFO] [IMG-SEC-1103] FIPS mode is configured and
active

# /opt/oracle.cellos/host_access_control fips-mode --disable
[2020-04-14 09:40:37 -0700] [INFO] [IMG-SEC-1103] FIPS mode is set to
disabled. A reboot is required to effect this change.

# /opt/oracle.cellos/host_access_control fips-mode --status
[2020-04-14 09:40:37 -0700] [INFO] [IMG-SEC-1103] FIPS mode is disabled but
is active. A reboot is required to deactivate FIPS mode.

# reboot

...

# /opt/oracle.cellos/host_access_control fips-mode --status
[2020-04-14 09:46:22 -0700] [INFO] [IMG-SEC-1101] FIPS mode is disabled

2.17 LED Status Descriptions

The LEDs on the Oracle Exadata Rack components help you identify the component that needs servicing.

2.17.1 Exadata Database Server X5-2 and Later LEDs

The following table describes the color codes of the LEDs on Oracle Server X5-2 and later Oracle Database Servers.

Table 2-4 Oracle Server X5-2 and Later Oracle Database Server LED Status Descriptions

Component LED Status

Fan module

  • Top Fan Fault LED is off: The system is powered on and the fan module is functioning correctly.

  • Top Fan Fault LED is amber: The fan module is faulty. The front and rear panel Fault-Service Required LEDs are also lit if the system detects a fan module fault.

Power supply

  • AC OK LED is green: Power supply can be removed during a hot-swap procedure.

  • Fault-Service Required LED is amber: The power supply is faulty. The front and rear panel Fault-Service Required LEDs are also lit if the system detects a power supply fault.

Server chassis

  • OK to Remove LED is blue: The storage device can be removed safely during a hot-swap procedure.

  • Fault-Service Required LED is amber: The system is running but the storage device is faulty. The front and rear panel Fault-Service Required LEDs are also lit if the system detects a storage device fault.

  • OK/Activity LED is green: Data is being read from or written to the storage device.

The following table describes the storage device status based on the LEDs.

Table 2-5 Storage Device Status of Oracle Server X5-2 and later Oracle Database Server Based on LEDs

LED Predictive Failure Critical

Service Action Required (Amber)

On

On

OK to Remove (Blue)

On

On

2.17.2 Sun Server X4-2 Oracle Database Server LEDs

Table 2-6 describes the color codes of the LEDs on Sun Server X4-2 Oracle Database Servers.

Table 2-6 Sun Fire X4170 M2 Oracle Database Server LED Status Descriptions

Component LED Status

Sun Server X4-2 Oracle Database Server fan module

  • Fan Status LED is green: The system is powered on and the fan module is functioning correctly.

  • Fan Status LED is amber: The fan module is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a fan module fault.

Sun Server X4-2 Oracle Database Server power supply

  • OK to Remove LED is green: The power supply can be removed safely during a hot-swap procedure.

  • Service Action Required LED is amber: The power supply is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a power supply fault.

  • AC Present LED is green: Power supply can be removed during a hot-swap procedure.

Sun Server X4-2 Oracle Database Server servers

  • OK to Remove LED is blue: The storage drive can be removed safely during a hot-swap procedure.

  • Service Action Required LED is amber: The system is running but the storage drive is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a storage drive fault.

  • OK/Activity LED is green: Data is being read from or written to the storage drive.

Table 2-7 describes the disk status based on the LEDs.

Table 2-7 Disk Status of Sun Server X4-2 Oracle Database Server Based on LEDs

LED Predictive Failure Poor Performance Critical

Service Action Required (Amber)

On

On

On

OK to Remove (Blue)

Off

On

On

2.17.3 Sun Server X3-2 Oracle Database Server LEDs

Table 2-8 describes the color codes of the LEDs on Sun Server X3-2 Oracle Database Servers.

Table 2-8 Sun Server X3-2 Oracle Database Server LED Status Descriptions

Component LED Status

Sun Server X3-2 Oracle Database Server fan module

  • Fan Status LED is green: The system is powered on and the fan module is functioning correctly.

  • Fan Status LED is amber: The fan module is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a fan module fault.

Sun Server X3-2 Oracle Database Server power supply

  • OK to Remove LED is green: The power supply can be removed safely during a hot-swap procedure.

  • Service Action Required LED is amber: The power supply is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a power supply fault.

  • AC Present LED is green: Power supply can be removed during a hot-swap procedure.

Sun Server X3-2 Oracle Database Server servers

  • OK to Remove LED is blue: The storage drive can be removed safely during a hot-swap procedure.

  • Service Action Required LED is amber: The system is running but the storage drive is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a storage drive fault.

  • OK/Activity LED is green: Data is being read from or written to the storage drive.

Table 2-9 describes the disk status based on the LEDs.

Table 2-9 Disk Status of Sun Server X3-2 Oracle Database Server Based on LEDs

LED Predictive Failure Poor Performance Critical

Service Action Required (Amber)

On

On

On

OK to Remove (Blue)

Off

On

On

2.17.4 Sun Fire X4170 M2 Oracle Database Server LEDs

Table 2-10 describes the color codes of the LEDs on Sun Fire X4170 M2 Oracle Database Servers.

Table 2-10 Sun Fire X4170 M2 Oracle Database Server LED Status Descriptions

Component LED Status

Sun Fire X4170 M2 Oracle Database Server fan module

  • Fan Status LED is green: The system is powered on and the fan module is functioning correctly.

  • Fan Status LED is amber: The fan module is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a fan module fault.

Sun Fire X4170 M2 Oracle Database Server power supply

  • OK to Remove LED is green: The power supply can be removed safely during a hot-swap procedure.

  • Service Action Required LED is amber: The power supply is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a power supply fault.

  • AC Present LED is green: Power supply can be removed during a hot-swap procedure.

Sun Fire X4170 M2 Oracle Database Server servers

  • OK to Remove LED is blue: The storage drive can be removed safely during a hot-swap procedure.

  • Service Action Required LED is amber: The system is running but the storage drive is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a storage drive fault.

  • OK/Activity LED is green: Data is being read from or written to the storage drive.

Table 2-11 describes the disk status based on the LEDs.

Table 2-11 Disk Status of Sun Fire X4170 M2 Oracle Database Server Based on LEDs

LED Predictive Failure Poor Performance Critical

Service Action Required (Amber)

On

On

On

OK to Remove (Blue)

Off

On

On

2.17.5 Sun Fire X4170 Oracle Database Server LEDs

Table 2-12 describes the color codes of the LEDs on Sun Fire X4170 Oracle Database Servers.

Table 2-12 Sun Fire X4170 Oracle Database Server LED Status Descriptions

Component LED Status

Sun Fire X4170 Oracle Database Server fan module

  • Power/OK LED is green: The system is powered on and the fan module is functioning correctly.

  • Service Action Required LED is amber: The fan module is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a fan module fault.

Sun Fire X4170 Oracle Database Server power supply

  • OK to Remove LED is green: The power supply can be removed safely during a hot-swap procedure.

  • Service Action Required LED is amber: The power supply is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a power supply fault.

  • AC Present LED is green: Power supply can be removed during a hot-swap procedure.

Sun Fire X4170 Oracle Database Server servers

  • OK to Remove LED is blue: A storage drive can be removed safely during a hot-swap procedure.

  • Service Action Required LED is amber: The system is running and the storage drive is faulty. The front and rear panel Service Action Required LEDs are also lit if the system detects a storage drive fault.

  • OK/Activity LED is green: Data is being read from or written to the storage drive.

Table 2-13 describes the disk status based on the LEDs.

Table 2-13 Disk Status of Sun Fire X4170 Oracle Database Server Based on LEDs

LED Predictive Failure Poor Performance Critical

Service Action Required (Amber)

Flashing

Flashing

On

OK to Remove (Blue)

Off

Flashing

Flashing

2.17.6 Exadata Database Server X5-8 and Later LEDs

The color codes of the LEDs on Exadata Database Server X5-8 and later models are as follows:

  • Dual PCIe Card Carriers (DPCC):

    • Fault-Service Required LED is amber. If lit, there is a fault condition that requires service. This LED blinks rapidly when you use Oracle ILOM to activate the LED as a locator.
    • OK/Activity LED is green. A steady light indicates the PCIe is ready, and in use by the operating system. A flashing green light indicates the DPCC is booting. If the LED is not lit, then power is not present.
  • Hard drives:

    • OK to Remove LED is blue. If this LED is lit, then the hard drive can safely be removed.
    • Fault-Service Required LED is amber. If this LED is lit, then a fault was detected in the hard drive. If flashing, this LED indicates the drive is not functioning correctly, which is related to a status of predictive failure, poor performance, or critical.
    • OK/Activity LED is green: This LED flashes to indicate drive activity. The rate at which the LED blinks can vary by activity. If the LED is a steady green, then the storage drive is functioning normally. If the LED is not lit, then power is not present or the Oracle ILOM boot is not complete.
  • Power supplies:

    • PSU Fault-Service Required LED is amber. If this LED is lit, then a fault was detected in the power supply.
    • PSU OK LED is green. If the LED is a steady green, then the power supply is functioning normally.
    • AC OK LED is green. If the LED is a steady green, then the power supply is connected to a properly rated AC power source.
  • Service processor (SP) modules:

    • Network Activity LEDs: A steady green light indicates a live network. This LED lights when the network port is active. If the LED is not lit, it indicates that there is no activity, and the link is not operational. When there is network traffic, the LED flashes green.
    • Network Speed LEDs: If the LED is not lit, it indicates a 10BASE-T link (if link up) (10 Gigabit Ethernet 10GBASE-T). If there is a steady amber light, it indicates a 100BASE-T link (Fast Ethernet 100 BASE-TX). If the light is a steady green light, it indicates a 1000BASE-T link (Gigabit Ethernet 1000BASE-T).
    • Chassis Fault-Service Required LED is amber. If this LED is lit, then a fault was detected in the system module.
    • Power/System OK LED: Green indicates full power mode. It flashes quickly during server start up and when AC power cords are connected to the server. It flashes slowly in standby power mode.
    • Temperature Fault LED is amber. If lit, this LED indicates the internal server temperature exceeds the upper threshold.
    • SP OK LED is green. When this LED is lit, it indicates the service processor and Oracle ILOM are operational. If the LED is flashing, it indicates the SP is booting.
    • Locator LED is white: This LED is lit when activated by Oracle ILOM or the recessed Locate button. This LED enables you to locate a server quickly and easily.

2.17.7 Sun Server X4-8 Oracle Database Server LEDs

Table 2-14 describes the color codes of the LEDs on Sun Server X4-8 Oracle Database Server.

Table 2-14 Sun Server X4-8 Oracle Database Server LED Status Descriptions

Component LED Status

Sun Server X4-8 Oracle Database Server Back Panel PCIe Card Carriers (DPCC)

  • ATTN Button LED: Press to remove and to add the PCIe to the operating system.

  • Service Action Required (fault) LED is amber: There is a fault condition.

  • OK LED is green: PCIe is ready, and in use by the operating system.

Sun Server X4-8 Oracle Database Server hard drives

  • Hot Swap LED is blue: The hard drive can safely be removed.

  • Fault is amber: The hard drive is faulty.

  • Activity is green: Flashing indicates drive activity, and standby.

Sun Server X4-8 Oracle Database Server power supply

  • PSU Fault LED is amber: The power supply has a fault condition.

  • PSU OK LED is green: The power supply is on.

  • AC LED is green: AC is connected to the power supply.

Sun Server X4-8 Oracle Database Server service processor (SP) modules

  • 10/1000/1000Base-T Ethernet LED (left): Green indicates link is established at 1 gigabit. Amber and on indicates link is established at 100 megabits. Amber and off indicates link is established at 10 megabits.

  • 10/1000/1000Base-T Ethernet LED (bottom) is green: There is activity on the link.

  • Chassis Service Action Required (fault) LED is amber: There is a fault condition.

  • Chassis Power OK LED: Green indicates full power mode. It flashes quickly during SP boot when AC power cords are connected to the server. It flashes slowly in standby power mode.

  • Chassis Over-temperature LED is amber: The internal server temperature exceeds the upper threshold.

  • SP module OK LED: It is green and on when SP is functional. It flashes green three times when SP module first receives power. It is yellow when SP is not functional.

  • Chassis Locate LED is white: It has been activated by the ILOM or Locate button.

Table 2-15 describes the disk status based on the LEDs.

Table 2-15 Disk Status of Sun Server X4-8 Oracle Database Server Based on LEDs

LED Predictive Failure Poor Performance Critical

Service Action Required (Amber)

Flashing

Flashing

Flashing

OK to Remove (Blue)

Off

Off

Off

2.17.8 Sun Server X2-8 Oracle Database Server LEDs

Table 2-16 describes the color codes of the LEDs on Sun Server X2-8 Oracle Database Servers.

Table 2-16 Sun Server X2-8 Oracle Database Server LED Status Descriptions

Component LED Status

Sun Server X2-8 Oracle Database Server Back Panel PCIe Express Modules (EM)

  • ATTN Button LED: Press to remove and to add the EM to the operating system.

  • Service Action Required (fault) LED is amber: There is a fault condition.

  • OK LED is green: EM is ready, and in use by the operating system.

Sun Server X2-8 Oracle Database Server hard drives

  • Hot Swap LED is blue: The hard drive can safely be removed.

  • Fault is amber: The hard drive is faulty.

  • Activity is green: Flashing indicates drive activity, and standby.

Sun Server X2-8 Oracle Database Server Network Express Modules (NEM)

  • 10 Gb Ethernet port Activity LED: Green indicates link is established at 100 megabits. Amber indicates link is at 10 megabits, and not at full capacity.

  • 10 Gb Ethernet port Link LED is green: There is activity on the link.

  • 10/1000/1000Base-T Ethernet LED (top): Green indicates link is established at 1 gigabit. Amber and on indicates link is established at 100 megabits. Amber and off indicates link is established at 10 megabits.

  • 10/1000/1000Base-T Ethernet LED (bottom) is green: There is activity on the link.

  • NEM Locate LED is white: It has been activated by the ILOM or Locate button.

  • OK to Remove LED is blue: Not used.

  • Service Action Required (fault) LED is amber: There is a fault condition.

  • Power OK LED is green: The system is powered on.

Sun Server X2-8 Oracle Database Server power supply

  • PSU Fault LED is amber: The power supply has a fault condition.

  • PSU OK LED is green: The power supply is on.

  • AC LED is green: AC is connected to the power supply.

Sun Server X2-8 Oracle Database Server service processor (SP) modules

  • 10/1000/1000Base-T Ethernet LED (left): Green indicates link is established at 1 gigabit. Amber and on indicates link is established at 100 megabits. Amber and off indicates link is established at 10 megabits.

  • 10/1000/1000Base-T Ethernet LED (bottom) is green: There is activity on the link.

  • Chassis Service Action Required (fault) LED is amber: There is a fault condition.

  • Chassis Power OK LED: Green indicates full power mode. It flashes quickly during SP boot when AC power cords are connected to the server. It flashes slowly in standby power mode.

  • Chassis Over-temperature LED is amber: The internal server temperature exceeds the upper threshold.

  • SP module OK LED: It is green and on when SP is functional. It flashes green three times when SP module first receives power. It is yellow when SP is not functional.

  • Chassis Locate LED is white: It has been activated by the ILOM or Locate button.

Table 2-17 describes the disk status based on the LEDs.

Table 2-17 Disk Status of Sun Server X2-8 Oracle Database Server Based on LEDs

LED Predictive Failure Poor Performance Critical

Service Action Required (Amber)

Flashing

Flashing

Flashing

OK to Remove (Blue)

Off

Off

Off

2.17.9 Sun Fire X4800 Oracle Database Server LEDs

Table 2-18 describes the color codes of the LEDs on Sun Fire X4800 Oracle Database Servers.

Table 2-18 Sun Fire X4800 Oracle Database Server LED Status Descriptions

Component LED Status

Sun Fire X4800 Oracle Database Server Back Panel PCIe Express Modules (EM)

  • ATTN Button LED: Press to remove and to add the EM to the operating system.

  • Service Action Required (fault) LED is amber: There is a fault condition.

  • OK LED is green: EM is ready, and in use by the operating system.

Sun Fire X4800 Oracle Database Server hard drives

  • Hot Swap LED is blue: The hard drive can safely be removed.

  • Fault is amber: The hard drive is faulty.

  • Activity is green: Flashing indicates drive activity, and standby.

Sun Fire X4800 Oracle Database Server Network Express Modules (NEM)

  • 10 Gb Ethernet port Activity LED: Green indicates link is established at 100 megabits. Amber indicates link is at 10 megabits, and not at full capacity.

  • 10 Gb Ethernet port Link LED is green: There is activity on the link.

  • 10/1000/1000Base-T Ethernet LED (top): Green indicates link is established at 1 gigabit. Amber and on indicates link is established at 100 megabits. Amber and off indicates link is established at 10 megabits.

  • 10/1000/1000Base-T Ethernet LED (bottom) is green: There is activity on the link.

  • NEM Locate LED is white: It has been activated by the ILOM or Locate button.

  • OK to Remove LED is blue: Not used.

  • Service Action Required (fault) LED is amber: There is a fault condition.

  • Power OK LED is green: The system is powered on.

Sun Fire X4800 Oracle Database Server power supply

  • PSU Fault LED is amber: The power supply has a fault condition.

  • PSU OK LED is green: The power supply is on.

  • AC LED is green: AC is connected to the power supply.

Sun Fire X4800 Oracle Database Server service processor (SP) modules

  • 10/1000/1000Base-T Ethernet LED (left): Green indicates link is established at 1 gigabit. Amber and on indicates link is established at 100 megabits. Amber and off indicates link is established at 10 megabits.

  • 10/1000/1000Base-T Ethernet LED (bottom) is green: There is activity on the link.

  • Chassis Service Action Required (fault) LED is amber: There is a fault condition.

  • Chassis Power OK LED: Green indicates full power mode. It flashes quickly during SP boot when AC power cords are connected to the server. It flashes slowly in standby power mode.

  • Chassis Over-temperature LED is amber: The internal server temperature exceeds the upper threshold.

  • SP module OK LED: It is green and on when SP is functional. It flashes green three times when SP module first receives power. It is yellow when SP is not functional.

  • Chassis Locate LED is white: It has been activated by the ILOM or Locate button.

Table 2-19 describes the disk status based on the LEDs.

Table 2-19 Disk Status of Sun Fire X4800 Oracle Database Server Based on LEDs

LED Predictive Failure Poor Performance Critical

Service Action Required (Amber)

Flashing

Flashing

Flashing

OK to Remove (Blue)

Off

Off

Off

2.18 Exadata Database Server Images

The Exadata database server models have different external layouts and physical appearance.

2.18.1 Oracle Server X8-2 Database Server Images

Oracle Server X8-2 is used as the database server in Oracle Exadata Database Machine X8M-2 and X8-2.

The following image shows the front view of Oracle Server X8-2 Database Servers.

Figure 2-3 Front View of Oracle Server X8-2 Database Servers

Description of Figure 2-3 follows
Description of "Figure 2-3 Front View of Oracle Server X8-2 Database Servers"

The following image shows the rear view of the Oracle Server.

Figure 2-4 Rear View of Oracle Server X8-2 Database Servers

Description of Figure 2-4 follows
Description of "Figure 2-4 Rear View of Oracle Server X8-2 Database Servers"

2.18.2 Oracle Server X7-2 Oracle Database Server Images

The following image shows the front view of Oracle Server X7-2 Oracle Database Server.

Figure 2-5 Front View of Oracle Server X7-2 Oracle Database Server

Description of Figure 2-5 follows
Description of "Figure 2-5 Front View of Oracle Server X7-2 Oracle Database Server"

The following image shows the rear view of Oracle Server.

Figure 2-6 Rear View of X7-2 Oracle Database Server

Description of Figure 2-6 follows
Description of "Figure 2-6 Rear View of X7-2 Oracle Database Server"

2.18.3 Oracle Server X6-2 Oracle Database Server Images

The following image shows the front view of Oracle Server X6-2 Oracle Database Server.

Figure 2-7 Front View of Oracle Server X6-2 Oracle Database Server

Description of Figure 2-7 follows
Description of "Figure 2-7 Front View of Oracle Server X6-2 Oracle Database Server"

The following image shows the rear view of Oracle Server X6-2 Oracle Database Server.

The top hard disk drives are, from left to right HDD1, and HDD3. The lower drives are, from left to right, HDD0, and HDD2.

Figure 2-8 Rear View of Oracle Server X6-2 Oracle Database Server

Description of Figure 2-8 follows
Description of "Figure 2-8 Rear View of Oracle Server X6-2 Oracle Database Server"

2.18.4 Oracle Server X5-2 Oracle Database Server Images

The following image shows the front view of Oracle Server X5-2 Oracle Database Server.

Figure 2-9 Front View of Oracle Server X5-2 Oracle Database Server

Description of Figure 2-9 follows
Description of "Figure 2-9 Front View of Oracle Server X5-2 Oracle Database Server"

The following image shows the rear view of Oracle Server X5-2 Oracle Database Server.

The top hard disk drives are, from left to right HDD1, and HDD3. The lower drives are, from left to right, HDD0, and HDD2.

Figure 2-10 Rear View of Oracle Server X5-2 Oracle Database Server

Description of Figure 2-10 follows
Description of "Figure 2-10 Rear View of Oracle Server X5-2 Oracle Database Server"

2.18.5 Sun Server X4-2 Oracle Database Server Images

The following image shows the front view of Sun Server X4-2 Oracle Database Server.

Figure 2-11 Front View of Sun Server X4-2 Oracle Database Server

Description of Figure 2-11 follows
Description of "Figure 2-11 Front View of Sun Server X4-2 Oracle Database Server"

The following image shows the rear view of Sun Server X4-2 Oracle Database Server.

Figure 2-12 Rear View of Sun Server X4-2 Oracle Database Server

Description of Figure 2-12 follows
Description of "Figure 2-12 Rear View of Sun Server X4-2 Oracle Database Server"

2.18.6 Sun Server X3-2 Oracle Database Server Images

The following image shows the front view of Sun Server X3-2 Oracle Database Server.

Figure 2-13 Front View of Sun Server X3-2 Oracle Database Server

Description of Figure 2-13 follows
Description of "Figure 2-13 Front View of Sun Server X3-2 Oracle Database Server"

The following image shows the rear view of Sun Server X3-2 Oracle Database Server.

Figure 2-14 Rear View of Sun Server X3-2 Oracle Database Server

Description of Figure 2-14 follows
Description of "Figure 2-14 Rear View of Sun Server X3-2 Oracle Database Server"

2.18.7 Sun Fire X4170 M2 Oracle Database Server Images

The following image shows the front view of Sun Fire X4170 M2 Oracle Database Server.

Figure 2-15 Front View of Sun Fire X4170 M2 Oracle Database Server

Description of Figure 2-15 follows
Description of "Figure 2-15 Front View of Sun Fire X4170 M2 Oracle Database Server"
  1. Hard disk drives. The top drives are, from left to right HDD1, and HDD3. The lower drives are, from left to right, HDD0, and HDD2.

The following image shows the rear view of Sun Fire X4170 M2 Oracle Database Server.

Figure 2-16 Rear View of Sun Fire X4170 M2 Oracle Database Server

Description of Figure 2-16 follows
Description of "Figure 2-16 Rear View of Sun Fire X4170 M2 Oracle Database Server"
  1. InfiniBand host channel adapter

  2. Gigabit Ethernet ports

2.18.8 Sun Fire X4170 Oracle Database Server Images

The following image shows the front view of Sun Fire X4170 Oracle Database Server.

Figure 2-17 Front View of Sun Fire X4170 Oracle Database Server

Description of Figure 2-17 follows
Description of "Figure 2-17 Front View of Sun Fire X4170 Oracle Database Server"
  1. Hard disk drives. The top drives are, from left to right HDD1, and HDD3. The lower drives are, from left to right, HDD0, and HDD2.

The following image shows the rear view of Sun Fire X4170 Oracle Database Server.

Figure 2-18 Rear View of Sun Fire X4170 Oracle Database Server

Description of Figure 2-18 follows
Description of "Figure 2-18 Rear View of Sun Fire X4170 Oracle Database Server"
  1. RDMA Network Fabric host channel adapter

  2. Gigabit Ethernet ports

2.18.9 Oracle Server X8M-8 and X8-8 Database Server Images

The following image shows the front view of Oracle Server X8M-8 and X8-8 Database Server.

Figure 2-19 Front View of Oracle Database Server X8M-8 and X8-8

Description of Figure 2-19 follows
Description of "Figure 2-19 Front View of Oracle Database Server X8M-8 and X8-8 "

The following image shows the rear view of Oracle Database Server X8M-8.

Figure 2-20 Rear View of Oracle Database Server X8M-8

Description of Figure 2-20 follows
Description of "Figure 2-20 Rear View of Oracle Database Server X8M-8"

The following image shows the rear view of Oracle Server X8-8 Database Server.

Figure 2-21 Rear View of Oracle Server X8-8 Database Server

Description of Figure 2-21 follows
Description of "Figure 2-21 Rear View of Oracle Server X8-8 Database Server"

2.18.10 Oracle Server X7-8 Oracle Database Server Images

The following image shows the front view of Oracle Server X7-8 Oracle Database Server.

Figure 2-22 Front View of Oracle Server X7-8 Oracle Database Server


Description of Figure 2-22 follows
Description of "Figure 2-22 Front View of Oracle Server X7-8 Oracle Database Server"

The following image shows the rear view of Oracle Server X7-8 Oracle Database Server.

Figure 2-23 Rear View of Oracle Server X7-8 Oracle Database Server


Description of Figure 2-23 follows
Description of "Figure 2-23 Rear View of Oracle Server X7-8 Oracle Database Server"

2.18.11 Oracle Server X5-8 and X6-8 Oracle Database Server Images

The following image shows the front view of Oracle Server X5-8 Oracle Database Server.

Figure 2-24 Front View of Oracle Server X5-8 Oracle Database Server


Description of Figure 2-24 follows
Description of "Figure 2-24 Front View of Oracle Server X5-8 Oracle Database Server"

The following image shows the back view of Oracle Server X5-8 Oracle Database Server.

Figure 2-25 Back View of Oracle Server X5-8 Oracle Database Server


Description of Figure 2-25 follows
Description of "Figure 2-25 Back View of Oracle Server X5-8 Oracle Database Server"

2.18.12 Sun Server X4-8 Oracle Database Server Images

The following image shows the front view of Sun Server X4-8 Oracle Database Server.

Figure 2-26 Front View of Sun Server X4-8 Oracle Database Server

Description of Figure 2-26 follows
Description of "Figure 2-26 Front View of Sun Server X4-8 Oracle Database Server"

The following image shows the rear view of Sun Server X4-8 Oracle Database Server.

Figure 2-27 Rear View of Sun Server X4-8 Oracle Database Server

Description of Figure 2-27 follows
Description of "Figure 2-27 Rear View of Sun Server X4-8 Oracle Database Server"

2.18.13 Sun Server X2-8 Oracle Database Server Images

The following image shows the front view of Sun Server X2-8 Oracle Database Server.

Figure 2-28 Front View of Sun Server X2-8 Oracle Database Server

Description of Figure 2-28 follows
Description of "Figure 2-28 Front View of Sun Server X2-8 Oracle Database Server"
  1. Power supplies.

  2. Hard disk drives. The top drives are, from left to right, XL4, XL5, XL6, and XL7. The lower drives are, from left to right, XL0, XL1, XL2, and XL3.

  3. CPU modules. The modules are, from bottom to top, BL0, BL1, BL2, and BL3.

The following image shows the rear view of Sun Fire X4800 Oracle Database Server.

Figure 2-29 Rear View of Sun Server X2-8 Oracle Database Server

Description of Figure 2-29 follows
Description of "Figure 2-29 Rear View of Sun Server X2-8 Oracle Database Server"
  1. Fan modules.

  2. Network Express Module.

  3. InfiniBand EM (CX2) dual port PCI Express modules.

2.18.14 Sun Fire X4800 Oracle Database Server Images

The following image shows the front view of Sun Fire X4800 Oracle Database Server.

Figure 2-30 Front View of Sun Fire X4800 Oracle Database Server

Description of Figure 2-30 follows
Description of "Figure 2-30 Front View of Sun Fire X4800 Oracle Database Server"
  1. Power supplies.

  2. Hard disk drives. The top drives are, from left to right, XL4, XL5, XL6, and XL7. The lower drives are, from left to right, XL0, XL1, XL2, and XL3.

  3. CPU modules. The modules are, from bottom to top, BL0, BL1, BL2, and BL3.

The following image shows the rear view of Sun Fire X4800 Oracle Database Server.

Figure 2-31 Rear View of Sun Fire X4800 Oracle Database Server

Description of Figure 2-31 follows
Description of "Figure 2-31 Rear View of Sun Fire X4800 Oracle Database Server"
  1. Fan modules.

  2. Network Express Module.

  3. InfiniBand EM (CX2) dual port PCI Express modules.