3.3 Oracle Exalogic

This section explains the features and tasks specific to Oracle EXAchk on Oracle Exalogic.

3.3.1 Scope and Supported Platforms for Running Oracle EXAchk on Oracle Exalogic

Oracle EXAchk is a health check tool that is designed to audit important configuration settings in an Oracle Exalogic Elastic Cloud machine.

Oracle EXAchk examines the following components:

  • Compute nodes

  • Storage appliance

  • InfiniBand fabric

  • Ethernet network

  • Oracle Exalogic Control vServers, relevant only in virtual configurations

  • Guest vServers, relevant only in virtual configurations

Oracle EXAchk audits the following configuration settings:

  • Hardware and firmware

  • Operating system kernel parameters

  • Operating system packages

You must run Oracle EXAchk for Oracle Exalogic in the following conditions:

  • After deploying the machine.

  • Before and after patching or upgrading the infrastructure.

  • Before and after making any changes in the system configuration.

  • Before and after any planned maintenance activity.

3.3.2 Prerequisites for Running Oracle EXAchk on Oracle Exalogic

Review the list of prerequisites.

Oracle recommends that you install Oracle EXAchk on the pre-existing share /export/common/general on the ZFS storage appliance on the Exalogic machine. You can then run Oracle EXAchk and access the Oracle EXAchk generated HTML reports from a compute node on which the /export/common/general share is mounted.

For Exalogic machines in a virtual configuration, Oracle recommends that you mount the /export/common/general share on the vServer that hosts the Enterprise Controller component of the Exalogic control stack, and run Oracle EXAchk from that vServer.

To install Oracle EXAchk on the /export/common/general share, you must complete the following steps:

  1. Enable NFS on the /export/common/general share.

  2. Mount the /export/common/general share.

3.3.2.1 Enable NFS on the /export/common/general Share

Before installing Oracle EXAchk on the pre-existing share export/common/general, enable NFS share mode on the share.

  1. In a web browser, enter the IP address or host name of the storage node as follows:
    https://ipaddress:215
    

    or

    https://hostname:215
    
  2. Log in as root.
  3. Click Shares in the top navigation pane.
  4. Place your cursor over the row corresponding to the share /export/common/general.
  5. Click the Edit entry.

    Figure 3-1 Oracle Exalogic - Shares

    Description of Figure 3-1 follows
    Description of "Figure 3-1 Oracle Exalogic - Shares"
  6. On the resulting page, select Protocols in the top navigation pane.
  7. In the NFS section, deselect Inherit from project, and click plus (+) located next to NFS Exceptions.

    Figure 3-2 Oracle Exalogic - Edit Protocols

    Description of Figure 3-2 follows
    Description of "Figure 3-2 Oracle Exalogic - Edit Protocols"
  8. Edit the following in the NFS Exceptions section:

    Table 3-5 NFS Exceptions

    Element Action/Description

    TYPE

    Select Network.

    ENTITY

    Enter the IP address of the host that gains access to the share.

    For example:
    192.168.10.0/24
    

    ACCESS MODE

    Select Read/write.

    CHARSET

    Keep the default setting.

    ROOT ACCESS

    Select the check box.

  9. Click Apply.
  10. Log out.

3.3.2.2 Mount the /export/common/general Share

In this section, compute node el01cn01 is used as the example of the host on which the /export/common/general share is mounted.

Note:

  • For an Oracle Exalogic machine in a virtual configuration running EECS 2.0.6, mount the export/common/general share on the vServer that hosts the Enterprise Controller component of the Exalogic Control stack. Substitute the compute node el01cn01 in this procedure with the host name or IP address of that vServer.

    For an Oracle Exalogic machine running EECS 2.0.4 (virtual), if traffic from the eth-admin network cannot be routed to the EoIB-external-mgmt network, when you run Oracle EXAchk from the Enterprise Controller vServer, then health checks are not performed for the switches and the storage appliance. On such racks, to perform health checks on the physical components, you must mount the export/common/general share on a compute node as well.

  • In a virtual configuration, if you run Oracle EXAchk from a compute node, Oracle EXAchk does not perform health checks for the Exalogic Control components.

  1. Check if the /export/common/general share is already mounted at the /u01/common/general directory on compute node el01cn01.
    You can do this by logging in to el01cn01 as root and running the following command:
    # mount
    
    If the /export/common/general share is already mounted on the compute node, then the output of the mount command contains an entry like the following:
    192.168.10.97:/export/common/general on /u01/common/general ...
    

    In this example, 192.168.10.97 is the IP address of the storage node el01sn01.

    If you see the previous line in the output of the mount command, then skip step 2.

    If the output of mount command does not contain the previous line, perform step 2.

  2. Mount the /export/common/general share at a directory on compute node el01cn01.
    1. Create the directory /u01/common/general to serve as the mount point on el01cn01 as follows:
      # mkdir -p /u01/common/general
      
    2. Depending on the operating system running on the host on which you want to mount the /export/common/general share, complete the following steps:

      Oracle Linux

      Edit the /etc/fstab file by using a text editor like vi, and add the following entry for the mount point that you just created:
      el01sn01-priv:/export/common/general /u01/common/general nfs4 rw,bg,hard,nointr,rsize=131072,wsize=131072,proto=tcp
      

      Oracle Solaris

      Edit the /etc/vfstab file by using a text editor like vi, and add the following entry for the mount point that you just created:
      el01sn01-priv:/export/common/general - /u01/common/general nfs - yes  rw,bg,hard,nointr,rsize=131072,wsize=131072,proto=tcp
      
    3. Save and close the file.
    4. Mount the volumes by running the following command:
      # mount -a
      

3.3.3 Prerequisite for Viewing Oracle EXAchk HTML Report

Review the prerequisite for viewing Oracle EXAchk HTML Report in a web browser.

Enable access to the /export/common/general share through the HTTP/WebDAV Protocol

To enable access to a share through the HTTP/WebDAV protocol, complete the following steps:

  1. In a web browser, enter the IP address or host name of the storage node as follows:
    https://ipaddress:215
    

    Or

    https://hostname:215
    
  2. Log in as root.
  3. Enable the HTTP service on the appliance, by doing the following:
    1. Click Configuration in the top navigation pane.

      Figure 3-3 Oracle Exalogic - Configuration

      Description of Figure 3-3 follows
      Description of "Figure 3-3 Oracle Exalogic - Configuration"
    2. Click HTTP under Data Services.

      Figure 3-4 Oracle Exalogic - Data Services

      Description of Figure 3-4 follows
      Description of "Figure 3-4 Oracle Exalogic - Data Services"
    3. Ensure that the Require client login check box is not selected.

      Figure 3-5 Oracle Exalogic - Client Login

      Description of Figure 3-5 follows
      Description of "Figure 3-5 Oracle Exalogic - Client Login"
    4. Click Apply.
      If the button is disabled, select and deselect the Require client login check box.

      Figure 3-6 Oracle Exalogic - Client Login

      Description of Figure 3-6 follows
      Description of "Figure 3-6 Oracle Exalogic - Client Login"
  4. Enable read-only HTTP access to the /export/common/general share by doing the following:
    1. Click Shares in the top navigation pane.
    2. Place your cursor over the row corresponding to the /export/common/general share.
    3. Click the Edit entry button (pencil icon) near the right edge of the row.
    4. On the resulting page, click Protocols in the navigation pane.
    5. Scroll down to the HTTP section.
    6. Deselect the Inherit from project check box.
    7. In the Share mode field, select Read only.

      Figure 3-7 Oracle Exalogic - Share Mode

      Description of Figure 3-7 follows
      Description of "Figure 3-7 Oracle Exalogic - Share Mode"
    8. Click APPLY .
  5. Log out.

3.3.4 Installing and Upgrading Oracle EXAchk on Oracle Exalogic

Follow these instructions to install and upgrade Oracle EXAchk on Oracle Exalogic.

3.3.4.1 Installing Oracle EXAchk on a Physical Oracle Exalogic Machine

Follow these instructions to install Oracle EXAchk on a physical Oracle Exalogic machine.

Install Oracle EXAchk in the /export/common/general share by completing the following steps:

  1. Ensure that /export/common/general share is mounted on the compute node el01cn01.
  2. SSH to the compute node el01cn01.
  3. Create a sub-directory named exachk in the /u01/common/general/ directory to hold the Oracle EXAchk binaries:
    # mkdir /u01/common/general/exachk
    
  4. Go to the /u01/common/general/exachk directory.
  5. Download the exachk.zip file.
  6. Extract the contents of the exachk.zip file.
    # unzip exachk.zip
    
    The Oracle EXAchk tool is now available at the following location on compute node el01cn01:
    /u01/common/general/exachk/exachk
    

3.3.4.2 Installing Oracle EXAchk on a Virtual Oracle Exalogic Machine

Follow these instructions to install Oracle EXAchk on a virtual Oracle Exalogic machine.

Install Oracle EXAchk in the /export/common/general share by completing the following steps:

  1. Ensure that /export/common/general share is mounted on the vServer that hosts the Enterprise Controller.

    Note:

    For an Exalogic machine running EECS 2.0.4 (virtual), if traffic from the eth-admin network cannot be routed to the EoIB-external-mgmt network, when you run Oracle EXAchk from the Enterprise Controller vServer, health checks will not be performed for the switches and the storage appliance. On such racks, to perform health checks on the physical components, you must mount the export/common/generalshare on a compute node as well.

  2. SSH to the vServer.
  3. Create a subdirectory named exachk in /u01/common/general/ to hold the EXAchk binaries:
    # mkdir /u01/common/general/exachk
    

    Note:

    If the vServer is down or otherwise inaccessible, then you can run Oracle EXAchk from a compute node. However, in this case, the health checks are performed for the Exalogic Control components.
  4. Go to the /u01/common/general/exachk directory.
  5. Download the exachk.zip file.
  6. Extract the contents of the exachk.zip file.
    # unzip exachk.zip
    
    The Oracle EXAchk tool is now available at the following location on compute node el01cn01:
    /u01/common/general/exachk/exachk
    

3.3.4.3 Upgrading Oracle EXAchk on Oracle Exalogic

Follow these instructions to upgrade Oracle EXAchk on Oracle Exalogic.

  1. Back up the directory containing the existing Oracle EXAchk binaries by moving it to a new location.
    For example, if the Oracle EXAchk binaries are currently in the directory /u01/common/general/exachk, then move them to a directory named exachk_05302012 by running the following commands:
    # cd /u01/common/general
    # mv exachk exachk_05302012
    

    In this example, the date when Oracle EXAchk is upgraded (05302012) is used to uniquely identify the backup directory. Pick any unique naming format, like a combination of the backup date and the release number and use it consistently.

  2. Create the exachk directory afresh.
    $ mkdir /u01/common/general/exachk
    
  3. Install the latest version of Oracle EXAchk.

3.3.5 Oracle EXAchk on Oracle Exalogic Usage

For optimum performance of the Oracle EXAchk tool, Oracle recommends that you complete the following steps.

  • Oracle EXAchk is a minimal impact tool. However, Oracle recommends that you run Oracle EXAchk when the load on the system is low. The runtime of Oracle EXAchk depends on the number of nodes to check, CPU load, network latency, and so on.

  • Do not run the scripts in the Oracle EXAchk directory unless specifically documented.

  • To avoid problems while running the tool from terminal sessions on a workstation or laptop, connect to the Exalogic machine and then run Oracle EXAchk by using VNC. Even if a network interruption occurs, Oracle EXAchk continues to run.

  • Run Oracle EXAchk as root.

3.3.5.1 Performing Health Checks for Oracle Exalogic Infrastructure

Perform health checks in a virtual or physical rack.

3.3.5.1.1 Prerequisites for Running Health Checks on Oracle Exalogic Infrastructure

The term infrastructure is used here to indicate the compute nodes, switches, storage appliance, and, also, the Exalogic Control stack if a machine in a virtual configuration.

Before running Oracle EXAchk for the Oracle Exalogic infrastructure components, ensure to meet the following prerequisites:

  • Ensure that Oracle EXAchk is installed as described in Installing Oracle EXAchk.

  • Before running Oracle EXAchk for the first time, make a note of the short names of the storage nodes and switches: el01sn01, el01sw-ib01, and so on. Oracle EXAchk prompts you for these names at the start of the health check process. This is a one time prompt. Oracle EXAchk stores the names you provide, and uses the stored names for subsequent runs.

3.3.5.1.2 Running Oracle EXAchk for Physical Racks

Perform health checks for all the infrastructure components in an Oracle Exalogic machine in a physical Linux or Solaris configuration.

  1. SSH as root to the compute node on which you installed Oracle EXAchk.
  2. Go to the directory where you have installed Oracle EXAchk.
    # cd /u01/common/general/exachk
    
  3. Run the following command:
    # ./exachk
    

When running Oracle EXAchk for the first time, the tool:

  • Detects the size of the Exalogic rack

  • Prompts for the host name or IP address of the switch and storage node

For information about overriding the IP addresses and host names set during the first run.

3.3.5.1.3 Running Oracle EXAchk for Virtual Racks

Perform health checks for all the infrastructure components in an Oracle Exalogic machine in a virtual configuration.

  1. SSH as root to the vServer that hosts the Enterprise Controller.
  2. Go to the directory where you have installed Oracle EXAchk.
    # cd /u01/common/general/exachk
    
  3. Run the following command:
    # ./exachk
    

Oracle EXAchk automatically discovers the IP addresses or host names of all the components in the machine, and starts performing the health checks.

3.3.5.1.3.1

For an Exalogic machine running EECS 2.0.4 (virtual), if traffic from the eth-admin network is not routed to the EoIB-external-mgmt network when you run Oracle EXAchk from the Enterprise Controller vServer, Oracle EXAchk does not run health checks for the switches and storage heads.

On such racks, do the following to perform health checks on all the components:

  1. Perform health checks for the Oracle Exalogic Control components:
    1. SSH as root to the Enterprise Controller vServer.
    2. Go to the directory where you have installed Oracle EXAchk.
      # cd /u01/common/general/exachk
      
    3. Run the following command:
      # ./exachk -profile control_VM
      

      Oracle EXAchk reports that all the checks on the compute nodes passed. However, this command did not perform any health checks on the compute nodes, the storage appliance, and the switches.

  2. Perform health checks for the physical components, such as compute nodes, storage appliance, and switches:
    1. SSH as root to the compute node on which you installed Oracle EXAchk.
    2. Ensure that passwordless SSH to the Oracle VM Manager CLI shell is enabled.
    3. Go to the directory where you have installed Oracle EXAchk.
      # cd /u01/common/general/exachk
      
    4. Run the following command:
      # ./exachk -profile el_extensive
      

3.3.5.1.4 Running Oracle EXAchk for Hybrid Racks

Perform health checks for all the infrastructure components in an Oracle Exalogic machine in a hybrid configuration, that is, a machine on which half the nodes are running Oracle VM Server and the other half are on Oracle Linux.

  1. SSH as root to the vServer that hosts the Enterprise Controller component of the Exalogic Control stack.
  2. Go to the directory where you have installed Oracle EXAchk.
    # cd /u01/common/general/exachk
    
  3. Run the following command:
    ./exachk -hybrid -phy physical_node_1[,physical_node_2,...]
    

In this command, physical_node_1physical_node_2, and so on, are the eth-admin IP addresses of the compute nodes running Oracle Linux.

The -phy physical_node_1[,physical_node_2,...] must be specified only the first time you run Oracle EXAchk with the -hybrid option. Oracle EXAchk stores the host names in the exachk_exalogic.conf file. For subsequent runs, you can run Oracle EXAchk without specifying the -phy option. Oracle EXAchk uses the host names stored in the exachk_exalogic.conf  file.

3.3.5.2 Performing Health Checks for Guest vServers

Run Oracle EXAchk to perform health checks for guest vServers.

3.3.5.2.1 Prerequisites for Running Health Checks on Guest vServers

Before running Oracle EXAchk on guest vServers, ensure to meet the prerequisites.

  • Install Oracle EXAchk as described in Installing Oracle EXAchk.

  • Install IaaS CLI and API on the vServer that hosts the Enterprise Controller. Note that the IaaS CLI and API are pre-installed on the Enterprise Controller vServer in EECS 2.0.4.

To verify this prerequisite, check whether the /opt/oracle/iaas/cli and /opt/oracle/iaas/api directories exist on the vServer. If the directories exist, then the IaaS CLI and API are installed.

3.3.5.2.2 Installing IaaS CLI and API

  1. Go to https://edelivery.oracle.com.
    1. Sign in by using your Oracle account.
  2. Read and accept the Oracle Software Delivery Cloud Trial License Agreement and the Export Restrictions.
    1. Click Continue.
  3. In the Select a Product Pack field, select Oracle Fusion Middleware.
    1. In the Platform field, select Linux x86-64.
    2. Click Go.
  4. In the results displayed, select Oracle Exalogic Elastic Cloud Software 11g Media Pack, and click Continue.
  5. Look for Oracle Exalogic version IaaS Client for Exalogic Linux x86-64 (64–bit), and download the appropriate version – 2.0.4.0.0, 2.0.6.0.0, or 2.0.6.0.1 depending on the EECS release installed on the Exalogic machine.
  6. Unzip the downloaded file.
  7. Install both the RPMs by running the following command in the directory in which you unzipped the RPMs:
    rpm -i *.rpm
    

3.3.5.2.3 Additional Prerequisites for STIG-hardened vServers

You can harden guest vServers using the STIGfix tool. The STIGfix tool is packaged as part of the Exalogic Lifecycle Toolkit.

Download the toolkit installer and tar bundle.

Refer to My Oracle Support Note for toolkit install instructions.

To run Oracle EXAchk on STIG-hardened vServers, you must perform the following prerequisites:

  • Run Oracle EXAchk on STIG-hardened vServers separately from other guest vServers.

  • The vServer that hosts the Enterprise Controller and the STIG-hardened guest vServers must have the same user with sudo privileges.

    You can create these users by doing the following.

    Create the account on the vServer hosting Enterprise Controller as follows:

  1. Log in to the vServer hosting Enterprise Controller as root.
  2. Run the following scripts to create the account ELAdmin:
    # useradd -d /home/ELAdmin -s /bin/bash -m ELAdmin
    # echo "ELAdmin:<password>"|chpasswd
    # echo "PATH=$PATH.:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin" >>/home/ELAdmin/.bashrc
    # usermod -a -G oinstall ELAdmin
    
    1. Replace password with a password of your choice.
  3. Run the visudo command.
  4. Under ## Allows people in group wheel to run all commands, add the following line:
    %ELAdmin ALL=(ALL) ALL
    
  5. Under ## Same thing without a password, add the following line:
    %ELAdmin ALL=(ALL) NOPASSWD: ALL
    
  6. Save the file.
3.3.5.2.3.1

Create the ELAdmin account that you created on the Enterprise Controller, on the guest vServer that is STIG-hardened as follows:

  1. Log in to the vServer that is STIG-hardened.
  2. Switch to the root user by running the following command:
    su root
    
  3. Run the following scripts to create the account ELAdmin:
    # useradd -d /home/ELAdmin -s /bin/bash -m ELAdmin
    # echo "ELAdmin:<password>"|chpasswd
    # echo "PATH=$PATH.:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin" >>/home/ELAdmin/.bashrc
    
    1. Replace password with a password of your choice.
  4. Run the visudo command.
  5. Under ## Allows people in group wheel to run all commands, add the following line:
    %ELAdmin ALL=(ALL) ALL
    
  6. Under ## Same thing without a password, add the following line:
    %ELAdmin ALL=(ALL) NOPASSWD: ALL
    
  7. Save the file.

3.3.5.2.4 Running Oracle EXAchk for vServers That are Not STIG-hardened

Perform health checks for all the guest vServers that are not STIG-hardened, in a vDC on an Oracle Exalogic machine.

  1. SSH as root to the vServer that hosts the Enterprise Controller.
  2. Go to the directory in which you installed Oracle EXAchk.
    # cd /u01/common/general/exachk
    
  3. Create a set of .out files, one for each Cloud User.

    Name the files as, for example, guest_vm_ip_user.out, where user is a Cloud User.

    In each .out file, specify the IP addresses of the guest vServers created by the Cloud User.

    The guest_vm_ip_user.out has the following format:
    ip_address_of_guest_vserver1
    ip_address_of_guest_vserver2
    ip_address_of_guest_vserver3
    
  4. Run Oracle EXAchk with the -vmguest option, and specify one or more guest_vm_ip_user.out files as arguments depending upon the users for which you want to perform health checks for guest vServers.
    # ./exachk -vmguest guest_vm_ip_user-1.out[,guest_vm_ip_user-1.out,...]
    

3.3.5.2.5 Running Oracle EXAchk for STIG-hardened vServers

Run Oracle EXAchk for STIG-hardened vServers.

  1. Log in as root on the vServer that hosts the Enterprise Controller.
  2. Switch to the ELAdmin user by running the following command:
    su - ELAdmin
    

    Note:

    When running Oracle EXAchk on STIG-hardened vServers, Oracle recommends using only the ELAdmin user which you created above.

    Create the guest_vm_ip_user.out manually. The guest_vm_ip_user.out has the following format:

    ip_address_of_stig_hardened_guest_vserver1
    ip_address_of_stig_hardened_guest_vserver2
    ip_address_of_stig_hardened_guest_vserver3
    
  3. Run Oracle EXAchk with the -vmguest option, and specify one or more guest_vm_ip_user.out files as arguments, depending on the users for which you want to perform health checks for guest vServers:
    # ./exachk -vmguest guest_vm_ip_user-1.out[,guest_vm_ip_user-1.out,...]
    

3.3.5.3 About the Oracle EXAchk Health Check Process

Review the Oracle EXAchk start up sequence of events.

  1. At the start of the health check process, Oracle EXAchk prompts you for the names of the storage nodes and switches.

    At the prompt, enter the names or IP addresses of the storage nodes and switches. This is a one time process. Oracle EXAchk remembers these values and uses them for the subsequent health checks.

    $ ./exachk
    
    Could not find infiniband gateway switch names from env or configuration file. Please enter the first gateway infiniband switch name : el01sw-ib02
    Could not find storage node names from env or configuration file. Please enter the first storage server : el01sn01
    
    Checking ssh user equivalency settings on all nodes in cluster
    
    Node el01cn02 is configured for ssh user equivalency for root user
    
    Node el01cn03 is configured for ssh user equivalency for root user
    
    Node el01cn04 is configured for ssh user equivalency for root user
    
    Node el01cn05 is configured for ssh user equivalency for root user
    
    Node el01cn06 is configured for ssh user equivalency for root user 
    

    Note:

    Enter the host names or IP addresses for the nodes, in the sequence in which they are arranged on the machine.

  2. The health check tool checks the SSH user equivalency settings on all the nodes in the cluster.

    Oracle EXAchk is a non-intrusive health check tool. Therefore, it does not change anything in the environment. The tool verifies the SSH user equivalency settings, assuming that it is configured on all the compute nodes on the system:
    • If the tool determines that the user equivalence is not established on the nodes, it provides you an option to set the SSH user equivalency either temporarily or permanently.

    • If you choose to set SSH user equivalence temporarily, then Oracle EXAchk does this during the health check. However, after the completion of the health check, Oracle EXAchk returns the system to the state in which it found.

    When Oracle EXAchk prompts you to specify your preference. Enter the password for the nodes for which you are prompted. The default preference, 1, allows you to enter the root password once for all the nodes on each host of the Oracle Exalogic machine.

    Using cached file /root/exachk/o_ibswitches.out for gateway infiniband switches list ....
    
    Using cached file /root/exachk/o_storage.out for storage nodes list ....
    
    Checking ssh user equivalency settings on all nodes in cluster
    
    Node 0 is configured for ssh user equivalency for root user
    
    Node 0 is configured for ssh user equivalency for root user
    
    root user equivalence is not setup between 2 and STORAGE SERVER.
    
    1. Enter 1 if you will enter root password for each STORAGE SERVER when prompted.
    
    2. Enter 2 to exit and configure root user equivalence manually and re-run exachk.
    
    3. Enter 3 to skip checking best practices on STORAGE SERVER.
    
    Please indicate your selection from one of the above options[1-3][1]:- 1-3
    
    Is root password same on all STORAGE SERVER?[y/n][y]
    

    On confirming the option and entering the credentials to proceed, Oracle EXAchk creates various output files, log files, and collection files for collecting the data required for the health check.

    Preparing to run root privileged commands on INFINIBAND SWITCH el01sw-ib04.
    
    root@el01sw-ib04's password: 
    Collecting -  Environment Test
    Collecting -  Ethernet over infiniband data and control SL
    Collecting -  Free Memory
    Collecting -  Gateway Configuration
    Collecting -  Infiniband status
    Collecting -  List Link Up
    Collecting -  Localhost Configuration in /etc/hosts
    Collecting -  VNICS
    Collecting -  Version
    Collecting -  configvalid
    Collecting -  opensm
    
    Preparing to run root privileged commands on INFINIBAND SWITCH el01sw-ib05.
    
    root@el01sw-ib05's password: 
    Collecting -  Environment Test
    Collecting -  Ethernet over infiniband data and control SL
    Collecting -  Free Memory
    Collecting -  Gateway Configuration
    Collecting -  Infiniband status
    Collecting -  List Link Up
    Collecting -  Localhost Configuration in /etc/hosts
    Collecting -  VNICS
    Collecting -  Version
    Collecting -  configvalid
    Collecting -  opensm 
    
  3. Oracle EXAchk checks the status of the components of the Oracle Exalogic stack, such as compute nodes, storage nodes, and InfiniBand switches. Depending on the status of each component, the tool runs the appropriate collections and audit checks.

    ==================================================================
                                            Node name - 0
    ==================================================================
            WARNING =>   NTP is not synchronized correctly.
            INFO =>      One or more NFS Mount Points don't sue the current recommended NFSv4.
            WARNING =>   One or more NFS Mount Points uses incorrect rsize or wsize.
            WARNING =>   Virtual Memory is not tuned to the recommended configuration.
            WARNING =>   Ypbind is not configured correctly.
            WARNING =>   DNS service is not configured correctly.
            WARNING =>   IP Configuration for eth0 and bond0 are not configured correctly.
            INFO =>              EoIB Setup is not set up.
            INFO =>              Please verify BIOS Setting. See the Action / Repair section for instructions.
            WARNING =>   Lock Daemon Configuration is not configured correctly.
    ==================================================================
                                            Node name - 0
    ==================================================================
            WARNING =>   NTP is not synchronized correctly.
            INFO =>      One or more NFS Mount Points don't sue the current recommended NFSv4.
            WARNING =>   One or more NFS Mount Points uses incorrect rsize or wsize.
            WARNING =>   Virtual Memory is not tuned to the recommended configuration.
            WARNING =>   Ypbind is not configured correctly.
            WARNING =>   DNS service is not configured correctly.
            WARNING =>   IP Configuration for eth0 and bond0 are not configured correctly.
            INFO =>              EoIB Setup is not set up.
            INFO =>              Please verify BIOS Setting. See the Action / Repair section for instructions.
            WARNING =>   Lock Daemon Configuration is not configured correctly.
    
  4. Oracle EXAchk runs in the background monitoring the progress of the command run. If any of the commands times out, Oracle EXAchk either skips or terminates that command so that the process continues. Oracle EXAchk logs such cases in the log files.

    If Oracle EXAchk stops running for any reason, it cannot resume or restart automatically. You must start Oracle EXAchk afresh. However, before running Oracle EXAchk again, complete the following steps:
    • Verify whether the previous Oracle EXAchk process has been terminated, by running the following command:
      # ps -ef | grep exachk
      
      If the Oracle EXAchk process is still running, terminate it by running the following command:
      # kill pid
      

      In this command pid  is the process ID of the Oracle EXAchk process that you want to terminate.

    • Verify if /tmp/.exachk/, the temporary directory generated by Oracle EXAchk during the previous run is deleted. If the directory still exists, delete it.

  5. When Oracle EXAchk completes the health check, it produces an HTML report and a zip file.

3.3.5.4 Running Oracle EXAchk in Silent Mode

When you run Oracle EXAchk in silent mode, it does not perform health checks for storage nodes and InfiniBand switches.

To run root privilege checks, Oracle EXAchk uses theroot_exachk.sh script.

Before running Oracle EXAchk in silent mode, ensure to meet the following prerequisites:

  1. Configure SSH user equivalence for the root user from the compute node on which Oracle EXAchk is staged to all the other compute nodes on which you plan to run the health check tool.

    To verify SSH user equivalence, log in by using the Oracle software owner credentials and run the SSH command.

    For example:
    $ ssh -o NumberOfPasswordPrompts=0 -o StrictHostKeyChecking=no -l oracle el01cn01 "echo \"oracle user equivalence is setup correctly\""
    

    In this example, oracle is the Oracle software owner, and el01cn01 is the compute node host name.

    If the SSH user is not properly configured on the compute nodes, the tool displays the following message:
    Permission denied (publickey,gssapi-with-mic,password)
    

    Upgrading Multiple Nodes Simultaneously section in My Oracle Support Note 1446396.1, for more information about configuring passwordless login.

  2. (required only for the -s option) Add the following line to the sudoers file on each compute node by using the visudo command:
    oracle ALL=(root) NOPASSWD:/tmp/root_exachk.sh
    

3.3.5.5 Overriding Discovered Component Addresses

In a physical environment, the component IP addresses or host names are determined in the first run based on user input. In a virtual environment, Oracle EXAchk has an in-built mechanism to automatically discover the IP addresses or host names of all the components. These features are designed to minimize the need for end-user input.

However, if the components were entered incorrectly during the first run or the auto-discovery mechanism fails to identify the components correctly, then do the following to override the values:
  • If you are running Oracle EXAchk from a compute node, then do the following:

    • To override the names of the IB switches, edit or create the file o_ibswitches.out in the directory that contains the exachk binary. The file should contain a list of host names of the NM2-GW switches, each on a separate line.

    • To override the names of the storage components, edit or create the file o_storage.out in the directory that contains the exachk binary. The file should contain a list of host names of the storage heads, each on a separate line.

    • To override the names of the compute nodes, add the environment variable named RAT_CLUSTERNODES, and specify a list of the host names separated by a space, as the value of the variable.
      export RAT_CLUSTERNODES="el01cn01 el01cn02 el01cn03 el01cn04"
      
  • If you are running Oracle EXAchk from the vServer that hosts the Enterprise Controller component of the Exalogic Control stack, you must use a file named exachk_exalogic.conf to define the names of the components.

    The exachk.zip contains the following templates for exachk_exalogic.conf in the templates subdirectory:

    • exachk_exalogic.conf.tmpl_full

    • exachk_exalogic.conf.tmpl_half

    • exachk_exalogic.conf.tmpl_quarter

    • exachk_exalogic.conf.tmpl_eight

    Copy the template that corresponds to the size of your Exalogic machine to the directory that contains the exachk binary, and rename the template file to exachk_exalogic.conf.

    Modify exachk_exalogic.conf to match your IP address schema.

    Note:

    Oracle recommends that you create a copy of the exachk_exalogic.conf file that Oracle EXAchk generates the first time when the system is fully populated and functional, so that you can use the file later.

3.3.5.6 Setting Environment Variables for Local Issues

Oracle EXAchk attempts to derive all the data it needs from the environment in which it is run.

However, at times, the tool does not work as expected due to local system variations. In such cases, you can use local environment variables to override the default behavior of Oracle EXAchk.

Table 3-6 Oracle EXAchk Environment Variables

Environment Variables Description Example

RAT_OS

Enables the utility to verify the platform information.

For a 64-bit Oracle Enterprise Linux 5 machine, with x86 architecture, use the following command to set the RAT_OS variable:
export RAT_OS=LINUXX8664OELRHEL5
For a 64-bit Oracle Solaris 11 machine, with x86 architecture, use the following command to set the RAT_OS variable:
export RAT_OS=SOLARISX866411

RAT_SSHELL

Redirects Oracle EXAchk to the default secure shell location.

export RAT_SSHELL="/usr/bin/ssh -q"

RAT_SCOPY

Redirects Oracle EXAchk to the default secure copy (SCP) location.

export RAT_SCOPY="/usr/bin/scp -q"

RAT_LOCALONLY

If set to 1, then directs Oracle EXAchk to perform health checks on only the compute node from which Oracle EXAchk is run; that is, Oracle EXAchk skips the checks for the storage nodes, the switches, and all the compute nodes other than one from which it is run.

To direct Oracle EXAchk to perform health checks on only the compute node from which Oracle EXAchk is run, use the following command:
export RAT_LOCALONLY=1

RAT_CELLS

Directs Oracle EXAchk to run checks on one of the two storage nodes.

If the names of the storage nodes are non-standard, then edit the theo_storage.out file that is located in the same directory where Oracle EXAchk is installed, and specify the name of the storage node.

To direct Oracle EXAchk to run checks on the second storage node, use the following command:
export RAT_CELLS="el01sn02"

RAT_SWITCHES

Directs Oracle EXAchk to run checks on subsets of the InfiniBand switches, in addition to the default checks on the InfiniBand switches.

If the names of the switches are non-standard, then edit the theo_ibswitches.out file that is located in the same directory where Oracle EXAchk is installed, and specify the names of the switches.

To direct Exact to run on the InfiniBand switch el01sw-ib02 and its subsets, use the following command:
export RAT_IBSWITCHES="el01sw-ib02"

RAT_CLUSTERNODES

Directs Oracle EXAchk to run checks on specific nodes.

On a quarter rack, which has eight compute nodes, use the following command to list the compute nodes on which the health check needs to be performed:
export RAT_CLUSTERNODES="el01cn01 el01cn02 el01cn03 el01cn04 el01cn05 el01cn06 el01cn07 el01cn08"

RAT_ELRACKTYPE

Indicates whether the machine is an eighth rack (0), quarter rack (1), half rack (2), or full rack (3).

To specify that the system is a full rack, use the following command:
export RAT_ELRACKTYPE="3"

Note:

In a virtual configuration, when running Oracle EXAchk from the vServer that hosts the Enterprise Controller component of the Exalogic Control stack, do not use the RAT_CELLS, RAT_SWITCHES, and RAT_CLUSTERNODES variables to override the storage node, switches, and compute nodes for which Oracle EXAchk should perform health checks. Instead, use the exachk_exalogic.conf file.

3.3.5.7 External ZFS Storage Appliance

For Exalogic systems, support has been added to run health checks on External ZFS Storage appliances. The results of these checks are displayed in the External ZFS Storage Appliance of the report.

Figure 3-8 External ZFS Storage Appliance

Description of Figure 3-8 follows
Description of "Figure 3-8 External ZFS Storage Appliance"

3.3.6 Oracle EXAchk on Oracle Exalogic Output

Identify the checks that you must act immediately to remediate, or investigate further to assess the checks that might cause performance or stability issues.

Reading and Interpreting the Oracle EXAchk HTML Report

You can view the Oracle EXAchk HTML report in a browser by using an HTTP URL as shown in the following example:
http://el01sn01/export/common/general/exachk/exachk_el01cn01_053112_101705/exachk_el01cn01_053112_101705.html

In this example, el01sn01 is the name of the storage node, el01cn01 is the name of the compute node on which the share is mounted, and 053112_101705 is the date and time stamp for the report.

The following is specific to Oracle EXAchk on Oracle Exalogic:

Table 3-7 Oracle EXAchk on Oracle Exalogic Message Definitions

Message Status Description or Possible Impact Action to be Taken

FAIL

Shows checks that did not pass due to issues.

Address the issue immediately.

WARNING

Shows checks that might cause performance or stability issues if not addressed.

Investigate the issue further.

ERROR

Shows errors in system components.

Take corrective measures, and restart Oracle EXAchk.

INFO

Indicates information about the system.

Read the information displayed in these checks, and follow the instructions provided, if any.

System-Wide Firmware and Software Versions

This section lists the firmware and software versions of all the components for which the health check was performed.

Skipped Nodes

This section lists components for which Oracle EXAchk did not perform any health check. Skipped components are those that, typically, Oracle EXAchk cannot access.

The following table lists the typical situations when Oracle EXAchk skips a component and the solutions for each situation:

Table 3-8 Oracle EXAchk on Oracle Exalogic Skipped Nodes

Situation Solution

The IP address of the component is incorrect or the host name cannot be resolved.

Update exachk_exalogic.conf or the o*.out files, as appropriate, with the correct IP addresses, and run Oracle EXAchk again.

The component is not running.

Ping or SSH to the component. If the ping or SSH command fails, ensure that the component is started. Then, run Oracle EXAchk again.

The network is congested and slow, causing an SSH time-out.

Try increasing the value of the environment variable, RAT_TIMEOUT, and run Oracle EXAchk again.

The component is overloaded and low on memory, causing a password time-out.

Try increasing the value of the environment variable, RAT_PASSWORDCHECK_TIMEOUT, and run Oracle EXAchk again.

Comparing Component Versions in Two Oracle EXAchk Collections

You can use the -exadiff option of Oracle EXAchk to compare two Oracle EXAchk collections. When you use this option, Oracle EXAchk generates a comparison report in HTML format, highlighting the differences in the versions of the infrastructure components, hardware, firmware, and software between the two reports. The two Oracle EXAchk reports can be for different Oracle Exalogic racks or at different points in time for the same rack, such as before and after upgrading the rack.

To compare two Oracle EXAchk collections, complete the following steps:

  1. Identify the two Oracle EXAchk collections, zip files that you want to compare.

  2. If the collections do not exist on the host, compute node or vServer on which you are running Oracle EXAchk, then copy the collections to the host.

  3. Run the following command:
    ./exachk -exadiff collection_1 collection_2
    

    In this command, collection_1 and collection_2 are the full paths and names of the two collections that you want to compare. You can specify either the collection zip file or the directory in which the zip file has been extracted.

  4. Wait for the command to finish running.

    After comparing the two collections, Oracle EXAchk saves the results of the comparison in an HTML file named rack_comparison_date_time.html, for example, rack_comparison_131219_213435.html.

You can view the HTML report in a browser by using an HTTP URL as shown in the following example:

Example 3-1 Comparing Component Versions in Two Oracle EXAchk Collections

http://el01sn01/export/common/general/exachk/rack_comparison_131219_213435.html

In this example, el01sn01 is the name of the active storage node, /common/general is the share in which the Oracle EXAchk reports are stored, and 131219_213435 is the date and time stamp for the report.

3.3.7 Oracle EXAchk on Oracle Exalogic Command-Line Options

List of command-line options applicable to Oracle Exalogic.

Command Options Applicable to Oracle Exalogic

Note:

Oracle EXAchk daemon option -d is not supported on Oracle Exalogic.

Table 3-9 Command Options Applicable to Oracle Exalogic

Option Purpose and Syntax

-clusternodes

Performs checks on only the specified compute nodes and all the other components, and exclude the unspecified compute nodes.

Syntax:
./exachk -clusternodes cn_1[,cn_2,...]

-diff

Compares two Oracle EXAchk HTML reports and generate an HTML report showing the changes in the health of the Exalogic rack between Oracle EXAchk runs.

Syntax:
# ./exachk -diff report1 report2 [-outfile compared_report.html]

-exadiff

Compares two Oracle EXAchk zip collections and generate an HTML report showing the differences in the versions of the infrastructure components, hardware, firmware, and software between the two reports. The two Oracle EXAchk reports can be for different Exalogic racks or at different points in time for the same rack, such as before and after upgrading the rack.

Syntax:
./exachk -exadiff exachk_collection_zip_1 exachk_collection_zip_2

-f

Performs checks on already collected data.

Syntax:
./exachk -f report_name

-vmguest

Performs checks for guest vServers as well.

Syntax:
./exachk -vmguest conf_file_1[,conf_file_2,...]

-hybrid

Performs checks on physical nodes as well in a hybrid rack.

Syntax:
./exachk -hybrid

-localonly

Perform checks for only the host on which Oracle EXAchk is running.

Syntax:
./exachk -localonly

-nopass

Excludes passed checks from the HTML report.

Syntax:
./exachk -nopass

-o v

Displays results for all checks, including those that passed.

Syntax:
./exachk -o v

-phy

Use this option along with -hybrid, to specify the physical nodes in a hybrid rack.

Syntax:
./exachk -hybrid -phy node_1[,node_2,...]

-profile

Performs specific checks or checks for specific components.

Syntax:
./exachk -profile profile_name

See Supported Profiles for the -profile option, for more details.

-s or -S

Runs Oracle EXAchk in silent mode.

Syntax:
./exachk -s

-v

Displays the version of the tool.

Syntax:
./exachk -v

Supported Profiles for the -profile Option

Table 3-10 Supported Profiles for the -profile option

Profile Description

control_VM

Runs health checks for only the Oracle Exalogic control components.

el_extensive

In addition to the standard set of checks, run the following checks that are useful for a freshly installed or upgraded machine:

  • Verify whether the BIOS on the compute nodes is configured correctly.

  • Verify whether PCI 64-bit resource allocation setting on the compute nodes is disabled.

  • In Oracle VM Manager, for each server pool name, verify whether VM Start Policy is set to Start on current server.

Note:

Before running Oracle EXAchk with the el_extensive profile, verify whether passwordless SSH has been enabled for the CLI shell of Oracle VM Manager.

switch

Runs checks for the switches.

virtual_infra

Runs checks for the Oracle Exalogic virtual infrastructure. This check is applicable to only Oracle Exalogic machines in a virtual configuration.

zfs

Runs checks for the storage appliance.

3.3.7.1 Verifying and Enabling Passwordless SSH to the Oracle VM Manager CLI

Before running Oracle EXAchk with the el_extensive profile, you must verify whether passwordless SSH is enabled for the CLI shell of Oracle VM Manager.

To do this, try logging in through SSH to the Oracle VM Manager CLI shell by running the following command on the host running the Oracle VM Manager vServer:
# ssh -l admin host_name_of_localhost -p 10000

host_name_of_localhost is the host name of the localhost.

If you can log in without having to enter a password, that is, if the OVM> prompt is displayed, then passwordless SSH is enabled.

If a password prompt is displayed, do the following:

  1. Enter the password for the admin user, default is welcome1.
  2. Log out from the OVM> shell, and try logging in again through SSH. If the password prompt continues to display, then passwordless SSH is not enabled. To enable passwordless SSH to the Oracle VM Manager CLI, complete the following steps:
    1. SSH as root to the vServer that hosts the Oracle VM Manager.
    2. Ensure that the ssh agent is running:
      # eval 'ssh-agent'
      

      The output is similar to following example: Agent pid 18529

    3. Generate a public/private key pair:
      # ssh-keygen -t rsa -f ~/.ssh/admin
      

      If the ssh agent is not running, the following error message is displayed: Could not open a connection to your authentication agent.

      When prompted for a pass phrase, press Enter.

      The keys are generated and stored in the ~/.ssh/ directory. The admin file contains the private key and the admin.pub file contains the public key.

    4. Add the private key to the authentication agent:
      # ssh-add ~/.ssh/admin
      Identity added: /home/user/.ssh/admin (/home/user/.ssh/admin)
      
      Copy the public key to the .ssh directory in the oracle user's home directory:
      # cp ~/.ssh/admin.pub /home/oracle/.ssh/
      
    5. Append the file containing the public key, that is, admin.pub to the ovmcli_authorized_keys file:
      # cd /home/oracle/.ssh/# cat admin.pub >> ovmcli_authorized_keys
      
    6. SSH as the admin user to the Oracle VM Manager CLI:
      # ssh -l admin localhost -p 10000
      

      At the prompt to continue connecting, enter yes.

      At the prompt for the password, enter the admin user's password.

      The following shell is displayed: OVM>

      For subsequent logins, the newly established passwordless SSH channel is used.

3.3.8 Troubleshooting Oracle EXAchk on Oracle Exalogic

Troubleshoot and fix Oracle EXAchk on Oracle Exalogic issues.

Refer to My Oracle Support Note 1478378.1 for the latest known issues specific to Oracle EXAchk on Oracle Exalogic.

Contacting Support with Oracle EXAchk Report

  1. Run Oracle EXAchk with the –profile el_extensive option to include a larger set of health checks in the generated HTML report:
    ./exachk -profile el_extensive
    

    Contact Support with Oracle EXAchk result bundle as needed for further assistance.

  2. To get assistance from Oracle Support on problems related to running Oracle EXAchk or issues related to generating complete Oracle EXAchk report, run the Oracle EXAchk command with -debug option:
    ./exachk -debug
    

    Contact Support with the resulting output zip file.