3.6 Oracle Big Data Appliance

This section explains the features and tasks specific to Oracle EXAchk on Oracle Big Data Appliance.

3.6.1 Scope and Supported Platforms for Running Oracle EXAchk on Oracle Big Data Appliance

Oracle EXAchk for Oracle Big Data Appliance supports all Oracle Big Data Appliance versions later than 2.0.1.

Oracle EXAchk for Oracle Big Data Appliance audits important configuration settings within an Oracle Big Data Appliance. Oracle EXAchk examines the following components:

  • CPU

  • Hardware, firmware, and BIOS

  • Operating System kernel parameters, system packages

  • Ethernet network, InfiniBand switches

  • RAM, hard disks

  • Software Installed

Goals for Oracle Big Data Appliance Health Checks

  1. Provide a mechanism to check the complete health of an Oracle Big Data Appliance on a proactive and reactive basis.

  2. Provide a “recommendation engine” for best practices and tips to fix Oracle Big Data Appliance known issues.

Recommended Validation Frequency

Oracle recommends validating Oracle Big Data Appliance immediately after initial deployment, before and after any change, and at least once a quarter as part of planned maintenance operations. The runtime duration of Oracle EXAchk depends on the number of nodes to check, CPU load, network latency, and so on.

Note:

Plan to run Oracle EXAchk when there is less load on the Oracle Big Data Appliance. This helps you avoid runtime timeouts during health checks.

3.6.2 Installing Oracle EXAchk on the Oracle Big Data Appliance

Follow these procedures to install Oracle EXAchk on the Oracle Big Data Appliance.

  1. Download the exachk.zip file to a directory on the Oracle Big Data Appliance, as root user.
  2. Extract the contents of exachk.zip.
    $ unzip exachk.zip
  3. (recommended) Add the location of the exachk executable to the /root/.bash_profile  file so that you can run Oracle EXAchk from anywhere.

    For example:

    From:
    # User specific environment and startup programs
    PATH=$PATH:$HOME/bin
    
    To:
    # User specific environment and startup programs
    PATH=$PATH:$HOME/bin: path to exachk
    
    If exachk is installed in /root/exachk_home, then update the /root/.bash_profile file as follows:
    PATH=$PATH:$HOME/bin:/root/exachk_home

3.6.3 Oracle EXAchk on Oracle Big Data Usage

Run the exachk -h command to view the list of options supported for Oracle Big Data Appliance.

Note:

Run Oracle EXAchk as root from node1 of the Oracle Big Data Appliance cluster.

Most data collection options require password for each InfiniBand switch. This is required, if there is no SSH user equivalency from running compute node to switch.

  1. To view the command options, run the following command as root or non-root user:
    ./exachk -h
      
    Usage : ./exachk [-abvhpfmsuSo:c:t:]
            -a      All (Perform best practice check and recommended patch check)
            -b      Best Practice check only. No recommended patch check
            -h      Show usage
            -v      Show version
             ...
    

List of Oracle EXAchk options supported for Oracle Big Data Appliance:

        -a      (Perform best practice check and recommended patch check.  This is the default option.  If no options are specified exachk runs with -a)
        -b      Best Practice check only. No recommended patch check
        -h      Show usage
        -v      Show version
        -m      exclude checks for Maximum Availability Architecture (MAA) scorecards(see user guide for more details)
        -o      Argument to an option. if -o is followed by v,V,Verbose,VERBOSE or Verbose, it will print checks which 	passes on the screen
                 if -o option is not specified,it will print only failures on screen. for eg: exachk -a -o v
        -clusternodes
                Pass comma separated node names to run exachk only on subset of nodes.
        -localonly
                Run exachk only on local node.

        -debug  Run exachk in debug mode. Debug log will be generated.
                eg:- ./exachk -debug 
                Output goes to stdout as well as generated log files

        -nopasd  Skip PASS'ed check to print in exachk report and upload to database. 

        -noscore  Do not print healthscore in HTML report.
        -diff <Old Report> <New Report> [-outfile <Output HTML>]
                Diff two exachk reports. Pass directory name or zip file or html report file as <Old Report> & <New Report>
        -<initsetup|initrmsetup|initcheck|initpresetup>
                initsetup       : Setup auto restart. Auto restart functionality automatically brings up exachk daemon when node starts
                initrmsetup   : Remove auto restart functionality
                initcheck       : Check if auto restart functionality is setup or not
         -d <start|start -debug|stop|status|info|stop_client|nextautorun>
                start           : Start the exachk daemon
                start -debug     : Start the exachk daemon in debug mode
                stop            : Stop the exachk daemon
                status          : Check if the exachk daemon is running
        -daemon
                run exachk only if daemon is running

       -nodaemon
                Dont use daemon to run exachk

       -set
                configure exachk daemon parameter like "param1=value1;param2=value2... "

                 Supported parameters are:-

                 AUTORUN_FLAGS <flags> : exachk flags to use for auto runs.

                     example: exachk -set "AUTORUN_FLAGS=-profile sysadmin" to run sysadmin profile every 12 hours

                              exachk -set "AUTORUN_FLAGS=-profile dba" to run dba profile once every 2 days.

                 NOTIFICATION_EMAIL : Comma separated list of email addresses used for notifications by daemon if mail server is configured.

                 PASSWORD_CHECK_INTERVAL <number of hours> : Interval to verify passwords in daemon mode

                 collection_retention <number of days> : Purge exachk collection directories and zip files older than specified days.
       -unset <parameter>
                unset the parameter
                  example: exachk -unset "AUTORUN_SCHEDULE"

       -get parameter | all
                Print the value of parameter
        -excludeprofile
                Pass specific profile.
                List of supported profiles is same as for -profile.

       -merge
                Pass comma separated collection names(directory or zip files) to merge collections and prepare single report.
                eg:- ./exachk -merge exachk_hostname1_db1_120213_163405.zip,exachk_hostname2_db2_120213_164826.zip
       -profile Pass specific profile.
                 List of supported profiles for BDA:
                 switch          Infiniband switch checks
                 sysadmin     sysadmin checks
       
       -ibswitches
                Pass comma separated infiniband switch names to run exachk only on selected infiniband switches.

Note:

If you run any other profiles that are not listed above, then Oracle EXAchk returns an error as follows:

<profile_name> is not supported component. EXAchk will run generic checks for components identified from environment
For example, to perform all checks including best practice checks and recommendations, run:
# ./exachk -a

Note:

If you do not specify any options, then Oracle EXAchk runs with the -a by default.

Output looks similar to the following:
Checking ssh user equivalency settings on all nodes in cluster

Node <BDANode01> is configured for ssh user equivalency for root user
...

Node <BDANode0n> is configured for ssh user equivalency for root user

Copying plug-ins
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .


9 of the included audit checks require root privileged data collection on INFINIBAND SWITCH .

1. Enter 1 if you will enter root password for each INFINIBAND SWITCH when prompted

2. Enter 2 to exit and to arrange for root access and run the exachk later.

3. Enter 3 to skip checking best practices on INFINIBAND SWITCH

Please indicate your selection from one of the above options for INFINIBAND SWITCH[1-3][1]:- 1

Is root password same on all INFINIBAND SWITCH ?[y/n][y]y

Enter root password for INFINIBAND SWITCH :-

Verifying root password.
. . .

*** Checking Best Practice Recommendations (PASS/WARNING/FAIL) ***

Collections and audit checks log file is
/<dir>/exachk_<BDANode0x_040414_091246/log/exachk.log
Starting to run exachk in background on <BDANode01>
...
Starting to run exachk in background on <BDANode0n>

=============================================================
                    Node name - <BDANode01>
=============================================================

Collecting - Verify ASR configuration check via ASREXACHECK

Starting to run root privileged commands in background on INFINIBAND SWITCH <RackName>sw-ib1.

Starting to run root privileged commands in background on INFINIBAND SWITCH <RackName>sw-ib2.

Starting to run root privileged commands in background on INFINIBAND SWITCH <RackName>sw-ib3.

Collections from INFINIBAND SWITCH:
------------------------------------
Collecting - Infiniband Switch NTP configuration
Collecting - Infiniband switch HOSTNAME configuration
Data collections completed. Checking best practices on BDANode01>
--------------------------------------------------------------------------------------
 ...

Copying results from <BDANode02> and generating report. This might take a while. Be patient.

=============================================================
                    Node name - <BDANode02>
=============================================================

Collecting - Verify ASR configuration check via ASREXACHECK

Data collections completed. Checking best practices on BDANode02>
--------------------------------------------------------------------------------------
...
---------------------------------------------------------------------------------

Detailed report (html) - /<dir>/exachk_<BDANode01>_040414_091246/exachk_<BDANode01>_040414_091246.html


UPLOAD(if required) - /<dir>/exachk_<BDANode01>_040414_091246.zip

3.6.4 Oracle EXAchk on Oracle Big Data Output

Identify the checks that you must act immediately to remediate, or investigate further to assess the checks that can cause performance or stability issues.

The following message statuses are specific to Oracle EXAchk on Oracle Big Data:

Oracle EXAchk on Oracle Big Data Message Definitions

Table 3-13 Oracle EXAchk on Oracle Big Data Message Definitions

Message Status Description or Possible Impact Action to be Taken

FAIL

Shows checks that did not pass due to issues.

Address the issue immediately.

WARNING

Shows checks that can cause performance or stability issues if not addressed.

Investigate the issue further.

INFO

Indicates information about the system.

Read the information displayed in these checks and follow the instructions provided, if any.

3.6.5 Troubleshooting Oracle EXAchk on Oracle BigData Appliance

In addition to the base troubleshooting, the following are also applicable to Oracle EXAchk on Oracle BigData.

If you face any problems running Oracle EXAchk, then create a service request through My Oracle Support.

Refer to My Oracle Support Note 1643715.1 for the latest known issues specific to Oracle EXAchk on Oracle BigData Appliance:

3.6.5.1 Timeouts Checking Switches

If there is a slow SSH on a given switch, then Oracle EXAchk throws an error:

Starting to run root privileged commands in background on INFINIBAND SWITCH <cluster>sw-ib1.

Timed out
Unable to create temp directory on <cluster>sw-ib1

Skipping root privileged commands on INFINIBAND SWITCH <cluster> sw-ib1 is 
available but SSH is blocked.

To resolve, increase the SSH timeout using Oracle EXAchk environment variable.

  1. Reset the environment variable RAT_PASSWORDCHECK_TIMEOUT:
    # set RAT_PASSWORDCHECK_TIMEOUT=40
  2. Rerun Oracle EXAchk.
    # ./exachk -a