This section explains features and tasks specific to Oracle EXAchk on Oracle Big Data.
exachk -h
command to view the list of options supported for Big Data Appliance.Oracle EXAchk for Big Data Appliance supports all BDA versions later than 2.0.1.
Oracle EXAchk for Big Data Appliance audits important configuration settings within a Big Data Appliance. Oracle EXAchk examines the following components:
CPU
Hardware, firmware, and BIOS
Operating System kernel parameters, system packages
Ethernet network, InfiniBand switches
RAM, hard disks
Software Installed
Goals for Big Data Appliance Health Checks
Provide a mechanism to check the complete health of a Big Data Appliance on a proactive and reactive basis.
Provide a “recommendation engine” for best practices and tips to fix Big Data Appliance known issues.
Recommended Validation Frequency
Note:
Plan to run Oracle EXAchk when there is less load on the a Big Data Appliance. This helps you avoid runtime timeouts during health checks.
Run the exachk -h
command to view the list of options supported for Big Data Appliance.
Note:
Run Oracle EXAchk as root from node1 of the BDA cluster.
Most data collection options require password for each InfiniBand switch. This is required, if there is no SSH user equivalency from running compute node to switch.
./exachk -h Usage : ./exachk [-abvhpfmsuSo:c:t:] -a All (Perform best practice check and recommended patch check) -b Best Practice check only. No recommended patch check -h Show usage -v Show version ...
List of Oracle EXAchk options supported for BDA:
-a (Perform best practice check and recommended patch check. This is the default option. If no options are specified exachk runs with -a) -b Best Practice check only. No recommended patch check -h Show usage -v Show version -m exclude checks for Maximum Availability Architecture (MAA) scorecards(see user guide for more details) -o Argument to an option. if -o is followed by v,V,Verbose,VERBOSE or Verbose, it will print checks which passes on the screen if -o option is not specified,it will print only failures on screen. for eg: exachk -a -o v -clusternodes Pass comma separated node names to run exachk only on subset of nodes. -localonly Run exachk only on local node. -debug Run exachk in debug mode. Debug log will be generated. eg:- ./exachk -debug Output goes to stdout as well as generated log files -nopasd Skip PASS'ed check to print in exachk report and upload to database. -noscore Do not print healthscore in HTML report. -diff <Old Report> <New Report> [-outfile <Output HTML>] Diff two exachk reports. Pass directory name or zip file or html report file as <Old Report> & <New Report> -<initsetup|initrmsetup|initcheck|initpresetup> initsetup : Setup auto restart. Auto restart functionality automatically brings up exachk daemon when node starts initrmsetup : Remove auto restart functionality initcheck : Check if auto restart functionality is setup or not initpresetup : Sets root user equivalency for COMPUTE, STORAGE and IBSWITCHES.(root equivalency for COMPUTE nodes is mandatory for setting up auto restart functionality) -d <start|start_debug|stop|status|info|stop_client|nextautorun> start : Start the exachk daemon start_debug : Start the exachk daemon in debug mode stop : Stop the exachk daemon status : Check if the exachk daemon is running -daemon run exachk only if daemon is running -nodaemon Dont use daemon to run exachk -set configure exachk daemon parameter like "param1=value1;param2=value2... " Supported parameters are:- AUTORUN_INTERVAL <n[d|h]> :- Automatic rerun interval in daemon mode.Set it zero to disable automatic rerun which is zero. AUTORUN_SCHEDULE * * * * :- Automatic run at specific time in daemon mode. - - - - ¦ ¦ ¦ ¦ ¦ ¦ ¦ +----- day of week (0 - 6) (0 to 6 are Sunday to Saturday) ¦ ¦ +---------- month (1 - 12) ¦ +--------------- day of month (1 - 31) +-------------------- hour (0 - 23) example: exachk -set "AUTORUN_SCHEDULE=8,20 * * 2,5" will schedule runs on tuesday and friday at 8 and 20 hour. AUTORUN_FLAGS <flags> : exachk flags to use for auto runs. example: exachk -set "AUTORUN_INTERVAL=12h;AUTORUN_FLAGS=-profile sysadmin" to run sysadmin profile every 12 hours exachk -set "AUTORUN_INTERVAL=2d;AUTORUN_FLAGS=-profile dba" to run dba profile once every 2 days. NOTIFICATION_EMAIL : Comma separated list of email addresses used for notifications by daemon if mail server is configured. PASSWORD_CHECK_INTERVAL <number of hours> : Interval to verify passwords in daemon mode collection_retention <number of days> : Purge exachk collection directories and zip files older than specified days. -unset <parameter> unset the parameter example: exachk -unset "AUTORUN_SCHEDULE" -get parameter | all Print the value of parameter -excludeprofile Pass specific profile. List of supported profiles is same as for -profile. -merge Pass comma separated collection names(directory or zip files) to merge collections and prepare single report. eg:- ./exachk -merge exachk_hostname1_db1_120213_163405.zip,exachk_hostname2_db2_120213_164826.zip -profile Pass specific profile. List of supported profiles for BDA: switch Infiniband switch checks sysadmin sysadmin checks -ibswitches Pass comma separated infiniband switch names to run exachk only on selected infiniband switches.
Note:
Oracle EXAchk returns an error as follows if run any other profiles that are not listed above:
<profile_name> is not supported component. EXAchk will run generic checks for components identified from environment
# ./exachk -a
Note:
By default, Oracle EXAchk runs with the -a
option, if you do not provide any options.
Checking ssh user equivalency settings on all nodes in cluster Node <BDANode01> is configured for ssh user equivalency for root user ... Node <BDANode0n> is configured for ssh user equivalency for root user Copying plug-ins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 of the included audit checks require root privileged data collection on INFINIBAND SWITCH . 1. Enter 1 if you will enter root password for each INFINIBAND SWITCH when prompted 2. Enter 2 to exit and to arrange for root access and run the exachk later. 3. Enter 3 to skip checking best practices on INFINIBAND SWITCH Please indicate your selection from one of the above options for INFINIBAND SWITCH[1-3][1]:- 1 Is root password same on all INFINIBAND SWITCH ?[y/n][y]y Enter root password for INFINIBAND SWITCH :- Verifying root password. . . . *** Checking Best Practice Recommendations (PASS/WARNING/FAIL) *** Collections and audit checks log file is /<dir>/exachk_<BDANode0x_040414_091246/log/exachk.log Starting to run exachk in background on <BDANode01> ... Starting to run exachk in background on <BDANode0n> ============================================================= Node name - <BDANode01> ============================================================= Collecting - Verify ASR configuration check via ASREXACHECK Starting to run root privileged commands in background on INFINIBAND SWITCH <RackName>sw-ib1. Starting to run root privileged commands in background on INFINIBAND SWITCH <RackName>sw-ib2. Starting to run root privileged commands in background on INFINIBAND SWITCH <RackName>sw-ib3. Collections from INFINIBAND SWITCH: ------------------------------------ Collecting - Infiniband Switch NTP configuration Collecting - Infiniband switch HOSTNAME configuration Data collections completed. Checking best practices on BDANode01> -------------------------------------------------------------------------------------- ... Copying results from <BDANode02> and generating report. This might take a while. Be patient. ============================================================= Node name - <BDANode02> ============================================================= Collecting - Verify ASR configuration check via ASREXACHECK Data collections completed. Checking best practices on BDANode02> -------------------------------------------------------------------------------------- ... --------------------------------------------------------------------------------- Detailed report (html) - /<dir>/exachk_<BDANode01>_040414_091246/exachk_<BDANode01>_040414_091246.html UPLOAD(if required) - /<dir>/exachk_<BDANode01>_040414_091246.zip
Identify the checks that you need to act immediately to remediate, or investigate further to assess the checks that might cause performance or stability issues.
The following message statuses are specific to Oracle EXAchk on Oracle Big Data:
Oracle EXAchk on Oracle Big Data Message Definitions
Table 3-15 Oracle EXAchk on Oracle Big Data Message Definitions
Message Status | Description or Possible Impact | Action to be Taken |
---|---|---|
FAIL |
Shows checks that did not pass due to issues. |
Address the issue immediately. |
WARNING |
Shows checks that might cause performance or stability issues if not addressed. |
Investigate the issue further. |
INFO |
Indicates information about the system. |
Read the information displayed in these checks and follow the instructions provided, if any. |
In addition to the base Troubleshooting, the following are also applicable to Oracle EXAchk on Oracle BigData.
Create a service request through My Oracle Support, if you face any problems running EXAchk.
See Also:
My Oracle Support Note 1643715.1 for the latest known issues specific to Oracle EXAchk on Oracle BigData Appliance, which is available at the following URL:
Runtime Command Timeouts
During the health check process, if a particular node or switch does not respond to the health check command within a pre-defined duration, Oracle EXAchk terminates that command. To prevent the program from freezing, Oracle EXAchk automatically terminates commands that exceed default timeouts. On a busy system, Oracle EXAchk terminates commands when the target of the check does not respond within the default timeout.
Note:
To avoid runtime command timeouts from occurring during health checks, ensure that you run the tool when there is least load on the system.
Timeouts Checking Switches
Starting to run root privileged commands in background on INFINIBAND SWITCH <cluster>sw-ib1. Timed out Unable to create temp directory on <cluster>sw-ib1 Skipping root privileged commands on INFINIBAND SWITCH <cluster> sw-ib1 is available but SSH is blocked.
RAT_PASSWORDCHECK_TIMEOUT
:
# set RAT_PASSWORDCHECK_TIMEOUT=40
# ./exachk -a