This chapter contains information about how to validate changes and troubleshoot Oracle Database Appliance problems.
Topics:
oakcli validate
to check your Oracle Database Appliance configuration, and if necessary, to provide information to Oracle Support Services.oakcli manage diagcollect
to collect diagnostic files to send to Oracle Support Services.Use oakcli validate
to check your Oracle Database Appliance configuration, and if necessary, to provide information to Oracle Support Services.
The oakcli validate
command is the Oracle Appliance Manager diagnostic and validation utility to identify and resolve support issues. If you experience problems with Oracle Database Appliance, then use the oakcli validate
command options to verify that your environment is properly configured, and that best practices are in effect. When placing a service request, also use Oracle Appliance Manager as described in this chapter to prepare the log files to send to Oracle Support Services.
Topics:
oakcli validate
command and options to validate the status of Oracle Database Appliance.oakcli validate
command and options.Use the oakcli validate
command and options to validate the status of Oracle Database Appliance.
You must run the oakcli validate
command as the root
user.
Syntax
The command oakcli validate
uses the following syntax, where checklist is a single check or a comma-delimited list of checks, and output_file
is the name that you designate for a validation output file:
oakcli validate -h oakcli validate [-V | -l | -h] oakcli validate [-v] [-f output_file] [-a | -d | -c checklist] [-v patch_version]
Parameters
Option | Purpose |
---|---|
|
Run all system checks, including |
|
Run the validation checks for the items identified in |
|
Run only the default checks. The default checks are |
|
Send output to a file with a fully qualified file name, |
|
Display the online help. |
|
List the items that can be checked (and their descriptions). |
|
Show verbose output (must be used with a parameter that generates a validation report). |
|
Display the version of oakValidation. |
|
Report any reasons for not being able to patch Oracle Database Appliance with the patch named in |
VALIDATE Options
Command | Purpose |
---|---|
|
Validate Oracle Auto Service Request (Oracle ASR) components based on the Oracle ASR configuration file and Oracle Integrated Lights Out Manager (Oracle ILOM) sensor data. |
|
Preinstallation check for the storage disk performance using Do not run this check after you have deployed Oracle software on Oracle Database Appliance, because running the Use the default check option ( |
|
Validate public and private network hardware connections. |
|
Validate the operating system disks, and file system information. |
|
Validate that the system can complete an upgrade successfully using the named patch. |
|
Validate shared storage and multipathing information. |
|
Validate the storage shelf connectivity. |
|
Validate system components, based on Oracle ILOM sensor data readings. |
Review these examples to see you can perform validation checks using the oakcli validate
command and options.
Listing All Checks and Their Descriptions
oakcli validate -l Checkname -- Description ========= =========== *SystemComponents -- Validate system components based on ilom sensor data readings *OSDiskStorage -- Validate OS disks and filesystem information *SharedStorage -- Validate Shared storage and multipathing information DiskCalibration -- Check disk performance with orion *NetworkComponents -- Validate public and private network components *StorageTopology -- Validate external JBOD connectivity asr -- Validate asr components based on asr config file and ilom sensor data readings * -- These checks are also performed as part of default checks
Note:
The NetworkComponents
validation check is not available on hardware prior to Oracle Database Appliance X3-2.
Running All Checks
Enter the following command to run all checks:
oakcli validate -a
Validating Storage Cable Connections
Check the cable connections between the system controllers and the storage shelf, as well as the cable connection to the storage expansion shelf (if one is installed):
oakcli validate -c storagetopology
Oracle recommends that you run the oakcli validate -c StorageTopology
command before deploying the system. This will avoid and prevent problems during deployment due to wrong or missing cable connections. The output shown in the following example reports a successful configuration. If the cabling is not correct, you will see errors in your output.
# oakcli validate -c storagetopology It may take a while. Please wait... INFO : ODA Topology Verification INFO : Running on Node0 INFO : Check hardware type SUCCESS : Type of hardware found : X4-2 INFO : Check for Environment(Bare Metal or Virtual Machine) SUCCESS : Type of environment found : Virtual Machine(ODA BASE) SUCCESS : Number of External LSI SAS controller found : 2 INFO : Check for Controllers correct PCIe slot address SUCCESS : External LSI SAS controller 0 : 00:15.0 SUCCESS : External LSI SAS controller 1 : 00:16.0 INFO : Check if powered on SUCCESS : 1 : Powered-on INFO : Check for correct number of EBODS(2 or 4) SUCCESS : EBOD found : 2 INFO : Check for External Controller 0 SUCCESS : Controller connected to correct ebod number SUCCESS : Controller port connected to correct ebod port SUCCESS : Overall Cable check for controller 0 INFO : Check for External Controller 1 SUCCESS : Controller connected to correct ebod number SUCCESS : Controller port connected to correct ebod port SUCCESS : Overall Cable check for controller 1 INFO : Check for overall status of cable validation on Node0 SUCCESS : Overall Cable Validation on Node0 INFO : Check Node Identification status SUCCESS : Node Identification SUCCESS : Node name based on cable configuration found : NODE0 INFO : Check Nickname SUCCESS : Nickname set correctly : Oracle Database Appliance - E0 INFO : The details for Storage Topology Validation can also be found in log file=/opt/oracle/oak/log/<hostname>/storagetopology/StorageTopology-2014-07-03-08:57:31_7661_15914.log
Validating Oracle ASR
Enter the following syntax to validate your Oracle ASR configuration:
# oakcli validate -c asr INFO: oak Asr information and Validations RESULT: /opt/oracle/oak/conf/asr.conf exist RESULT: ASR Manager ip:10.139.154.17 RESULT: ASR Manager port:1162 SUCCESS: ASR configuration file validation successfully completed RESULT: /etc/hosts has entry 141.146.156.46 transport.oracle.com RESULT: ilom alertmgmt level is set to minor RESULT: ilom alertmgmt type is set to snmptrap RESULT: alertmgmt snmp_version is set to 2c RESULT: alertmgmt community_or_username is set to public RESULT: alertmgmt destination is set to 10.139.154.17 RESULT: alertmgmt destination_port is set to 1162 SUCCESS: Ilom snmp confguration for asr set correctly RESULT: notification trap configured to ip:10.139.154.17 RESULT: notification trap configured to port:1162 SUCCESS: Asr notification trap set correctly INFO: IP_ADDRESS HOST_NAME SERIAL_NUMBER ASR PROTOCOL SOURCE PRODUCT_NAME INFO: --------- ---------- ------------- --- -------- ------ ------------ 10.170.79.98 oda-02-c 1130FMW00D Enabled SNMP ILOM SUN FIRE X4370 M2 SERVER 10.170.79.97 oda-01-c 1130FMW00D Enabled SNMP ILOM SUN FIRE X4370 M2 SERVER INFO: Please use My Oracle Support 'http://support.oracle.com' to view the activation status. SUCCESS: asr log level is already set to Fine. RESULT: Registered with ASR backend. RESULT: test connection successfully completed. RESULT: submitted test event for asset:10.139.154.17 RESULT: bundle com.sun.svc.asr.sw is in active state RESULT: bundle com.sun.svc.asr.sw-frag is in resolved state RESULT: bundle com.sun.svc.asr.sw-rulesdefinitions is in resolved state RESULT: bundle com.sun.svc.ServiceActivation is in active state SUCCESS: ASR diag successfully completed
Checking the Viability of a Patch
Use the oakcli validate ospatch -ver
command to report any reasons for not being able to patch Oracle Database Appliance with the patch named in patch_version
. Run this command before you attempt to patch Oracle Database Appliance to determine if it succeeds or if you must make changes before applying the patch.
# oakcli validate -c ospatch -ver 12.1.2.5.0 INFO: Validating the OS patch for the version 12.1.2.5.0 WARNING: 2015-10-10 06:30:32: Patching sub directory /opt/oracle/oak/pkgrepos/orapkgs/OEL/5.10/Patches/5.10.1 is not existing INFO: 2015-10-10 06:30:32: May need to unpack the Infra patch bundle for the version: 12.1.2.5.0 ERROR: 2015-10-10 06:30:32: No OS patch directory found in the repository
Validating Hardware System and Network Components
The following command runs system checks to validate hardware system components and Oracle Database Appliance network components:
# oakcli validate -c SystemComponents,NetworkComponents
If you encounter errors while configuring Oracle Database Appliance, then review the following messages and actions:
Cause: This message is most likely to occur when you attempt to redeploy the End-User Bundle without cleaning up a previous deployment. This error occurs because an existing VIP is configured for the addresses assigned to Oracle Database Appliance.
Cause: This error occurs when the Oracle Grid Infrastructure CSS daemon attempts to start the node as a standalone cluster node, but during startup discovers that the other cluster node is running, and changes to cluster mode to join the cluster.
Cause: This message occurs on a node if one of the two operating system disks is not installed, but you are attempting to reimage the operating system.
Cause: Operating system plug-ins required for sound cards for the Oracle ILOM remote redirection console are not installed.
Cause: One or both operating system disks are not available. This message occurs if you select "Default hard disk" during reimaging the system, but that disk is not available.
Cause: If you select "Default (use BIOS settings)" as your imaging option, but one or both of the disks is not available, this message occurs on a node if both operating disks are installed, and you choose to reimage the operating system disks.
Cause: On Windows platforms, the Oracle Appliance Manager configurator uses the echo service on port 7 to contact the gateway. If the echo service is disabled, possibly for security reasons, the ping fails.
Cause: Oracle Database Appliance operating system upgrade includes upgrade of Oracle Linux to Unbreakable Enterprise Kernel (UEK). Because Oracle Automatic Storage Management Cluster File System (Oracle ACFS) is not supported on all versions of Oracle Linux, a successful upgrade of the operating system may effectively disable Oracle ACFS.
Upgrade to Oracle Database Appliance 2.2 has three options: —infra
, —gi
, and —database
. The —infra
option includes upgrade from Oracle Linux to UEK. Before the —infra
upgrade to 2.2, the operating system is Oracle Linux with 11.2.0.2.x Grid Infrastructure. After the —infra
upgrade, the operating system is UEK and 11.2.0.2.x Oracle ACFS, which is not compatible with UEK.
For example, upgrade to Oracle Linux 2.6.32-300.11.1.el5uek causes reco.acfsvol.acfs
and ora.registry.acfs
to temporarily go to an OFFLINE state, because 2.6.32-300.11.1.el5uek does not support Oracle 11.2.0.2.x ACFS. However, when Oracle Grid Infrastructure is upgraded to 11.2.0.3.2, these components are online again.
If necessary, use the command oakcli manage diagcollect
to collect diagnostic files to send to Oracle Support Services.
If you have a system fault that requires help from Oracle Support Services, then you may need to provide log records to help Oracle support diagnose your issue.
Collect log file information by running the commandoakcli manage diagcollect
. This command consolidates information from log files stored on Oracle Database Appliance into a single log file for use by Oracle Support Services. The location of the file is specified in the command output.
This section describes additional tools and commands for diagnosing and troubleshooting problems with Oracle Database Appliance.
Although some of these tools are specific to Oracle Database Appliance, others are tools for all clustered systems.
Topics:
Oracle Appliance Manager provides access to a number of sophisticated monitoring and reporting tools, some of them derived from standalone tools that require their own syntax and command sets.
The following list briefly describes the ORAchk command, and the disk diagnostic tool:
ORAchk
The ORAchk Configuration Audit Tool audits important configuration settings for Oracle RAC two-node deployments in the following categories:
Operating system kernel parameters and packages
RDBMS
Database parameters, and other database configuration settings
Oracle Grid Infrastructure, which includes Oracle Clusterware and Oracle Automatic Storage Management
ORAchk is aware of the entire system. It checks the configuration to indicate if best practices are being followed. For example, ORAchk reviews the system and identifies best practice issues that are specific to Oracle Database Appliance when ORAchk is run by Oracle Appliance Manager. To explore ORAchk on Oracle Database Appliance, use the following command:
oakcli orachk -h
Also review My Oracle Support note 1268927.2, which is available from My Oracle Support.
Disk Diagnostic Tool
Use the Disk Diagnostic Tool to help identify the cause of disk problems. The tool produces a list of 14 disk checks for each node. To run the tool, enter the following command:
# oakcli stordiag resource_type
Trace File Analyzer (TFA) Collector simplifies diagnostic data collection on Oracle Grid Infrastructure and Oracle Real Application Clusters systems.
TFA behaves in a similar manner to the ion utility packaged with Oracle Clusterware. Both tools collect and package diagnostic data. However, TFA is much more powerful than ion, because TFA centralizes and automates the collection of diagnostic information.
TFA provides the following key benefits and options:
Encapsulation of diagnostic data collection for all Oracle Grid Infrastructure and Oracle RAC components on all cluster nodes into a single command, which you run from a single node
Option to "trim" diagnostic files during data collection to reduce data upload size
Options to isolate diagnostic data collection to a given time period, and to a particular product component, such as Oracle ASM, RDBMS, or Oracle Clusterware
Centralization of collected diagnostic output to a single node in Oracle Database Appliance, if desired
On-Demand Scans of all log and trace files for conditions indicating a problem
Real-Time Scan Alert Logs for conditions indicating a problem (for example, Database Alert Logs, Oracle ASM Alert Logs, and Oracle Clusterware Alert Logs)
See Also:
Refer to My Oracle Support note 1513912.1 "TFA Collector - Tool for Enhanced Diagnostic Gathering" for more information. https://support.oracle.com/CSP/main/article?cmd=show&amp;type=NOT&amp;id=1513912.1
The Oracle Database Appliance Hardware Monitoring Tool displays the status of different hardware components in Oracle Database Appliance server nodes.
The tool is implemented with the Trace File Analyzer collector. Use the tool both on bare-metal and on virtualized systems.
You can see the list of monitored components by running the command oakcli show -h
To see information about specific components, use the command syntax oakcli show component
, where component
is the hardware component that you want to query. For example, the command oakcli show power
shows information specifically about the Oracle Database Appliance power supply:
oakcli show power NAME HEALTH HEALTH DETAILS PART_NO. SERIAL_NO. LOCATION INPUT POWER OUTPUT POWER INLET TEMP EXHAUST TEMP Power Supply_0 OK - 7047410 476856F+1242CE0020 PS0 Present 88 watts 31.250 degree C 34.188 degree C Power Supply_1 OK - 7047410 476856F+1242CE004J PS1 Present 66 watts 31.250 degree C 34.188 degree C
Note:
Oracle Database Appliance Server Hardware Monitoring Tool is enabled during initial startup of ODA_BASE on Oracle Database Appliance Virtualized Platform. When it starts, the tool collects base statistics for about 5 minutes. During this time, the tool displays the message "Gathering Statistics…" message.
The Oracle Database Appliance Hardware Monitoring Tool reports information only for the node on which you run the command. The information it displays in the output depend on the component that you select to review.