The Oracle PCA controller software contains a monitoring
service, which is started and stopped with the
ovca
service on the active management node.
When the system runs for the first time it creates an
inventory database and monitor
database. Once these are set up and the monitoring
service is active, health information about the hardware
components is updated continuously.
The inventory database is populated with information about the various components installed in the rack, including the IP addresses to be used for monitoring. With this information, the ping manager pings all known components every 3 minutes and updates the inventory database to indicate whether a component is pingable and when it was last seen online. When errors occur they are logged in the monitor database. Error information is retrieved from the component ILOMs.
For troubleshooting purposes, historic health status details can be retrieved through the CLI support mode by an authorized Oracle Field Engineer. When the CLI is used in support mode, a number of additional commands are available; two of which are used to display the contents of the health monitoring databases.
Use
show db inventory
to display component health status information from the inventory database.Use
show db monitor
to display errors logged in the monitoring database.
The appliance administrator can retrieve current component health
status information through the Oracle PCA CLI at any time by
means of the diagnose
command.
Checking the Current Health Status of an Oracle PCA Installation
Using SSH and an account with superuser privileges, log into the active management node.
NoteThe default
root
password is Welcome1.# ssh root@10.100.1.101 root@10.100.1.101's password: root@ovcamn05r1 ~]#
Launch the Oracle PCA command line interface.
# pca-admin Welcome to PCA! Release: 2.1.1 PCA>
Check the current status of the rack components by querying their ILOMs.
PCA> diagnose ilom Checking ILOM health............please wait.. IP_Address Status Health_Details ---------- ------ -------------- 192.168.4.129 Not Connected 192.168.4.128 Not Connected 192.168.4.127 Not Connected 192.168.4.126 Not Connected 192.168.4.125 Not Connected 192.168.4.124 Not Connected 192.168.4.123 Not Connected 192.168.4.122 Not Connected 192.168.4.121 Not Connected 192.168.4.120 Not Connected 192.168.4.101 OK 192.168.4.102 OK 192.168.4.105 Faulty Mon Nov 25 14:17:37 2013 Power PS1 (Power Supply 1) A loss of AC input to a power supply has occurred. (Probability: 100, UUID: 2c1ec5fc-ffa3-c768-e602-ca12b86e3ea1, Part Number: 07047410, Serial Number: 476856F+1252CE027X, Reference Document: http://www.sun.com/msg/SPX86-8003-73) 192.168.4.107 OK 192.168.4.106 OK 192.168.4.109 OK 192.168.4.108 OK 192.168.4.112 OK 192.168.4.113 Not Connected 192.168.4.110 OK 192.168.4.111 OK 192.168.4.116 Not Connected 192.168.4.117 Not Connected 192.168.4.114 Not Connected 192.168.4.115 Not Connected 192.168.4.118 Not Connected 192.168.4.119 Not Connected ----------------- 27 rows displayed Status: Success
Verify that the Oracle PCA controller software is fully operational.
PCA> diagnose software PCA Software Acceptance Test runner utility Test - 701 - OpenSSL CVE-2014-0160 Heartbleed bug Acceptance [PASSED] Test - 785 - PCA package Acceptance [PASSED] Test - 1083 - Mgmt node xsigo network interface Acceptance [PASSED] Test - 787 - Shared Storage Acceptance [PASSED] Test - 973 - Simple connectivity Acceptance [PASSED] Test - 1078 - Test for ovs-agent service on CNs Acceptance [PASSED] Test - 1079 - Test for shares mounted on CNs Acceptance [PASSED] Test - 1080 - ovs-log check Acceptance [PASSED] Test - 788 - PCA services Acceptance [PASSED] Test - 789 - PCA config file Acceptance [PASSED] Test - 1300 - All compute nodes running Acceptance [PASSED] Test - 1318 - Check support packages in PCA image Acceptance [PASSED] Test - 928 - Repositories defined in OVM manager Acceptance [PASSED] Test - 1107 - Compute node xsigo network interface Acceptance [PASSED] Test - 1316 - PCA version Acceptance [PASSED] Test - 1117 - Network interfaces check Acceptance [PASSED] Test - 824 - OVM manager settings Acceptance [PASSED] Test - 927 - OVM server model Acceptance [PASSED] Test - 925 - PCA log Acceptance [PASSED] Test - 926 - Networks defined in OVM manager for CNs Acceptance [PASSED] Test - 822 - Compute node network interface Acceptance [PASSED] Status: Success
Close the CLI.
PCA> exit