The Oracle Private Cloud Appliance Controller Software contains a monitoring
service, which is started and stopped with the
ovca
service on the active management node.
When the system runs for the first time it creates an
inventory database and monitor
database. Once these are set up and the monitoring
service is active, health information about the hardware
components is updated continuously.
The inventory database is populated with information about the various components installed in the rack, including the IP addresses to be used for monitoring. With this information, the ping manager pings all known components every 3 minutes and updates the inventory database to indicate whether a component is pingable and when it was last seen online. When errors occur they are logged in the monitor database. Error information is retrieved from the component ILOMs.
For troubleshooting purposes, historic health status details can be retrieved through the CLI support mode by an authorized Oracle Field Engineer. When the CLI is used in support mode, a number of additional commands are available; two of which are used to display the contents of the health monitoring databases.
Use
show db inventory
to display component health status information from the inventory database.Use
show db monitor
to display errors logged in the monitoring database.
The appliance administrator can retrieve current component health
status information through the Oracle Private Cloud Appliance CLI at any time by
means of the diagnose
command.
Checking the Current Health Status of an Oracle Private Cloud Appliance Installation
Using SSH and an account with superuser privileges, log in to the active management node.
NoteThe default
root
password is Welcome1. For security reasons, you must set a new password at your earliest convenience.# ssh root@10.100.1.101 root@10.100.1.101's password: root@ovcamn05r1 ~]#
Launch the Oracle Private Cloud Appliance command line interface.
# pca-admin Welcome to PCA! Release: 2.4.1 PCA>
Check the current status of the rack components by querying their ILOMs.
PCA> diagnose ilom Checking ILOM health............please wait.. IP_Address Status Health_Details ---------- ------ -------------- 192.168.4.129 Not Connected None 192.168.4.128 Not Connected None 192.168.4.127 Not Connected None 192.168.4.126 Not Connected None 192.168.4.125 Not Connected None 192.168.4.124 Not Connected None 192.168.4.123 Not Connected None 192.168.4.122 Not Connected None 192.168.4.121 Not Connected None 192.168.4.120 Not Connected None 192.168.4.101 OK None 192.168.4.103 OK None 192.168.4.102 OK None 192.168.4.105 Faulty Mon Apr 55 14:17:37 2019 Power PS1 (Power Supply 1) A loss of AC input to a power supply has occurred. (Probability: 100, UUID: 2c1ec5fc-ffa3-c768-e602-ca12b86e3ea1, Part Number: 07047410, Serial Number: 476856F+1252CE027X, Reference Document: http://www.sun.com/msg/SPX86-8003-73) 192.168.4.104 OK None 192.168.4.107 OK None 192.168.4.106 OK None 192.168.4.109 OK None 192.168.4.108 OK None 192.168.4.112 Not Connected None 192.168.4.113 OK None 192.168.4.110 OK None 192.168.4.111 OK None 192.168.4.116 OK None 192.168.4.117 OK None 192.168.4.114 OK None 192.168.4.115 OK None 192.168.4.118 OK None 192.168.4.119 OK None ----------------- 29 rows displayed Status: Success
Verify that the Oracle Private Cloud Appliance controller software is fully operational.
PCA> diagnose software PCA Software Acceptance Test runner utility Test - 01 - OpenSSL CVE-2014-0160 Heartbleed bug Acceptance [PASSED] Test - 02 - PCA package Acceptance [PASSED] Test - 03 - Shared Storage Acceptance [PASSED] Test - 04 - PCA services Acceptance [PASSED] Test - 05 - PCA config file Acceptance [PASSED] Test - 06 - Check PCA DBs exist Acceptance [PASSED] Test - 07 - Compute node network interface Acceptance [PASSED] Test - 08 - OVM manager settings Acceptance [PASSED] Test - 09 - Check management nodes running Acceptance [PASSED] Test - 10 - Check OVM manager version Acceptance [PASSED] Test - 11 - OVM server model Acceptance [PASSED] Test - 12 - Repositories defined in OVM manager Acceptance [PASSED] Test - 13 - Management Nodes have IPv6 disabled [PASSED] Test - 14 - Bash Code Injection Vulnerability bug Acceptance [PASSED] Test - 15 - Check Oracle VM 3.4 xen security update Acceptance [PASSED] Test - 16 - Test for ovs-agent service on CNs Acceptance [PASSED] Test - 17 - Test for shares mounted on CNs Acceptance [PASSED] Test - 18 - All compute nodes running Acceptance [PASSED] Test - 19 - PCA version Acceptance [PASSED] Test - 20 - Check support packages in PCA image Acceptance [PASSED] Status: Success
NoteFor additional information about these diagnostic results, look at
/var/log/ovca-diagnosis.log
. However, note that this health monitoring status information changes frequently as the appliance environment runs. If the system does not perform as expected, use it only as an indication of where a problem might have occurred.Close the CLI.
PCA> exit