- 3.2.1 Do Not Reconfigure Network During Compute Node Provisioning or Upgrade
- 3.2.2 Nodes Attempt to Synchronize Time with the Wrong NTP Server
- 3.2.3 Unknown Symbol Warning during InfiniBand Driver Installation
- 3.2.4 Node Manager Does Not Show Node Offline Status
- 3.2.5 Update Functionality Not Available in Dashboard
- 3.2.6 Interrupting Download of Software Update Leads to Inconsistent Image Version and Leaves Image Mounted and Stored in Temporary Location
- 3.2.7 Compute Nodes Lose Oracle VM iSCSI LUNs During Software Update
- 3.2.8 Virtual Machine File Systems Become Read-Only after Storage Head Failover
- 3.2.9 Oracle VM Manager Tuning Settings Are Lost During Software Update
- 3.2.10 Oracle VM Manager Fails to Restart after Restoring a Backup Due to Password Mismatch
- 3.2.11 Oracle VM Java Processes Consume Large Amounts of Resources
- 3.2.12 External Storage Cannot Be Discovered Over Data Center Network
- 3.2.13 High Network Load with High MTU May Cause Time-Out and Kernel Panic in Compute Nodes
- 3.2.14 Oracle PCA Dashboard URL Is Not Redirected
- 3.2.15 User Interface Does Not Support Internet Explorer 10 and 11
- 3.2.16 Authentication Error Prevents Oracle VM Manager Login
- 3.2.17 Error Getting VM Stats in Oracle VM Agent Logs
- 3.2.18 Virtual Machine with High Availability Takes Five Minutes to Restart when Failover Occurs
- 3.2.19 The CLI Command show Accepts Non-Existent Targets As Parameters
- 3.2.20 Deleting a Running Task in the CLI Results in Errors
- 3.2.21 The CLI Command diagnose software Fails To Correctly Run Some Diagnostic Tests
This section describes software-related limitations and workarounds.
In the Oracle PCA Dashboard, the Network Setup tab becomes available when the first compute node has been provisioned successfully. However, when installing and provisioning a new system, you must wait until all nodes have completed the provisioning process before changing the network configuration. Also, when provisioning new nodes at a later time, or when upgrading the environment, do not apply a new network configuration before all operations have completed. Failure to follow these guidelines is likely to leave your environment in an indeterminate state.
Workaround: Before reconfiguring the system network settings, make sure that no provisioning or upgrade processes are running.
Bug 17475738
External time synchronization, based on
ntpd
, is left in default configuration at the factory. As a
result, NTP does not work when you first power on the
Oracle PCA, and you may find messages in system logs similar
to these:
Oct 1 11:20:33 ovcamn06r1 kernel: o2dlm: Joining domain ovca ( 0 1 ) 2 nodes Oct 1 11:20:53 ovcamn06r1 ntpd_initres[3478]: host name not found:0.rhel.pool.ntp.org Oct 1 11:20:58 ovcamn06r1 ntpd_initres[3478]: host name not found:1.rhel.pool.ntp.org Oct 1 11:21:03 ovcamn06r1 ntpd_initres[3478]: host name not found:2.rhel.pool.ntp.org
Workaround: Apply the appropriate network configuration for your data center environment, as described in the section Network Setup in the Oracle Private Cloud Appliance Administrator's Guide. When the data center network configuration is applied successfully, the default values for NTP configuration are overwritten and components will synchronize their clocks with the source you entered.
Bug 17548941
Towards the end of the management node
install.log
file, the following warnings
appear:
> WARNING: > /lib/modules/2.6.39-300.32.1.el6uek.x86_64/kernel/drivers/infiniband/ \ > hw/ipath/ib_ipath.ko needs unknown symbol ib_wq > WARNING: > /lib/modules/2.6.39-300.32.1.el6uek.x86_64/kernel/drivers/infiniband/ \ > hw/qib/ib_qib.ko needs unknown symbol ib_wq > WARNING: > /lib/modules/2.6.39-300.32.1.el6uek.x86_64/kernel/drivers/infiniband/ \ > ulp/srp/ib_srp.ko needs unknown symbol ib_wq > *** FINISHED INSTALLING PACKAGES ***
These warnings have no adverse effects and may be disregarded.
Bug 16946511
The role of the Node Manager database is to track the various states a compute node goes through during provisioning. After successful provisioning the database continues to list a node as running, even if it is shut down. For nodes that are fully operational, the server status is tracked by Oracle VM Manager. However, the Oracle PCA Dashboard displays status information from the Node Manager. This may lead to inconsistent information between the Dashboard and Oracle VM Manager, but it is not considered a bug.
Workaround: To verify the status of operational compute nodes, use the Oracle VM Manager user interface.
Bug 17456373
The Oracle PCA Dashboard cannot be used to perform an update of the software stack.
Workaround: Use the command line tool
pca-updater to update the software stack of your Oracle PCA. For
details, refer to the section Oracle Private Cloud Appliance Software Update in the Oracle Private Cloud Appliance Administrator's Guide. For step-by-step
instructions, refer to the section Update. You can use SSH
to log in to each management node and check /etc/pca-info
for log
entries indicating restarted services and new software revisions.
Bug 17476010, 17475976 and 17475845
The first step of the software update process is to download
an image file, which is unpacked in a particular location on
the ZFS storage appliance. When the download is interrupted,
the file system is not cleaned up or rolled back to a previous
state. As a result, contents from different versions of the
software image may end up in the source location from where
the installation files are loaded. In addition, the downloaded
*.iso
file remains stored in
/tmp
and is not unmounted. If downloads are
frequently started and stopped, this could cause the system to
run out of free loop devices to mount the
*.iso
files, or even to run out of free
space.
Workaround: The files left behind by previous downloads do not prevent you from running the update procedure again and restarting the download. Download a new software update image. When it completes successfully you can install the new version of the software, as described in the section Update in the Oracle Private Cloud Appliance Administrator's Guide.
Bug 18352512
Several iSCSI LUNs, including the essential server pool file system, are mapped on each compute node. When you update the appliance software, it may occur that one or more LUNs are missing on certain compute nodes. In addition, there may be problems with the configuration of the clustered server pool, preventing the existing compute nodes from joining the pool and resuming correct operation after the software update.
Workaround: To avoid these software update issues, upgrade all previously provisioned compute nodes by following the procedure described in the section Upgrading Existing Compute Node Configuration from Release 1.0.2 in the Oracle Private Cloud Appliance Administrator's Guide.
Bugs 17922555, 18459090, 18433922 and 18397780
When a failover occurs between the storage heads of the Oracle PCA internal ZFS storage appliance, or an externally connected ZFS storage appliance, the file systems used by virtual machines may become read-only, preventing normal VM operation. Compute nodes may also hang or crash as a result.
Workaround: There is no documented workaround to prevent the issue. Once the storage head failover has completed, you can reboot the virtual machines to bring them back online in read-write mode.
Bugs 19324312 and 19670873
During the Oracle PCA software update from Release 1.0.2 to Release 1.1.x, it may occur that the specific tuning settings for Oracle VM Manager are not applied correctly, and that default settings are used instead.
Workaround: Verify the Oracle VM Manager tuning settings and re-apply them if necessary. Follow the instructions in the section Verifying and Re-applying Oracle VM Manager Tuning after Software Update in the Oracle Private Cloud Appliance Administrator's Guide.
Bug 18477228
If you have changed the Oracle PCA password in the Dashboard, and need to restore the Oracle VM Manager from a backup that was made prior to the password change, the passwords will be out of sync, and Oracle VM Manager cannot be started because it cannot connect to its database. In this case, you need to make sure that the actual database password is also restored in the Oracle WebLogic Server JDBC connection configuration used by Oracle VM Manager. It is important to keep the password entries in the Oracle PCA Wallet up-to-date as well, although it is not the cause of this particular bug.
Workaround: After restoring the Oracle VM Manager database, also restore the database connection settings for Oracle VM Manager. The Oracle WebLogic Server JDBC connection must be configured to use the password that was in use at the time of the database backup. Make sure that the database entry in the Oracle PCA Wallet matches this password.
After synchronizing the passwords to access the Oracle VM Manager MySQL database,
restart the ovca
service from the master management node command line
as follows: service ovca restart.
The database restore procedure is described in the section entitled Restoring the MySQL Database for Oracle VM Manager in the Oracle VM Installation and Upgrade Guide.
The database password used by Oracle VM Manager can be restored by extracting this file from the backup:
ovmm/wls/config/jdbc/OVMDS-6373-jdbc.xml
. The file must be extracted to this location:/nfs/shared_storage/wls/config/jdbc/OVMDS-6373-jdbc.xml
. Since thejdbc
directory is symlinked from/u01/app/oracle/ovm-manager-3/machine1/base_adf_domain/config/jdbc/
, the file only needs to be extracted on one of the management nodes.For instructions to manually update the passwords stored in the Wallet, refer to the section entitled Replacing Default Passwords Manually in the Oracle Private Cloud Appliance Administrator's Guide.
Bug 19333583
Particularly in environments with a large number of virtual machines, and when many virtual machine operations – such as start, stop, save, restore or migrate – occur in a short time, the Java processes of Oracle VM may consume a lot of CPU and memory capacity on the master management node. Users will notice the browser and command line interfaces becoming very slow or unresponsive. This behavior is likely caused by a memory leak in the Oracle VM CLI.
Workaround: A possible remedy is to restart the Oracle VM CLI from the Oracle Linux shell on the master management node.
# /u01/app/oracle/ovm-manager-3/ovm_cli/bin/stopCLIMain.sh # nohup /u01/app/oracle/ovm-manager-3/ovm_cli/bin/startCLIMain.sh&
Bug 18965916
The default compute node configuration does not allow connectivity to additional storage resources in the data center network. Compute nodes are connected to the data center subnet to enable public connectivity for the virtual machines they host, but the compute nodes' physical network interfaces have no IP address in that subnet. Consequently, SAN or file server discovery will fail.
Bug 17508885
When network throughput is very high, certain conditions, like a large number of MTU
9000 streams, have been known to cause a kernel panic in a compute node. In that case,
/var/log/messages
on the affected compute node contains entries like
"Task Python:xxxxx blocked for more than 120 seconds". As a result,
HA virtual machines may not have been migrated in time to another compute node. Usually
compute nodes return to their normal operation automatically.
Workaround: If HA virtual machines have not been live-migrated off the affected compute node, log into Oracle VM Manager and restart the virtual machines manually. If an affected compute node does not return to normal operation, restart it from Oracle VM Manager.
Bugs 20981004 and 21119672
Before the product name change from Oracle Virtual Compute Appliance to
Oracle Private Cloud Appliance, the Oracle PCA Dashboard could be accessed at
https://<manager-vip>
:7002/ovca. As
of Release 2.0.5, the URL ends in /dashboard
instead. However, there is
no redirect from /ovca
to /dashboard
.
Workaround: Enter the correct URL:
https://<manager-vip>
:7002/dashboard.
Bug 21199163
Oracle PCA Release 2.0.1 uses the Oracle Application Development Framework (ADF) version 11.1.1.2.0 for both the Dashboard and the Oracle VM Manager user interface. This version of ADF does not support Microsoft Internet Explorer 10 or 11.
Workaround: Use Internet Explorer 9 or a different web browser; for example Mozilla Firefox.
Bug 18791952
In environments with a large number of virtual machines and
frequent connections through the VM console of Oracle VM Manager,
the browser UI login to Oracle VM Manager may fail with an
"unexpected error during login". A
restart of the ovmm
service is required.
Workaround: From the
Oracle Linux shell of the master management node, restart the
ovmm
service by entering the command
service ovmm restart. You should now be
able to log into Oracle VM Manager again.
Bug 19562053
During the upgrade to Oracle PCA Software Release 2.0.4 a new version of the Xen
hypervisor is installed on the compute nodes. While the upgrade is in progress, entries
may appear in the ovs-agent.log
files on the compute nodes indicating
that xen commands are not executed properly ("Error getting
VM stats"). This is a benign and temporary condition resolved by the compute
node reboot at the end of the upgrade process. No workaround is required.
Bug 20901778
The compute nodes in an Oracle PCA are all placed in a single clustered server pool during provisioning. A clustered server pool is created as part of the provisioning process. One of the configuration parameters is the cluster time-out: the time a server is allowed to be unavailable before failover events are triggered. To avoid false positives, and thus unwanted failovers, the Oracle PCA server pool time-out is set to 300 seconds. As a consequence, a virtual machine configured with high availability (HA VM) can be unavailable for 5 minutes when its host fails. After the cluster time-out has passed, the HA VM is automatically restarted on another compute node in the server pool.
This behavior is as designed; it is not a bug. The server pool cluster configuration causes the delay in restarting VMs after a failover has occurred.
The CLI command show expects a target parameter to be specified to indicate the target object for which information should be displayed. For most commands you can use tab-completion to determine what target objects are available for use as a parameter. However, if you enter a target object that does not exist, the command completes successfully but does not return any useful information. For example:
PCA> show cloud-wwpn blah ---------------------------------------- Cloud_Name blah WWPN_List ---------------------------------------- Status: Success
When using the show rack-layout command on an x4-2 rack, tab-completion may indicate that the rack name is 'x3-2_base'. This is a misnomer for the rack, however the command works as expected.
Bug 19679777
A task may take a while to complete, in which case it appears as "Running" if you display the task list in the CLI. While a task is running, it is possible to delete it using the CLI. The delete operation succeeds, but error messages appear at the CLI prompt a few minutes later. For example:
PCA> backup Task_ID Status Progress Start_Time ------- ------ -------- ---------- 1a553fae7ede40a7ac110fff557f2590 RUNNING 0 06-10-2015 06:22:12 --------------- 1 row displayed Status: Success PCA> delete task 1a553fae7ede40a7ac110fff557f2590 ************************************************************ WARNING !!! THIS IS A DESTRUCTIVE OPERATION. ************************************************************ Are you sure [y/N]:y Status: Success PCA> PCA> PCA> Traceback (most recent call last): File "/usr/lib64/python2.6/logging/__init__.py", line 776, in emit msg = self.format(record) File "/usr/lib64/python2.6/logging/__init__.py", line 654, in format return fmt.format(record) File "/usr/lib64/python2.6/logging/__init__.py", line 436, in format record.message = record.getMessage() File "/usr/lib64/python2.6/logging/__init__.py", line 306, in getMessage msg = msg % self.args TypeError: not all arguments converted during string formatting
It is advised not to delete running tasks. While the risk of irreparable damage is minimal, there may be adverse effects to deleting a running task.
Bug 21231788
The CLI command diagnose software executes several individual tests that execute various scripts from the management node where the command is executed. There is a known issue with this mechanism where some of these scripts fail to run due to a missing ssh key. Therefore, it is recommended that this command is not run for diagnostic purposes until this issue has been resolved in a forthcoming release.
If required, a workaround is available, which involves exchanging the ssh keys between the live management node and itself. This can be achieved by running the following command on the management node:
[root@ovcamn05r1 ~]# ssh-copy-id root@192.168.4.3 The authenticity of host '192.168.4.3 (192.168.4.3)' can't be established. RSA key fingerprint is 4e:33:d2:d1:2c:43:7f:f1:74:3f:42:b3:83:78:22:78. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.4.3' (RSA) to the list of known hosts. root@192.168.4.3's password:
If you implement this workaround, you should check that you are able to ssh into the
management node from itself, and that /root/.ssh/authorized_keys
does
not contain any additional keys that you were not expecting to be added.
Bug 19667855