3.2 Oracle Private Cloud Appliance Software

3.2.1 Do Not Reconfigure Network During Compute Node Provisioning or Upgrade
3.2.2 Nodes Attempt to Synchronize Time with the Wrong NTP Server
3.2.3 Unknown Symbol Warning during InfiniBand Driver Installation
3.2.4 Node Manager Does Not Show Node Offline Status
3.2.5 Compute Node State Changes Despite Active Provisioning Lock
3.2.6 Virtual Machines Remain in Running Status when Host Compute Node Is Reprovisioned
3.2.7 Update Functionality Not Available in Dashboard
3.2.8 Interrupting Download of Software Update Leads to Inconsistent Image Version and Leaves Image Mounted and Stored in Temporary Location
3.2.9 Compute Nodes Lose Oracle VM iSCSI LUNs During Software Update
3.2.10 Customer Created LUNs Are Mapped to the Wrong Initiator Group
3.2.11 Virtual Machine File Systems Become Read-Only after Storage Head Failover
3.2.12 Oracle VM Manager Tuning Settings Are Lost During Software Update
3.2.13 Oracle VM Manager Fails to Restart after Restoring a Backup Due to Password Mismatch
3.2.14 Oracle VM Java Processes Consume Large Amounts of Resources
3.2.15 External Storage Cannot Be Discovered Over Data Center Network
3.2.16 High Network Load with High MTU May Cause Time-Out and Kernel Panic in Compute Nodes
3.2.17 Oracle PCA Dashboard URL Is Not Redirected
3.2.18 User Interface Does Not Support Internet Explorer 10 and 11
3.2.19 Mozilla Firefox Cannot Establish Secure Connection with User Interface
3.2.20 Authentication Error Prevents Oracle VM Manager Login
3.2.21 Error Getting VM Stats in Oracle VM Agent Logs
3.2.22 Virtual Machine with High Availability Takes Five Minutes to Restart when Failover Occurs
3.2.23 The CLI Command show Accepts Non-Existent Targets As Parameters
3.2.24 Deleting a Running Task in the CLI Results in Errors
3.2.25 CLI Output Misaligned When Listing Tasks With Different UUID Length
3.2.26 The CLI Command list opus-ports Shows Information About Non-existent Switches

This section describes software-related limitations and workarounds.

3.2.1 Do Not Reconfigure Network During Compute Node Provisioning or Upgrade

In the Oracle PCA Dashboard, the Network Setup tab becomes available when the first compute node has been provisioned successfully. However, when installing and provisioning a new system, you must wait until all nodes have completed the provisioning process before changing the network configuration. Also, when provisioning new nodes at a later time, or when upgrading the environment, do not apply a new network configuration before all operations have completed. Failure to follow these guidelines is likely to leave your environment in an indeterminate state.

Workaround: Before reconfiguring the system network settings, make sure that no provisioning or upgrade processes are running.

Bug 17475738

3.2.2 Nodes Attempt to Synchronize Time with the Wrong NTP Server

External time synchronization, based on ntpd , is left in default configuration at the factory. As a result, NTP does not work when you first power on the Oracle PCA, and you may find messages in system logs similar to these:

Oct  1 11:20:33 ovcamn06r1 kernel: o2dlm: Joining domain ovca ( 0 1 ) 2 nodes
Oct  1 11:20:53 ovcamn06r1 ntpd_initres[3478]: host name not found:0.rhel.pool.ntp.org
Oct  1 11:20:58 ovcamn06r1 ntpd_initres[3478]: host name not found:1.rhel.pool.ntp.org
Oct  1 11:21:03 ovcamn06r1 ntpd_initres[3478]: host name not found:2.rhel.pool.ntp.org

Workaround: Apply the appropriate network configuration for your data center environment, as described in the section Network Setup in the Oracle Private Cloud Appliance Administrator's Guide. When the data center network configuration is applied successfully, the default values for NTP configuration are overwritten and components will synchronize their clocks with the source you entered.

Bug 17548941

3.2.3 Unknown Symbol Warning during InfiniBand Driver Installation

Towards the end of the management node install.log file, the following warnings appear:

> WARNING:
> /lib/modules/2.6.39-300.32.1.el6uek.x86_64/kernel/drivers/infiniband/ \
> hw/ipath/ib_ipath.ko needs unknown symbol ib_wq
> WARNING:
> /lib/modules/2.6.39-300.32.1.el6uek.x86_64/kernel/drivers/infiniband/ \
> hw/qib/ib_qib.ko needs unknown symbol ib_wq
> WARNING:
> /lib/modules/2.6.39-300.32.1.el6uek.x86_64/kernel/drivers/infiniband/ \
> ulp/srp/ib_srp.ko needs unknown symbol ib_wq
> *** FINISHED INSTALLING PACKAGES ***

These warnings have no adverse effects and may be disregarded.

Bug 16946511

3.2.4 Node Manager Does Not Show Node Offline Status

The role of the Node Manager database is to track the various states a compute node goes through during provisioning. After successful provisioning the database continues to list a node as running, even if it is shut down. For nodes that are fully operational, the server status is tracked by Oracle VM Manager. However, the Oracle PCA Dashboard displays status information from the Node Manager. This may lead to inconsistent information between the Dashboard and Oracle VM Manager, but it is not considered a bug.

Workaround: To verify the status of operational compute nodes, use the Oracle VM Manager user interface.

Bug 17456373

3.2.5 Compute Node State Changes Despite Active Provisioning Lock

The purpose of a lock of the type provisioning or all_provisioning is to prevent all compute nodes from starting or continuing a provisioning process. However, when you attempt to reprovision a running compute node from the Oracle PCA CLI while an active lock is in place, the compute node state changes to "reprovision_only" and it is marked as "DEAD". Provisioning of the compute node continues as normal when the provisioning lock is deactivated.

Bug 22151616

3.2.6 Virtual Machines Remain in Running Status when Host Compute Node Is Reprovisioned

Using the Oracle PCA CLI it is possible to force the reprovisioning of a compute node even if it is hosting running virtual machines. The compute node is not placed in maintenance mode when running Oracle VM Server 3.2.10. Consequently, the active virtual machines are not shut down or migrated to another compute node. Instead these VMs remain in running status and Oracle VM Manager reports their host compute node as "N/A".

Workaround: In this particular condition the VMs can no longer be migrated. They must be killed and restarted. After a successful restart they return to normal operation on a different host compute node in accordance with start policy defined for the server pool.

Bug 22018046

3.2.7 Update Functionality Not Available in Dashboard

The Oracle PCA Dashboard cannot be used to perform an update of the software stack.

Workaround: Use the command line tool pca-updater to update the software stack of your Oracle PCA. For details, refer to the section Oracle Private Cloud Appliance Software Update in the Oracle Private Cloud Appliance Administrator's Guide. For step-by-step instructions, refer to the section Update. You can use SSH to log in to each management node and check /etc/pca-info for log entries indicating restarted services and new software revisions.

Bug 17476010, 17475976 and 17475845

3.2.8 Interrupting Download of Software Update Leads to Inconsistent Image Version and Leaves Image Mounted and Stored in Temporary Location

The first step of the software update process is to download an image file, which is unpacked in a particular location on the ZFS storage appliance. When the download is interrupted, the file system is not cleaned up or rolled back to a previous state. As a result, contents from different versions of the software image may end up in the source location from where the installation files are loaded. In addition, the downloaded *.iso file remains stored in /tmp and is not unmounted. If downloads are frequently started and stopped, this could cause the system to run out of free loop devices to mount the *.iso files, or even to run out of free space.

Workaround: The files left behind by previous downloads do not prevent you from running the update procedure again and restarting the download. Download a new software update image. When it completes successfully you can install the new version of the software, as described in the section Update in the Oracle Private Cloud Appliance Administrator's Guide.

Bug 18352512

3.2.9 Compute Nodes Lose Oracle VM iSCSI LUNs During Software Update

Several iSCSI LUNs, including the essential server pool file system, are mapped on each compute node. When you update the appliance software, it may occur that one or more LUNs are missing on certain compute nodes. In addition, there may be problems with the configuration of the clustered server pool, preventing the existing compute nodes from joining the pool and resuming correct operation after the software update.

Workaround: To avoid these software update issues, upgrade all previously provisioned compute nodes by following the procedure described in the section Upgrading Existing Compute Node Configuration from Release 1.0.2 in the Oracle Private Cloud Appliance Administrator's Guide.

Bugs 17922555, 18459090, 18433922 and 18397780

3.2.10 Customer Created LUNs Are Mapped to the Wrong Initiator Group

When adding LUNs on the Oracle PCA internal ZFS Storage Appliance you must add them under the "OVM" target group, because this target groups exists by default and there can be only one. However, if you continue to use default settings the new LUNs are mapped to the "All Initiators" group. This means that the LUNs are mapped to all nodes in the system, and this causes several problems inside the appliance rack. Instead, LUNs must be associated with a different initiator group so that the appliance software can map them correctly at initial setup or during a software update.

Workaround: When creating additional LUNs on the internal ZFS Storage Appliance, do not map to the default All Initiators group but use a separate one instead.

Bugs 22309236 and 18155778

3.2.11 Virtual Machine File Systems Become Read-Only after Storage Head Failover

When a failover occurs between the storage heads of the Oracle PCA internal ZFS storage appliance, or an externally connected ZFS storage appliance, the file systems used by virtual machines may become read-only, preventing normal VM operation. Compute nodes may also hang or crash as a result.

Workaround: There is no documented workaround to prevent the issue. Once the storage head failover has completed, you can reboot the virtual machines to bring them back online in read-write mode.

Bugs 19324312 and 19670873

3.2.12 Oracle VM Manager Tuning Settings Are Lost During Software Update

During the Oracle PCA software update from Release 1.0.2 to Release 1.1.x, it may occur that the specific tuning settings for Oracle VM Manager are not applied correctly, and that default settings are used instead.

Workaround: Verify the Oracle VM Manager tuning settings and re-apply them if necessary. Follow the instructions in the section Verifying and Re-applying Oracle VM Manager Tuning after Software Update in the Oracle Private Cloud Appliance Administrator's Guide.

Bug 18477228

3.2.13 Oracle VM Manager Fails to Restart after Restoring a Backup Due to Password Mismatch

If you have changed the password for Oracle VM Manager or its related components Oracle WebLogic Server and Oracle MySQL database, and you need to restore the Oracle VM Manager from a backup that was made prior to the password change, the passwords will be out of sync. As a result of this password mismatch, Oracle VM Manager cannot connect to its database and cannot be started.

Workaround: Follow the instructions in Restoring a Backup After a Password Change in the Oracle Private Cloud Appliance Administrator's Guide.

Bug 19333583

3.2.14 Oracle VM Java Processes Consume Large Amounts of Resources

Particularly in environments with a large number of virtual machines, and when many virtual machine operations – such as start, stop, save, restore or migrate – occur in a short time, the Java processes of Oracle VM may consume a lot of CPU and memory capacity on the master management node. Users will notice the browser and command line interfaces becoming very slow or unresponsive. This behavior is likely caused by a memory leak in the Oracle VM CLI.

Workaround: A possible remedy is to restart the Oracle VM CLI from the Oracle Linux shell on the master management node.

# /u01/app/oracle/ovm-manager-3/ovm_cli/bin/stopCLIMain.sh
# nohup /u01/app/oracle/ovm-manager-3/ovm_cli/bin/startCLIMain.sh&

Bug 18965916

3.2.15 External Storage Cannot Be Discovered Over Data Center Network

The default compute node configuration does not allow connectivity to additional storage resources in the data center network. Compute nodes are connected to the data center subnet to enable public connectivity for the virtual machines they host, but the compute nodes' physical network interfaces have no IP address in that subnet. Consequently, SAN or file server discovery will fail.

Bug 17508885

3.2.16 High Network Load with High MTU May Cause Time-Out and Kernel Panic in Compute Nodes

When network throughput is very high, certain conditions, like a large number of MTU 9000 streams, have been known to cause a kernel panic in a compute node. In that case, /var/log/messages on the affected compute node contains entries like "Task Python:xxxxx blocked for more than 120 seconds". As a result, HA virtual machines may not have been migrated in time to another compute node. Usually compute nodes return to their normal operation automatically.

Workaround: If HA virtual machines have not been live-migrated off the affected compute node, log into Oracle VM Manager and restart the virtual machines manually. If an affected compute node does not return to normal operation, restart it from Oracle VM Manager.

Bugs 20981004 and 21841578

3.2.17 Oracle PCA Dashboard URL Is Not Redirected

Before the product name change from Oracle Virtual Compute Appliance to Oracle Private Cloud Appliance, the Oracle PCA Dashboard could be accessed at https://<manager-vip>:7002/ovca. As of Release 2.0.5, the URL ends in /dashboard instead. However, there is no redirect from /ovca to /dashboard.

Workaround: Enter the correct URL: https://<manager-vip>:7002/dashboard.

Bug 21199163

3.2.18 User Interface Does Not Support Internet Explorer 10 and 11

Oracle PCA Release 2.1.1 uses the Oracle Application Development Framework (ADF) version 11.1.1.2.0 for both the Dashboard and the Oracle VM Manager user interface. This version of ADF does not support Microsoft Internet Explorer 10 or 11.

Workaround: Use Internet Explorer 9 or a different web browser; for example Mozilla Firefox.

Bug 18791952

3.2.19 Mozilla Firefox Cannot Establish Secure Connection with User Interface

Both the Oracle PCA Dashboard and the Oracle VM Manager user interface run on an architecture based on Oracle WebLogic Server, Oracle Application Development Framework (ADF) and Oracle JDK 6. The cryptographic protocols supported on this architecture are SSLv3 and TLSv1.0. Mozilla Firefox version 38.2.0 or later no longer supports SSLv3 connections with a self-signed certificate. As a result, an error message might appear when you try to open the user interface login page.

In Oracle PCA Release 2.1.1 – with Oracle VM Release 3.2.10 – a server-side fix eliminates these secure connection failures. If secure connection failures occur with future versions of Mozilla Firefox, the workaround below might resolve them.

Workaround: Override the default Mozilla Firefox security protocol as follows:

In the Mozilla Firefox address bar, type about:config to access the browser configuration.
Acknowledge the warning about changing advanced settings by clicking I'll be careful, I promise!.
In the list of advanced settings, use the Search bar to filter the entries and look for the settings to be modified.
Double-click the following entries and then enter the new value to change the configuration preferences:
- security.tls.version.fallback-limit: 1
- security.ssl3.dhe_rsa_aes_128_sha: false
- security.ssl3.dhe_rsa_aes_256_sha: false
If necessary, also modify the configuration preference security.tls.insecure_fallback_hosts and enter the affected hosts as a comma-separated list, either as domain names or as IP addresses.
Close the Mozilla Firefox advanced configuration tab. The pages affected by the secure connection failure should now load normally.

Bug 21622475 and 21803485

3.2.20 Authentication Error Prevents Oracle VM Manager Login

In environments with a large number of virtual machines and frequent connections through the VM console of Oracle VM Manager, the browser UI login to Oracle VM Manager may fail with an "unexpected error during login". A restart of the ovmm service is required.

Workaround: From the Oracle Linux shell of the master management node, restart the ovmm service by entering the command service ovmm restart. You should now be able to log into Oracle VM Manager again.

Bug 19562053

3.2.21 Error Getting VM Stats in Oracle VM Agent Logs

During the upgrade to Oracle PCA Software Release 2.0.4 a new version of the Xen hypervisor is installed on the compute nodes. While the upgrade is in progress, entries may appear in the ovs-agent.log files on the compute nodes indicating that xen commands are not executed properly ("Error getting VM stats"). This is a benign and temporary condition resolved by the compute node reboot at the end of the upgrade process. No workaround is required.

Bug 20901778

3.2.22 Virtual Machine with High Availability Takes Five Minutes to Restart when Failover Occurs

The compute nodes in an Oracle PCA are all placed in a single clustered server pool during provisioning. A clustered server pool is created as part of the provisioning process. One of the configuration parameters is the cluster time-out: the time a server is allowed to be unavailable before failover events are triggered. To avoid false positives, and thus unwanted failovers, the Oracle PCA server pool time-out is set to 300 seconds. As a consequence, a virtual machine configured with high availability (HA VM) can be unavailable for 5 minutes when its host fails. After the cluster time-out has passed, the HA VM is automatically restarted on another compute node in the server pool.

This behavior is as designed; it is not a bug. The server pool cluster configuration causes the delay in restarting VMs after a failover has occurred.

3.2.23 The CLI Command show Accepts Non-Existent Targets As Parameters

The CLI command show expects a target parameter to be specified to indicate the target object for which information should be displayed. For most commands you can use tab-completion to determine what target objects are available for use as a parameter. However, if you enter a target object that does not exist, the command completes successfully but does not return any useful information. For example:

PCA> show cloud-wwpn blah

----------------------------------------
Cloud_Name           blah                 
WWPN_List                                 
----------------------------------------

Status: Success

When using the show rack-layout command on an x4-2 rack, tab-completion may indicate that the rack name is 'x3-2_base'. This is a misnomer for the rack, however the command works as expected.

Bug 19679777

3.2.24 Deleting a Running Task in the CLI Results in Errors

A task may take a while to complete, in which case it appears as "Running" if you display the task list in the CLI. While a task is running, it is possible to delete it using the CLI. The delete operation succeeds, but error messages appear at the CLI prompt a few minutes later. For example:

PCA> backup
Task_ID                           Status  Progress  Start_Time
-------                           ------  --------  ----------
1a553fae7ede40a7ac110fff557f2590  RUNNING        0  06-10-2015 06:22:12   
--------------- 
1 row displayed
Status: Success

PCA> delete task 1a553fae7ede40a7ac110fff557f2590 
************************************************************
WARNING !!! THIS IS A DESTRUCTIVE OPERATION.
************************************************************
Are you sure [y/N]:y
Status: Success
PCA>
PCA>
PCA> Traceback (most recent call last): 
        File "/usr/lib64/python2.6/logging/__init__.py", line 776, in emit      
          msg = self.format(record)
        File "/usr/lib64/python2.6/logging/__init__.py", line 654, in format
          return fmt.format(record)
        File "/usr/lib64/python2.6/logging/__init__.py", line 436, in format 
          record.message = record.getMessage()
        File "/usr/lib64/python2.6/logging/__init__.py", line 306, in getMessage
          msg = msg % self.args
TypeError: not all arguments converted during string formatting

Caution

It is advised not to delete running tasks. While the risk of irreparable damage is minimal, there may be adverse effects to deleting a running task.

Bug 21231788

3.2.25 CLI Output Misaligned When Listing Tasks With Different UUID Length

To simplify task management in the CLI the task identifiers (UUIDs) have been shortened. After an upgrade from a Release 2.0.x the task list may still contain entries from before the upgrade, resulting in misaligned entries due to the longer UUID. The command output then looks similar to this example:

PCA> list task
Task_ID         Status  Progress Start_Time           Task_Name
-------         ------  -------- ----------           ---------
3327cc9b1414e2  RUNNING None     08-18-2015 11:45:54  update_download_image
9df321d37eed4bfea74221d22c26bfce SUCCESS      100 08-18-2015 09:59:08
update_run_ovmm_upgrader
8bcdcdf785ac4dfe96406284f1689802 SUCCESS      100 08-18-2015 08:46:11
update_download_image
f1e6e60351174870b853b24f8eb7b429 SUCCESS      100 08-18-2015 04:00:01  backup
e2e00c638b9e43808623c25ffd4dd42b SUCCESS      100 08-17-2015 16:00:01  backup
d34325e2ff544598bd6dcf786af8bf30 SUCCESS      100 08-17-2015 10:47:20
update_download_image
dd9d1f3b5c6f4bf187298ed9dcffe8f6 SUCCESS      100 08-17-2015 04:00:01  backup
a48f438fe02d4b9baa91912b34532601 SUCCESS      100 08-16-2015 16:00:01  backup
e03c442d27bb47d896ab6a8545482bdc SUCCESS      100 08-16-2015 04:00:01  backup
f1d2f637ad514dce9a3c389e5e7bbed5 SUCCESS      100 08-15-2015 16:00:02  backup
c4bf0d86c7a24a4fb656926954ee6cf2 SUCCESS      100 08-15-2015 04:00:01  backup
016acaf01d154095af4faa259297d942 SUCCESS      100 08-14-2015 16:00:01  backup
-----------------
12 rows displayed

Workaround: It is generally good practice to purge old jobs from time to time. You can remove the old tasks with the command delete task uuid. When all old tasks have been removed the task list is output with correct column alignment.

Bug 21650772

3.2.26 The CLI Command list opus-ports Shows Information About Non-existent Switches

The CLI command list opus-ports lists ports for additional switches that are not present within your environment. These switches are labelled OPUS-3, OPUS-4, OPUS-5 and OPUS-6 and are listed as belonging to rack numbers that are not available in a current deployment. This is due to the design, which caters to the future expansion of an environment. These entries are currently displayed in the listing of port relationships between compute nodes and each Oracle Switch ES1-24, and can be safely ignored.

Bug 18904287