Chapter 3 Updating Oracle Private Cloud Appliance
- 3.1 Before You Start Updating
- 3.2 Using the Oracle Private Cloud Appliance Upgrader
-
3.3 Upgrading Component Firmware
- 3.3.1 Firmware Policy
- 3.3.2 Install the Current Firmware on All Compute and Management Nodes
- 3.3.3 Upgrading the Operating Software on the Oracle ZFS Storage Appliance
- 3.3.4 Upgrading the Cisco Switch Firmware
- 3.3.5 Upgrading the NM2-36P Sun Datacenter InfiniBand Expansion Switch Firmware
- 3.3.6 Upgrading the Oracle Fabric Interconnect F1-15 Firmware
- 3.4 Upgrading the Storage Network
- 3.5 Upgrading the Virtualization Platform
Due to the nature of the Oracle Private Cloud Appliance – where the term appliance is key – an update is a delicate and complicated procedure that deals with different hardware and software components at the same time. It is virtually impossible to automate the entire process, and more importantly it would be undesirable to take the appliance and the virtual environment it hosts out of service entirely for updating. Instead, updates can be executed in phases and scheduled for minimal downtime. The following table describes the sequence to perform Oracle Private Cloud Appliance updates.
Order |
Component |
Description |
---|---|---|
1. |
management nodes software |
Install updated management software on both management nodes
( |
2. |
all firmware, in this order
|
See Section 3.3, “Upgrading Component Firmware”. For InfinBand-based systems, perform switch upgrades in this order:
|
3. |
storage network update |
|
4. |
compute node updates |
3.1 Before You Start Updating
Please read and observe the critical information in this section before you begin any procedure to update your Oracle Private Cloud Appliance.
All the software included in a given release of the
Oracle Private Cloud Appliance software is tested to work together and should be
treated as one package. Consequently, no appliance component
should be updated individually, unless Oracle
provides specific instructions to do so. All Oracle Private Cloud Appliance
software releases are downloaded as a single large
.iso
file, which includes the items listed
above.
The appliance update process must always be initiated from the active management node.
To view supported firmware versions for all releases of Oracle Private Cloud Appliance, see support note Doc ID 1610373.1.
3.1.1 Warnings and Cautions
Read and understand these warnings and cautions before you start the appliance update procedure. They help you avoid operational issues including data loss and significant downtime.
In this version of the Oracle Private Cloud Appliance Administrator's Guide, it is assumed that your system is currently running Controller Software release 2.3.4 or 2.4.1 prior to this software update.
If your system is currently running an earlier version, please refer to the Update chapter of the Administrator's Guide for Release 2.3. Follow the appropriate procedures and make sure that your appliance configuration is valid for the Release 2.4.2 update.
When updating the Oracle Private Cloud Appliance software, make sure that no provisioning operations occur and that any externally scheduled backups are suspended. Such operations could cause a software update or component firmware upgrade to fail and lead to system downtime.
On Oracle Private Cloud Appliance management nodes the YUM repositories have been intentionally disabled and should not be enabled by the customer. Updates and upgrades of the management node operating system and software components must only be applied through the update mechanism described in the documentation.
To ensure that your Oracle Private Cloud Appliance configuration remains in a qualified state, take the required firmware upgrades into account when planning the controller software update. For more information, refer to Section 3.3.1, “Firmware Policy”.
During controller software updates, backup operations must be
prevented. The Oracle Private Cloud Appliance Upgrader disables
crond
and blocks backups.
If you have generated custom keys using
ovmkeytool.sh
in a previous version of the
Oracle Private Cloud Appliance software, you must regenerate the keys prior
to updating the Controller Software. For instructions, refer
to the support note with
Doc
ID 2597439.1. See also
Section 7.9.1, “Creating a Keystore”.
If direct public access is not available within your data center and you make use of proxy servers to facilitate HTTP, HTTPS and FTP traffic, it may be necessary to edit the Oracle Private Cloud Appliance system properties, using the CLI on each management node, to ensure that the correct proxy settings are specified for a download to succeed from the Internet. This depends on the network location from where the download is served. See Section 7.2, “Adding Proxy Settings for Oracle Private Cloud Appliance Updates” for more information.
If the internal ZFS Storage Appliance contains customer-created LUNs, make sure they are not mapped to the default initiator group. See
See “Customer Created LUNs Are Mapped to the Wrong Initiator Group” within the Known Limitations and Workarounds section of the Oracle Private Cloud Appliance Release Notes.
When updating the Oracle Private Cloud Appliance Controller Software to Release 2.4.x, Oracle VM Manager is unavailable for the entire duration of the update. The virtualized environment remains functional, but configuration changes and other management operations are not possible.
Compute nodes cannot be upgraded to the appropriate Oracle VM Server Release 3.4.x with the Oracle VM Manager web UI. You must upgrade them using the update compute-node command within the Oracle Private Cloud Appliance CLI.
To perform this CLI-based upgrade procedure, follow the specific instructions in Section 3.5, “Upgrading the Virtualization Platform”.
As stated in Section 5.1, “Guidelines and Limitations”, at the start of Chapter 5, Managing the Oracle VM Virtual Infrastructure, the settings of the default server pool and custom tenant groups must not be modified through Oracle VM Manager. For compute node upgrade specifically, it is critical that the server pool option "Override Global Server Update Group" remains deselected. The compute node update process must use the repository defined globally, and overriding this will cause the update to fail.
Once you have confirmed that the update process has completed, it is advised that you wait a further 30 minutes before starting another compute node or management node software update. This allows the necessary synchronization tasks to complete.
If you ignore the recommended delay between these update procedures there could be issues with further updating as a result of interference between existing and new tasks.
3.1.2 Backup Prevents Data Loss
An update of the Oracle Private Cloud Appliance software stack may involve a complete re-imaging of the management nodes. Any customer-installed agents or customizations are overwritten in the process. Before applying new appliance software, back up all local customizations and prepare to re-apply them after the update has completed successfully.
Oracle Enterprise Manager Plug-in Users
If you use Oracle Enterprise Manager and the Oracle Enterprise Manager Plug-in to
monitor your Oracle Private Cloud Appliance environment, always back up the
oraInventory Agent data to
/nfs/shared_storage
before updating the
controller software. You can restore the data after the
Oracle Private Cloud Appliance software update is complete.
For detailed instructions, refer to the Agent Recovery section in the Oracle Enterprise Manager Plug-in documentation.
3.1.3 Determine Firmware Versions
Use the following commands to determine the current version of fimrware installed on a component.
-
Using an account with superuser privileges, log in to the component.
For Cisco switches you must log in as
admin
. -
Use the appropriate command to find the current firmware version of each component.
-
compute/management nodes
→ fwupdate list sp_bios
-
ZFS Storage Appliances
ovcasn02r1:> maintenance system updates show current contains current version
-
Cisco switches
ovcasw21r1# show version
-
Oracle Fabric Interconnect F1-15
admin@ovcasw22r1[xsigo] show system version Build 4.0.13-XGOS - (sushao) Wed Dec 19 11:28:28 PST 2018
-
NM2-36P Sun Datacenter InfiniBand Expansion Switch
[root@ilom-ovcasw19r1 ~]# version SUN DCS 36p version: 2.2.13
-
Oracle Switch ES1-24
-> cd /SYS/fs_cli cd: Connecting to Fabric Switch CLI ilom-ovcasw21ar1 SEFOS# show system information ... Firmware Version :ES1-24-1.3.1.23
-
3.2 Using the Oracle Private Cloud Appliance Upgrader
With the Oracle Private Cloud Appliance Upgrader, the two management node upgrade processes are theoretically separated. Each management node upgrade is initiated by a single command and managed through the Upgrader, which invokes the native Oracle VM Manager upgrade mechanisms. However, you must treat the upgrade of the two management nodes as a single operation.
During the management node upgrade, the high-availability (HA) configuration of the management node cluster is temporarily broken. To restore HA management functionality and mitigate the risk of data corruption, it is critical that you start the upgrade of the second management node immediately after a successful upgrade of the first management node.
The Oracle Private Cloud Appliance Upgrader manages the entire process to upgrade both management nodes in the appliance. Under no circumstances should you perform any management operations – through the Oracle Private Cloud Appliance Dashboard or CLI, or Oracle VM Manager – while the Upgrader process is running, and until both management nodes have been successfully upgraded through the Upgrader. Although certain management functions cannot be programmatically locked during the upgrade, they are not supported, and are likely to cause configuration inconsistencies and considerable repair downtime.
Once the upgrade has been successfully completed on both management nodes, you can safely execute appliance management tasks and configuration of the virtualized environment.
As of Release 2.3.4, a separate command line tool is provided to
manage the Controller Software update process. A version of the
Oracle Private Cloud Appliance Upgrader is included in the Controller Software
.iso
image. However, Oracle
recommends that you download and install the latest stand-alone
version of the Upgrader tool on the management nodes. The
Oracle Private Cloud Appliance Upgrader requires only a couple of commands to
execute several sets of tasks, which were scripted or manual steps
in previous releases. The Upgrader is more robust and easily
extensible, and provides a much better overall upgrade experience.
A more detailed description of the Oracle Private Cloud Appliance Upgrader is included in the introductory chapter of this book. Refer to Section 1.7, “Oracle Private Cloud Appliance Upgrader”.
3.2.1 Rebooting the Management Node Cluster
It is advised to reboot both management nodes before starting the appliance software update. This leaves the management node cluster in the cleanest possible state, ensures that no system resources are occupied unnecessarily, and eliminates potential interference from processes that have not completed properly.
-
Using SSH and an account with superuser privileges, log into both management nodes using the IP addresses you configured in the Network Setup tab of the Oracle Private Cloud Appliance Dashboard. If you use two separate consoles you can view both side by side.
NoteThe default
root
password is Welcome1. For security reasons, you must set a new password at your earliest convenience. -
Run the command pca-check-master on both management nodes to verify which node owns the active role.
-
Reboot the management node that is NOT currently the active node. Enter init 6 at the prompt.
-
Ping the machine you rebooted. When it comes back online, reconnect using SSH and monitor system activity to determine when the secondary management node takes over the active role. Enter this command at the prompt: tail -f /var/log/messages. New system activity notifications will be output to the screen as they are logged.
-
In the other SSH console, which is connected to the current active management node, enter init 6 to reboot the machine and initiate management node failover.
The log messages in the other SSH console should now indicate when the secondary management node takes over the active role.
-
Verify that both management nodes have come back online after reboot and that the active role has been transferred to the other manager. Run the command pca-check-master on both management nodes.
If this is the case, proceed with the software update steps below.
3.2.2 Installing the Oracle Private Cloud Appliance Upgrader
The Oracle Private Cloud Appliance Upgrader is a separate application with its own release plan, independent of Oracle Private Cloud Appliance. Always download and install the latest version of the Oracle Private Cloud Appliance Upgrader before you execute any verification or upgrade procedures.
-
Log into My Oracle Support and download the latest version of the Oracle Private Cloud Appliance Upgrader.
The Upgrader can be found under patch ID 31747130, and is included in part 1 of a series of downloadable zip files. Any updated versions of the Upgrader will be made available in the same location.
To obtain the Upgrader package, download this zip file and extract the file
pca_upgrader-
.<version>
.el6.noarch.rpm -
Copy the downloaded
*.rpm
package to the active management node and install it.[root@ovcamn05r1 ~]# pca-check-master NODE: 192.168.4.3 MASTER: True root@ovcamn05r1 tmp]# yum install pca_upgrader-2.0.el
<n>
.noarch.rpm Preparing..########################################### [100%] 1:pca_upgrader########################################### [100%]CautionAlways download and use the latest available version of the Oracle Private Cloud Appliance Upgrader.
-
If the version of the Oracle Private Cloud Appliance Upgrader you downloaded, is newer than the version shipped in the Controller Software ISO, then upgrade to the newer version. From the directory where the
*.rpm
package was saved, run the commandrpm -U pca_upgrader-1.2-111.el6.noarch.rpm
. -
Repeat the
*.rpm
upgrade on the second management node.The Oracle Private Cloud Appliance Upgrader verifies the version of the Upgrader installed on the second management node. Only if the version in the ISO is newer, the package on the second management node is automatically upgraded in the process. If you downloaded a newer version, you must upgrade the package manually on both management nodes.
3.2.3 Verifying Upgrade Readiness
The Oracle Private Cloud Appliance Upgrader has a verify-only mode. It allows you to run all the pre-checks defined for a management node upgrade, without proceeding to the actual upgrade steps. The terminal output and log file report any issues you need to fix before the system is eligible for the next Controller Software update.
The Oracle Private Cloud Appliance Upgrader cannot be stopped by means of a
keyboard interrupt or by closing the terminal session. After a
keyboard interrupt (Ctrl+C
) the Upgrader
continues to execute all pre-checks. If the terminal session
is closed, the Upgrader continues as a background process.
If the Upgrader process needs to be terminated, enter this
command pca_upgrader --kill
.
-
Go to Oracle VM Manager and make sure that all compute nodes are in Running status. If any server is not in Running status, resolve the issue before proceeding. For instructions to correct the compute node status, refer to the support note with Doc ID 2245197.1.
-
Perform the required manual pre-upgrade checks. Refer to Section 7.5, “Running Manual Pre- and Post-Upgrade Checks in Combination with Oracle Private Cloud Appliance Upgrader” for instructions.
-
Log in to My Oracle Support and download the required Oracle Private Cloud Appliance software update.
You can find the update by searching for the product name “Oracle Private Cloud Appliance”, or for the Patch or Bug Number associated with the update you need.
CautionRead the information and follow the instructions in the
readme
file very carefully. It is crucial for a successful Oracle Private Cloud Appliance Controller Software update and Oracle VM upgrade. -
Make the update, a zipped ISO, available on an HTTP or FTP server that is reachable from your Oracle Private Cloud Appliance. Alternatively, if upgrade time is a major concern, you can download the ISO file to the local file system on both management nodes. This reduces the upgrade time for the management nodes, but has no effect on the time required to upgrade the compute nodes or the Oracle VM database.
The Oracle Private Cloud Appliance Upgrader downloads the ISO from the specified location and unpacks it on the management node automatically at runtime.
-
Using SSH and an account with superuser privileges, log in to the active management node through its individually assigned IP address, not the shared virtual IP.
NoteDuring the upgrade process, the interface with the shared virtual IP address is shut down. Therefore, you must log in using the individually assigned IP address of the management node.
The default
root
password is Welcome1. For security reasons, you must set a new password at your earliest convenience. -
From the active management node, run the Oracle Private Cloud Appliance Upgrader in verify-only mode. The target of the command must be the stand-by management node.
NoteThe console output below is an example. You may see a different output, depending on the specific architecture and configuration of your appliance.
[root@ovcamn05r1 ~]# pca-check-master NODE: 192.168.4.3 MASTER: True root@ovcamn05r1 ~]# pca_upgrader -V -t management -c ovcamn06r1 -g 2.4.3 \ -l http://
<path-to-iso>
/ovca-2.4.3-b000.iso.zip PCA Rack Type: PCA X8_BASE. Please refer to log file /nfs/shared_storage/pca_upgrader/log/pca_upgrader_<date>
-<time>
.log for more details. Beginning PCA Management Node Pre-Upgrade Checks... Validate the Image Provided 1/44 Internal ZFSSA Available Space Check 2/44 MN Disk and Shared Storage Space Check 3/44 [...] Oracle VM Minimum Version Check 41/44 OS Check 42/44 OSA Disabled Check 43/44 ZFSSA Network Configuration Check 44/44 PCA Management Node Pre-Upgrade Checks completed after 0 minutes --------------------------------------------------------------------------- PCA Management Node Pre-Upgrade Checks Passed --------------------------------------------------------------------------- Validate the Image Provided Passed Internal ZFSSA Available Space Check Passed [...] OS Check Passed Password Check Passed OSA Disabled Check Passed --------------------------------------------------------------------------- Overall Status Passed --------------------------------------------------------------------------- -
As the verification process runs, check the console output for test progress. When all pre-checks have been completed, a summary is displayed. A complete overview of the verification process is saved in the file
/nfs/shared_storage/pca_upgrader/log/pca_upgrader_
.<date>
-<time>
.logSome pre-checks may result in a warning. These warnings are unlikely to cause issues, and therefore do not prevent you from executing the upgrade, but they do indicate a situation that should be investigated. When an upgrade command is issued, warnings do cause the administrator to be prompted whether to proceed with the upgrade, or quit and investigate the warnings first.
-
If pre-checks have failed, consult the log file for details. Fix the reported problems, then execute the verify command again.
NoteIf errors related to SSL certificates are reported, check whether these have been re-generated using
ovmkeytool.sh
. This can cause inconsistencies between the information stored in the Wallet and the actual location of your certificate. For detailed information and instructions to resolve the issue, refer to the support note with Doc ID 2597439.1. -
Repeat this process until no more pre-check failures are reported. When the system passes all pre-checks, it is ready for the Controller Software update.
3.2.4 Executing a Controller Software Update
During a Controller Software update, the virtualized environment does not accept any management operations. After successful upgrade of the management node cluster, upgrade the firmware on rack components, then perform the network storage upgrade, and finally, upgrade the compute nodes in phases. When you have planned all these upgrade tasks, and when you have successfully completed the upgrade readiness verification, your environment is ready for a Controller Software update and any additional upgrades.
No upgrade procedure can be executed without completing the pre-checks. Therefore, the upgrade command first executes the same steps as in Section 3.2.3, “Verifying Upgrade Readiness”. After successful verification, the upgrade steps are started.
The console output shown throughout this section is an example. You may see a different output, depending on the specific architecture and configuration of your appliance.
The Oracle Private Cloud Appliance Upgrader cannot be stopped by means of a keyboard interrupt or by closing the terminal session.
After a keyboard interrupt (Ctrl+C
) the
Upgrader continues the current phase of the process. If
pre-checks are in progress, they are all completed, but the
upgrade phase does not start automatically after successful
completion of all pre-checks. If the upgrade phase is in
progress at the time of the keyboard interrupt, it continues
until upgrade either completes successfully or fails.
If the terminal session is closed, the Upgrader continues as a background process.
If the Upgrader process needs to be terminated, enter this
command: pca_upgrader --kill
.
-
Using SSH and an account with superuser privileges, log in to the active management node through its individually assigned IP address, not the shared virtual IP.
NoteDuring the upgrade process, the interface with the shared virtual IP address is shut down. Therefore, you must log in using the individually assigned IP address of the management node.
The default
root
password is Welcome1. For security reasons, you must set a new password at your earliest convenience.NO MANAGEMENT OPERATIONS DURING UPGRADEUnder no circumstances should you perform any management operations – through the Oracle Private Cloud Appliance Dashboard or CLI, or Oracle VM Manager – while the Upgrader process is running, and until both management nodes have been successfully upgraded through the Upgrader.
-
From the active management node, run the Oracle Private Cloud Appliance Upgrader with the required upgrade parameters. The target of the command must be the stand-by management node.
[root@ovcamn05r1 ~]# pca-check-master NODE: 192.168.4.3 MASTER: True root@ovcamn05r1 ~]# pca_upgrader -U -t management -c ovcamn06r1 -g 2.4.3 \ -l http://
<path-to-iso>
/ovca-2.4.3-b000.iso.zip PCA Rack Type: PCA X8_BASE. Please refer to log file /nfs/shared_storage/pca_upgrader/log/pca_upgrader_<date>
-<time>
.log for more details. Beginning PCA Management Node Pre-Upgrade Checks... [...] *********************************************************************** Warning: The management precheck completed with warnings. It is safe to continue with the management upgrade from this point or the upgrade can be halted to investigate the warnings. ************************************************************************ Do you want to continue? [y/n]: yAfter successfully completing the pre-checks, the Upgrader initiates the Controller Software update on the other management node. If any errors occur during the upgrade phase, tasks are rolled back and the system is returned to its original state from before the upgrade command.
Rollback works for errors that occur during these steps:
-
downloading the ISO
-
setting up the YUM repository
-
taking an Oracle VM backup
-
breaking the Oracle Private Cloud Appliance HA model
Beginning PCA Management Node upgrade for ovcamn06r1 Disable PCA Backups 1/16 Download ISO 2/16 Setup Yum Repo 3/16 Take OVM Backup 4/16 ... PCA Management Node upgrade of ovcamn06r1 completed after 43 minutes Beginning PCA Post-Upgrade Checks... OVM Manager Cache Size Check 1/1 PCA Post-Upgrade Checks completed after 2 minutes --------------------------------------------------------------------------- PCA Management Node Pre-Upgrade Checks Passed --------------------------------------------------------------------------- Validate the Image Provided Passed Internal ZFSSA Available Space Check Passed [...] --------------------------------------------------------------------------- PCA Management Node Upgrade Passed --------------------------------------------------------------------------- Disable PCA Backups Passed [...] Restore PCA Backups Passed Upgrade is complete Passed [...] --------------------------------------------------------------------------- Overall Status Passed --------------------------------------------------------------------------- PCA Management Node Pre-Upgrade Checks Passed PCA Management Node Upgrade Passed PCA Post-Upgrade Checks Passed
TipWhen the ISO is copied to the local file system of both management nodes, the management node upgrade time is just over 1 hour each. The duration of the entire upgrade process depends heavily on the size of the environment: the number of compute nodes and their configuration, the size of the Oracle VM database, etc.
If you choose to copy the ISO locally, replace the location URL in the pca_upgrader command with
-l file:///
.<path-to-iso>
/ovca-2.4.3-b000.iso.zip -
-
Monitor the progress of the upgrade tasks. The console output provides a summary of each executed task. If you need more details on a task, or if an error occurs, consult the log file. You can track the logging activity in a separate console window by entering the command tail -f /nfs/shared_storage/pca_upgrader/log/pca_upgrader_
<date>
-<time>
.log. The example below shows several key sections in a typical log file.NoteOnce the upgrade tasks have started, it is no longer possible to perform a rollback to the previous state.
# tail -f /nfs/shared_storage/pca_upgrader/log/pca_upgrader_
<date>
-<time>
.log [2019-09-26 12:10:13 44526] INFO (pca_upgrader:59) Starting PCA Upgrader... [2019-09-26 12:10:13 44526] INFO (validate_rack:29) Rack Type: hardware_blue [2019-09-26 12:10:13 44526] INFO (pca_upgrader:62) PCA Rack Type: PCA X8_BASE. [2019-09-26 12:10:13 44526] DEBUG (util:511) the dlm_locks command output is debugfs.ocfs2 1.8.6 [?1034hdebugfs: dlm_locks -f /sys/kernel/debug/o2dlm/ovca/locking_state Lockres: master Owner: 0 State: 0x0 Last Used: 0 ASTs Reserved: 0 Inflight: 0 Migration Pending: No Refs: 3 Locks: 1 On Lists: None Reference Map: 1 Lock-Queue Node Level Conv Cookie Refs AST BAST Pending-Action Granted 0 EX -1 0:1 2 No No None debugfs: quit [2019-09-26 12:10:13 44526] INFO (util:520) This node (192.168.4.3) is the master [2019-09-26 12:10:14 44526] DEBUG (run_util:17) Writing 44633 to /var/run/ovca/upgrader.pid [2019-09-26 12:10:14 44633] DEBUG (process_flow:37) Execute precheck steps for component: management [2019-09-26 12:10:25 44633] INFO (precheck:159) [Validate the Image Provided (Verify that the image exists and is correctly named)] Passed [2019-09-26 12:10:25 44633] INFO (precheck_utils:471) Checking the existence of default OVMM networks. [2019-09-26 12:10:25 44633] INFO (precheck_utils:2248) Checking for PCA services. [2019-09-26 12:10:25 44633] INFO (precheck_utils:1970) Checking if there are multiple tenant groups. [...] [2019-09-26 12:10:32 44633] INFO (precheck_utils:1334) Checking hardware faults on host ovcasw16r1. [2019-09-26 12:10:32 44633] INFO (precheck_utils:1334) Checking hardware faults on host ilom-ovcasn02r1. [2019-09-26 12:10:32 44633] INFO (precheck_utils:1334) Checking hardware faults on host ilom-ovcamn06r1. [2019-09-26 12:10:32 44633] INFO (precheck_utils:1334) Checking hardware faults on host ovcacn09r1. [2019-09-26 12:10:32 44633] INFO (precheck_utils:1334) Checking hardware faults on host ovcacn08r1. [2019-09-26 12:10:32 44633] INFO (precheck_utils:1334) Checking hardware faults on host ilom-ovcacn08r1. [2019-09-26 12:10:32 44633] INFO (precheck:159) [Hardware Faults Check (Verifying that there are no hardware faults on any rack component)] Passed The check succeeded. There are no hardware faults on any rack component. [...] [2019-09-26 12:10:34 44633] INFO (precheck_utils:450) Checking storage... Checking server pool... Checking pool filesystem... Checking rack repository Rack1-Repository... [...] [2019-09-26 12:10:53 44633] INFO (precheck:159) [OS Check (Checking the management nodes and compute nodes are running the correct Oracle Linux version)] Passed The check succeeded on the management nodes. The check succeeded on all the compute nodes. [...] ****** PCA Management Node Pre-Upgrade Checks Summary ****** [2019-09-26 12:11:00 44633] INFO (precheck:112) [Validate the Image Provided (Verify that the image exists and is correctly named)] Passed [...] [2019-09-26 12:11:00 44633] INFO (precheck:112) [OSA Disabled Check (Checking OSA is disabled on all management nodes and compute nodes)] Passed The check succeeded on the management nodes. The check succeeded on all the compute nodes. [2019-09-26 12:11:00 44633] INFO (precheck:113) ******************* End of Summary ******************** [2019-09-26 12:11:02 44633] DEBUG (process_flow:98) Successfully completed precheck. Proceeding to upgrade. [2019-09-26 12:11:02 44633] DEBUG (process_flow:37) Execute upgrade steps for component: management [...] [2019-09-26 12:25:04 44633] DEBUG (mn_upgrade_steps:237) Verifying ISO image version [2019-09-26 12:25:04 44633] INFO (mn_upgrade_steps:243) Successfully downloaded ISO [2019-09-26 12:25:17 44633] INFO (mn_upgrade_utils:136) /nfs/shared_storage/pca_upgrader/scripts/ 1.2-89.el6/remote_yum_setup -t 2019_09_26-12.10.13 -f /nfs/shared_storage/pca_upgrader/pca_upgrade.repo Successfully setup the yum config [...] [2019-09-26 12:26:44 44633] DEBUG (mn_upgrade_steps:339) Successfully completed break_ha_model [2019-09-26 12:26:44 44633] INFO (util:184) Created lock: all_provisioning [2019-09-26 12:26:44 44633] INFO (util:184) Created lock: database [2019-09-26 12:26:44 44633] INFO (util:184) Created lock: cn_upgrade [2019-09-26 12:26:44 44633] INFO (util:184) Created lock: mn_upgrade [2019-09-26 12:26:44 44633] DEBUG (mn_upgrade_steps:148) Successfully completed place_pca_locks [2019-09-26 12:26:45 44633] INFO (mn_upgrade_utils:461) Beginning Yum Upgrade [...] Setting up Upgrade Process Resolving Dependencies [...] Transaction Summary ================================================================================ Install 2 Package(s) Upgrade 4 Package(s) Total download size: 47 M Downloading Packages: [...] [2019-09-26 12:35:40 44633] INFO (util:520) This node (192.168.4.3) is the master [2019-09-26 12:35:40 44633] INFO (mn_upgrade_utils:706) Beginning Oracle VM upgrade on ovcamn06r1 [...] Oracle VM upgrade script finished with success STDOUT: Oracle VM Manager Release 3.4.6 Installer Oracle VM Manager Installer log file: /var/log/ovmm/ovm-manager-3-install-2019-09-26-123633.log Verifying upgrading prerequisites ... [...] Running full database backup ... Successfully backed up database to /u01/app/oracle/mysql/dbbackup/3.4.6_preUpgradeBackup-20190926_123644 Running ovm_preUpgrade script, please be patient this may take a long time ... Exporting weblogic embedded LDAP users Stopping service on Linux: ovmcli ... Stopping service on Linux: ovmm ... Exporting core database, please be patient this may take a long time ... [...] Installation Summary -------------------- Database configuration: Database type : MySQL Database host name : localhost Database name : ovs Database listener port : 1521 Database user : ovs Weblogic Server configuration: Administration username : weblogic Oracle VM Manager configuration: Username : admin Core management port : 54321 UUID : 0004fb0000010000f1b07bf678cf43d6 Passwords: There are no default passwords for any users. The passwords to use for Oracle VM Manager, Database, and Oracle WebLogic Server have been set by you during this installation. In the case of a default install, all passwords are the same. [...] Oracle VM Manager upgrade complete. [...] Successfully started Oracle VM services Preparing to install the PCA UI [...] Successfully installed PCA UI [...] [09/26/2019 12:56:56 34239] DEBUG (complete_postupgrade_tasks:228) Copying switch config files [09/26/2019 12:56:56 34239] INFO (complete_postupgrade_tasks:231) [09/26/2019 12:56:56 34239] INFO (update:896) Scheduling post upgrade sync tasks... [...] STDOUT: Looking for [tenant group] [Rack1_ServerPool] Looking for [storage array] [OVCA_ZFSSA_Rack1] ID: 0004fb0000020000c500310305e98353 Server: ovcacn09r1 Server: ovcacn07r1 Server: ovcacn08r1 Finding the LUN that is used as the heartbeat device in tenant group by page83id. page83_ID: 3600144f09c52bc4200005d8c891f0003 [...] Successfully completed PCA management node post upgrade tasksWhen the upgrade tasks have been completed successfully, the active management node is rebooted, and the upgraded management node assumes the active role. The new active management node's operating system is now up-to-date, and it runs the new Controller Software version and upgraded Oracle VM Manager installation.
TipRebooting the management node is expected to take up to 10 minutes.
To monitor the reboot process and make sure the node comes back online as expected, log in to the rebooting management node ILOM.
Broadcast message from root@ovcamn05r1 (pts/2) (Mon Sep 26 14:48:52 2019): Management Node upgrade succeeded. The master manager will be rebooted to initiate failover in one minute.
-
Log into the upgraded management node, which has now become the active management node. Use its individually assigned IP address, not the shared virtual IP.
[root@ovcamn06r1 ~]# pca-check-master NODE: 192.168.4.4 MASTER: True [root@ovcamn06r1 ~]# head /etc/ovca-info ==== Begin build info ==== date: 2019-09-30 release: 2.4.2 build: 404 === Begin compute node info === compute_ovm_server_version: 3.4.6 compute_ovm_server_build: 2.4.2-631 compute_rpms_added: osc-oracle-s7k-2.1.2-4.el6.noarch.rpm ovca-support-2.4.2-137.el6.noarch.rpm
NO MANAGEMENT OPERATIONS DURING UPGRADEUnder no circumstances should you perform any management operations – through the Oracle Private Cloud Appliance Dashboard or CLI, or Oracle VM Manager – while the Upgrader process is running, and until both management nodes have been successfully upgraded through the Upgrader.
-
From the new active management node, run the Oracle Private Cloud Appliance Upgrader command again. The target of the command must be the stand-by management node, which is the original active management node from where you executed the command for the first run.
root@ovcamn06r1 ~]# pca_upgrader -U -t management -c ovcamn05r1 -g 2.4.3 \ -l http://
<path-to-iso>
/ovca-2.4.3-b000.iso.zip PCA Rack Type: PCA X8_BASE. Please refer to log file /nfs/shared_storage/pca_upgrader/log/pca_upgrader_<date>
-<time>
.log for more details. Beginning PCA Management Node Pre-Upgrade Checks... [...] *********************************************************************** Warning: The management precheck completed with warnings. It is safe to continue with the management upgrade from this point or the upgrade can be halted to investigate the warnings. ************************************************************************ Do you want to continue? [y/n]: y Beginning PCA Management Node upgrade for ovcamn05r1 [...] --------------------------------------------------------------------------- Overall Status Passed --------------------------------------------------------------------------- PCA Management Node Pre-Upgrade Checks Passed PCA Management Node Upgrade Passed PCA Post-Upgrade Checks Passed Broadcast message from root@ovcamn05r1 (pts/2) (Mon Sep 26 23:18:27 2019): Management Node upgrade succeeded. The master manager will be rebooted to initiate failover in one minute.NoteAfter the first management node is succesfully upgraded, a
fw_upgrade.LOCK
file is created to prevent the use of CLI commands during upgrade, and remains in place until the successful completion of the storage network upgrade.The upgrade steps are executed the same way as during the first run. When the second management node is rebooted, the process is complete. At this point, both management nodes run the updated Oracle Linux operating system, Oracle Private Cloud Appliance Controller Software, and Oracle VM Manager. The high-availability cluster configuration of the management nodes is restored, and all Oracle Private Cloud Appliance and Oracle VM Manager management functionality is operational again. However, do not perform any management operations until you have completed the required manual post-upgrade checks.
TipIf the first management node is inadvertantly rebooted at this point, the upgrade fails on the second management node. For more information, see Inadvertant Reboot of Stand-by Management Node During Upgrade Suspends Upgrade.
-
Perform the required manual post-upgrade checks on management nodes and compute nodes. Refer to Section 7.5, “Running Manual Pre- and Post-Upgrade Checks in Combination with Oracle Private Cloud Appliance Upgrader” for instructions.
When the management node cluster upgrade is complete, proceed with the following tasks:
-
Firmware upgrades, as described in Section 3.3, “Upgrading Component Firmware”.
-
Storage network upgrade, as described in Section 3.4, “Upgrading the Storage Network”.
-
Compute node upgrades, as described in Section 3.5, “Upgrading the Virtualization Platform”.
3.3 Upgrading Component Firmware
- 3.3.1 Firmware Policy
- 3.3.2 Install the Current Firmware on All Compute and Management Nodes
- 3.3.3 Upgrading the Operating Software on the Oracle ZFS Storage Appliance
- 3.3.4 Upgrading the Cisco Switch Firmware
- 3.3.5 Upgrading the NM2-36P Sun Datacenter InfiniBand Expansion Switch Firmware
- 3.3.6 Upgrading the Oracle Fabric Interconnect F1-15 Firmware
All the software components in a given Oracle Private Cloud Appliance release are designed to work
together. As a general rule, no individual appliance component should be upgraded. If a
firmware upgrade is required for one or more components, the correct version is distributed
inside the Oracle Private Cloud Appliance .iso
file you downloaded from My Oracle Support. When the image file is unpacked on the internal shared
storage, the firmwares are located in this directory:
/nfs/shared_storage/mgmt_image/firmware/
.
Do not perform any compute node provisioning operations during firmware upgrades.
For certain services it is necessary to upgrade the Hardware Management Pack after a Controller Software update. For additional information, refer to Some Services Require an Upgrade of Hardware Management Pack in the Oracle Private Cloud Appliance Release Notes.
If a specific or additional procedure to upgrade the firmware of an Oracle Private Cloud Appliance hardware component is available, it appears in this section. For components not listed here, you may follow the instructions provided in the product documentation of the subcomponent. An overview of the documentation for appliance components can be found in the Preface of this book and on the index page of the Oracle Private Cloud Appliance Documentation Library.
3.3.1 Firmware Policy
To improve Oracle Private Cloud Appliance supportability, reliability and security,
Oracle has introduced a standardized approach to component firmware. The
general rule remains unchanged: components and their respective firmware are designed to
work together, and therefore should not be upgraded separately. However, the firmware
upgrades, which are provided as part of the .iso
file of a given
controller software release, are no longer optional.
As part of the test process prior to a software release, combinations of component firmware are tested on all applicable hardware platforms. This allows Oracle to deliver a fully qualified set of firmware for the appliance as a whole, corresponding to a software release. In order to maintain their Oracle Private Cloud Appliance in a qualified state, customers who upgrade to a particular software release, are expected to also install all the qualified firmware upgrades delivered as part of the controller software.
The firmware versions that have been qualified by Oracle for a given release are listed in the Oracle Private Cloud Appliance Release Notes for that release. Please refer to the Release Notes for the Oracle Private Cloud Appliance Controller Software release running on your system, and open the chapter Firmware Qualification.
Note that the file names shown in the procedures below may not exactly match the file
names in the .iso
image on your system.
Oracle periodically releases firmware patches for many products, for
example to eliminate security vulnerabilities. It may occur that an important firmware
patch is released for a component of Oracle Private Cloud Appliance outside of the normal Controller
Software release schedule. When this occurs, the patches go through the same testing as
all other appliance firmware, but they are not added to the qualified firmware list or the
installation .iso
for the affected Controller Software release.
After thorough testing, important firmware patches that cannot be included in the
Controller Software .iso
image are made available to Oracle Private Cloud Appliance
users through My Oracle Support.
3.3.2 Install the Current Firmware on All Compute and Management Nodes
To avoid compatibility issues with newer Oracle Private Cloud Appliance Controller Software and Oracle VM
upgrades, you should always install the server ILOM firmware included in the ISO image of
the current Oracle Private Cloud Appliance software release. When the ISO image is unpacked on the
appliance internal storage, the firmware directory can be reached from the management nodes
at this location: /nfs/shared_storage/mgmt_image/firmware/
.
For management and compute node firmware upgrade instructions, refer to the Administration Guide of the server series installed in your appliance rack.
3.3.3 Upgrading the Operating Software on the Oracle ZFS Storage Appliance
The instructions in this section are specific for a component firmware upgrade as part of the Oracle Private Cloud Appliance.
During this procedure, the Oracle Private Cloud Appliance services on the management nodes must be halted for a period of time. Plan this upgrade carefully, so that no compute node provisioning, Oracle Private Cloud Appliance configuration changes, or Oracle VM Manager operations are taking place at the same time.
The statement below regarding the two-phased procedure does not apply to X8-2 or newer systems. The Oracle ZFS Storage Appliance ZS7-2 comes with a more recent firmware version that is not affected by the issue described.
If the Oracle ZFS Storage Appliance is running a firmware version older than 8.7.14, an intermediate upgrade to version 8.7.14 is required. Version 8.7.14 can then be upgraded to the intended newer version. For additional information, refer to Oracle ZFS Storage Appliance Firmware Upgrade 8.7.20 Requires A Two-Phased Procedure in the Oracle Private Cloud Appliance Release Notes.
Detailed information about software upgrades can be found in the Oracle ZFS Storage Appliance Customer Service Manual (document ID: F13771). Refer to the section “Upgrading the Software”.
The Oracle Private Cloud Appliance internal ZFS Storage Appliance contains two clustered controllers in an active/passive configuration. You may disregard the upgrade information for standalone controllers.
-
Before initiating the upgrade on the storage controllers, follow the preparation instructions in the Oracle ZFS Storage Appliance Customer Service Manual. Refer to the section entitled “Preparing for a Software Upgrade”.
-
Log on to the active management node using SSH and an account with superuser privileges.
-
If you are upgrading a ZFS Storage Appliance running firmware version 8.7.14 or newer, skip this step and proceed to the next step.
If you are upgrading a ZFS Storage Appliance running a firmware version older than 8.7.14, unzip the firmware package
p27357887_20131_Generic.zip
included in the Oracle Private Cloud Appliance software image.[root@ovcamn05r1 ~]# mkdir /nfs/shared_storage/yum/ak [root@ovcamn05r1 ~]# cd /nfs/shared_storage/yum/ak [root@ovcamn05r1 ak]# unzip /nfs/shared_storage/mgmt_image/firmware/storage/AK_NAS/p27357887_20131_Generic.zip Archive: /nfs/shared_storage/mgmt_image/firmware/storage/AK_NAS/p27357887_20131_Generic.zip extracting: ak-nas-2013-06-05-7-14-1-1-1-nd.pkg.gz inflating: OS8714_Readme.html
-
Select the appropriate software update package:
CautionThe procedure shows the upgrade to version 8.8.20 IDR. For an upgrade to version 8.7.14, substitute the file name in the commands as shown here.
-
Version 8.8.20 IDR -
ak-nas-2013.06.05.8.20-1.2.20.4392.1x-nondebug.pkg
-
Version 8.7.14 -
ak-nas-2013-06-05-7-14-1-1-1-nd.pkg.gz
Download the software update package to both storage controllers. Their management IP addresses are 192.168.4.1 and 192.168.4.2.
-
Log on to one of the storage controllers using SSH and an account with superuser privileges.
[root@ovcamn05r1 ~]# ssh root@192.168.4.1 Password: ovcasn01r1:>
-
Enter the following series of commands to download the software update package from the shared storage directory to the controller.
ovcasn01r1:> maintenance system updates download ovcasn01r1:maintenance system updates download (uncommitted)> \ set url=http://192.168.4.100/shares/export/Yum/ak/ak-nas-2013.06.05.8.20-1.2.20.4392.1x-nondebug.pkg url = http://192.168.4.100/shares/export/Yum/ak/ak-nas-2013.06.05.8.20-1.2.20.4392.1x-nondebug.pkgovcasn01r1:maintenance system updates download (uncommitted)> set user=root user = root ovcasn01r1:maintenance system updates download (uncommitted)> set password Enter password: password = ******** ovcasn01r1:maintenance system updates download (uncommitted)> commit Transferred 1.90G of 1.90G (100%) ... done Unpacking ... done
-
Wait for the package to fully download and unpack before proceeding.
-
Repeat these steps for the second storage controller.
-
-
Check the storage cluster configuration and make sure you are logged on to the standby controller.
ovcasn02r1:> configuration cluster show Properties: state = AKCS_STRIPPED description = Ready (waiting for failback) peer_asn = 80e4823f-1573-caaa-8c44-fb3c8af3b921 peer_hostname = ovcasn01r1 peer_state = AKCS_OWNER peer_description = Active (takeover completed) Children: resources => Configure resources
-
Always upgrade the operating software first on the standby controller.
-
Display the available operating software versions and select the version you downloaded.
ovcasn02r1:> maintenance system updates ovcasn02r1:maintenance system updates> ls Updates: UPDATE RELEASE DATE STATUS ak-nas@2013.06.05.4.2,1-1.1 2015-6-16 17:03:41 previous ak-nas@2013.06.05.7.14,1-1.1 2018-1-6 17:16:42 previous ak-nas@2013.06.05.7.20,1-1.4 2018-7-17 02:27:51 current ak-nas@2013.06.05.8.20,1-1.3 2020-3-23 16:13:25 waiting * ak-nas@2013.06.05.8.20,1-2.20.4392.1 2020-5-22 00:35:42 waiting [*] : Interim Diagnostics and Relief (IDR) Deferred updates: The appliance is currently configured as part of a cluster. The cluster peer may have shared resources for which deferred updates are available. After all updates are completed, check both cluster peers for any deferred updates. ovcasn02r1:maintenance system updates> select ak-nas@2013.06.05.8.20,1-2.20.4392.1
-
Launch the upgrade process with the selected software version.
ovcasn02r1:maintenance system updates> upgrade This procedure will consume several minutes and requires a system reboot upon successful update, but can be aborted with [Control-C] at any time prior to reboot. A health check will validate system readiness before an update is attempted, and may also be executed independently using the check command. Are you sure? (Y/N) Y Updating from ... ak/nas@2013.06.05.7.20,1-1.4 Loading media metadata ... done. Selecting alternate product ... SUNW,maguro_plus Installing Sun ZFS Storage 7320 2013.06.05.8.20,1-2.20.4392.1 pkg://sun.com/ak/SUNW,maguro_plus@2013.06.05.8.20,1-2.20.4392.1:20200522T003535Z Creating system/ak-nas-2013.06.05.8.20_1-2.20.4392.1 ... done. Creating system/ak-nas-2013.06.05.8.20_1-2.20.4392.1/install ... done. Creating system/ak-nas-2013.06.05.8.20_1-2.20.4392.1/boot ... done. Creating system/ak-nas-2013.06.05.8.20_1-2.20.4392.1/root ... done. Creating system/ak-nas-2013.06.05.8.20_1-2.20.4392.1/install/svc ... done. Creating system/ak-nas-2013.06.05.8.20_1-2.20.4392.1/install/var ... done. Creating system/ak-nas-2013.06.05.8.20_1-2.20.4392.1/install/home ... done. Creating system/ak-nas-2013.06.05.8.20_1-2.20.4392.1/install/stash ... done. Creating system/ak-nas-2013.06.05.8.20_1-2.20.4392.1/wiki ... done. Extracting os image ... done. Customizing Solaris ... done. Creating driver_aliases.addendum... done. Updating vfstab ... done. Generating usr/man windex ... done. Generating usr/gnu/share/man windex ... done. Generating usr/perl5/man windex ... done. Preserving ssh keys ... done. Generating ssh ecdsa key ... done. Configuring smf(5) ... done. Extracting appliance kit ... Creating private passwd and shadow files ... done. Creating private smbshadow file ... done. Creating product symlink ... done. Registering update job 311edb88-0a8c-62f3-e6c9-b21657c681f6 ... done. Creating install profile ... done. Assigning appliance serial number ... 7c3c83c8-bfb9-6ade-e824-9b77b1682ed7 Determining chassis serial number ... 1311FMM04B Setting appliance product string ... SUNW,maguro_plus Setting appliance product class ... nas Setting install timestamp ... done. Setting virtualization status ... done. Saving SSL keys ... done. Updating phone-home key ... done. Saving currently running profile ... done. Saving apache22 properties ... done. Installing firmware ... done. Installing device links ... done. Installing device files ... done. Updating device links ... done. Updating /etc ... done. Creating /.domainroot ... done. Installing boot amd64/unix ... done. Assembling etc/system.d ... done. Creating factory reset boot archive ... done. Generating GRUB2legacy configuration ... done. Installing GRUB2legacy configuration ... done. Snapshotting zfs filesystems ... done. Installation complete - unmounting datasets ... Creating boot archive ... done. Update completed; rebooting.
-
At the end of the upgrade, when the controller has fully rebooted and rejoined the cluster, log back in and check the cluster configuration. The upgraded controller must still be in the state "Ready (waiting for failback)".
ovcasn02r1:> configuration cluster show Properties: state = AKCS_STRIPPED description = Ready (waiting for failback) peer_asn = 8a535bd2-160f-c93b-9575-d29d4c86cac5 peer_hostname = ovcasn01r1 peer_state = AKCS_OWNER peer_description = Active (takeover completed)
-
-
From the Oracle Private Cloud Appliance active management node, stop the Oracle Private Cloud Appliance services.
CautionYou must perform this step if you are upgrading to Controller Software versions 2.3.1, 2.3.2, or 2.3.3. Is not required when upgrading to Controller Software version 2.3.4 or later. Executing the storage controller operating software upgrade while the Oracle Private Cloud Appliance services are running, will result in errors and possible downtime.
[root@ovcamn05r1 ~]# service ovca stop
-
Upgrade the operating software on the second storage controller.
-
Check the storage cluster configuration. Make sure you are logged on to the active controller.
ovcasn01r1:> configuration cluster show Properties: state = AKCS_OWNER description = Active (takeover completed) peer_asn = 34e4292a-71ae-6ce1-e26c-cc38c2af9719 peer_hostname = ovcasn02r1 peer_state = AKCS_STRIPPED peer_description = Ready (waiting for failback)
-
Display the available operating software versions and select the version you downloaded.
ovcasn01r1:> maintenance system updates ovcasn01r1:maintenance system updates> show Updates: UPDATE RELEASE DATE RELEASE NAME STATUS ak-nas@2013.06.05.8.5,1-1.3 2019-3-30 07:27:20 OS8.8.5 previous ak-nas@2013.06.05.8.6,1-1.4 2019-6-21 20:56:45 OS8.8.6 current ak-nas@2013.06.05.8.20,1-1.3 2020-8-16 09:57:19 OS8.8.20 waiting ovcasn01r1:maintenance system updates> select ak-nas@2013.06.05.8.20,1-1.3
-
Launch the upgrade process with the selected software version.
ovcasn01r1:maintenance system updates> upgrade This procedure will consume several minutes and requires a system reboot upon successful update, but can be aborted with [Control-C] at any time prior to reboot. A health check will validate system readiness before an update is attempted, and may also be executed independently using the check command. Are you sure? (Y/N) Y
-
At the end of the upgrade, when the controller has fully rebooted and rejoined the cluster, log back in and check the cluster configuration.
ovcasn01r1:> configuration cluster show Properties: state = AKCS_STRIPPED description = Ready (waiting for failback) peer_asn = 34e4292a-71ae-6ce1-e26c-cc38c2af9719 peer_hostname = ovcasn02r1 peer_state = AKCS_OWNER peer_description = Active (takeover completed)
The last upgraded controller must now be in the state "Ready (waiting for failback)". The controller that was upgraded first, took over the active role during the upgrade and reboot of the second controller, which held the active role originally.
-
-
Now that both controllers have been upgraded, verify that all disks are online.
ovcasn01r1:> maintenance hardware show [...] NAME STATE MANUFACTURER MODEL SERIAL RPM TYPE chassis-000 1906NMQ803 ok Oracle Oracle Storage DE3-24C 1906NMQ803 7200 hdd disk-000 HDD 0 ok WDC W7214A520ORA014T 001851N3VKLT 9JG3VKLT 7200 data disk-001 HDD 1 ok WDC W7214A520ORA014T 001851N5K85T 9JG5K85T 7200 data disk-002 HDD 2 ok WDC W7214A520ORA014T 001851N5MPXT 9JG5MPXT 7200 data disk-003 HDD 3 ok WDC W7214A520ORA014T 001851N5L08T 9JG5L08T 7200 data disk-004 HDD 4 ok WDC W7214A520ORA014T 001851N42KNT 9JG42KNT 7200 data [...]
-
Initiate an Oracle Private Cloud Appliance management node failover and wait until all services are restored on the other management node. This helps prevent connection issues between Oracle VM and the ZFS storage.
-
Log on to the active management node using SSH and an account with superuser privileges.
-
Reboot the active management node.
[root@ovcamn05r1 ~]# pca-check-master NODE: 192.168.4.3 MASTER: True [root@ovcamn05r1 ~]# shutdown -r now
-
Log on to the other management node and wait until the necessary services are running.
NoteEnter this command at the prompt: tail -f /var/log/messages. The log messages should indicate when the management node takes over the active role.
Verify the status of the services:
[root@ovcamn06r1 ~]# service ovca status Checking Oracle Fabric Manager: Running MySQL running (70254) [ OK ] Oracle VM Manager is running... Oracle VM Manager CLI is running... tinyproxy (pid 71315 71314 71313 71312 71310 71309 71308 71307 71306 71305 71301) is running... dhcpd (pid 71333) is running... snmptrapd (pid 71349) is running... log server (pid 6359) is running... remaster server (pid 6361) is running... http server (pid 71352) is running... taskmonitor server (pid 71356) is running... xmlrpc server (pid 71354) is running... nodestate server (pid 71358) is running... sync server (pid 71360) is running... monitor server (pid 71363) is running...
-
-
When the storage controller cluster has been upgraded, remove the shared storage directory you created to make the unzipped package available.
# cd /nfs/shared_storage/yum/ak # ls ak-nas-2013.06.05.8.20-1.1.3x-nondebug.pkg OS8.8.20_Readme.html # rm ak-nas-2013.06.05.8.20-1.2.20.4392.1x-nondebug.pkg OS8.8.20_Readme.html rm: remove regular file `ak-nas-2013.06.05.8.20-1.1.3x-nondebug.pkg'? yes rm: remove regular file `OS8.8.20_Readme.html'? yes # cd .. # rmdir ak
3.3.4 Upgrading the Cisco Switch Firmware
The instructions in this section are specific for a component firmware upgrade as part of the Oracle Private Cloud Appliance. The Cisco switches require two upgrade procedures: upgrading the Cisco NX-OS software and upgrading the electronic programmable logic device (EPLD). Perform both procedures on each of the switches.
Cisco switches are part of systems with an Ethernet-based network architecture.
When upgrading to Controller Software version 2.4.3, it is critical that you perform the upgrade operations in the correct order. This means the Cisco switch firmware must be upgraded after the management node upgrade, but before the storage network upgrade.
Do not make any spine switch configuration changes until all the upgrade operations have been completed, otherwise you could lose access to the storage network. See Loading Incompatible Spine Switch Configuration Causes Storage Network Outage in the Oracle Private Cloud Appliance Release Notes.
-
Log on to the active management node using SSH and an account with superuser privileges.
-
Verify that the new Cisco NX-OS software image is available on the appliance shared storage. During the Controller Software update, the Oracle Private Cloud Appliance Upgrader copies the file to this location:
/nfs/shared_storage/mgmt_image/firmware/ethernet/Cisco/nxos.7.0.3.I7.8.bin
-
Log on as admin to the switch you wish to upgrade.
root@ovcamn05r1 ~]# ssh admin@ovcasw15r1 User Access Verification Password: ovcasw15r1#
Please upgrade the switches, one at a time, in this order:
-
Leaf Cisco Nexus 9336C-FX2 Switches: ovcasw15r1, ovcasw16r1
-
Spine Cisco Nexus 9336C-FX2 Switches: ovcasw22r1, ovcasw23r1
-
Management Cisco Nexus 9348GC-FXP Switch: ovcasw21r1
-
-
Copy the Cisco NX-OS software file to the bootflash location on the switch.
The copy command for the management switch ovcasw21r1 is slightly different. Select the appropriate option.
-
Leaf and Spine switches:
ovcasw15r1# copy scp://root@192.168.4.216//nfs/shared_storage/mgmt_image/firmware/ethernet \ /Cisco/nxos.7.0.3.I7.8.bin bootflash:nxos.7.0.3.I7.8.bin vrf
management
root@192.168.4.216's password: nxos.7.0.3.I7.8.bin 100% 937MB 16.2MB/s 00:58 Copy complete, now saving to disk (please wait)... Copy complete. -
Management switch:
ovcasw21r1# copy scp://root@192.168.4.216//nfs/shared_storage/mgmt_image/firmware/ethernet \ /Cisco/nxos.7.0.3.I7.8.bin bootflash:nxos.7.0.3.I7.8.bin vrf
default
root@192.168.4.216's password: nxos.7.0.3.I7.8.bin 100% 937MB 16.2MB/s 00:58 Copy complete, now saving to disk (please wait)... Copy complete.
-
-
Verify the impact of the software upgrade.
ovcasw15r1# show install all impact nxos bootflash:nxos.7.0.3.I7.8.bin Installer will perform impact only check. Please wait. Verifying image bootflash:/nxos.7.0.3.I7.8.bin for boot variable "nxos". [####################] 100% -- SUCCESS Verifying image type. [####################] 100% -- SUCCESS Preparing "nxos" version info using image bootflash:/nxos.7.0.3.I7.8.bin. [####################] 100% -- SUCCESS Preparing "bios" version info using image bootflash:/nxos.7.0.3.I7.8.bin. [####################] 100% -- SUCCESS Performing module support checks. Notifying services about system upgrade. Compatibility check is done: Module bootable Impact Install-type Reason ------ -------- -------------- ------------ ------ 1 yes disruptive reset default upgrade is not hitless Images will be upgraded according to following table: Module Image Running-Version(pri:alt) New-Version Upg-Required ------ ---------- ---------------------------------------- -------------------- ------------ 1 nxos 7.0(3)I7(7) 7.0(3)I7(8) yes 1 bios v05.38(06/12/2019):v05.33(09/08/2018) v05.38(06/12/2019) no
-
Save the current running configuration as the startup configuration.
ovcasw15r1# copy running-config startup-config [########################################] 100% Copy complete, now saving to disk (please wait)... Copy complete.
-
Install the Cisco NX-OS software that was copied to the bootflash location. When prompted about the disruptive upgrade, enter y to continue with the installation.
ovcasw15r1# install all nxos bootflash:nxos.7.0.3.I7.8.bin Installer will perform compatibility check first. Please wait. Installer is forced disruptive Verifying image bootflash:/nxos.7.0.3.I7.8.bin for boot variable "nxos". [####################] 100% -- SUCCESS Verifying image type. [####################] 100% -- SUCCESS Preparing "nxos" version info using image bootflash:/nxos.7.0.3.I7.8.bin. [####################] 100% -- SUCCESS Preparing "bios" version info using image bootflash:/nxos.7.0.3.I7.8.bin. [####################] 100% -- SUCCESS Performing module support checks. Notifying services about system upgrade. Compatibility check is done: Module bootable Impact Install-type Reason ------ -------- -------------- ------------ ------ 1 yes disruptive reset default upgrade is not hitless Images will be upgraded according to following table: Module Image Running-Version(pri:alt) New-Version Upg-Required ------ ---------- ---------------------------------------- -------------------- ------------ 1 nxos 7.0(3)I7(7) 7.0(3)I7(8) yes 1 bios v05.38(06/12/2019):v05.33(09/08/2018) v05.38(06/12/2019) no Switch will be reloaded for disruptive upgrade. Do you want to continue with the installation (y/n)? [n] y Install is in progress, please wait. Performing runtime checks. [####################] 100% -- SUCCESS Setting boot variables. [####################] 100% -- SUCCESS Performing configuration copy. [####################] 100% -- SUCCESS Module 1: Refreshing compact flash and upgrading bios/loader/bootrom. Warning: please do not remove or power off the module at this time. [####################] 100% -- SUCCESS Finishing the upgrade, switch will reboot in 10 seconds.
-
After switch reboot, confirm the install succeeded.
ovcasw15r1# show install all status This is the log of last installation. Verifying image bootflash:/nxos.7.0.3.I7.8.bin for boot variable "nxos". -- SUCCESS Verifying image type. -- SUCCESS Preparing "nxos" version info using image bootflash:/nxos.7.0.3.I7.8.bin. -- SUCCESS Preparing "bios" version info using image bootflash:/nxos.7.0.3.I7.8.bin. -- SUCCESS Performing module support checks. -- SUCCESS Notifying services about system upgrade. -- SUCCESS Compatibility check is done: Module bootable Impact Install-type Reason ------ -------- -------------- ------------ ------ 1 yes disruptive reset default upgrade is not hitless Images will be upgraded according to following table: Module Image Running-Version(pri:alt) New-Version Upg-Required ------ ---------- ---------------------------------------- -------------------- ------------ 1 nxos 7.0(3)I7(7) 7.0(3)I7(8) yes 1 bios v05.38(06/12/2019):v05.33(09/08/2018) v05.38(06/12/2019) no Switch will be reloaded for disruptive upgrade. Install is in progress, please wait. Performing runtime checks. -- SUCCESS Setting boot variables. -- SUCCESS Performing configuration copy. -- SUCCESS Module 1: Refreshing compact flash and upgrading bios/loader/bootrom. Warning: please do not remove or power off the module at this time. -- SUCCESS Finishing the upgrade, switch will reboot in 10 seconds.
-
Verify that the correct software version is active on the switch.
ovcasw15r1# show version Cisco Nexus Operating System (NX-OS) Software TAC support: http://www.cisco.com/tac Copyright (C) 2002-2020, Cisco and/or its affiliates. All rights reserved. [...] Software BIOS: version 05.38 NXOS: version 7.0(3)I7(8) BIOS compile time: 06/12/2019 NXOS image file is: bootflash:///nxos.7.0.3.I7.8.bin NXOS compile time: 3/3/2020 20:00:00 [03/04/200 04:49:49] [...] ovcasw15r1#
-
Verify the VPC status.
NoteThis step does not apply to the appliance internal management network switch (Cisco Nexus 9348GC-FXP Switch). Proceed to the next step.
Use the command shown below. The output values should match this example.
ovcasw15r1# show vpc brief Legend: (*) - local vPC is down, forwarding via vPC peer-link vPC domain id : 2 Peer status : peer adjacency formed ok <---- verify this field vPC keep-alive status : peer is alive <----- verify this field Configuration consistency status : success <----- verify this field Per-vlan consistency status : success <----- verify this field Type-2 consistency status : success <----- verify this field vPC role : primary, operational secondary Number of vPCs configured : 27 Peer Gateway : Enabled Dual-active excluded VLANs : - Graceful Consistency Check : Enabled Auto-recovery status : Disabled Delay-restore status : Timer is off.(timeout = 30s) Delay-restore SVI status : Timer is off.(timeout = 10s) Operational Layer3 Peer-router : Enabled
-
Log out of the switch. The firmware has been upgraded successfully.
-
Proceed to the next Cisco switch in the appliance. Upgrade the switches, one at a time, in this order:
-
Leaf Cisco Nexus 9336C-FX2 Switches: ovcasw15r1, ovcasw16r1
-
Spine Cisco Nexus 9336C-FX2 Switches: ovcasw22r1, ovcasw23r1
-
Management Cisco Nexus 9348GC-FXP Switch: ovcasw21r1
NoteDuring the upgrade of switch software on the management switch, there will be network disruption between compute nodes, management nodes, the storage node, and Leaf and Spine switch management connections. This occurs due to the reboot of the switch as part of the upgrade process.
CautionOnce an upgrade to Controller Software release 2.4.3 is complete on the spine switches, do not attempt to reload a spine switch backup from a prior software release. This could cause the management nodes to lose access to the storage network.
-
The instructions in this section are specific for a component firmware upgrade as part of the Oracle Private Cloud Appliance.
-
Log on to the active management node using SSH and an account with superuser privileges.
-
Verify that the new Cisco NX-OS EPLD firmware image is available on the appliance shared storage. During the Controller Software update, the Oracle Private Cloud Appliance Upgrader copies the file to this location:
/nfs/shared_storage/mgmt_image/firmware/ethernet/Cisco/n9000-epld.7.0.3.I7.8.img
-
Log on as admin to the switch you wish to upgrade.
root@ovcamn05r1 ~]# ssh admin@ovcasw15r1 User Access Verification Password: ovcasw15r1#
Please upgrade the switches, one at a time, in this order:
-
Leaf Cisco Nexus 9336C-FX2 Switches: ovcasw15r1, ovcasw16r1
-
Spine Cisco Nexus 9336C-FX2 Switches: ovcasw22r1, ovcasw23r1
-
Management Cisco Nexus 9348GC-FXP Switch: ovcasw21r1
-
-
Copy the firmware file to the bootflash location on the switch.
The copy command for the management switch ovcasw21r1 is slightly different. Select the appropriate option.
-
Leaf and Spine switches:
ovcasw15r1# copy scp://root@192.168.4.216//nfs/shared_storage/mgmt_image/firmware/ethernet \ /Cisco/n9000-epld.7.0.3.I7.8.img bootflash:n9000-epld.7.0.3.I7.8.img vrf
management
root@192.168.4.216's password: n9000-epld.7.0.3.I7.8.img 100% 142MB 15.8MB/s 00:09 Copy complete, now saving to disk (please wait)... Copy complete. -
Management switch:
ovcasw21r1# copy scp://root@192.168.4.216//nfs/shared_storage/mgmt_image/firmware/ethernet \ /Cisco/n9000-epld.7.0.3.I7.8.img bootflash:n9000-epld.7.0.3.I7.8.img vrf
default
root@192.168.4.216's password: n9000-epld.7.0.3.I7.8.img 100% 142MB 15.8MB/s 00:09 Copy complete, now saving to disk (please wait)... Copy complete.
-
-
Verify the impact of the EPLD upgrade.
ovcasw15r1# show install all impact epld bootflash:n9000-epld.7.0.3.I7.8.img Retrieving EPLD versions.... Please wait. Images will be upgraded according to following table: Module Type EPLD Running-Version New-Version Upg-Required ------ ---- ------------- --------------- ----------- ------------ 1 SUP MI FPGA 0x04 0x05 Yes 1 SUP IO FPGA 0x09 0x11 Yes Compatibility check: Module Type Upgradable Impact Reason ------------------------------------------------- 1 SUP Yes disruptive Module Upgradable
-
Save the current running configuration as the startup configuration.
ovcasw15r1# copy running-config startup-config [########################################] 100% Copy complete, now saving to disk (please wait)... Copy complete.
NoteYou must upgrade the primary and golden regions of the FPGA, however only one upgrade is allowed per reload to avoid programming errors. The next steps describe upgrading both regions of the FPGA.
-
Install the Cisco EPLD software that was copied to the bootflash location to the primary region of the FPGA. When prompted about the switch reload, enter y to continue with the installation.
ovcasw15r1# install epld bootflash:n9000-epld.7.0.3.I7.8.img module 1 Digital signature verification is successful Compatibility check: Module Type Upgradable Impact Reason ------ ----------------- ---------- ---------- ------ 1 SUP Yes disruptive Module Upgradable Retrieving EPLD versions.... Please wait. Images will be upgraded according to following table: Module Type EPLD Running-Version New-Version Upg-Required ------ ---- ------------- --------------- ----------- ------------ 1 SUP MI FPGA 0x04 0x05 No 1 SUP IO FPGA 0x09 0x11 Yes The above modules require upgrade. The switch will be reloaded at the end of the upgrade Do you want to continue (y/n) ? [n] y Proceeding to upgrade Modules. Starting Module 1 EPLD Upgrade Module 1 : IO FPGA [Programming] : 100.00% ( 64 of 64 sectors) Module 1 EPLD upgrade is successful. Module Type Upgrade-Result ------ ------------------ -------------- 1 SUP Success EPLDs upgraded. Module 1 EPLD upgrade is successful. Reseting Active SUP (Module 1) FPGAs. Please wait...
CautionDo not interrupt, power cycle, or reload the switch during the upgrade.
-
The switch reloads automatically and boots from the backup FPGA. Confirm the primary module upgrade succeeded.
ovcasw15r1# show version module 1 epld EPLD Device Version --------------------------------------- MI FPGA 0x5 IO FPGA 0x9
At this point, the
MI FPGA
version is upgraded, but theIO FPGA
version is not upgraded. -
Install the Cisco EPLD software that was copied to the bootflash location to the golden region of the FPGA. When prompted about the switch reload, enter y to continue with the installation.
ovcasw15r1# install epld bootflash:n9000-epld.7.0.3.I7.8.img module 1 golden Digital signature verification is successful Compatibility check: Module Type Upgradable Impact Reason ------ ----------------- ---------- ---------- ------ 1 SUP Yes disruptive Module Upgradable Retrieving EPLD versions.... Please wait. The above modules require upgrade. The switch will be reloaded at the end of the upgrade Do you want to continue (y/n) ? [n] y Proceeding to upgrade Modules. Starting Module 1 EPLD Upgrade Module 1 : MI FPGA [Programming] : 100.00% ( 64 of 64 sectors) Module 1 : IO FPGA [Programming] : 100.00% ( 64 of 64 sectors) Module 1 EPLD upgrade is successful. Module Type Upgrade-Result ------ ------------------ -------------- 1 SUP Success EPLDs upgraded. Module 1 EPLD upgrade is successful. Reseting Active SUP (Module 1) FPGAs. Please wait...
CautionDo not interrupt, power cycle, or reload the switch during the upgrade.
-
The switch reloads automatically and boots from the backup FPGA. Confirm the both upgrades succeeded.
ovcasw15r1# show version module 1 epld EPLD Device Version --------------------------------------- MI FPGA 0x5 IO FPGA 0x11
-
Verify the VPC status.
NoteThis step does not apply to the appliance internal management network switch (Cisco Nexus 9348GC-FXP Switch). Proceed to the next step.
Use the command shown below. The output values should match this example.
ovcasw15r1# show vpc brief Legend: (*) - local vPC is down, forwarding via vPC peer-link vPC domain id : 2 Peer status : peer adjacency formed ok <---- verify this field vPC keep-alive status : peer is alive <----- verify this field Configuration consistency status : success <----- verify this field Per-vlan consistency status : success <----- verify this field Type-2 consistency status : success <----- verify this field vPC role : primary, operational secondary Number of vPCs configured : 27 Peer Gateway : Enabled Dual-active excluded VLANs : - Graceful Consistency Check : Enabled Auto-recovery status : Disabled Delay-restore status : Timer is off.(timeout = 30s) Delay-restore SVI status : Timer is off.(timeout = 10s) Operational Layer3 Peer-router : Enabled
-
Log out of the switch. The firmware has been upgraded successfully.
-
Proceed to the next Cisco switch in the appliance. Upgrade the switches, one at a time, in this order:
-
Leaf Cisco Nexus 9336C-FX2 Switches: ovcasw15r1, ovcasw16r1
-
Spine Cisco Nexus 9336C-FX2 Switches: ovcasw22r1, ovcasw23r1
-
Management Cisco Nexus 9348GC-FXP Switch: ovcasw21r1
NoteDuring the upgrade of switch software on the management switch, there will be network disruption between compute nodes, management nodes, the storage node, and Leaf and Spine switch management connections. This occurs due to the reboot of the switch as part of the upgrade process.
-
3.3.5 Upgrading the NM2-36P Sun Datacenter InfiniBand Expansion Switch Firmware
The instructions in this section are specific for a component firmware upgrade as part of the Oracle Private Cloud Appliance.
InfiniBand switches are part of systems with an InfiniBand-based network architecture.
For firmware upgrades to version 2.2.8 or newer, an intermediate upgrade to unsigned version 2.2.7-2 is required. Version 2.2.7-2 can then be upgraded to the intended newer version. For additional information, refer to NM2-36P Sun Datacenter InfiniBand Expansion Switch Firmware Upgrade 2.2.9-3 Requires A Two-Phased Procedure in the Oracle Private Cloud Appliance Release Notes.
Detailed information about firmware upgrades can be found in the Sun Datacenter InfiniBand Switch 36 Product Notes for Firmware Version 2.2 (document ID: E76431). Refer to the section “Upgrading the Switch Firmware”.
It is recommended that you back up the current configuration of the NM2-36P Sun Datacenter InfiniBand Expansion Switches prior to performing a firmware upgrade.
Backup and restore instructions are provided in the maintenance and configuration management sections of the Oracle ILOM Administration Guide that corresponds with the current ILOM version used in the switch. For example:
-
Log on to the active management node using SSH and an account with superuser privileges.
-
Unzip the firmware package included in the Oracle Private Cloud Appliance software image.
[root@ovcamn05r1 ~]# mkdir /nfs/shared_storage/yum/nm [root@ovcamn05r1 ~]# cd /nfs/shared_storage/yum/nm [root@ovcamn05r1 nm2]# unzip /nfs/shared_storage/mgmt_image/firmware/IB_gateway/NM2-36P/p29623156_2213_Generic.zip Archive: /nfs/shared_storage/mgmt_image/firmware/IB_gateway/NM2-36P/p29623156_2213_Generic.zip inflating: license.txt inflating: readme_SUN_DCS_36p_2.2.13-2.txt creating: SUN_DCS_36p_2.2.13-2/ creating: SUN_DCS_36p_2.2.13-2/SUN_DCS_36p/ inflating: SUN_DCS_36p_2.2.13-2/SUN_DCS_36p/sundcs_36p_repository_2.2.13_2.pkg inflating: SUN_DCS_36p_2.2.13-2/SUN_DCS_36p/SUN-FABRIC-MIB.mib inflating: SUN_DCS_36p_2.2.13-2/SUN_DCS_36p/SUN-PLATFORM-MIB.mib inflating: SUN_DCS_36p_2.2.13-2/SUN_DCS_36p/SUN-ILOM-CONTROL-MIB.mib inflating: SUN_DCS_36p_2.2.13-2/SUN_DCS_36p/ENTITY-MIB.mib inflating: SUN_DCS_36p_2.2.13-2/SUN_DCS_36p/SUN-DCS-IB-MIB.txt inflating: SUN_DCS_36p_2.2.13-2/SUN_DCS_36p/SUN-HW-TRAP-MIB.mib inflating: SUN_DCS_36p_2.2.13-2/SUN_DCS_36p_2.2.13-2_metadata.xml inflating: SUN_DCS_36p_2.2.13-2/README_pkey_filter inflating: SUN_DCS_36p_2.2.13-2/pkey_filter.pl inflating: SUN_DCS_36p_2.2.13-2_THIRDPARTYLICENSE.pdf inflating: written_offer_source_code.pdf
-
Log on to one of the InfiniBand switches as root.
root@ovcamn05r1 ~]# ssh root@192.168.4.202 root@192.168.4.202's password: FW upgrade completed successfully on Thu Sep 3 14:11:16 UTC 2020. Please run the "fwverify" CLI command to verify the new image. This message will be cleared on next reboot. You are now logged in to the root shell. It is recommended to use ILOM shell instead of root shell. All usage should be restricted to documented commands and documented config files. To view the list of documented commands, use "help" at linux prompt.
-
Check the active configuration and the state of the SubnetManager.
[root@ilom-ovcasw19r1 ~]# getmaster Local SM not enabled Last change in Master SubnetManager status detected at: Thu Sep 3 14:12:03 UTC 2020 Master SubnetManager on sm lid 3 sm guid 0x139702010013ac : ovcasw22r1 Master SubnetManager Activity Count: 51664 Priority: 0
WarningThe command output must read Local SM not enabled. If this is not the case, abort this procedure and contact Oracle Support.
-
List the details of the current firmware version.
[root@ilom-ovcasw19r1 ~]# version SUN DCS 36p version: 2.2.9-3 Build time: Mar 2 2018 12:54:44 SP board info: Manufacturing Date: 2012.06.23 Serial Number: "NCD9A0231" Hardware Revision: 0x0007 Firmware Revision: 0x0000 BIOS version: SUN0R100 BIOS date: 06/22/2010
-
Connect to the ILOM and start the firmware upgrade procedure. Press "Y" when prompted to load the file.
[root@ilom-ovcasw19r1 ~]# spsh Oracle(R) Integrated Lights Out Manager Version 2.2.9-3 ILOM 3.2.11 r124039 Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. Warning: HTTPS certificate is set to factory default. Hostname: nm2-36p-a -> load -source http://192.168.4.100/shares/export/Yum/nm/SUN_DCS_36p_2.2.13-2/SUN_DCS_36p/sundcs_36p_repository_2.2.13_2.pkg Downloading firmware image. This will take a few minutes. SUN DCS 36p version: 2.2.9-3 Build time: Mar 2 2018 12:54:44 SP board info: Manufacturing Date: 2012.06.23 Serial Number: "NCD9A0231" Hardware Revision: 0x0007 Firmware Revision: 0x0000 BIOS version: SUN0R100 BIOS date: 06/22/2010 FROM_VERSION: 2.2.9-3 TO_VERSION: 2.2.13-2 NOTE: Firmware upgrade will upgrade the SUN DCS 36p firmware. ILOM will enter a special mode to load new firmware. No other tasks should be performed in ILOM until the firmware upgrade is complete. Are you sure you want to load the specified file (y/n)? y Setting up environment for firmware upgrade. This will take a few minutes. Starting SUN DCS 36p FW update ========================== Performing operation: I4 A ========================== I4 A: I4 is already at the given version. ========================================= Performing operation: SUN DCS 36p firmware update ========================================= SUN DCS 36p Kontron module fw upgrade from 2.2.9-3 to 2.2.13-2: Please reboot the system to enable firmware update of Kontron module. The download of the Kontron firmware image happens during reboot. After system reboot, Kontron FW update progress can be monitored in browser using URL [http://system] OR at OS command line prompt by using command [telnet system 1234] where system is the hostname or IP address of SUN DCS 36P or GW. Firmware update is complete.
-
Reboot the system.
-> reset /SP Are you sure you want to reset /SP (y/n)? y Performing reset on /SP -> Broadcast message from root@nm2-36p-a (unknown) at 17:15 ... The system is going down for reboot NOW! Connection to 192.168.4.202 closed by remote host. Connection to 192.168.4.202 closed.
-
Reconnect to the InfiniBand switch to verify that the new firmware is running and to confirm that the SubnetManager remains disabled.
root@ovcamn05r1 ~]# ssh root@192.168.4.202 root@192.168.4.202's password: FW upgrade completed successfully on Thu Sep 3 17:19:06 UTC 2020. Please run the "fwverify" CLI command to verify the new image. This message will be cleared on next reboot. You are now logged in to the root shell. It is recommended to use ILOM shell instead of root shell. All usage should be restricted to documented commands and documented config files. To view the list of documented commands, use "help" at linux prompt. [root@nm2-36p-a ~]# version SUN DCS 36p version: 2.2.13-2 Build time: Mar 26 2019 10:31:08 SP board info: Manufacturing Date: 2012.06.23 Serial Number: "NCD9A0231" Hardware Revision: 0x0007 Firmware Revision: 0x0000 BIOS version: SUN0R100 BIOS date: 06/22/2010 [root@nm2-36p-a ~]# getmaster Local SM not enabled Last change in Master SubnetManager status detected at: Thu Sep 3 17:20:35 UTC 2020 Master SubnetManager on sm lid 3 sm guid 0x139702010013ac : ovcasw22r1 Master SubnetManager Activity Count: 53518 Priority: 0
WarningThe command output must read Local SM not enabled. If this is not the case, abort this procedure and contact Oracle Support.
-
When the first InfiniBand switch has completed the upgrade successfully and has come back online, connect to the other InfiniBand switch, with IP address 192.168.4.203, and execute the same procedure.
-
When both InfiniBand switches have been upgraded, remove the shared storage directory you created to make the unzipped package available.
root@ovcamn05r1 ~]# cd /nfs/shared_storage/yum/ root@ovcamn05r1 yum]# ls -al total 323 drwxr-xr-x 8 root root 8 Mar 26 07:57 . drwxrwxrwx 31 root root 31 Mar 13 13:38 .. drwxr-xr-x 2 root root 5 Mar 13 12:04 backup_COMPUTE drwxr-xr-x 2 root root 5 Mar 13 13:19 current_COMPUTE drwxr-xr-x 3 root root 6 Mar 26 07:58
nm2
drwxr-xr-x 3 root root 587 Mar 13 12:04 OVM_3.4.4_1735_server drwxr-xr-x 3 root root 18 Mar 13 12:03 OVM_3.4.4_1735_transition drwxr-xr-x 4 root root 9 Mar 13 12:03 OVM_3.4.4_1735_update root@ovcamn05r1 yum]# rm -rf nm2/
3.3.6 Upgrading the Oracle Fabric Interconnect F1-15 Firmware
The instructions in this section are specific for a component firmware upgrade as part of the Oracle Private Cloud Appliance.
Fabric Interconnects are part of systems with an InfiniBand-based network architecture.
Detailed information about firmware upgrades can be found in the XgOS User's Guide (document ID: E53170). Refer to the section “System Image Upgrades”.
It is recommended that you back up the current configuration of the Fabric Interconnects prior to performing a firmware upgrade. Store the backup configuration on another server and add a time stamp to the file name for future reference.
For detailed information, refer to the section “Saving and Restoring Your Configuration” in the XgOS User's Guide (document ID: E53170).
-
Log on to the active management node using SSH and an account with superuser privileges.
-
Copy the firmware package from the Oracle Private Cloud Appliance software image to the Yum repository share.
root@ovcamn05r1 ~]# cp /nfs/shared_storage/mgmt_image/firmware/IB_gateway/ \ OFI/p29180828_4013_Generic.zip /nfs/shared_storage/yum/
-
Log on to one of the Fabric Interconnects as admin.
root@ovcamn05r1 ~]# ssh admin@192.168.4.205 Password: Last login: Thu Oct 15 10:57:23 2015 from 192.168.4.4 Welcome to XgOS Copyright (c) 2007-2012 Xsigo Systems, Inc. All rights reserved. Enter "help" for information on available commands. Enter the command "show system copyright" for licensing information admin@ovcasw22r1[xsigo]
-
List the details of the current firmware version.
admin@ovcasw22r1[xsigo] show system version Build 4.0.12-XGOS - (buildsys) Thu May 10 22:12:05 PDT 2018 admin@ovcasw22r1[xsigo]
-
Check the active configuration and the state of the SubnetManager. Optionally run the additional diagnostics command for more detailed information.
admin@ovcasw22r1[xsigo] show diagnostics sm-info - SM is running on ovcasw22r1 - SM Lid 3 - SM Guid 0x139702010013ac - SM key 0x1 - SM priority 0 - SM State MASTER admin@ovcasw22r1[xsigo] show diagnostics opensm-param OpenSM $ Current log level is 0x83 OpenSM $ Current sm-priority is 0 OpenSM $ OpenSM Version : OpenSM 3.3.13 SM State : Master SM Priority : 0 SA State : Ready Routing Engine : minhop Loaded event plugins : <none> PerfMgr state/sweep state : Disabled/Sleeping MAD stats --------- QP0 MADs outstanding : 0 QP0 MADs outstanding (on wire) : 0 QP0 MADs rcvd : 3080 QP0 MADs sent : 3080 QP0 unicasts sent : 343 QP0 unknown MADs rcvd : 0 SA MADs outstanding : 0 SA MADs rcvd : 9171 SA MADs sent : 10335 SA unknown MADs rcvd : 0 SA MADs ignored : 0 Subnet flags ------------ Sweeping enabled : 1 Sweep interval (seconds) : 10 Ignore existing lfts : 0 Subnet Init errors : 0 In sweep hop 0 : 0 First time master sweep : 0 Coming out of standby : 0 Known SMs --------- Port GUID SM State Priority --------- -------- -------- 0x139702010013ac Master 0 SELF 0x139702010013ec Standby 0 OpenSM $ admin@ovcasw22r1[xsigo]
-
Start the system upgrade procedure.
admin@ovcasw22r1[xsigo] system upgrade http://192.168.4.100/shares/export/Yum/xsigo-4.0.13-XGOS.xpf forcebaseos Copying... ############################################################################################################################## [100%] You have begun to upgrade the system software. Please be aware that this will cause an I/O service interruption and the system may be rebooted. The following software will be installed: 1. XgOS Operating System software including SCP Base OS 2. XgOS Front-panel software 3. XgOS Common Chassis Management software on IOC 4. XgOS VNIC Manager and Agent software 5. XgOS VN10G and VN10x1G Manager and Agent software 6. XgOS VHBA and VHBA-2 Manager and Agent software 7. XgOS VN10G and VN10x1G Manager and Agent software with Eth/IB Interfaces 8. XgOS VN4x10G and VN2x10G Manager and Agent software with Eth/IB Interfaces 9. XgOS VHBA-3 Manager and Agent software 10. XgOS VHBA 2x 8G FC Manager and Agent software Are you sure you want to update the software (y/n)?y Running verify scripts... Running preunpack scripts... Installing... ####################################################################################################### [ 82%] Installing... ############################################################################################################################## [100%] Verifying... ############################################################################################################################## [100%] Running preinstall scripts... Installing Base OS - please wait... LABEL=/dev/uba /mnt/usb vfat rw 0 0 Rootfs installation successful The installer has determined that a reboot of the Base OS is required (HCA driver changed) The installer has determined that a cold restart of the Director is necessary Installing package... Running postinstall scripts... Installation successful. Please stand by for CLI restart. admin@ovcasw22r1[xsigo] Rebooting OS. Please log in again in a couple of minutes... *********************************** Xsigo system is being shut down now *********************************** Connection to 192.168.4.205 closed.
After reboot, it takes approximately 10 minutes before you can log back in. The upgrade resets the admin user's password to the default "
admin
". It may take several attempts, but login with the default password eventually succeeds. -
Reconnect to the Fabric Interconnect to change the admin and root passwords back to the setting from before the firmware upgrade.
NoteWhen you log back in after the firmware upgrade, you may encounter messages similar to this example:
Message from syslogd@ovcasw22r1 at Fri Jun 22 09:49:33 2018 ... ovcasw22r1 systemcontroller[2713]: [EMERG] ims::IMSService [ims::failedloginattempt] user admin has tried to log on for 5 times in a row without success !!
These messages indicate failed login attempts from other Oracle Private Cloud Appliance components. They disappear after you set the passwords back to their original values.
Modify the passwords for users root and admin as follows:
admin@ovcasw22r1[xsigo] set system root-password Administrator's password: admin New password:
myOriginalRootPassword
New password again:myOriginalRootPassword
admin@ovcasw22r1[xsigo] set user admin -password New password:myOriginalAdminPassword
New password again:myOriginalAdminPassword
-
Reconnect to the Fabric Interconnect to verify that the new firmware is running and to confirm that all vNICs and vHBAs are in up/up state.
root@ovcamn05r1 ~]# ssh admin@192.168.4.205 admin@ovcasw22r1[xsigo] show system version Build 4.0.13-XGOS - (sushao) Wed Dec 19 11:28:28 PST 2018 admin@ovcasw22r1[xsigo] show diagnostics sm-info - SM is running on ovcasw22r1 - SM Lid 3 - SM Guid 0x139702010013ac - SM key 0x0 - SM priority 0 - SM State MASTER admin@ovcasw22r1[xsigo] show vnic name state mac-addr ipaddr if if-state ---------------------------------------------------------------------------------------------- eth4.ovcacn08r1 up/up 00:13:97:59:90:11 0.0.0.0/32 mgmt_pvi(64539) up eth4.ovcacn09r1 up/up 00:13:97:59:90:0D 0.0.0.0/32 mgmt_pvi(64539) up eth4.ovcacn10r1 up/up 00:13:97:59:90:09 0.0.0.0/32 mgmt_pvi(64539) up eth4.ovcacn11r1 up/up 00:13:97:59:90:1D 0.0.0.0/32 mgmt_pvi(64539) up eth4.ovcacn12r1 up/up 00:13:97:59:90:19 0.0.0.0/32 mgmt_pvi(64539) up [...] eth7.ovcacn29r1 up/up 00:13:97:59:90:28 0.0.0.0/32 5/1 up eth7.ovcamn05r1 up/up 00:13:97:59:90:04 0.0.0.0/32 4/1 up eth7.ovcamn06r1 up/up 00:13:97:59:90:08 0.0.0.0/32 5/1 up 40 records displayed admin@ovcasw22r1[xsigo] show vhba name state fabric-state if if-state wwnn ------------------------------------------------------------------------------------------------ vhba03.ovcacn07r1 up/up down(Down) 12/1 down 50:01:39:71:00:58:B1:0A vhba03.ovcacn08r1 up/up down(Down) 3/1 down 50:01:39:71:00:58:B1:08 vhba03.ovcacn09r1 up/up down(Down) 12/1 down 50:01:39:71:00:58:B1:06 vhba03.ovcacn10r1 up/up down(Down) 3/1 down 50:01:39:71:00:58:B1:04 [...] vhba04.ovcacn29r1 up/up down(Down) 12/2 down 50:01:39:71:00:58:B1:13 vhba04.ovcamn05r1 up/up down(Down) 3/2 down 50:01:39:71:00:58:B1:01 vhba04.ovcamn06r1 up/up down(Down) 12/2 down 50:01:39:71:00:58:B1:03 20 records displayed
-
When the first Fabric Interconnect has completed the upgrade successfully and has come back online, connect to the other Fabric Interconnect, with IP address 192.168.4.204, and execute the same procedure.
3.4 Upgrading the Storage Network
The Oracle Private Cloud Appliance Storage Network is available with Controller Software release 2.4.3 for ethernet-based systems. This feature enables access for virtual machines to the internal ZFS storage appliance, and requires 60TB of space on the ZFS storage appliance.
Make sure you perform the upgrades in this order before you proceed with the Storage Network upgrade:
-
Upgrade the management nodes with the controller software upgrade
-
Upgrade firmware on all components including the Cisco switches
-
Upgrade the storage network
-
Upgrade the compute nodes
Functionality is built in to ensure the upgrade process works properly. This includes
three lock files that are set during the storage network upgrade and are designed to prevent
specific behaviors that can interrupt the upgrade. The
all_provisioning.LOCK
prevents provisioning actions on compute nodes
during upgrade. The fw_upgrade.LOCK
is placed immediately following the
successful completion of the first management node upgrade, and prevents the use of CLI
commands before the storage network upgrade is complete. The
storage_network_upgrade.LOCK
prevents any customer-initiated changes to
the spine or leaf switches while the upgrade is taking place. The locks are removed at the
completion of the storage network upgrade, regardless of success or failure.
-
Log in to the management node and run the
verify
command.pca_upgrader -V -t storage-network
-
If you use the optional ASRM and OEM agents, stop them.
service asrm stop service gcstartup stop
-
Upgrade the storage network.
pca_upgrader -U -t storage-network PCA Rack Type: PCA X8_BASE. Please refer to log file /var/log/pca_upgrader_
<date>
-<time>
.log for more details. Beginning PCA Storage Network Pre-Upgrade Checks... Rack Type Check 1/14 PCA Version Check 2/14 Upgrade Locks Check 3/14 Backup Tasks Check 4/14 Storage Port Channel Status Check 5/14 Spine Switch Firmware Check 6/14 Check ZFSSA MGMT Network 7/14 Check Cluster Status 8/14 AK Firmware Version Check 9/14 ZFSSA Resilvering Jobs Check 10/14 iSCSI Target Check 11/14 ZFSSA Hardware Error Check 12/14 Check ZFSSA Default Shares 13/14 ZFSSA Network Configuration Check 14/14 PCA Storage Network Pre-Upgrade Checks completed after 0 minutes Beginning PCA Storage Network Upgrade Disable PCA Backups 1/6 Take Spine Switch Backup 2/6 Take ZFSSA Configuration Backup 3/6 Place Storage Network Upgrade Locks 4/6 Perform Storage Network Upgrade 5/6 Remove Firmware Upgrade Lock 6/6 Remove PCA Upgrade Locks 1 Re-enable PCA Backups 2 PCA Storage Network Upgrade completed after 6 minutes Beginning PCA Storage Network Post-Upgrade Checks... PCA Storage Network Post-Upgrade Checks completed after 0 minutes --------------------------------------------------------------------------- PCA Storage Network Pre-Upgrade Checks Passed --------------------------------------------------------------------------- Rack Type Check Passed PCA Version Check Passed Upgrade Locks Check Passed Backup Tasks Check Passed Storage Port Channel Status Check Passed Spine Switch Firmware Check Passed Check ZFSSA MGMT Network Passed Check Cluster Status Passed AK Firmware Version Check Passed ZFSSA Resilvering Jobs Check Passed iSCSI Target Check Passed ZFSSA Hardware Error Check Passed Check ZFSSA Default Shares Passed ZFSSA Network Configuration Check Passed --------------------------------------------------------------------------- PCA Storage Network Upgrade Passed --------------------------------------------------------------------------- Disable PCA Backups Passed Take Spine Switch Backup Passed Take ZFSSA Configuration Backup Passed Place Storage Network Upgrade Locks Passed Perform Storage Network Upgrade Passed Remove Firmware Upgrade Lock Passed --------------------------------------------------------------------------- PCA Storage Network Post-Upgrade Checks Passed --------------------------------------------------------------------------- --------------------------------------------------------------------------- Overall Status Passed --------------------------------------------------------------------------- PCA Storage Network Pre-Upgrade Checks Passed PCA Storage Network Upgrade Passed PCA Storage Network Post-Upgrade Checks Passed Please refer to log file /var/log/pca_upgrader_<date>
-<time>
.log for more details.NoteAfter the successful upgrade of management nodes, an upgrade lock is left in place. This lock is intentional to ensure that the storage network upgrade is performed before attempting to upgrade the compute nodes.
-
If you use the optional the ASRM and OEM agents, restart them.
service asrm start service gcstartup start
3.5 Upgrading the Virtualization Platform
Some releases of the Oracle Private Cloud Appliance Controller Software include a new version of Oracle VM, the virtualization platform used in Oracle Private Cloud Appliance. As part of the controller software update, the new Oracle VM Manager Release is automatically installed on both management nodes.
After the controller software update on the management nodes, Oracle VM Manager displays events indicating that the compute nodes are running outdated version of Oracle VM Server. These events are informational and do not prevent any operations, but it is recommended that you upgrade all compute nodes to the new Oracle VM Server Release at your earliest convenience.
The Oracle VM Server upgrade was intentionally decoupled from the automated controller software update process. This allows you to plan the compute node upgrades and the migration or downtime of your virtual machines in steps and outside peak hours. As a result, service interruptions for users of the Oracle VM environment can be minimized or even eliminated. By following the instructions in this section, you also make sure that previously deployed virtual machines remain fully functional when the appliance update to the new software release is complete.
During an upgrade of Oracle VM Server, no virtual machine can be running on a given compute node. VMs using resources on a shared storage repository can be migrated to other running compute nodes. If a VM uses resources local to the compute node you want to upgrade, it must be shut down, and returned to service after the Oracle VM Server upgrade.
When you install Oracle Private Cloud Appliance Controller Software Release 2.4.x, the management nodes are set up to run Oracle VM Manager 3.4.x. Compute nodes cannot be upgraded to the corresponding Oracle VM Server Release with the Oracle VM Manager web UI. You must upgrade them using the update compute-node command within the Oracle Private Cloud Appliance CLI.
Execute this procedure on each compute node after the software update on the management nodes has completed successfully.
If compute nodes are running other packages that are not part of Oracle Private Cloud Appliance, these must be uninstalled before the Oracle VM Server upgrade.
-
Make sure that the appliance software has been updated successfully to the new release.
You can verify this by logging into the active management node and entering the following command in the Oracle Private Cloud Appliance CLI:
# pca-admin Welcome to PCA! Release: 2.4.3 PCA> show version ---------------------------------------- Version 2.4.3 Build 000 Date 2020-08-06 ---------------------------------------- Status: Success
Leave the console and CLI connection open. You need to run the update command later in this procedure.
-
Log in to Oracle VM Manager.
For details, see Section 5.2, “Logging in to the Oracle VM Manager Web UI”.
-
Migrate all running virtual machines away from the compute node you want to upgrade.
Information on migrating virtual machines is provided in the Oracle VM Manager User's Guide section entitled Migrate or Move Virtual Machines.
-
Place the compute node in maintenance mode.
Information on using maintenance mode is provided in the Oracle VM Manager User's Guide section entitled Edit Server.
-
In the Servers and VMs tab, select the Oracle VM Server in the navigation pane. Click Edit Server in the management pane toolbar.
The Edit Server dialog box is displayed.
-
Select the Server in Maintenance Mode check box to place the Oracle VM Server into maintenance mode. Click OK.
The Oracle VM Server is in maintenance mode and ready for servicing.
-
-
Run the Oracle VM Server update for the compute node in question.
-
Return to the open management node console window with active CLI connection.
-
Run the update compute-node command for the compute nodes you wish to update at this time. Run this command for one compute node at a time.
WarningRunning the update compute-node command with multiple servers as arguments is not supported. Neither is running the command concurrently in separate terminal windows.
PCA> update compute-node ovcacn09r1 ************************************************************ WARNING !!! THIS IS A DESTRUCTIVE OPERATION. ************************************************************ Are you sure [y/N]:y Status: Success
This CLI command invokes a validation mechanism, which verifies critical requirements that a compute node must meet to qualify for the Oracle VM Server 3.4.x upgrade. It also ensures that all the necessary packages are installed from the correct source location, and configured properly.
-
Wait for the command to complete successfully. The update takes approximately 30 minutes for each compute node.
As part of the update procedure, the Oracle VM Server is restarted but remains in maintenance mode.
WarningIf the compute node does not reboot during the update, you must restart it from within Oracle VM Manager.
-
-
Return to Oracle VM Manager to take the compute node out of maintenance mode.
-
In the Servers and VMs tab, select the Oracle VM Server in the navigation pane. Click Edit Server in the management pane toolbar.
The Edit Server dialog box is displayed.
-
Clear the Server in Maintenance Mode check box. Click OK.
The Oracle VM Server rejoins the server pool as a fully functioning member.
-
-
Repeat this procedure for each compute node in your Oracle Private Cloud Appliance.
The appliance software update is now complete. Next, perform the required post-upgrade verification steps. The procedure for those additional manual verification tasks is documented in the Post Upgrade section of the support note with Doc ID 2701377.1.
After successful completion of the post-upgrade verification steps, the Oracle Private Cloud Appliance is ready to resume all normal operations.