1.4 Provisioning and Orchestration

1.4.1 Appliance Management Initialization
1.4.2 Compute Node Discovery and Provisioning
1.4.3 Server Pool Readiness

As a converged infrastructure solution, the Oracle Virtual Compute Appliance X3-2 aims to eliminate many of the intricacies of optimizing the system configuration. Hardware components are installed and cabled at the factory. Configuration settings and installation software are preloaded onto the system. Once the appliance is connected to the data center power source and public network, the provisioning process between the administrator pressing the power button of the first management node and the appliance reaching its Deployment Readiness state is entirely orchestrated by the master management node. This section explains what happens behind the curtains as the Oracle Virtual Compute Appliance X3-2 is initialized and all nodes are provisioned.

1.4.1 Appliance Management Initialization

Boot Sequence and Health Checks

When power is applied to the first management node, it takes approximately five minutes for the server to boot. While the Oracle Linux 6 operating system is loading, an Apache web server is started, which serves a static welcome page the administrator can browse to from the workstation connected to the appliance management network.

The necessary Oracle Linux services are started as the server comes up to runlevel 3 (multi-user mode with networking). At this point, the management node executes a series of system health checks. It verifies that all expected infrastructure components are present on the appliance management network and in the correct predefined location, identified by the rack unit number and fixed IP address. Next, the management node probes the ZFS storage appliance for a management NFS export and a management iSCSI LUN with OCFS2 file system. The storage and its access groups have been configured at the factory. If the health checks reveal no problems, the ocfs2 and o2cb services are started up automatically.

Management Cluster Setup

When the OCFS2 file system on the shared iSCSI LUN is ready, and the o2cb services have started successfully, the management nodes can join the cluster. In the meantime, the first management node has also started the second management node, which will come up with an identical configuration. Both management nodes eventually join the cluster, but the first management node will take an exclusive lock on the shared OCFS2 file system using Distributed Lock Management (DLM). The second management node remains in permanent standby and takes over the lock only in case the first management node goes down or otherwise releases its lock.

With mutual exclusion established between both members of the management cluster, the master management node continues to load the remaining Oracle Virtual Compute Appliance services, including tftp, dhcpd, Oracle VM Manager and the Oracle Virtual Compute Appliance databases. The virtual IP address of the management cluster is also brought online, and the Oracle Virtual Compute Appliance Dashboard is started within WebLogic. The static Apache web server now redirects to the Dashboard at the virtual IP, where the administrator can access a live view of the appliance rack component status.

Once the dhcpd service is started, the system state changes to Provision Readiness, which means it is ready to discover non-infrastructure components.

1.4.2 Compute Node Discovery and Provisioning

Node Manager

To discover compute nodes, the Node Manager on the master management node uses a DHCP server and the node database. The node database is a single-access MySQL database, located on the management NFS share, containing the state and configuration details of each node in the system, including MAC addresses, IP addresses and host names. The discovery process of a node begins with a DHCP request from the ILOM. All discovery and provisioning actions are asynchronous and occur in parallel. The DHCP server hands out pre-assigned IP addresses on the appliance management network ( 192.168.4.0/24 ). When the Node Manager has verified that a node has a valid service tag for use with Oracle Virtual Compute Appliance, it launches a series of provisioning tasks. All required software resources have been loaded onto the ZFS storage appliance at the factory.

Provisioning Tasks

The provisioning process is tracked in the node database by means of status changes. The next provisioning task can only be started if the node status indicates that the previous task has completed successfully. For each valid node, the Node Manager begins by building a PXE configuration and forces the node to boot using Oracle Virtual Compute Appliance runtime services. After the hardware RAID-1 configuration is applied, the node is restarted to perform a kickstart installation of Oracle VM Server. Crucial kernel modules and host drivers for InfiniBand and IO Director support are added to the installation. At the end of the installation process, the network configuration files are updated to allow all necessary network interfaces and bonds to be brought up.

Now that the PVI for the Oracle VM management network exists, the compute node is rebooted one last time to reconfigure the Oracle VM Agent to communicate over the PVI. At this point, the node is ready for Oracle VM Manager discovery.

1.4.3 Server Pool Readiness

Oracle VM Server Pool

When the Node Manager detects a fully installed compute node that is ready to join the Oracle VM environment, it issues the necessary Oracle VM CLI commands to add the new node to the Oracle VM server pool. With the discovery of the first node, the system also configures the clustered Oracle VM server pool with the appropriate networking, access to the shared storage, and a virtual IP. Oracle Virtual Compute Appliance X3-2 expects that all compute nodes in one rack belong to a single clustered server pool with High Availability (HA) and Distributed Resource Scheduling (DRS) enabled. When all compute nodes have joined the Oracle VM server pool, the appliance is in Ready state, meaning virtual machines (VMs) can be deployed.

Expansion Compute Nodes

When an expansion compute node is installed, its presence is detected based on the DHCP request from its ILOM. If the new server is identified as an Oracle Virtual Compute Appliance node, an entry is added in the node database with "new" state. This triggers the initialization and provisioning process. New compute nodes are integrated seamlessly to expand the capacity of the running system, without the need for manual reconfiguration by an administrator.