2 Installing CNE
This chapter provides information about installing Oracle Communications Cloud Native Core, Cloud Native Environment (CNE). CNE can be deployed onto dedicated hardware, referred to as a baremetal CNE, or deployed onto Virtual Machines, referred to as a virtualized CNE.
Regardless of which deployment platform is selected, CNE installation is highly automated. A collection of container-based utilities automate the provisioning, installation, and configuration of CNE. These utilities are based on the following automation tools:
- PXE helps to reliably automate the process of provisioning the hosts with a minimal operating system.
- Terraform creates the virtual resources that are used to host the virtualized CNE.
- Kubespray helps reliably install a base Kubernetes cluster, including all dependencies such as etcd, using the Ansible provisioning tool.
- Ansible is used to deploy and manage a collection
of operational tools (Common Services) that are:
- provided by open source third party products such as Prometheus and Grafana
- built from source and packaged as part of CNE releases such as Oracle OpenSearch and Oracle OpenSearch Dashboard
- Kyverno Policy management is used to enforce security posture in CNE.
- Helm is used to deploy and configure common services such as Prometheus, Grafana, and OpenSearch.
Note:
- Ensure that the shell is configured with Keepalive to avoid unexpected timeout.
- The 'X' in Oracle Linux
Xor OLXin the installation procedures indicates the latest version of Oracle Linux supported by CNE. - CNE 24.2.0 replaces Terraform with OpenTofu when you freshly install CNE with Cloud Native Load Balancer (CNLB). For vCNE instances deployed using Terraform, CNE 24.2.0 continues to use and support Terraform for upgrade and maintenance.
Preinstallation Tasks
This section describes the procedure to run before installing an Oracle Communications Cloud Native Environment also known in these installation procedures as CNE.
Sizing Kubernetes Cluster
CNE deploys a Kubernetes cluster to host application workloads and common services. The following table provides the minimum Kubernetes cluster sizing (node counts) required for production deployments of CNE.
Note:
Deployments that do not meet the minimum sizing requirements may not operate correctly when maintenance operations (including upgrades) are performed on the CNE.Table 2-1 Kubernetes Cluster Sizing
| Node Type | Minimum Required | Maximum Allowed |
|---|---|---|
| Kubernetes controller node | 3 | 3 |
| Kubernetes node | 6 | 100 |
Sizing Prometheus Persistent Storage
Prometheus stores metrics in Kubernetes persistent storage. Use the following calculations to reserve the correct amount of persistent storage during installation so that Prometheus can store metrics for the desired retention period.
total_metrics_storage = (nf_metrics_daily_growth + occne_metrics_daily_growth) * metrics_retention_period * 1.2Note:
- An extra 20% storage is reserved to allow for a future increase in metrics growth.
- It is recommended to maintain the
metrics_retention_periodat the default value of 14 days. - If the resulting storage size as per the above formula is greater than 500GB, then the retention period must be reduced until the resulting value is less than 500GB.
- Ensure that the retention period is set to more than 3 days.
- The default value for Prometheus persistent storage is 8GB.
- If the
total_metrics_storagevalue calculated as per the above fomula is less than 8GB, then use the default value.
One NF is installed on the OCCNE instance. From the NFs documentation, it is known that it generates 150 MB of metrics data per day at the expected ingress signaling traffic rate. Using the formula above:
metrics_retention_period = 14 days
nf_metrics_daily_growth = 150 MB/day (from NF documentation)
occne_metrics_daily_growth = 144 MB/day (from calculation below)
(0.15 GB/day + 0.144 GB/day) * 14 days * 1.2 = 5 GB (rounded up)
Since this is less than the default value of 8 GB, use 8 GB as the total_metrics_storage value.Note:
After determining the required metrics storage, record thetotal_metrics_storage
value for later use in the installation procedures.
Calculating CNE metrics daily storage growth requirements
CNE stores varying amounts of metrics data
each day depending on the size of the Kubernetes cluster deployed in the CNE
instance. To determine the correct occne_metrics_daily_growth
value for the CNE instance, use the
formula:
occne_metrics_daily_growth = 36 MB * num_kubernetes_nodesSizing Oracle OpenSearch Persistent Storage
Note:
OpenSearch master nodes do not store any user data (logs/traces). Data ingestion has been explicitly disabled on master nodes, and PVCs attached to them are no longer used for data storage.As a result:
- Master PVC size is fixed at 1Gi by default.
- Master PVC size cannot be resized, as they no longer serve a data ingestion purpose.
- All log and trace data is now stored exclusively on hot and warm data nodes.
- It is recommended to provision atleast 5 OpenSearch data nodes to ensure adequate storage and high availability.
log_trace_daily_growth = (nf_logs_daily_growth + nf_trace_daily_growth)log_trace_active_storage = log_trace_daily_growth * (log_trace_retention_period+1) *1.2Note:
- An extra day's worth of storage is allocated on the hot data nodes that gets used when deactivating the old daily indices.
- An extra 20% storage is reserved to allow for a future increase in logging growth.
- It is recommended to maintain the
log_trace_retention_periodat the default value of 7 days. - If the resulting storage size as per the above formula is greater than 500GB, then the retention period must be reduced until the resulting value is less than 500GB.
- Ensure that the retention period is set to more than 3 days.
- The default values for Oracle OpenSearch persistent storage for
log_trace_active_storageis 10Gi. - If the
log_trace_active_storagevalue calculated as per the above fomula is less than 10Gi, then use the default value.
log_trace_retention_period = 7
nf_mps_rate = 200 msgs/sec
nf_logs_daily_growth = 150 MB/day (from NF documentation)
nf_trace_daily_growth = 500 MB/day (from section below)
log_trace_daily_growth = (0.15 GB/day + 0.50 GB/day) = 0.65 GB/day
log_trace_active_storage = 0.65*(7+1)*1.2 = 6.24 GBNote:
After determining the required logs and trace storage, record thelog_trace_active_storage
value for later use in the installation procedures.
Calculating NF trace data daily storage growth requirements
NFs store varying amounts of trace data each day, depending on the ingress traffic rate, the trace sampling rate, and the error rate for handling ingress traffic. The default trace sampling rate is .01%. Space is reserved for 10M trace records per NF per day (an amount equivalent to a 1% trace sampling rate) and uses 50 bytes as the average record size (as measured during testing). 1% is used instead of .01% to account for the capture of error scenarios and overhead.
nf_trace_daily_num_records = 10M records/day
nf_trace_avg_record_size = 50 bytes/record
nf_trace_daily_growth = 10M records/day * 50 bytes/record = 500 MB/dayRecord the value log_trace_daily_growth for later use
in the installation procedures.
Note:
- Ensure that trace sampling rate is set to less than 1% under normal circumstances. Collecting a higher percentage of traces causes the Oracle OpenSearch to respond more slowly, and impacts the performance of the CNE instance. If you want to collect a higher percentage of traces, contact My Oracle Support.
- CNE does not generate any traces, so no additional storage is reserved for CNE trace data.
- CNE platform logs are disabled by default, so no additional storage is reserved for CNE logs data.
- Master PVC size is fixed at 1Gi by default and must not be modified or resized.
Configuring OpenStack LB Controller Environment Variables
This section provides information about configuring the OpenStack Load Balancer (LB) controller environment variables.
occne-lb-controller-server deployment to adjust how a port recovery
takes place, how often the health check runs, and how to set the log level. The
following table provides the recommended default values that are set prior to
installation for a standard deployment.
Note:
Change these variables after a deployment only if required in case of issues during port recovery (after a switchover) or as recommended by the Oracle support.Table 2-2 Openstack LB Controller Variables
| Variable name | Description | Default Value |
|---|---|---|
| OPENSTACK_MAX_PARALLEL | The number of OpenStack API port calls made in parallel. The default value 0 indicates that the OpenStack API calls are run for all ports at once. This variable is helpful when the rate limiting is enabled at the OpenStack LB controller level or the OpenStack LB controller cannot process multiple API requests in a short period of time. | 0 |
| OPENSTACK_PORT_API_RETRY | This variable controls the number of retries the system attempts to connect to a given port when the port detach API call, before deleting the port, creating a new port (using the same name and IP address), and attaching that port to the newly ACTIVE LBVM during a switchover. If this variable is set to 0, the ports are deleted immediately and recreated without trying to detach the ports (this is referred to as forced detachment and must be avoided until it is absolutely necessary due to underlying issues with the OpenStack LB controller). | 5 |
| OPENSTACK_PORT_API_TIMEOUT | The time (in seconds) between the attempts to detach a port using the OpenStack API call. | 2 |
| LB_MON_REQ_TIMEOUT | The time (in seconds) between the LB controller monitor health checks on the LBVMs across all pools. | 2 |
| LOG_LEVEL | This variable allows to set the level of debug log in the LB controller. Additional level: DEBUG. | INFO |
Generating Root CA Certificate
To use an intermediate Certificate Authority (CA) as an issuer for Istio service mesh mTLS certificates, a signing certificate and key from an external CA must be generated. The generated certificate and key values must be base64 encoded.
For more information about generating the required certificate and key, see the Certificate Authority documentation.
Configuring GRUB Password
This section provides information about configuring GRUB password in all hosts of a cluster.
To configure the GRUB password, add the occne_grub_password
variable to the [occne:vars] section of the
occne.ini or hosts.ini file
corresponding to Ansible configuration.
occne_grub_password variable to the
required password. Before setting a password, ensure that the password you
choose comply to the following conditions:
- The password must contain at least eight characters.
- The password must contain uppercase and lowercase characters.
- The password must contain at least special
character except single and double quotes. For example:
~ @ # ^ * - _ + [ { } ] : . / ? % = ! - The password must contain at least two digits.
occne_grub_password variable in
the occne.ini
file:################################################################################
# #
# Copyright (c) 2024 Oracle and/or its affiliates. All rights reserved. #
# #
################################################################################
################################################################################
# OCCNE Cluster occne.ini file. Defines OCCNE deployment variables
[occne:vars]
occne_grub_password=TheGrubPassword2024
...BareMetal Installation
Note:
- Before installing CNE in BareMetal, you must complete the preinstallation tasks.
- CNE supports the following Load Balancers for traffic segregation:
- Standard MetalLB
- Cloud Native Load Balancer (CNLB)
Overview
Frame and Component
The initial release of the CNE system provides support for on-prem deployment to a very specific target environment consisting of a frame holding switches and servers. This section describes the layout of the frame and the roles performed by the racked equipment.
Frame
The physical frame is comprised of DL380 or DL360 rack mount servers, and 2 Top of Rack (ToR) Cisco switches. The frame components are added from the bottom up, thus designations found in the next section number from the bottom of the frame to the top of the frame.
Figure 2-1 Frame Overview

Host Designations
Each physical server has a specific role designation within the CNE solution.
Figure 2-2 Host Designations

Node Roles
Along with the primary role of each host, you can assign a secondary role. The secondary role can be software related or in the case of the Bootstrap Host, hardware related, as there are unique out-of-band (OOB) connection to the ToR switches.
Figure 2-3 Node Roles

Transient Roles
RMS1 has unique out-of-band (OOB) connections to the ToR switches, which brings the designation of management host. This role is only relevant during initial switch configuration and fault recovery of the switch. RMS1 also has a transient role as the Installer Bootstrap Host, which is applicable only during initial installation of the frame, and subsequently to get an official install on RMS2. Later, this host is re-paved to its K8s Master role.
Figure 2-4 Transient Roles

Creating CNE Instance
This section describes the procedures to create the CNE instance at a customer site. The following diagram shows the installation context:
Figure 2-5 CNE Installation Overview

Following is the basic installation flow to understand the overall effort:
- Check that the hardware is on-site, properly cabled, and powered up.
- Pre-assemble the basic equipments needed to perform
a successful install:
-
Identify
- Download and stage software and other configuration files using the manifests.
- Identify the layer 2 (MAC) and layer 3 (IP) addresses for the equipment in the target frame.
- Identify the addresses of key external network services, for example, NTP, DNS, and so on.
- Verify or set all of the credentials for the target frame hardware to known settings.
-
Prepare
- Software Repositories: Load the various SW repositories (YUM, Helm, Docker, and so on) using the downloaded software and configuration files.
- Configuration Files: Populate the host's inventory file with credentials, layer 2 and layer 3 network information, switch configuration files with assigned IP addresses, and yaml files with appropriate information.
-
Identify
- Bootstrap the System:
- Manually configure a Minimal Bootstrapping Environment (MBE): perform the minimal set of manual operations to enable networking and initial loading of a single Rack Mount Server (RMS1), the transient Installer Bootstrap Host. In this procedure, a minimal set of packages needed to configure switches, iLOs, PXE boot environment, and provision RMS2 as an CNE Storage Host are installed.
- Using the newly constructed MBE, automatically create the first Bastion Host on RMS2.
- Using the newly constructed Bastion Host on RMS2, automatically deploy and configure the CNE on the other servers in the frame.
- Final Steps:
- Perform postinstallation checks.
- Perform recommended security hardening steps.
Cluster Bootstrapping Overview
The following install procedure describes how to install the CNE onto a new hardware that does not contain any networking configurations to switches or provisioned operating systems. Therefore, the initial step in the installation process is to provision RMS1 (see Installing CNE) as a temporary Installer Bootstrap Host. The Bootstrap Host is configured with a minimal set of packages to configure switches, iLOs, and Boot Firmware. From the Bootstrap host, a virtual Bastion Host is provisioned on RMS2. The Bastion Host is then used to provision (and in the case of the Bootstrap Host, re-provision) the remaining CNE hosts, install Kubernetes, and Common Services running within the Kubernetes cluster.
Prerequisites
Before installing and configuring CNE on BareMetal, ensure that the following prerequisites are met.
Prerequisites for Oracle X Servers:
Ensure that the Integrated Lights Out Manager (ILOM) firmware of Oracle X8-2 or X9-2 server is up to date. The ILOM firmware is crucial for seamless functioning of CNE and is essential for optimal performance, security, and compatibility. To update the ILOM firmware, perform the steps outlined in the Oracle documentation or contact the system administrator.- Ensure that the preprovisioned nodes are installed with Oracle Linux 9, with packages
from the
@coreand@basegroupings. - Ensure that the KVM host nodes for Bastions and
k8s-controlnodes have sufficient space (at least 300Gb per hosted VM) in their/varvolume for the virtual machine drives. - Ensure that the worker nodes have an unallocated storage device to be used for
rookorcephvolume allocation of Kubernetes persistent volumes. - Ensure that the initial network setup is complete on the preprovisioned nodes, so that
the installer can reach the nodes using
sshwith a required interface name ofbond0. The following example provides a sample configuration to create abond0interface out of two Ethernet interfaces:Note:
Run the following commands as a root user.- Run the following commands to create interfaces for
bonding:
nmcli con add type bond con-name bond0 ifname bond0 mode 802.3ad; nmcli con add type bond-slave con-name bond0-slave-0 ifname ens1f0 master bond0; nmcli con add type bond-slave con-name bond0-slave-1 ifname ens1f1 master bond0; - On the KVM host nodes, run the following commands to create a bridge interface named
bondbr0and assign it to the host node's cluster IP address. This allows the KVM host node to connect with its VM guests.nmcli con add type bridge ifname bondbr0 con-name bondbr0 bridge.stp no ipv4.method manual ipv4.addresses ${IP}; nmcli con mod bond0 connection.slave-type bridge connection.master bondbr0; - Bridge interface is not required for the nodes that not KVM hosts. On these nodes,
add the IP address and gateway directly to the
bond0interface:nmcli con mod bond0 ipv4.method manual ipv4.addresses ${IP} ipv4.gateway ${GATEWAY}; - On the bootstrap node, set up a connection to an OAM vlan with an appropriate IP
address and
gateway:
nmcli connection add type bridge ifname vlan${OAM_VLAN}-br con-name vlan${OAM_VLAN}-br connection.autoconnect yes bridge.stp no ipv4.method manual ipv4.addresses ${OAM_IP} ipv4.gateway ${OAM_GATEWAY}; nmcli connection add type vlan con-name bond0.${OAM_VLAN} dev bond0 id ${OAM_VLAN} master vlan${OAM_VLAN}-br connection.autoconnect yes; - Stop any prior interface that may have the node's cluster IP address from OS
installation, and start the new
bond0interface:nmcli con down eno1; nmcli con up bond0;
- Run the following commands to create interfaces for
bonding:
- Ensure that all the preprovisioned nodes that are to be included in the cluster have a
user account (default
admusr) with password-lesssudoaccess:useradd admusr; usermod -aG wheel admusr; echo "%admusr ALL=(ALL) NOPASSWD: ALL" | tee -a /etc/sudoers; passwd admusr;
Configuring Artifact Acquisition and Hosting
CNE requires artifacts from Oracle Software Delivery Cloud (OSDC), Oracle Support (MOS), the Oracle YUM repository, and certain open-source projects. CNE deployment environments are not expected to have direct internet access. Thus, customer-provided intermediate repositories are necessary for the CNE installation process. These intermediate repositories need CNE dependencies to be loaded into them. This section provides the list of artifacts needed in the repositories.
Oracle eDelivery Artifact Acquisition
Table 2-3 CNE Artifacts
| Artifact | Description | File Type | Destination repository |
|---|---|---|---|
| occne-images-25.2.1xx.tgz | CNE Installers (Docker images) from OSDC/MOS | Tar GZ | Docker Registry |
| Templates | OSDC/MOS
|
Config files (.conf, .ini, .yaml, .mib, .sh, .txt) | Local media |
Third Party Artifacts
CNE dependencies needed from open-source software must be available in repositories that are reachable by the CNE installation tools. For an accounting of third party artifacts needed for this installation, see the Artifact Acquisition and Hosting chapter.
Populating MetalLB Configuration
Introduction
mb_resources.yaml) defines the
Border Gateway Protocol (BGP) peers and address pools for metalLB. The
mb_resources.yaml file must be placed in the same directory
(/var/occne/<cluster_name>) as the hosts.ini
file. This section provides information about configuring the MetalLB resource
file.
Note:
Themb_resources.yaml metalLB resources file is applicable for
MetalLB based deployments only and it is not applicable for CNLB based
deployments.
Limitations
The IP addresses listed below can have three possible formats. Each peer address pool can use a different format from the others if desired.- IP List
A list of IPs (each on a single line) in single quotes in the following format: 'xxx.xxx.xxx.xxx/32'. The IPs do not have to be sequential. The number of IPs must cover the number of IPs as needed by the application.
- IP Range
A range of IPs, separated by a dash in the following format: 'xxx.xxx.xxx.xxx - xxx.xxx.xxx.xxx'. The range must cover the number of IPs as needed by the application.
- CIDR (IP-slash notation)
A single subnet defined in the following format: 'xxx.xxx.xxx.xxx/nn'. The CIDR must cover the number of IPs as needed by the application.
- The peer-address IP must be a different subnet from the IP subnets used to define the IPs for each peer address pool.
mb_resources.yaml file.
Configuring MetalLB Pools and Peers
- Add BGP peers and address groups: Referring to the data collected in the Installation Preflight Checklist, add BGP peers (ToRswitchA_Platform_IP, ToRswitchB_Platform_IP) and address groups for each address pool. The Address-pools list the IP addresses that metalLB is allowed to allocate.
- Edit the
mb_resources.yamlfile with the site-specific values found in the Installation Preflight ChecklistNote:
Theoampeer address pool is required for defining the IPs. Other pools are application specific and can be named to best fit the application it applies to. The following examples show howoamandsignalingare used to define the IPs, each using a different method.Example for oam:
apiVersion: metallb.io/v1beta2 kind: BGPPeer metadata: creationTimestamp: null name: peer1 namespace: occne-infra spec: holdTime: 1m30s keepaliveTime: 0s myASN: 64512 passwordSecret: {} peerASN: 64501 peerAddress: <ToRswitchA_Platform_IP> status: {} --- apiVersion: metallb.io/v1beta2 kind: BGPPeer metadata: creationTimestamp: null name: peer2 namespace: occne-infra spec: holdTime: 1m30s keepaliveTime: 0s myASN: 64512 passwordSecret: {} peerASN: 64501 peerAddress: <ToRswitchB_Platform_IP> status: {} --- apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: creationTimestamp: null name: oam namespace: occne-infra spec: addresses: - '<MetalLB_oam_Subnet_IPs>' autoAssign: false status: {} --- apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: creationTimestamp: null name: <application_specific_peer_address_pool_name> namespace: occne-infra spec: addresses: - '<MetalLB_app_Subnet_IPs>' autoAssign: false status: {} --- apiVersion: metallb.io/v1beta1 kind: BGPAdvertisement metadata: creationTimestamp: null name: bgpadvertisement1 namespace: occne-infra spec: ipAddressPools: - oam - <application_specific_peer_address_pool_name> status: {}Example for signaling:apiVersion: metallb.io/v1beta2 kind: BGPPeer metadata: creationTimestamp: null name: peer1 namespace: occne-infra spec: holdTime: 1m30s keepaliveTime: 0s myASN: 64512 passwordSecret: {} peerASN: 64501 peerAddress: 172.16.2.3 status: {} --- apiVersion: metallb.io/v1beta2 kind: BGPPeer metadata: creationTimestamp: null name: peer2 namespace: occne-infra spec: holdTime: 1m30s keepaliveTime: 0s myASN: 64512 passwordSecret: {} peerASN: 64501 peerAddress: 172.16.2.2 status: {} --- apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: creationTimestamp: null name: oam namespace: occne-infra spec: addresses: - '10.75.200.22/32' - '10.75.200.23/32' - '10.75.200.24/32' - '10.75.200.25/32' - '10.75.200.26/32' autoAssign: false status: {} --- apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: creationTimestamp: null name: signalling namespace: occne-infra spec: addresses: - '10.75.200.30 - 10.75.200.40' autoAssign: false status: {} --- apiVersion: metallb.io/v1beta1 kind: BGPAdvertisement metadata: creationTimestamp: null name: bgpadvertisement1 namespace: occne-infra spec: ipAddressPools: - oam - signalling status: {}
Predeployment Configuration - Preparing a Minimal Boot Strapping Environment
The steps in this section provide the details to establish a minimal bootstrap environment (to support the automated installation of the CNE environment) on the Installer Bootstrap Host using a Keyboard, Video, Mouse (KVM) connection.
Installing Oracle Linux X.x on Bootstrap Host
This procedure defines the steps to install Oracle Linux X.x onto the CNE Installer Bootstrap Host. This host is used to configure the networking throughout the system and install OLX. After OLX installation, the host is repaved as a Kubernetes Master Host in a later procedure.
Note:
Skip this section if you are installing CNE on servers other than HP Gen10 and Oracle X and run the bootstrap host procedures on the firstk8s-host node
which will be the KVM host of the first Kubernetes master and Bastion VMs. The topology
in this case remains the same, however when you are installing CNE on other servers, the
system assumes the following:
- The OS is already preinstalled and the network is configured as specified in Prerequisites for Servers Other than HP and Oracle X.
- ToR switches are already configured as specified in Configuring Top of Rack Switches.
Prerequisites
- USB drive of sufficient size to contain the ISO (approximately 5Gb)
- Oracle Linux X.x iso (For example: Oracle Linux 9.x iso) is available
- YUM repository file is available
- Keyboard, Video, Mouse (KVM) are available
Limitations and Expectations
- The configuration of the Installer Bootstrap Host has to be quick and easy. The Installer Bootstrap Host is re-paved with the appropriate OS configuration for cluster and DB operation at a later installation stage. The Installer Bootstrap Host needs a Linux OS and some basic network to start the installation process.
- All steps in this procedure are performed using Keyboard, Video, Mouse (KVM).
Bootstrap Install Procedure
- Create Bootable USB Media:
- On the installer's notebook, download the OLX ISO from the customer's repository.
- Push the OLX ISO image onto the
USB Flash Drive.
Since the installer's notebook can be Windows or Linux OS, you must determine the appropriate details to run this task. For a Linux based notebook, insert a USB Flash Drive of the appropriate size into a Laptop (or some other Linux host on which you can copy the ISO), and run the
ddcommand to create a bootable USB drive with the Oracle Linux X ISO (For example: Oracle Linux 9 ISO).$ dd -if=<path to ISO> -of=<USB device path> -bs=1m
- Install Oracle Linux on the Installer Bootstrap Host:
Note:
The following procedure considers installing OL9 and provides the options and commands accordingly. The procedure vary for other versions.- Connect a Keyboard, Video, and Mouse (KVM) into the Installer Bootstrap Host's monitor and USB ports.
- Plug the USB flash drive containing the bootable ISO into an available USB port on the Bootstrap host (usually in the front panel).
- Reboot the host by momentarily
pressing the power button on the host's front panel. The button turns
yellow. If the button stays yellow, press the button again. The host
automatically boots onto the USB flash drive.
Note:
If the host was configured previously, and the USB is unavailable in the bootable path as per the boot order, the booting process will be unsuccessful. - If the host is unable to boot
onto the USB, repeat step 2c, and interrupt the boot process by pressing
F11 button
which displays the Boot Menu.
If the host has been recently booted with an OL, the Boot Menu displays Oracle Linux at the top of the list. Select Generic USB Boot as the first boot device and proceed.
- The host attempts to boot from
the USB. The Boot Menu is displayed on the screen. Select
Install Oracle Linux 9.x.y and click ENTER. This
begins the boot process and the system displays the Welcome screen.
When prompted for the language to use, select the default setting: English (United States) and click Continue in the lower left corner.
Note:
You can also select the second option Test this media & install Oracle Linux 9.x.y. This option first runs the media verification process. - The system displays the INSTALLATION
SUMMARY page. The system expects the following settings on the
page. If any of these are not set correctly, then select that menu item
and make the appropriate changes.
- LANGUAGE SUPPORT: English (United States)
- KEYBOARD: English (US)
- INSTALLATION SOURCE: Local Media
- SOFTWARE SELECTION: Minimal Install
- INSTALLATION DESTINATION: This must display
No disks selected.
- Select INSTALLATION DESTINATION to indicate on which drive to install the OS.
- Select the drives where the OS is installed.
Note:
- The system displays a dialog box if there is no space in the
selected drives to install Oracle Linux. When you encounter
such a scenario, perform the following steps to clear up
space for Oracle Linux installation:
- Click Reclaim space.
- Click Delete all.
- Click Reclaim space.
- Be aware that the data in the selected drives is lost.
- The system displays a dialog box if there is no space in the
selected drives to install Oracle Linux. When you encounter
such a scenario, perform the following steps to clear up
space for Oracle Linux installation:
- Select DONE. This returns to the INSTALLATION SUMMARY screen.
- At the INSTALLATION
SUMMARY screen, select ROOT PASSWORD.
Enter a root password appropriate for this installation.
It is recommended to use the secure password that the customer provides. This helps to minimize the host from being compromised during installation.
- At the INSTALLATION SUMMARY screen, select Begin Installation. The INSTALLATION PROGRESS screen is displayed.
- After completing the installation process, remove the USB and select Reboot System to complete the installation and booting to the OS on the Bootstrap Host. At the end of the boot process, the Log in prompt appears.
Configuring Host BIOS
Introduction
The following procedure defines the steps to set up the Basic Input Output System (BIOS) changes on the following server types:- Bootstrap host uses the KVM. If you are using a previously configured Bootstrap host that can be accessed through the remote HTML5 console, follow the procedure according to the remaining servers.
- All the remaining servers use remote HTML5 console.
The steps can vary based on the server type. Follow the appropriate steps specific to configured server. Some of the steps require a system reboot and are indicated in the procedure.
Prerequisites
- For Bootstrap host, the procedure Installing of Oracle Linux X.X on Bootstrap Host is complete.
- For all other servers, the procedure Configure Top of Rack 93180YC-EX Switches and Configuring Addresses for RMS iLOs are complete.
Limitations and Expectations
- Applies to HP Gen10 iLO 5 and Netra X8-2 server only.
- Procedures listed here apply to both Bootstrap Host and other servers unless indicated explicitly.
Procedure for Netra X8-2 server
By default, BIOS of the Netra X8-2 server is set to the factory settings with predefined default values. Do not change BIOS of a new X8-2 server. If any issue occurs in the new X8-2 server, reset BIOS to the default factory settings.
- Log in to https://<netra ilom address>.
- Navigate to System Management, select BIOS.
- Set the value to Factory from the drop-down list for Reset to Defaults under Settings.
- Click Save.
Exposing the System Configuration Utility on a RMS Host
Perform the following steps to launch the HP iLO 5 System Configuration Utility main page from the KVM. It does not provide instructions to connect the console, as it can differ on each installation.
- After providing connections for the KVM to access the console, you must reboot the host by momentarily pressing the power button on the front of the Bootstrap host.
- Navigate to the HP ProLiant DL380 Gen10 System Utilities.
Once the remote console has been exposed, reset the system to force it through a restart. When the initial window is displayed, keep pressing the F9 key repeatedly. Once the F9 key is highlighted at the lower left corner of the remote console, it eventually brings up the main System Utilities screen.
- The System Utilities screen is displayed in the remote console.
Note:
As CNE 23.4.0 upgraded Oracle Linux to version 9, some of the OS capabilities are removed for security reasons. This includes the removal of older insecure cryptographic policies as well as shorter RSA lengths that are no longer supported. For more information, see Step c.- Perform the following steps to launch the system utility for other RMS
servers:
- SSH to the RMS using the iLO IP address, and the root user and password
previously assigned at the Installation Preflight Checklist. This displays the HP iLO
prompt.
$ ssh root@<rms_ilo_ip_address> Using username "root". Last login: Fri Apr 19 12:24:56 2019 from 10.39.204.17 [root@localhost ~]# ssh root@192.168.20.141 root@192.168.20.141's password: User:root logged-in to ILO2M2909004F.labs.nc.tekelec.com(192.168.20.141 / FE80::BA83:3FF:FE47:649C) Integrated Lights-Out 5 iLO Advanced 2.30 at Aug 24 2020 Server Name: Server Power: On </>hpiLO-> - Use Virtual Serial Port (VSP) to connect to the blade
remote
console:
</>hpiLO->vsp - Power cycle the blade to bring up the System
Utilities for that blade.
Note:
The System Utility is a text based version of that exposed on the RMS via the KVM. You must use the directional (arrow) keys to manipulate between selections, ENTER key to select, and ESC to return from the current selection. - Access the System Utility by pressing ESC 9.
- SSH to the RMS using the iLO IP address, and the root user and password
previously assigned at the Installation Preflight Checklist. This displays the HP iLO
prompt.
- [Optional]: If you are using OL9, depending on the host that is used to
connect the RMS, you may encounter the following error messages when you
connect to iLOM. These errors are encountered in OL9 due to the change in
security:
Error in Oracle X8-2 or X9-2 server when RSA key length is too short:
$ ssh root@172.10.10.10 Bad server host key: Invalid key lengthError in HP server when legacy crypto policy is not enabled:Perform the following steps to resolve these iLOM connectivity issues:$ ssh root@172.11.11.10 Unable to negotiate with 172.11.11.10 port 22: no matching key exchange method found. Their offer: diffie-hellman-group14-sha1,diffie-hellman-group1-sha1Note:
Run the following command on the connecting host only if the host experiences the aforementioned errors while connecting to HP iLO or Oracle iLOM through SSH.- Perform the following steps to reenable Legacy crypto policies to
connect to HP iLO using SSH:
- Run the following command to enable Legacy crypto
policies:
$ sudo update-crypto-policies --set LEGACY - Run the following command to revert the policies to
default:
$ sudo update-crypto-policies --set DEFAULT
- Run the following command to enable Legacy crypto
policies:
- Run the following command to allow short short RSA length while
connecting to Oracle X8-2 or X9-2 server using
SSH:
$ ssh -o RSAMinSize=1024 root@172.10.10.10
- Perform the following steps to reenable Legacy crypto policies to
connect to HP iLO using SSH:
- Navigate to the System Utility as per step 1.
- Select System Configuration.
- Select BIOS/Platform Configuration (RBSU).
- Select Boot Options: If the Boot Mode is
currently UEFI Mode and you decide to use BIOS Mode, use this procedure to
change to Legacy BIOS Mode.
Note:
The server reset must go through an attempt to boot before the changes apply. - Select the Reboot Required dialog window to drop back into the boot process. The boot must go into the process of actually attempting to boot from the boot order. Attempting to boot fails as disks are not installed at this point. The System Utility can be accessed again.
- After the reboot and you re-enter the System Utility, the Boot Options page appears.
- Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
- Navigate to the System Utility as per step 1.
- Select System Configuration.
- Select BIOS/Platform Configuration (RBSU).
- Select Boot Options. Click drop-down the
Warning prompt appears, click OK.
Note:
The server reset must go through an attempt to boot before the changes apply. - Select the Reboot Required dialog window. Click OK for the warning reboot window.
- After the reboot and you re-enter the System
Utility, the Boot Options page appears.
The Boot Mode is changed to UEFI Mode and the UEFI Optimized Boot has changed to Enabled automatically.
- Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
Note:
Ensure thepxe_install_lights_out_usr and
pxe_install_lights_out_passwd fields match as provided in
the hosts inventory files created using the template. For more information about
inventory file preparation, see Inventory File Preparation.
- Navigate to the System Utility as per step 1.
- Select System Configuration.
- Select iLO 5 Configuration Utility.
- Select User Management → Add User.
- Select the appropriate
permissions. For the root user set all permissions to YES. Enter root as New User Name and Login Name fields,
and select Password
field, press Enter key to type
<password>twice. - Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
- Navigate to the System Utility as per step 1.
- Select System Configuration.
- Select BIOS/Platform Configuration (RBSU).
- Select Boot Options.
- Perform the following steps depending the current Boot
Mode:
- Perform the folloqing steps if the current Boot
Mode is Legacy BIOS
Mode:
- Ensure that the following options are configured
properly:
- UEFI Optimized Boot must be set to disabled
- Boot Order Policy must be set to Retry Boot Order Indefinitely. This means that the systems will keep trying to boot without ever going to disk.
- Legacy BIOS Boot Order must be selected by default.
- If Legacy BIOS Mode is not selected, then follow the "Changing from UEFI Booting Mode to Legacy BIOS Booting Mode" procedure in this section to set the configuration utility to Legacy BIOS Mode.
- Select
Legacy BIOS Boot Order.
This page defines the legacy BIOS boot order. This includes the list of devices from which the server will listen for the DHCP OFFER (including the reserved IPv4) after the PXE DHCP DISCOVER message is broadcast from the server.
- In the default view, 10Gb Embedded FlexibleLOM 1 Port 1 is at the bottom of the list in the default view. When the server begins the scan for the response, it scans down this list until it receives the response. Each NIC takes a finite amount of time before the server gives up on that NIC and attempts another in the list. Moving the 10Gb Embedded FlexibleLOM 1 Port 1 up on this list decreases the time that is required to finally process the DHCP OFFER. To move an entry, drag and drop the entry up in the list below the entry it must reside.
- Ensure that the following options are configured
properly:
- If the current Boot Mode is
UEFI BIOS Mode, then perform the
following steps:
- Ensure that the following options are configured
properly:
- UEFI Optimized Boot must be set to enabled
- Boot Order Policy must be set to Retry Boot Order Indefinitely. This means that the systems will keep trying to boot without ever going to disk.
- UEFI Boot Settings must be selected by default.
- Click UEFI Boot Settings and select UEFI Boot Order.
- Move the 10 Gb Embedded FlexibleLOM 1 Port 1 entry above the 1Gb Embedded LOM 1 Port 1 entry.
- Ensure that the following options are configured
properly:
- Perform the folloqing steps if the current Boot
Mode is Legacy BIOS
Mode:
- Select F10: Save to save and stay in the utility or select the F12: Save and Exit and to complete the current boot process.
- Verifying Default Settings
- Navigate to the System Configuration Utility as per step 1.
- Select System Configuration.
- Select BIOS/Platform Configuration (RBSU).
- Select
Virtualization Options.
This screen displays the settings for the Intel(R) Virtualization Technology (IntelVT), Intel(R) VT-d, and SR-IOV options (Enabled or Disabled). The default value for each option is Enabled.
- Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
- Navigate to the System Configuration Utility as per step 1.
- Select System Configuration.
- Select Embedded RAID 1 : HPE Smart Array P408i-a SR Gen10.
- Select Array Configuration.
- Select Manage Arrays.
- Select Array A (or any designated Array Configuration if there are more than one).
- Select Delete Array.
- Select Submit Changes.
- Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
- Navigate to the System Configuration Utility as per step 1.
- Select System Configuration.
- Select Embedded RAID 1 : HPE Smart Array P408i-a SR Gen10.
- Select Set Bootable Device(s) for Legacy Boot Mode. If the boot devices are not set then it will display Not Set for the primary and secondary devices.
- Select Select Bootable Physical Drive.
- Select Port 1| Box:3 Bay:1
Size:1.8 TB SAS HP EG00100JWJNR.
Note:
This example includes two HDDs and two SSDs. The actual configuration can be different. - Select Set as Primary Bootable Device.
- Select Back to Main
Menu.
This returns to the HPE Smart Array P408i-a SR Gen10 menu. The secondary bootable device is left as Not Set.
- Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
Note:
This step requires a reboot after completion.- Navigate to the System Configuration Utility as per step 1.
- Select System Configuration.
- Select iLO 5 Configuration Utility.
- Select Network Options.
- Enter the IP Address, Subnet Mask, and Gateway IP Address fields provided in Installation PreFlight Checklist.
- Select F12: Save and Exit to complete the current boot process. A reboot is required when setting the static IP for the iLO 5. A warning appears indicating that you must wait 30 seconds for the iLO to reset. A prompt requesting a reboot appears. Select Reboot.
- Once the reboot is complete, you can re-enter the System Utility and verify the settings if necessary.
Configuring Top of Rack Switches
Before installing CNE on BareMetal clusters, you must configure at least two ToR switches to support CNE installation. Though CNE primarily uses Cisco Nexus C93180YC-EX switches, it allows you to use any ToR switch to support a BareMetal CNE cluster. However, it is your responsibility to configure and manage the ToR switch in your domain. This section provides an overview of the generic requirements, capabilities, and configuration of a ToR switch to support BareMetal CNE clusters.
For the procedure to configure Cisco Nexus 93180YC-EX switch, see Configuring Top of Rack 93180YC-EX Switches.
Prerequisites
Before configuring your ToR switch to support BareMetal CNE, ensure that you meet the following prerequisites:- You must have the network topology design that specifies the roles and connections of each switch.
- You must have Console or SSH access to the ToR switches.
- You must have the administrative access to configure the switches.
- The switches must be connected as per Installation PreFlight Checklist. The customer uplinks must not be active before the outside traffic is necessary.
- The ToR switch must support user creation for secure access to the switches.
Features Required in ToR Switches
Ensure that the following features are available in the ToR switch to support CNE installation on BareMetal:- Border Gateway Protocol (BGP)
- interface-vlan
- Link Aggregation Control Protocol (LACP)
- Virtual Port Channel (VPC) or Intelligent Resilient Framework (IRF)
- Virtual Router Redundancy Protocol (VRRP v3) or Hot Standby Router Protocol (HSRP)
- Open Shortest Path First (OSPF). This feature is optional.
Configurations
This section provides information about the generic configurations that are required in the ToR switch to support CNE installation on BareMetal.The
mgmt0 port of the ToR switch must
be assigned with an IP address within the management Virtual
Routing and Forwarding (VRF). This ensures that the
management traffic is routed independent of data traffic
thereby enhancing
security.
Maximum Transmission Unit (MTU), is the largest size of a packet or frame that can be sent in a single transmission on a network interface. On Cisco switches, MTU settings determine the maximum size of packets that can be transmitted over the network. The default MTU size on most switches is typically 1500 bytes. This is the standard size for Ethernet frames in most network environments. If you want to use a larger MTU, ensure that your ToR switch supports larger MTU (jumbomtu) and configure the larger MTU on all interfaces (vlan, port-channel, physical interfaces).
HP switches use the IRF protocol to combine two or more switches into a single logical device.
Cisco Nexus switches use the vPC protocol that
displays the links that are physically connected to two
different Cisco Nexus devices to appear as a single port
channel to a third device. This improves redundancy, load
balancing, and eliminates the ports blocked by Spanning Tree
Protocol (STP). The vPC protocol uses mgmt0
for vPC peer keep-alive link which is used to monitor the
health of the vPC peer link. The keep-alive link sends
heartbeat messages between the vPC peer switches to ensure
both are operational and synchronized. This link helps
prevent split-brain scenarios where both switches assume the
active role due to a peer link
failure.
VRRP v3 and HSRP protocols provide high availability and redundancy for IP routing. This is done by allowing multiple routers to work together to present a single virtual router to end devices. It enhances network reliability by ensuring continuous availability of routing paths even if one of the routers fails. Configure these protocols depending on your requirement. For more information about configuring these protocols, see "Configuring VLAN".
Tracking object (Object-Track) monitors the status of the line protocol on the uplink interface. This tracking object is used for routing through VRRPv3/HSRP, based on the interface status.
- Defining the VLAN.
- Configuring the VLAN interface (also known as Switched Virtual Interface (SVI)).
- Assigning switch ports to VLAN.
- Each RMS server must have two eNet ports. Each eNet port must be connected to a separate ToR Switch.
- The first two RMS systems must be configured as
k8s-host-1/ks8-host-2andbastion-1/bastion-2. VLANs 2, 3, and 4 must be allowed to enable external access to the Bastion Hosts and to facilitate their communication with all other nodes and ILOs. - RMS3 is dedicated to
k8s-host-3andk8s-master-3, where access to VLAN 3 is sufficient. However, to ensure redundancy in the event of an issue with RMS1 or RMS2, RMS3 must also be configured to allow VLANs 2, 3, and 4. - All nodes starting from RMS4 are worker nodes. When configuring the worker nodes, extend the commands to all of them.
- Run the following or equivalent commands to
create the port-channel and allow appropriate
VLANS:
- Allow quick convergence of the
servers:
spanning-tree port type edge trunk - Allow Pre-boot Execution Environment (PXE)
boot on the first Network Interface Card
(NIC):
no lacp suspend-individual
- Allow quick convergence of the
servers:
- Run the following or equivalent commands to
configure physical interface into port channel.
Allow same VLANS as in the port-channel:
- Allow quick convergence of the servers.
Configure this on both port-channel and physical
ports.
spanning-tree port type edge trunk - Configure a physical interface to be part of
an EtherChannel (or Port Channel) in active mode
using the Link Aggregation Control Protocol
(LACP):
channel-group <groupp-id> force mode active
- Allow quick convergence of the servers.
Configure this on both port-channel and physical
ports.
Ensure that each server has the iLO or iLOM port to connect to one of the switches, with access mode configured on the switch port.
- The vPC peer link is a special link that connects two Cisco Nexus switches configured as vPC peers. It serves as the communication backbone between the two switches, allowing them to synchronize state and configuration information. This link is essential for the operation of vPCs and ensures that both switches operate together.
- The inter-switch link facilitates communication between two ToR switches for VRRPv3 or HSRP. This is used to advertise the link and negotiate the controller or backup relationship.
- Do not connect the cables between the ToR switch and customer network physically.
- Do not run the "
no shutdown" or equivalent command which enables the uplink.
Border Gateway Protocol (BGP) is used to exchange routing information between different autonomous systems. When MetalLB is used in the cluster, BGP configurations are required on the switches to access LoadBalancer IPs.
Calico (used in the cluster) provides 64512 as the default route-as number. The ToR switches must use this number to establish peer relationship with the cluster. The router ID must be unique for each switch in the connections, including the two ToR switches and the connected customer switches.
The routing between ToR switches and customer switches can be different and it is decided based on the user network. BGP and OSPF are the most commonly used static routing protocols. Open Shortest Path First (OSPF) is a widely used Interior Gateway Protocol (IGP) which is designed for routing within an autonomous system. OSPF is based on a link-state routing algorithm, which provides fast convergence and scalability, making it suitable for large and complex network topologies.
Support for CNLB Switch
- CNLB bond0 version: Add secondary IP addresses for all external subnets on private VLAN interface.
- CNLB VLAN version: Perform the following steps to
configure CNLB VLAN for each CNLB internal and
external subnet:
- Add VLAN.
- Add VLAN interface. IPv6 is required to keep worker node interfaces up and running.
- Add allowed VLAN on each port assigned to the worker nodes.
Configuring Top of Rack 93180YC-EX Switches
Introduction
This section provides the steps to initialize and configure Cisco 93180YC-EX switches as per your topology design.Note:
Run all instructions in this procedure from the Bootstrap Host.Prerequisites
- Procedure Installation of Oracle Linux X.X on Bootstrap Host is complete.
- The switches are in a factory default state. If
the switches are out of the box or preconfigured, run
write eraseand reload to factory default. - The switches are connected as per the Installation PreFlight Checklist. The customer uplinks are not active before outside traffic is necessary.
- DHCP, XINETD, and TFTP are already installed on the Bootstrap host but not configured.
- Available Utility USB contains all the necessary files according to the Installation PreFlight checklist: Create Utility USB.
Limitations/Expectations
All steps are run from a Keyboard, Video, Mouse (KVM) connection.
References
Configuration Procedure
Following is the procedure to configure Top of Rack 93180YC-EX Switches:
- Use KVM to log in to the Bootstrap Host as the root user.
- Insert and mount the Utility USB that contains the configuration and script
files. Verify the files are listed in the USB using the
ls /media/usbcommand. To mount USB, perform steps 2 and 3 of Installation of Oracle Linux X.X on Bootstrap Host. - Create bridge interface to connect both management ports and setup the
management bridge to support switch initialization:
Commands for
Note:
The names of interface 1 and interface 2 depend on the version of Linux that is being run. You can obtain the names of the interfaces by running theip acommand.mgmtBridge:
For example:$ nmcli con add con-name mgmtBridge type bridge ifname mgmtBridge $ nmcli con add type bridge-slave ifname <interface 1> master mgmtBridge $ nmcli con add type bridge-slave ifname <interface 2> master mgmtBridge $ nmcli con mod mgmtBridge ipv4.method manual ipv4.addresses <mgmtBridge_IP> $ nmcli con up mgmtBridge
Commands for$ nmcli con add con-name mgmtBridge type bridge ifname mgmtBridge $ nmcli con add type bridge-slave ifname eno5np0 master mgmtBridge $ nmcli con add type bridge-slave ifname eno6np1 master mgmtBridge $ nmcli con mod mgmtBridge ipv4.method manual ipv4.addresses 192.168.2.11/24 $ nmcli con up mgmtBridgebond:
For example:$ nmcli con add type bond con-name bond0 ifname bond0 mode 802.3ad $ nmcli con add type bond-slave con-name bond0-slave-1 ifname <interface 1> master bond0 $ nmcli con add type bond-slave con-name bond0-slave-2 ifname <interface 2> master bond0$ nmcli con add type bond con-name bond0 ifname bond0 mode 802.3ad $ nmcli con add type bond-slave con-name bond0-slave-1 ifname eno5np0 master bond0 $ nmcli con add type bond-slave con-name bond0-slave-2 ifname eno6np1 master bond0Note:
- The
<CNE_Management_IP_With_Prefix>value in the following commands is from Installation PreFlight Checklist: Complete Site Survey Host IP Table (Row 1 CNE Management IP Addresess (VLAN 4) column). - The
<ToRSwitch_CNEManagementNet_VIP>value in the following commands is from Installation PreFlight Checklist: Complete OA and Switch IP Table.
For example:$ nmcli con mod bond0 ipv4.method manual ipv4.addresses <bootstrap_bond0_address_with_prefix> $ nmcli con add con-name bond0.<mgmt_vlan_id> type vlan id <mgmt_vlan_id> dev bond0 $ nmcli con mod bond0.<mgmt_vlan_id> ipv4.method manual ipv4.addresses <CNE_Management_IP_Address_With_Prefix> ipv4.gateway <ToRswitch_CNEManagementNet_VIP> $ nmcli con up bond0.<mgmt_vlan_id>$ nmcli con mod bond0 ipv4.method manual ipv4.addresses 172.16.3.4/24 $ nmcli con add con-name bond0.4 type vlan id 4 dev bond0 $ nmcli con mod bond0.4 ipv4.method manual ipv4.addresses 10.7.5.22/28 ipv4.gateway 10.7.5.17 $ nmcli con up bond0.4 - The
- This step is applicable for Netra X8-2 server only. Due to limitation of
Ethernet ports, only three out of five ports can be enabled. Therefore, there
are not enough ports to configure
mgmtBridgeandbond0at the same time. Connect NIC1 or NIC2 to ToR switch mgmt ports in this step to configure ToR switches. After that, reconnect the ports to Ethernet ports on the ToR switches.$ nmcli con add con-name mgmtBridge type bridge ifname mgmtBridge $ nmcli con add type bridge-slave ifname eno5np0 master mgmtBridge $ nmcli con add type bridge-slave ifname eno6np1 master mgmtBridge $ nmcli con mod mgmtBridge ipv4.method manual ipv4.addresses 192.168.2.11/24 $ nmcli con up mgmtBridge - Copy the customer Oracle Linux repo file (for example:
ol9.repo) from USB to/etc/yum.repo.ddirectory. Move the origin to backup file (For exmaple:oracle-linux-ol9.repo,uek-ol9.repo, andvirt-ol9.repo):$ cd /etc/yum.repos.d $ mv oracle-linux-ol9.repo oracle-linux-ol9.repo.bkp $ mv virt-ol9.repo virt-ol9.repo.bkp $ mv uek-ol9.repo uek-ol9.repo.bkp $ cp /media/usb/<central_repo>.repo ./ - Install and set up tftp server on bootstrap
host:
$ dnf install -y tftp-server tftp $ cp /usr/lib/systemd/system/tftp.service /etc/systemd/system/tftp-server.service $ cp /usr/lib/systemd/system/tftp.socket /etc/systemd/system/tftp-server.socket $ tee /etc/systemd/system/tftp-server.service<<'EOF' [Unit] Description=Tftp Server Requires=tftp-server.socket Documentation=man:in.tftpd [Service] ExecStart=/usr/sbin/in.tftpd -c -p -s /var/lib/tftpboot StandardInput=socket [Install] WantedBy=multi-user.target Also=tftp-server.socket EOF - Enable tftp on the Bootstrap
host:
$ systemctl daemon-reload $ systemctl enable --now tftp-servertftp is active and enabled:$ systemctl status tftp $ ps -elf | grep tftp - Install and setup dhcp server on bootstrap
host:
$ dnf -y install dhcp-server - Copy the
dhcpd.conffile from the Utility USB in Installation PreFlight checklist : Create the dhcpd.conf File to the/etc/dhcp/directory.$ cp /media/usb/dhcpd.conf /etc/dhcp/ - Restart and enable dhcpd
service.
Use the following command to verify the active and enabled state:$ systemctl enable --now dhcpd$ systemctl status dhcpd - Depending on the type of Load Balancer being configured (CNLB or Metallb), copy
the switch configuration and script files from the Utility USB to
/var/lib/tftpboot/directory as follows:- If you are using MetalLB, use the following
commands:
$ cp /media/usb/93180_switchA.cfg /var/lib/tftpboot/. $ cp /media/usb/93180_switchB.cfg /var/lib/tftpboot/. $ cp /media/usb/poap_nexus_script.py /var/lib/tftpboot/. - If you are using CNLB, you can either use
bond0orvlanin the CNLB network configuration:- Example for CNLB using
bond0:
$ cp /media/usb/93180_switchA_cnlb_bond0.cfg /var/lib/tftpboot/. $ cp /media/usb/93180_switchB_cnlb_bond0.cfg /var/lib/tftpboot/. $ cp /media/usb/poap_nexus_script.py /var/lib/tftpboot/. - Example for CNLB using
vlan:
$ cp /media/usb/93180_switchA_cnlb_vlan.cfg /var/lib/tftpboot/. $ cp /media/usb/93180_switchB_cnlb_vlan.cfg /var/lib/tftpboot/. $ cp /media/usb/poap_nexus_script.py /var/lib/tftpboot/.
- Example for CNLB using
bond0:
- If you are using MetalLB, use the following
commands:
- To modify the POAP script file change username and password credentials used to
log in to the Bootstrap host.
$ vi /var/lib/tftpboot/poap_nexus_script.py # Host name and user credentials options = { "username": "<username>", "password": "<password>", "hostname": "192.168.2.11", "transfer_protocol": "scp", "mode": "serial_number", "target_system_image": "nxos.9.2.3.bin", }Note:
The versionnxos.9.2.3.binis used by default. If different version is to be used, modifytarget_system_imagewith new version. - Run the
md5Poap.shscript from the Utility USB created from Installation PreFlight checklist: Create the md5Poap Bash Script, to modify the POAP script file md5sum as follows:$ cd /var/lib/tftpboot/ $ /bin/bash md5Poap.sh - Create the files necessary to configure the ToR switches using the serial
number from the switch.

Note:
The serial number is located on a pullout card on the back of the switch in the left most power supply of the switch. Be careful in interpreting the exact letters. If the switches are preconfigured, then you can even verify the serial numbers usingshow license host-idcommand. - Depending on the type of Load Balancer, copy the switch configuration files into
a new file renamed according to the switch A or B serial number:
- For standard MetalLB, copy the
/var/lib/tftpboot/93180_switchA.cfgfile into a file named/var/lib/tftpboot/conf.<switchA serial number>. - For CNLB:
- If you are using
bond0, copy the/var/lib/tftpboot/93180_switchA_bond0.cfgfile into a file named/var/lib/tftpboot/conf.<switchA serial number>. - If you are using
vlan, copy the/var/lib/tftpboot/93180_switchA_vlan.cfgfile into a file named/var/lib/tftpboot/conf.<switchA serial number>.
- If you are using
- For standard MetalLB, copy the
- Modify the switch specific values in the
/var/lib/tftpboot/conf.<switchA serial number>file, including all the values in the curly braces as shown in the following code block:These values are available in Installation PreFlight checklist : ToR and Enclosure Switches Variables Table (Switch Specific) and Installation PreFlight Checklist : Complete OA and Switch IP Table. Modify these values with the followingsedcommands or use an editor such asvito modify the commands.Note:
The template supports 12 RMS servers. If there are less than 12 servers, then the extra configurations may not work without physical connections and will not affect the first number of servers. If there are more than 12 servers, simulate the pattern to add for more servers.$ sed -i 's/{switchname}/<switch_name>/' conf.<switchA serial number> $ sed -i 's/{admin_password}/<admin_password>/' conf.<switchA serial number> $ sed -i 's/{user_name}/<user_name>/' conf.<switchA serial number> $ sed -i 's/{user_password}/<user_password>/' conf.<switchA serial number> $ sed -i 's/{ospf_md5_key}/<ospf_md5_key>/' conf.<switchA serial number> $ sed -i 's/{OSPF_AREA_ID}/<ospf_area_id>/' conf.<switchA serial number> $ sed -i 's/{NTPSERVER1}/<NTP_server_1>/' conf.<switchA serial number> $ sed -i 's/{NTPSERVER2}/<NTP_server_2>/' conf.<switchA serial number> $ sed -i 's/{NTPSERVER3}/<NTP_server_3>/' conf.<switchA serial number> $ sed -i 's/{NTPSERVER4}/<NTP_server_4>/' conf.<switchA serial number> $ sed -i 's/{NTPSERVER5}/<NTP_server_5>/' conf.<switchA serial number> Note: If less than 5 ntp servers available, delete the extra ntp server lines such as command: $ sed -i 's/{NTPSERVER5}/d' conf.<switchA serial number> Note: different delimiter is used in next two commands due to '/' sign in the variables $ sed -i 's#{ALLOW_5G_XSI_LIST_WITH_PREFIX_LEN}#<MetalLB_Signal_Subnet_With_Prefix>#g' conf.<switchA serial number> $ sed -i 's#{CNE_Management_SwA_Address}#<ToRswitchA_CNEManagementNet_IP>#g' conf.<switchA serial number> $ sed -i 's#{CNE_Management_SwB_Address}#<ToRswitchB_CNEManagementNet_IP>#g' conf.<switchA serial number> $ sed -i 's#{CNE_Management_Prefix}#<CNEManagementNet_Prefix>#g' conf.<switchA serial number> $ sed -i 's/{CNE_Management_VIP}/<ToRswitch_CNEManagementNet_VIP>/g' conf.<switchA serial number> $ sed -i 's/{OAM_UPLINK_CUSTOMER_ADDRESS}/<ToRswitchA_oam_uplink_customer_IP>/' conf.<switchA serial number> $ sed -i 's/{OAM_UPLINK_SwA_ADDRESS}/<ToRswitchA_oam_uplink_IP>/g' conf.<switchA serial number> $ sed -i 's/{SIGNAL_UPLINK_SwA_ADDRESS}/<ToRswitchA_signaling_uplink_IP>/g' conf.<switchA serial number> $ sed -i 's/{OAM_UPLINK_SwB_ADDRESS}/<ToRswitchB_oam_uplink_IP>/g' conf.<switchA serial number> $ sed -i 's/{SIGNAL_UPLINK_SwB_ADDRESS}/<ToRswitchB_signaling_uplink_IP>/g' conf.<switchA serial number> $ ipcalc -n <ToRswitchA_signaling_uplink_IP>/30 | awk -F'=' '{print $2}' $ sed -i 's/{SIGNAL_UPLINK_SUBNET}/<output from ipcalc command as signal_uplink_subnet>/' conf.<switchA serial number> $ ipcalc -n <ToRswitchA_SQLreplicationNet_IP> | awk -F'=' '{print $2}' $ sed -i 's/{MySQL_Replication_SUBNET}/<output from the above ipcalc command appended with prefix >/' conf.<switchA serial number> Note: The version nxos.9.2.3.bin is used by default and hard-coded in the conf files. If different version is to be used, run the following command: $ sed -i 's/nxos.9.2.3.bin/<nxos_version>/' conf.<switchA serial number> Note: access-list Restrict_Access_ToR The following line allow one access server to access the switch management and SQL vlan addresses while other accesses are denied. If no need, delete this line. If need more servers, add similar line. $ sed -i 's/{Allow_Access_Server}/<Allow_Access_Server>/' conf.<switchA serial number>If you are using CNLB deployment, run the following commands in addition to the commands in the previous codeblock. This is applicable for bothbond0andvlanconfigurations:$ sed -i 's#{CNLB_OAM_EXT_SwA_Address}#<CNLB_OAM_EXT_SwA_Address>#g' conf.<switchA serial number> $ sed -i 's#{CNLB_OAM_EXT_VIP}#<CNLB_OAM_EXT_VIP>#g' conf.<switchA serial number> $ sed -i 's#{CNLB_OAM_EXT_Prefix}#<CNLB_OAM_EXT_Prefix>#g' conf.<switchA serial number> $ sed -i 's#{CNLB_SIG_EXT_SwA_Address}#<CNLB_SIG_EXT_SwA_Address>#g' conf.<switchA serial number> $ sed -i 's#{CNLB_SIG_EXT_VIP}#<CNLB_SIG_EXT_VIP>#g' conf.<switchA serial number> $ sed -i 's#{CNLB_SIG_EXT_Prefix}#<CNLB_SIG_EXT_Prefix>#g' conf.<switchA serial number> - Copy the
/var/lib/tftpboot/93180_switchB.cfginto a/var/lib/tftpboot/conf.<switchB serial number>file:Modify the switch specific values in the
/var/lib/tftpboot/conf.<switchA serial number>file, including: hostname, username/password, oam_uplink IP address, signaling_uplink IP address, access-list ALLOW_5G_XSI_LIST permit address, prefix-list ALLOW_5G_XSI.These values are available in Installation PreFlight checklist : ToR and Enclosure Switches Variables Table and Installation PreFlight Checklist : Complete OA and Switch IP Table.Note:
The template supports 12 RMS servers. If there are less than 12 servers, then the extra configurations may not work without physical connections and will not affect the first number of servers. If there are more than 12 servers, simulate the pattern to add for more servers.$ sed -i 's/{switchname}/<switch_name>/' conf.<switchB serial number> $ sed -i 's/{admin_password}/<admin_password>/' conf.<switchB serial number> $ sed -i 's/{user_name}/<user_name>/' conf.<switchB serial number> $ sed -i 's/{user_password}/<user_password>/' conf.<switchB serial number> $ sed -i 's/{ospf_md5_key}/<ospf_md5_key>/' conf.<switchB serial number> $ sed -i 's/{OSPF_AREA_ID}/<ospf_area_id>/' conf.<switchB serial number> $ sed -i 's/{NTPSERVER1}/<NTP_server_1>/' conf.<switchB serial number> $ sed -i 's/{NTPSERVER2}/<NTP_server_2>/' conf.<switchB serial number> $ sed -i 's/{NTPSERVER3}/<NTP_server_3>/' conf.<switchB serial number> $ sed -i 's/{NTPSERVER4}/<NTP_server_4>/' conf.<switchB serial number> $ sed -i 's/{NTPSERVER5}/<NTP_server_5>/' conf.<switchB serial number> Note: If less than 5 ntp servers available, delete the extra ntp server lines such as command: $ sed -i 's/{NTPSERVER5}/d' conf.<switchB serial number> Note: different delimiter is used in next two commands due to '/' sign in in the variables $ sed -i 's#{ALLOW_5G_XSI_LIST_WITH_PREFIX_LEN}#<MetalLB_Signal_Subnet_With_Prefix>#g' conf.<switchB serial number> $ sed -i 's#{CNE_Management_SwA_Address}#<ToRswitchA_CNEManagementNet_IP>#g' conf.<switchB serial number> $ sed -i 's#{CNE_Management_SwB_Address}#<ToRswitchB_CNEManagementNet_IP>#g' conf.<switchB serial number> $ sed -i 's#{CNE_Management_Prefix}#<CNEManagementNet_Prefix>#g' conf.<switchB serial number> $ sed -i 's/{CNE_Management_VIP}/<ToRswitch_CNEManagementNet_VIP>/' conf.<switchB serial number> $ sed -i 's/{OAM_UPLINK_CUSTOMER_ADDRESS}/<ToRswitchB_oam_uplink_customer_IP>/' conf.<switchB serial number> $ sed -i 's/{OAM_UPLINK_SwA_ADDRESS}/<ToRswitchA_oam_uplink_IP>/g' conf.<switchB serial number> $ sed -i 's/{SIGNAL_UPLINK_SwA_ADDRESS}/<ToRswitchA_signaling_uplink_IP>/g' conf.<switchB serial number> $ sed -i 's/{OAM_UPLINK_SwB_ADDRESS}/<ToRswitchB_oam_uplink_IP>/g' conf.<switchB serial number> $ sed -i 's/{SIGNAL_UPLINK_SwB_ADDRESS}/<ToRswitchB_signaling_uplink_IP>/g' conf.<switchB serial number> $ ipcalc -n <ToRswitchB_signaling_uplink_IP>/30 | awk -F'=' '{print $2}' $ sed -i 's/{SIGNAL_UPLINK_SUBNET}/<output from ipcalc command as signal_uplink_subnet>/' conf.<switchB serial number> Note: The version nxos.9.2.3.bin is used by default and hard-coded in the conf files. If different version is to be used, run the following command: $ sed -i 's/nxos.9.2.3.bin/<nxos_version>/' conf.<switchB serial number> Note: access-list Restrict_Access_ToR The following line allow one access server to access the switch management and SQL vlan addresses while other accesses are denied. If no need, delete this line. If need more servers, add similar line. $ sed -i 's/{Allow_Access_Server}/<Allow_Access_Server>/' conf.<switchB serial number>If you are using a CNLB deployment, run the following commands in addition to the commands in the previous codeblock. This is applicable for bothbond0andvlanconfigurations:$ sed -i 's#{CNLB_OAM_EXT_SwB_Address}#<CNLB_OAM_EXT_SwB_Address>#g' conf.<switchB serial number> $ sed -i 's#{CNLB_OAM_EXT_VIP}#<CNLB_OAM_EXT_VIP>#g' conf.<switchB serial number> $ sed -i 's#{CNLB_OAM_EXT_Prefix}#<CNLB_OAM_EXT_Prefix>#g' conf.<switchB serial number> $ sed -i 's#{CNLB_SIG_EXT_SwB_Address}#<CNLB_SIG_EXT_SwB_Address>#g' conf.<switchB serial number> $ sed -i 's#{CNLB_SIG_EXT_VIP}#<CNLB_SIG_EXT_VIP>#g' conf.<switchB serial number> $ sed -i 's#{CNLB_SIG_EXT_Prefix}#<CNLB_SIG_EXT_Prefix>#g' conf.<switchB serial number> - Generate the md5 checksum for each conf file in
/var/lib/tftpbootand copy that into a new file calledconf.<switchA/B serial number>.md5.$ md5sum conf.<switchA serial number> > conf.<switchA serial number>.md5 $ md5sum conf.<switchB serial number> > conf.<switchB serial number>.md5 - Verify that the
/var/lib/tftpbootdirectory has the correct files.Ensure that the file permissions are set as follows:Note:
The ToR switches constantly attempts to find and run thepoap_nexus_script.pyscript which usestftpto load and install the configuration files.$ ls -l /var/lib/tftpboot/ total 1305096 -rw-r--r--. 1 root root 7161 Mar 25 15:31 conf.<switchA serial number> -rw-r--r--. 1 root root 51 Mar 25 15:31 conf.<switchA serial number>.md5 -rw-r--r--. 1 root root 7161 Mar 25 15:31 conf.<switchB serial number> -rw-r--r--. 1 root root 51 Mar 25 15:31 conf.<switchB serial number>.md5 -rwxr-xr-x. 1 root root 75856 Mar 25 15:32 poap_nexus_script.py - Enable
tftp-serveragain and verify the status.Note:
The status oftftp-serverremains active for only 15 minutes after enabling the server. Therefore, you must enable thetftp-serveragain before configuring the switches.$ systemctl enable --now tftp-serverVerify tftp is active and enabled:$ systemctl status tftp-server $ ps -elf | grep tftp - Disable and verify the firewalId service
status:
$ systemctl stop firewalld $ systemctl disable firewalld $ systemctl status firewalldAfter completing the above steps, the ToR Switches will attempt to boot from the tftpboot files automatically.
- Unmount the Utility USB and remove it as follows:
umount /media/usb
Verification
- After the configuration of ToR switches, ping the switches from bootstrap
server. The switches mgmt0 interfaces are configured with the IP addresses
that are in the conf files.
Example to ping switch 1:
Note:
Wait for the device to respond.$ ping 192.168.2.1Sample output:PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data. 64 bytes from 192.168.2.1: icmp_seq=1 ttl=255 time=0.419 ms 64 bytes from 192.168.2.1: icmp_seq=2 ttl=255 time=0.496 ms 64 bytes from 192.168.2.1: icmp_seq=3 ttl=255 time=0.573 ms 64 bytes from 192.168.2.1: icmp_seq=4 ttl=255 time=0.535 ms ^C --- 192.168.2.1 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3000ms rtt min/avg/max/mdev = 0.419/0.505/0.573/0.063 msExample to ping switch 2:
Sample output:$ ping 192.168.2.2PING 192.168.2.2 (192.168.2.2) 56(84) bytes of data. 64 bytes from 192.168.2.2: icmp_seq=1 ttl=255 time=0.572 ms 64 bytes from 192.168.2.2: icmp_seq=2 ttl=255 time=0.582 ms 64 bytes from 192.168.2.2: icmp_seq=3 ttl=255 time=0.466 ms 64 bytes from 192.168.2.2: icmp_seq=4 ttl=255 time=0.554 ms ^C --- 192.168.2.2 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3001ms rtt min/avg/max/mdev = 0.466/0.543/0.582/0.051 ms - Attempt to SSH to the switches with the username and password provided in the
configuration
files.
$ ssh plat@192.168.2.1Sample output:The authenticity of host '192.168.2.1 (192.168.2.1)' can't be established. RSA key fingerprint is SHA256:jEPSMHRNg9vejiLcEvw5qprjgt+4ua9jucUBhktH520. RSA key fingerprint is MD5:02:66:3a:c6:81:65:20:2c:6e:cb:08:35:06:c6:72:ac. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.2.1' (RSA) to the list of known hosts. User Access Verification Password: Cisco Nexus Operating System (NX-OS) Software TAC support: http://www.cisco.com/tac Copyright (C) 2002-2019, Cisco and/or its affiliates. All rights reserved. The copyrights to certain works contained in this software are owned by other third parties and used and distributed under their own licenses, such as open source. This software is provided "as is," and unless otherwise stated, there is no warranty, express or implied, including but not limited to warranties of merchantability and fitness for a particular purpose. Certain components of this software are licensed under the GNU General Public License (GPL) version 2.0 or GNU General Public License (GPL) version 3.0 or the GNU Lesser General Public License (LGPL) Version 2.1 or Lesser General Public License (LGPL) Version 2.0. A copy of each such license is available at http://www.opensource.org/licenses/gpl-2.0.php and http://opensource.org/licenses/gpl-3.0.html and http://www.opensource.org/licenses/lgpl-2.1.php and http://www.gnu.org/licenses/old-licenses/library.txt. - Verify that the running-config contains all expected configurations in the conf
file using the
show running-configcommand as follows:
Sample output:$ show running-config!Command: show running-config !Running configuration last done at: Mon Apr 8 17:39:38 2019 !Time: Mon Apr 8 18:30:17 2019 version 9.2(3) Bios:version 07.64 hostname 12006-93108A vdc 12006-93108A id 1 limit-resource vlan minimum 16 maximum 4094 limit-resource vrf minimum 2 maximum 4096 limit-resource port-channel minimum 0 maximum 511 limit-resource u4route-mem minimum 248 maximum 248 limit-resource u6route-mem minimum 96 maximum 96 limit-resource m4route-mem minimum 58 maximum 58 limit-resource m6route-mem minimum 8 maximum 8 feature scp-server feature sftp-server cfs eth distribute feature ospf feature interface-vlan feature lacp feature vpc feature bfd feature vrrpv3 .... .... - In case some of the above features are missing, verify license on the switches
and at least NXOS_ADVANTAGE level license is in use. If the license is not
installed or too low level, contact the vendor for correct license key file.
Then run
write eraseandreloadto set back to factory default. The switches will go to POAP configuration again.# show licenseExample output:# show license MDS20190215085542979.lic: SERVER this_host ANY VENDOR cisco INCREMENT NXOS_ADVANTAGE_XF cisco 1.0 permanent uncounted \ VENDOR_STRING=<LIC_SOURCE>MDS_SWIFT</LIC_SOURCE><SKU>NXOS-AD-XF</SKU> \ HOSTID=VDH=FDO22412J2F \ NOTICE="<LicFileID>20190215085542979</LicFileID><LicLineID>1</LicLineID> \ <PAK></PAK>" SIGN=8CC8807E6918# show license usageExample output:# show license usage Feature Ins Lic Status Expiry Date Comments Count -------------------------------------------------------------------------------- ... NXOS_ADVANTAGE_M4 No - Unused - NXOS_ADVANTAGE_XF Yes - In use never - NXOS_ESSENTIALS_GF No - Unused - ... # - For Netra X8-2 server, reconnect the cable on mgmt ports to the Ethernet ports
for RMS1, delete mgmtBridge, and configure bond0 and management vlan
interface:
$ nmcli con delete con-name mgmtBridge $ nmcli con add type bond con-name bond0 ifname bond0 mode 802.3ad $ nmcli con add type bond-slave con-name bond0-slave-1 ifname eno2np0 master bond0 $ nmcli con add type bond-slave con-name bond0-slave-2 ifname eno3np1 master bond0The following commands are related to the VLAN and IP address for this bootstrap server, the<mgmt_vlan_id>is the same as inhosts.iniand the<bootstrap bond0 address>is same asansible_hostIP for this bootstrap server:$ nmcli con mod bond0 ipv4.method manual ipv4.addresses <bootstrap bond0 address> $ nmcli con add con-name bond0.<mgmt_vlan_id> type vlan id <mgmt_vlan_id> dev bond0 $ nmcli con mod bond0.<mgmt_vlan_id> ipv4.method manual ipv4.addresses <CNE_Management_IP_Address_With_Prefix> ipv4.gateway <ToRswitch_CNEManagementNet_VIP> $ nmcli con up bond0.<mgmt_vlan_id>For example:$ nmcli con mod bond0 ipv4.method manual ipv4.addresses 172.16.3.4/24 $ nmcli con add con-name bond0.4 type vlan id 4 dev bond0 $ nmcli con mod bond0.4 ipv4.method manual ipv4.addresses <CNE_Management_IP_Address_With_Prefix> ipv4.gateway <ToRswitch_CNEManagementNet_VIP> $ nmcli con up bond0.4 - Verify if the RMS1 can ping the CNE_Management VIP.
Sample output:$ ping <ToRSwitch_CNEManagementNet_VIP>PING <ToRSwitch_CNEManagementNet_VIP> (<ToRSwitch_CNEManagementNet_VIP>) 56(84) bytes of data. 64 bytes from <ToRSwitch_CNEManagementNet_VIP>: icmp_seq=2 ttl=255 time=1.15 ms 64 bytes from <ToRSwitch_CNEManagementNet_VIP>: icmp_seq=3 ttl=255 time=1.11 ms 64 bytes from <ToRSwitch_CNEManagementNet_VIP>: icmp_seq=4 ttl=255 time=1.23 ms ^C --- 10.75.207.129 ping statistics --- 4 packets transmitted, 3 received, 25% packet loss, time 3019ms rtt min/avg/max/mdev = 1.115/1.168/1.237/0.051 ms - Connect or enable customer uplink.
- Verify if the RMS1 can be accessed from laptop. Use application such as Putty
to ssh to RMS1.
Sample output:$ ssh root@<CNE_Management_IP_Address>Using username "root". root@<CNE_Management_IP_Address>'s password:<root password> Last login: Mon May 6 10:02:01 2019 from 10.75.9.171 [root@RMS1 ~]#
SNMP Trap Configuration
- SNMPv2c Configuration.
When SNMPv2c configuration is needed, ssh to the two switches and run the following commands:
These values
<SNMP_Trap_Receiver_Address>and<SNMP_Community_String>are from Installation Preflight Checklist.
Sample output:$ ssh <user_name>@<ToRswitchA_CNEManagementNet_IP># configure terminal (config)# snmp-server host <SNMP_Trap_Receiver_Address> traps version 2c <SNMP_Community_String> (config)# snmp-server host <SNMP_Trap_Receiver_Address> use-vrf default (config)# snmp-server host <SNMP_Trap_Receiver_Address> source-interface Ethernet1/51 (config)# snmp-server enable traps (config)# snmp-server community <SNMP_Community_String> group network-admin - To restrict the direct access to ToR switches, create IP access list and apply
on the uplink interfaces. Use the following commands on ToR switches:
Sample output:$ ssh <user_name>@<ToRswitchA_CNEManagementNet_IP># configure terminal (config)# ip access-list Restrict_Access_ToR permit ip {Allow_Access_Server}/32 any permit ip {NTPSERVER1}/32 {OAM_UPLINK_SwA_ADDRESS}/32 permit ip {NTPSERVER2}/32 {OAM_UPLINK_SwA_ADDRESS}/32 permit ip {NTPSERVER3}/32 {OAM_UPLINK_SwA_ADDRESS}/32 permit ip {NTPSERVER4}/32 {OAM_UPLINK_SwA_ADDRESS}/32 permit ip {NTPSERVER5}/32 {OAM_UPLINK_SwA_ADDRESS}/32 deny ip any {CNE_Management_VIP}/32 deny ip any {CNE_Management_SwA_Address}/32 deny ip any {CNE_Management_SwB_Address}/32 deny ip any {SQL_replication_VIP}/32 deny ip any {SQL_replication_SwA_Address}/32 deny ip any {SQL_replication_SwB_Address}/32 deny ip any {OAM_UPLINK_SwA_ADDRESS}/32 deny ip any {OAM_UPLINK_SwB_ADDRESS}/32 deny ip any {SIGNAL_UPLINK_SwA_ADDRESS}/32 deny ip any {SIGNAL_UPLINK_SwB_ADDRESS}/32 permit ip any any interface Ethernet1/51 ip access-group Restrict_Access_ToR in interface Ethernet1/52 ip access-group Restrict_Access_ToR in - Traffic egress out of cluster (including snmptrap traffic to SNMP trap
receiver) and traffic goes to signal server:
Sample output:$ ssh <user_name>@<ToRswitchA_CNEManagementNet_IP># configure terminal (config)# feature nat ip access-list host-snmptrap 10 permit udp 172.16.3.0/24 <snmp trap receiver>/32 eq snmptrap log ip access-list host-sigserver 10 permit ip 172.16.3.0/24 <signal server>/32 ip nat pool sig-pool 10.75.207.211 10.75.207.222 prefix-length 27 ip nat inside source list host-sigserver pool sig-pool overload add-route ip nat inside source list host-snmptrap interface Ethernet1/51 overload interface Vlan3 ip nat inside interface Ethernet1/51 ip nat outside interface Ethernet1/52 ip nat outside Run the same commands on ToR switchB
Configuring Addresses for RMS iLOs
Introduction
Note:
Skip this procedure if the iLO network is controlled by lab network or customer network that is beyond the ToR switches. The iLO network can be accessed from the bastion host management interface. Perform this procedure only if the iLO network is local on the ToR switches and iLO addresses are not configured on the servers.Prerequisites
Ensure that the procedure Configure Top of Rack 93180YC-EX Switches has been completed.
Limitations
All steps must be run from the SSH session of the Bootstrap server.
References
Procedure
Following is the procedure to configure addresses for RMS iLOs:
Setting up interface on bootstrap server and find iLO DHCP address- Setup the VLAN interface to access ILO subnet. The
ilo_vlan_idandilo_subnet_cidrare the same value as inhosts.ini:$ nmcli con add con-name bond0.<ilo_vlan_id> type vlan id <ilo_vlan_id> dev bond0 $ nmcli con mod bond0.<ilo_vlan_id> ipv4.method manual ipv4.addresses <unique ip in ilo subnet>/<ilo_subnet_cidr> $ nmcli con up bond0.<ilo_vlan_id>Example:
$ nmcli con add con-name bond0.2 type vlan id 2 dev bond0 $ nmcli con mod bond0.2 ipv4.method manual ipv4.addresses 192.168.20.11/24 $ nmcli con up bond0.2 - Subnet and conf file address.
The
/etc/dhcp/dhcpd.conffile is already configured as per the OCCNE Configure Top of Rack 93180YC-EX Switches procedure and DHCP started or enabled on the bootstrap server. The second subnet 192.168.20.0 is used to assign addresses for OA and RMS iLOs. - Display the DHCPD leases file at
/var/lib/dhcpd/dhcpd.leases. The DHCPD lease file displays the DHCP addresses for all RMS iLOs:
Sample output:$ cat /var/lib/dhcpd/dhcpd.leases# The format of this file is documented in the dhcpd.leases(5) manual page. # This lease file was written by isc-dhcp-4.2.5 ... lease 192.168.20.106 { starts 5 2019/03/29 18:10:04; ends 5 2019/03/29 21:10:04; cltt 5 2019/03/29 18:10:04; binding state active; next binding state free; rewind binding state free; hardware ethernet b8:83:03:47:5f:14; uid "\000\270\203\003G_\024\000\000\000"; client-hostname "ILO2M2909004B"; } lease 192.168.20.104 { starts 5 2019/03/29 18:10:35; ends 5 2019/03/29 21:10:35; cltt 5 2019/03/29 18:10:35; binding state active; next binding state free; rewind binding state free; hardware ethernet b8:83:03:47:64:9c; uid "\000\270\203\003Gd\234\000\000\000"; client-hostname "ILO2M2909004F"; } lease 192.168.20.105 { starts 5 2019/03/29 18:10:40; ends 5 2019/03/29 21:10:40; cltt 5 2019/03/29 18:10:40; binding state active; next binding state free; rewind binding state free; hardware ethernet b8:83:03:47:5e:54; uid "\000\270\203\003G^T\000\000\000"; client-hostname "ILO2M29090048";
- Access RMS iLO from the DHCP address with default Administrator password.
From the above
dhcpd.leasesfile. Find the IP address for the iLO name, the default username is Administrator, the password is on the label that can be pulled out from front of server.Note:
The DNS Name is on the pull-out label. Use the DNS Name on the pull-out label to match the physical machine with the iLO IP. The same default DNS Name from the pull-out label is displayed upon logging in to the iLO command line interface, as shown in the following example:
Sample output:$ ssh Administrator@192.168.20.104Administrator@192.168.20.104's password: User:Administrator logged-in to ILO2M2909004F.labs.nc.tekelec.com(192.168.20.104 / FE80::BA83:3FF:FE47:649C) iLO Standard 1.37 at Oct 25 2018 Server Name: Server Power: On - Create an RMS iLO new user with customized username and password.
</>hpiLO-> create /map1/accounts1 username=root password=TklcRoot group=admin,config,oemHP_rc,oemHP_power,oemHP_vm status=0 status_tag=COMMAND COMPLETED Tue Apr 2 20:08:30 2019 User added successfully. - Disable the DHCP before you are able to setup the static IP. The setup of
static IP failed before DHCP is
disabled.
</>hpiLO-> set /map1/dhcpendpt1 EnabledState=NO status=0 status_tag=COMMAND COMPLETED Tue Apr 2 20:04:53 2019 Network settings change applied. Settings change applied, iLO 5 will now be reset. Logged Out: It may take several minutes before you can log back in. CLI session stopped packet_write_wait: Connection to 192.168.20.104 port 22: Broken pipe - Setup RMS iLO static IP address.
After the previous step, log in back with the same address (which is static IP now), and enter new username and password. Go to next step to change the IP address, if required.
Sample output:$ ssh <new username>@192.168.20.104<new username>@192.168.20.104's password: <new password> User: logged-in to ILO2M2909004F.labs.nc.tekelec.com(192.168.20.104 / FE80::BA83:3FF:FE47:649C) iLO Standard 1.37 at Oct 25 2018 Server Name: Server Power: On </>hpiLO-> set /map1/enetport1/lanendpt1/ipendpt1 IPv4Address=192.168.20.122 SubnetMask=255.255.255.0 status=0 status_tag=COMMAND COMPLETED Tue Apr 2 20:22:23 2019 Network settings change applied. Settings change applied, iLO 5 will now be reset. Logged Out: It may take several minutes before you can log back in. CLI session stopped packet_write_wait: Connection to 192.168.20.104 port 22: Broken pipe - Setup RMS iLO default
gateway.
$ ssh <new username>@192.168.20.122Sample output:<new username>@192.168.20.122's password: <new password> User: logged-in to ILO2M2909004F.labs.nc.tekelec.com(192.168.20.104 / FE80::BA83:3FF:FE47:649C) iLO Standard 1.37 at Oct 25 2018 Server Name: Server Power: On </>hpiLO-> set /map1/gateway1 AccessInfo=192.168.20.1 status=0 status_tag=COMMAND COMPLETED Fri Oct 8 16:10:27 2021 Network settings change applied. Settings change applied, iLO will now be reset. Logged Out: It may take several minutes before you can log back in. CLI session stopped Received disconnect from 192.168.20.122 port 22:11: Client Disconnect Disconnected from 192.168.20.122 port 22
- Access RMS iLO from the DHCP address with default root password. From the
above
dhcpd.leasesfile, find the IP address for the iLO name. The default username is root and the password is changeme. At the same time, note the DNS Name on the pull-out label.Note:
The DNS Name is on the pull-out label. Use the DNS Name on the pull-out label to match the physical machine with the iLO IP. The same default DNS Name from the pull-out label is displayed upon logging in to the iLO command line interface, as shown in the following example:Using username "root". Using keyboard-interactive authentication. Password: Oracle(R) Integrated Lights Out Manager Version 5.0.1.28 r140682 Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved. Warning: password is set to factory default. Warning: HTTPS certificate is set to factory default. Hostname: ORACLESP-2117XLB00V - Netra server has the default root user. To change the default password, run the
following set
command:
-> set /SP/users/root password Enter new password: ******** Enter new password again: ******** - Create an RMS iLO new user with customized username and
password:
-> create /SP/users/<username> Creating user... Enter new password: **** create: Non compliant password. Password length must be between 8 and 16 characters. Enter new password: ******** Enter new password again: ******** Created /SP/users/<username> - Setup RMS iLO static IP address.
After the previous step, log in with the same address (which is a static IP now), and the new username and password. If not using the same address, go to next step to change the IP address:
- Check the current state before
change:
# show /SP/network - Set command to
configure:
# set /SP/network state=enabled|disabled ipdiscovery=static|dhcp ipaddress=value ipgateway=value ipnetmask=valueExample command:# set /SP/network state=enabled ipdiscovery=static ipaddress=172.16.9.13 ipgateway=172.16.9.1 ipnetmask=255.255.255.0
- Check the current state before
change:
- Commit the changes to implement the updates
performed:
# set /SP/network commitpending=true
Generating SSH Key on Oracle Servers with Oracle ILOM
This section provides the procedures to generate a new SSH key on an Oracle X8-2 and Oracle X9-2 server using the Oracle Integrated Lights Out Manager (ILOM) web interface. The new SSH key is created at the service level and has a length of 3072 bits. It is automatically managed by the firmware.
- Oracle ILOM: Version 5.1.3.20, revision 153596
- BIOS (Basic Input/Output System): Version 51.11.02.00
Oracle X9-2 server is compatible with firmware 5.1. For more details, contact My Oracle Support.
Before craeting a new SSH key, ensure that you have the necessary access permissions to log in to the Oracle ILOM web interface.
- Open a web browser and access the Oracle ILOM user interface by entering the corresponding IP address in the address bar.
- Enter your login credentials for Oracle ILOM.
- Perform the following steps to generate the SSH key:
- Navigate to the SSH or security configuration section in the following path: ILOM Administration → Managament Access → SSH Server
- Click Generate Key to generate a
new SSH key.
The system generates a new SSH key of 3072 bits length.
- Run the following command on the CLI to validate the
generated
key:
-> show -d properties /SP/services/ssh/keys/rsaSample output:/SP/services/ssh/keys/rsa Properties: fingerprint = 53:66:65:85:45:ba:4e:63:2d:aa:ab:8b:ef:fa:95:ac:9e:17:8e:92 fingerprint_algorithm = SHA1 length = 3072Note:
- The length of the SSH key is managed by the firmware and set to 3072 bits. There are no options to configure it to 1024 or 2048 bits.
- Ensure that the client's configuration is compatible with 3072 bit SSH keys.
- After making the changes to SSH keys or user configuration in the ILOM web interface, log out from Oracle ILOM and then log back in. This applies the changes without having to restart the entire ILOM.
- [Optional]: You can also restart ILOM using the ILOM
command line interface by running the following command. This
command applies any configuration changes that you've made and
initiates a restart of the
ILOM:
-> reset /SP
Use the following properties and commands in the Oracle ILOM CLI to configure and manage SSH settings on the X9-2 server. Refer to the specific documentation for Oracle ILOM version 5.1 on the X9-2 server for any updates or changes to the commands (https://docs.oracle.com/en/servers/management/ilom/5.1/admin-guide/modifying-default-management-access-configuration-properties.html#GUID-073D4AA6-E5CC-45B5-9CF4-28D60B56B548).
- CLI path:
/SP/services/ssh - Web path: ILOM Administration > Management Access > SSH Server > SSH Server Settings
- User Role:
admin(a). Required for all property modifications.
Table 2-4 SSH Configuration Properties
| Property | Description |
|---|---|
| State |
Parameter: Description: Determines whether the SSH server is enabled or disabled. When enabled, the SSH server uses the server side keys to allow remote clients to securely connect to the Oracle ILOM SP using a command-line interface. On disabling or restarting, the SSH server automatically terminates all connected SP CLI sessions over SSH. Default Value: Enabled CLI
Syntax:
Note: If you are using a web interface, the changes you made to the SSH Server State in the web interface takes effect in Oracle ILOM only after clicking Save. Restarting the SSH server is not required for this property. |
| Restart Button |
Parameter:
Description: This property allows you to restart the SSH server by terminating all connected SP CLI sessions and activating the newly generated server-side keys. Default Value: NA Available Options: CLI
Syntax:
|
| Generate RSA Key Button |
Parameter:
Description: This property provides the ability to generate a new RSA SSH key. This action is used for creating a new key pair for SSH authentication. Default Value: NA CLI
Syntax:
|
Note:
- Periodic firmware updates for Oracle ILOM are crucial. Regularly check for updates to access the new features, improvements, or security enhancements in the Firmware Downloads and Release History for Oracle Systems page.
- Verify that the clients connecting to the Oracle X8-2 server support 3072 bit SSH keys.
- For detailed information about SSH key generation and management in your specific environment, refer to the official Oracle ILOM documentation.
Installing Bastion Host
This section describes the use of Installer Bootstrap Host to provision RMS2 with an operating system and creating VM guest to fulfill the role of Bastion Host. After the Bastion Host is created, it is used to complete the installation of CNE.
Provisioning Second Kubernetes Host (RMS2) from Installer Bootstrap Host (RMS1)
Table 2-5 Terminology used in Procedure
| Name | Description |
|---|---|
| bastion_full_name | This is the full name of the Bastion Host as defined in the hosts.ini
file.
Example: bastion-2.rainbow.lab.us.oracle.com |
| bastion_kvm_host_full_name | This is the full name of the KVM server (usually RMS2/db-2) that hosts
the Bastion Host VM.
Example: k8s-host-2.rainbow.lab.us.oracle.com |
| bastion_short_name | This is the name of the Bastion Host derived
from the bastion_full_name up to the first ".".
Example: bastion-2 |
| bastion_external_ip_address | This is the external address for the Bastion Host
Example: 10.75.148.5 for bastion-2 |
| bastion_ip_address |
This is the internal IPv4 "ansible_host" address of the Bastion Host as defined within the hosts.ini file. Example: 172.16.3.100 for bastion-2 |
| cluster_full_name | This is the name of the cluster as defined in the hosts.ini file field:
occne_cluster_name.
Example: rainbow.us.lab.us.oracle.com |
| cluster_short_name | This is the short name of the cluster derived from the
cluster_full_name up to the first ".".
Note: Following are the specifications for cluster_short_name value:
Example: rainbow |
Note:
- Setup the Bootstrap Host to use
root/<customer_specific_root_password>as the credentials to access it. For the procedure to configure the user and password, see Installation of Oracle Linux X.X on Bootstrap Host. - The commands and examples in this procedure assume that Bastion Host is installed on Oracle Linux 9. The procedure vary for other versions.
Procedure
- Run the following commands to create a user
(
<user-name>) and edit thesudoersfile forno-passwordsudo.Note:
Skip this step and proceed to the next step if you are installing CNE on servers other than HP Gen10 and Oracle X as the user creation is already taken care in the Prerequisites for Servers Other than HP and Oracle X.$ groupadd <user-name> $ useradd -g <user-name> <user-name> $ passwd <user-name> <Enter new password twice> $ usermod -aG wheel <user-name> $ echo "%<user-name> ALL=(ALL) NOPASSWD: ALL" | tee -a /etc/sudoers - Log in as
<user-name>with the newly created password and perform the following steps in this procedure as a<user-name>. - Set the
cluster_short_namefor use in the bootstrap environment, and load it into the current environment. Enter the user name when prompted.$ echo 'export OCCNE_CLUSTER=<cluster_short_name>' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_USER=<user-name>' | sudo tee -a /etc/profile.d/occne.sh $ source /etc/profile.d/occne.shExample:$ echo 'export OCCNE_CLUSTER=rainbow' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_USER=admusr' | sudo tee -a /etc/profile.d/occne.sh $ source /etc/profile.d/occne.sh
Note:
After running this step, the bash variable references such as${OCCNE_CLUSTER}expands to thecluster_short_namein this shell and subsequent ones. - Configure the central repository access on Bootstrap by
performing the following steps:
- Mount the Utility USB. For information about mounting a USB in Linux, see Installation of Oracle Linux X.X on Bootstrap Host.
- Create the cluster specific
directory:
$ sudo mkdir -p -m 0750 /var/occne/cluster/${OCCNE_CLUSTER}/ $ sudo chown -R ${OCCNE_USER}:${OCCNE_USER} /var/occne/ - Configure the central repository access for Bootstrap by performing the Configuring Central Repository Access on Bootstrap procedure.
- Copy the OLx ISO to the Installer Bootstrap Host:
The
isofile must be accessible from a Customer Site Specific repository. This file must be accessible because the ToR switch configurations were completed in procedure: Configure Top of Rack 93180YC-EX SwitchesCreate the
/var/occne/os/directory and copy the OLX ISO file to the directory. The following example usesOracleLinux-R9-U5-x86_64-dvd.iso. If this file was copied to the utility USB, it can be copied from the utility USB into the same directory on the Bootstrap Host.Note:
If the user copies this ISO from their laptop then they must use an application like WinSCP pointing to the Management Interface IP.$ mkdir /var/occne/os $ scp <usr>@<site_specific_address>:/<path_to_iso>/OracleLinux-R9-U5-x86_64-dvd.iso /var/occne/os/ - Install packages onto the Installer Bootstrap Host: Use DNF to
install
podman,httpd, andsshpassonto the installer Bootstrap Host:$ sudo dnf install -y podman httpd sshpass - Setup HTTPD on the Installer Bootstrap Host: Run the following commands to
mount ISO and enable httpd service.
Note:
Before running the following commands, ensure thathttpdis already installed in step 7 and the OLX ISO file is namedOracleLinux-R9-UX-x86_64-dvd.iso.$ sudo mkdir -p -m 0755 /var/www/html/occne/pxe $ sudo mkdir -p -m 0755 /var/www/html/os/OL9 $ sudo mount -t iso9660 -o loop /var/occne/os/OracleLinux-R9-UX-x86_64-dvd.iso /var/www/html/os/OL9 $ sudo systemctl enable --now httpd - Disable SELINUX:
- Set SELINUX to permissive mode. To successfully set the
SELINUX mode, a reboot of the system is required. The
getenforcecommand is used to determine the status of SELINUX.$ getenforce Enforcing - If the output of this command does not display
PermissiveorDisabled, change it toPermissiveorDisabledby running the following command (This step must be redone if the system reboots before the installation process completes):$ sudo setenforce 0
- Set SELINUX to permissive mode. To successfully set the
SELINUX mode, a reboot of the system is required. The
- Run the following commands on Bootstrap Host to generate the
SSH private and public keys on Bootstrap Host. These keys are passed to the
Bastion Host and used to communicate to other nodes from that Bastion Host.
Note:
- Do not supply a passphrase when the system asks for one and click Enter.
- The private key (
occne_id_rsa) must be copied to a server that is going to access the Bastion Host because the Bootstrap Host is repaved. This key is used later in the procedure to access the Bastion Host after it has been created. The user can also use an SSH client like Putty (keyGen and Pagent) to generate their own key pair and place the public key into the Bastion Hostauthorized_keysfile to provide access. Pagent can also be used to convert the occne_id_rsa .pem key format to .ppk format using putty to access the Bastion Host.
If you are installing CNE on HP Gen10 or Oracle X, then the public key is passed to each node during OS installation. However, if you are installing CNE on other servers, copy the public key to the rest of the cluster nodes that already have an OS installed by performing the following step.$ mkdir -p -m 0700 /var/occne/cluster/${OCCNE_CLUSTER}/.ssh $ ssh-keygen -b 4096 -t rsa -C "occne installer key" -f "/var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa" -q -N "" $ mkdir -p -m 0700 ~/.ssh $ cp /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa ~/.ssh/id_rsa - If you are installing CNE on servers other than HP Gen10 or Oracle X, ensure
that you are able to run SSH commands on other node from the bootstrap
successfully without typing in passwords for SSH login or
sudoaccess:Note:
Skip this step if you are installing CNE on HP Gen10 or Oracle X.for NODE_ADDR in ${IP1} ${IP2} ... ; do echo Copy Key to $NODE_ADDR ssh $NODE_ADDR "sudo hostname" done
Performing Automated Installation
This section details the steps to run the automated configuration of the Bastion Host VM.
- Setting up and running the
deploy.shscript on the Bootstrap Host. - Accessing the Bastion Host and implementing the
final commands to run the
pipeline.shscript to complete the Bastion Host configuration and deploy the CNE cluster.
- Set up and run the
deploy.shscript on the Bootstrap Host:The deploy.sh script performs the initial configuration of the Bastion host. This includes installing the OS on the bastion and its kvm-host, populating the Bastion with repositories, and verifying that everything is up to date. The script is run on the Bootstrap Host using a set of environment variables that can be initialized on the command line along with thedeploy.shscript. These variables include the following:Table 2-6 Environmental Variables
Name Comment Example usage OCCNE_BASTION The full name of Bastion Host OCCNE_BASTION=bastion-2.rainbow.us.labs.oracle.com OCCNE_VERSION The version tag of the image releases OCCNE_VERSION=25.2.1xx - Copy necessary files from CNE provision container and configure
deployment:
Note:
If you want to save any of the files after configuration for future use, then copy those files from Bootstrap Host.- Run the following command to set the CNE
version:
$ export OCCNE_VERSION=25.2.1xx - Depending on the type of Load Balancer (MetalLB or
CNLB) you want to use for traffic segregation, use one of the
following command to run the podman to copy all necessary files to
configure and run BareMetal deployment:
Note:
Ensure that you copy and run the command as such with out making any changes.- Run the following command for
MetalLB:
$ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in config_files/. scripts/. ../common/scripts/. ../../common/scripts/. ; do cp -r /platform/bare_metal/metallb/"$source" /host; done' - There are two sets of files that must be copied to the Bootstrap Host.
These include the common scripts and those needed that are
contained within the CNLB installer:
Run the following commands for CNLB:
- Files copied from the provision
container:
$ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in switch_install cluster/. templates/. scripts/. ../common/scripts/. ../../common/scripts/. ../common/config_files/. ; do cp -r /platform/bare_metal/cnlb/"$source" /host; done' - Files copied from the CNLB installer
container:
$ mkdir /var/occne/cluster/${OCCNE_CLUSTER}/installer $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/cnlb_installer:${OCCNE_VERSION} -c 'for source in installer/validateCnlbIni.py installer/cnlb_logger.py installer/utils.py; do cp -r "$source" /host/installer/; done'
- Files copied from the provision
container:
- Run the following command for
MetalLB:
- Copy the
hosts_sample.iniorhosts_sample_remoteilo.inifile to the cluster directory ashosts.inifile and configure the file by performing the CNE Inventory File Preparation procedure.$ cp /var/occne/cluster/${OCCNE_CLUSTER}/hosts_sample.ini /var/occne/cluster/${OCCNE_CLUSTER}/hosts.iniNote:
If you are installing CNE on servers other than HP Gen10 or Oracle X, when you are configuringhosts.ini, ensure that all systems that already have an OS preprovisioned have thepre_provisionedattribute set to True appended to their inventory declaration line.For example:[host_baremetal] k8s-host-1.airraid.lab.us.oracle.com ansible_host=172.16.3.4 oam_host=10.148.217.4 pre_provisioned=True k8s-host-2.airraid.lab.us.oracle.com ansible_host=172.16.3.5 oam_host=10.148.217.5 pre_provisioned=True k8s-host-3.airraid.lab.us.oracle.com ansible_host=172.16.3.6 pre_provisioned=True - Verify the
occne_repo_host_addressfiled in thehosts.inifile and check if thebond0IP address is configured as per Configuring Top of Rack 93180YC-EX Switches. If not, modify the file to configure the required value:$ vi /var/occne/cluster/${OCCNE_CLUSTER}/hosts.iniSample output:occne_repo_host_address = <bootstrap server bond0 address>
- Run the following command to set the CNE
version:
- Depending on the type of Load Balancer (MetalLB or CNLB) you
want to use for traffic segregation, use one of the following steps to
configure the MetalLB or CNLB configuration file:
- If you want to use MetalLB for traffic segregation, configure the MetalLB configuration file by performing the Populate the MetalLB Configuration File procedure.
- If you want to use CNLB for traffic segregation,
configure the CNLB configuration file by performing the following
steps:
- Copy the
cnlb.ini.templatefile to the cluster directory ascnlb.ini:$ cp /var/occne/cluster/${OCCNE_CLUSTER}/cnlb.ini.template /var/occne/cluster/${OCCNE_CLUSTER}/cnlb.ini - Configure CNLB by performing the Configuring Cloud Native Load Balancer (CNLB) procedure.
- Copy the
- Run the
deploy.shscript from the cluster directory by setting the required parameters:$ cd /var/occne/cluster/${OCCNE_CLUSTER} $ export OCCNE_BASTION=<bastion_full_name> $ ./deploy.shFor example:$ cd /var/occne/cluster/${OCCNE_CLUSTER} $ export OCCNE_BASTION=bastion-2.rainbow.lab.us.oracle.com $ ./deploy.shNote:
The release version defaults to the current GA release. If you want to use a different version, specify the version by adding theOCCNE_VERSION=<release>variable to the command line. - Run the following commands from the Bastion Host to complete
the Bastion Host configuration and deploy CNE on the BareMetal system.
Note:
The Bootstrap Host cannot be used to access the Bastion Host as it is repaved from running this command.- Log in to the Bastion Host
as
<user-name>. - Use the private key that was saved earlier to access
the Bastion Host from a server other than the Bootstrap
Host.
$ ssh <user-name>@<bastion_external_ip_address> - Copy this private key to the
/home/<user-name>/.sshdirectory on that server asid_rsausing SCP or winSCP from a desktop PC. Set the permissions of the key to 0600 using the following command:chmod 0600 ~/.ssh/id_rsa - Customize the common services and cnDBTier installation if required by using the Common Installation Configuration section.
- Run the following command to complete the deployment of
CNE from the Bastion Host (excluding re-install on the Bastion Host
and its KVM host, which are already setup). This action will repave
the Bootstrap Host
RMS.
$ /var/occne/cluster/${OCCNE_CLUSTER}/artifacts/pipeline.sh
Note:
The release version defaults to the current GA release. If you want to use a different version, specify the version by adding theOCCNE_VERSION=<release>variable to the command line. - Log in to the Bastion Host
as
Installing BareMetal CNE using Bare Minimum Servers
This section provides the procedure to configure and install CNE on a BareMetal deployment using bare minimum servers (three worker nodes).
Prerequisites
Before performing this procedure, ensure that you meet the following prerequisites:- This procedure must be followed only for a fresh installation of BareMetal CNE with minimal resources.
- Ensure that you have performed all the Preinstallation Tasks.
- Ensure that you have read and understood all the instructions provided in the BareMetal Installation section.
- Ensure that you have configured the worker nodes
with at least the following resources:
- vCPU: 32 GB
- RAM: 65 GB
- Disk: 80 GB
- Ensure that you have configured the controller nodes
with at least the following resources:
- vCPU: 2 GB
- RAM: 7.5 GB
- Disk: 40 GB
Procedure
- Run the following command to edit the
hosts.inifile:$ vi /var/occne/cluster/${OCCNE_CLUSTER}/hosts.ini - Add the
opensearch_data_replicas_countvariable under the[occne:vars]section and set the value to 3. The value indicates the number of controller and worker nodes:... [occne:vars] ... opensearch_data_replicas_count=3 [openstack:vars] ... - Follow the Baremetal installation procedure and run the
deploy.shscript when indicated.
Virtualized CNE Installation
Note:
Before installing CNE on vCNE, you must complete the preinstallation tasks.Installing vCNE on OpenStack Environment
- Custom Configurable Volumes (CCVs) as block devices for each VM resource: The Custom Configurable Volumes (CCVs) allows the customers to configure the volume cinder type on their OpenStack cloud. These configuration can be used to create the hard disk for each VM resource.
- Non-Configureable volumes that are created by Openstack using the flavors assigned to each VM resource. This is a standard configuration.
- Load Balancer VM (LBVM)
- Cloud Native Load Balancer (CNLB)
Prerequisites
Before installing and configuring CNE on OpenStack, ensure that the following prerequisites are met.
- You must have access to an existing OpenStack environment and OpenStack Dashboard (web-based user interface).
- Ensure that the Nova, Neutron, and Cinder modules are configured. The OpenStack environment uses these modules to manage compute, networking, and storage, respectively.
- Ensure that the OpenStack environment is configured with appropriate resource flavors and network resources for resource allocation to the VMs.
- The DHCP Enabled value for OpenStack subnet in each MetalLB pool must be set as "Yes".
- Ensure that all the required images, binaries, and files are downloaded from the Oracle OSDC before running this procedure. Ensure that these resources are available for use in this procedure. For instructions on how to generate the lists of images and binaries, see the Artifact Acquisition and Hosting chapter.
- You must obtain a public key (that
can be configured) for logging into the Bootstrap Host. Before running this
procedure, you must place the public key into the customer's OpenStack
Environment as follows:
Use the Import Key tab on the Launch Instance→Key Pair dialog or via the Compute→Access and Security screen.
- Ensure that there is a default or custom security group which allows the Bootstrap instance to reach the central repository.
Expectations
- You must be familiar with the use of OpenStack as a
virtualized provider including the use of the OpenStack Client and OpenStack
Dashboard.
Note:
The Bootstrap host doesn't provide OpenStack CLI tools. Therefore, you must use OpenStack Dashboard or an external Linux instance with access to OpenStack CLI to fetch the data or values from OpenStack. - You must make a central repository available for all resources like images, binaries, and helm charts before running this procedure.
- The default user on installation is "cloud-user" and this is still
recommended. However, starting 23.4.0, it is possible to define a different
default user. This user will be used to access the VMs, configure, run tasks,
and manage the cluster.
Note:
You can change the default user only during installation and not during an upgrade. When you change the default user, ensure that you use the changed user to run all the commands in the entire procedure. Refrain from using the root user. - You must define all the necessary networking protocols, such as using fixed IPs or floating IPs, for use on the OpenStack provider.
- If you select Custom Configurable Volumes (CCVs), the size defined for each volume must match the size of the disk as defined in the flavor used for the given VM resource.
- When using CCV, you must be fully aware of the volume storage used. If there is insufficient volume storage on the cloud on which CNE is deployed, the deployment will fail while applying the Terraform (MetalLB) or OpenTofu (CNLB).
Downloading OLX Image
Download the OLX image by following the procedure
in the Downloading Oracle Linux section.
Note:
The letter 'X' in Oracle Linux X or OLX in this procedure indicates the latest version of Oracle Linux supported by CNE.Uploading Oracle Linux X to OpenStack
This procedure describes the process to upload the qcow2 image, that is obtained using the Downloading OLX Image procedure, to an OpenStack environment.
Note:
Run this procedure from the OpenStack Dashboard.- Log in to OpenStack Dashboard using your credentials.
- Select Compute → Images.
- Click the +Create Image button. This displays the OpenStack Create Image dialog.
- In the OpenStack Create Image dialog, enter a name for
the image.
Use a name similar to the name of the qcow2 image at the time of download. It's recommended to include at least the OS version and the update version as part of the name. For example: ol9u5.
- Under Image Source, select File.
This enables the File* → Browse button to search for the
image file.
Click the File* → Browse button to display the Windows Explorer dialog.
- From the Windows dialog, select the qcow2 image that is
downloaded in the Downloading OLX Image section. This inserts the file name and set the
Format option to QCOW2 - QEMU
Emulator automatically.
Note:
If Format isn't set automatically, use the drop-down to select QCOW2 - QEMU Emulator. - Retain the default values for the other options. However, you can adjust the Visibility and Protected options according to your requirement.
- Click the Create Image button at the bottom right corner
of the dialog. This starts the process to upload image.
It takes a while for the system to complete uploading the image. During the upload process, the system doesn't display any progress bar or final confirmation.
- Navigate to Compute → Images to verify the uploaded image.
Creating Bootstrap Host Using OpenStack Dashboard
- Creating Bootstrap Host Using Non-Configurable volumes: These non-configurable volumes are created by Openstack using the flavors assigned to each VM resource (standard configuration).
- Creating Bootstrap Host Using Custom Configurable Volumes (CCV): The CCVs are created as block devices for each VM resource. This allows you to configure the volume cinder type on your Openstack environment with the configuration that is used for creating the hard disk for each VM resource.
Note:
- A separate bootstrap image (qcow2) isn't required and not provided as part of the artifacts. The Bootstrap VM is created as an instance of a regular base OS image, similar to the other VM instances on the cluster.
- Use the following examples for reference only. The actual values differ from the example values.
- Perform the following procedures manually on the customer specific OpenStack environment.
Creating Bootstrap Host Using Nonconfigurable Volumes
Note:
These tools are installed as part of the previous step and are not bundled along with the base image.- Log in to the OpenStack Dashboard using your credentials.
- Select Compute→Instances.
- Select the Launch Instance button on the upper right. A dialog box appears to configure a VM instance.
- In the dialog box, enter a VM instance name. For example,
occne-<cluster-name>-bootstrap. Retain the Availability Zone and Count values as is. - Perform the following steps to select Source from the left pane:
Note:
There can be a long list of available images to choose from. Ensure that you choose the correct image.- Ensure that the Select Boot Source drop-down is set to Image.
- Enter the OL
Ximage name that you created using the Uploading Oracle Linux X to OpenStack procedure. You can also use the Available search filter to search for the required image. For example, ol9u5. - Enter occne-bootstrap in the Available filter. This displays the
occne-bootstrap-<x.y.z>image uploaded earlier.Note:
Do not use a Bootstrap image from any earlier versions of CNE. - Select the OL
XImage image by clicking "↑" on the right side of the image listing. This adds the image as the source for the current VM.
- Perform the following steps to select Flavor from the left pane:
- Enter a string (not case-sensitive) which best describes the flavor that is used for this customer specific OpenStack Environment in the Available search filter. This reduces the list of possible choices.
- Select the appropriate customer specific flavor (for example,
OCCNE-Bootstrap-host) by clicking "↑" on the right side of the
flavor listings. This adds the resources to the Launch Instance
dialog.
Note:
The Bootstrap requires a flavor that includes a disk size of 40GB or higher and RAM size must be 8GB or higher.
- Perform the following steps to select Networks from the left pane:
- Enter the appropriate network name as defined by the
customer with the OpenStack Environment (for example,
ext-net) in the Available search filter. This reduces the list of possible choices. - Select the appropriate network by clicking "↑" on the right side of the network listings. This adds the external network interface to the Launch Instance dialog.
- Enter the appropriate network name as defined by the
customer with the OpenStack Environment (for example,
- Perform the following step to select Key Pair from the left pane. This
dialog assumes you have already uploaded a public key to OpenStack. For more
information, see Prerequisites:
- Choose the appropriate key by clicking "↑" on the
right side of the key pair listings. This adds the public key to the
authorized_keysfile on the Bootstrap Host.
- Choose the appropriate key by clicking "↑" on the
right side of the key pair listings. This adds the public key to the
- Select Configuration from the left pane. This screen allows you to add
configuration data that is used by
cloud-initto set on the VM, the initial username, and hostname or FQDN additions to the/etc/hostsfile.Copy the following configuration into the Customization Script text box:Note:
- Ensure that the fields marked as
<instance_name_from_details_screen>are updated with the instance name provided as per step 4 in this procedure. - Ensure that the
<user-name>field is updated. The recommended value for this field is"cloud-user"
#cloud-config hostname: <instance_name_from_details_screen> fqdn: <instance_name_from_details_screen> system_info: default_user: name: <user-name> lock_passwd: false write_files: - content: | 127.0.0.1 localhost localhost4 localhost4.localdomain4 <instance_name_from_details_screen> ::1 localhost localhost6 localhost6.localdomain6 <instance_name_from_details_screen> path: /etc/hosts owner: root:root permissions: '0644' - Ensure that the fields marked as
- Select Launch Instance at the bottom right of the Launch Instance window. This initiates the creation of the VM. After the VM creation process is complete, you can see the VM instance on the Compute→Instances screen.
Creating Bootstrap Host Using Custom Configurable Volumes (CCV)
The CCV deployment includes additional steps to create the volume for the Bootstrap host prior to creating the Bootstrap host instance.
Note:
The Bootstrap host drives the creation of virtualized cluster using terraform (MetalLB) or OpenTofu (CNLB), OpenStack Client, and Ansible Playbooks.To create Bootstrap host using Custom Configurable Volumes (CCV), perform the following steps manually on the customer specific OpenStack environment:
Note:
Make sure you know the volume size defined by the flavor that is used to create the Bootstrap host. This information is required to create the Bootstrap host volume.- Login to the OpenStack Dashboard using your credentials.
- Select Compute→Volumes.
- Select the + Create Volume button on the top right.
The system displays a dialog box to configure the volume.
- Enter the volume name. Example:
occne-boostrap-host. - Enter a description. Example: Customer Configurable Volume for the Bootstrap host.
- From the Volume Source drop-down list, select
image. - From the OL
Ximage name that you created using the Uploading Oracle Linux X to OpenStack procedure. For example, ol9u5.Note:
Do not use a Bootstrap image from any earlier versions of CNE. - From the Type drop-down list, select the image type. Example: nfs or as configured on the cloud.
- In the Size (GiB) field, enter the size of the image. This size must match the size defined for the Bootstrap host flavor that is used.
- From the Availability Zone drop-down list, select an availability zone.
- From the Group drop-down list, select
No group. - Click Create Volume.
The system creates the Custom Configurable Volume (CCV) that is used to create the Bootstrap host VM.
- Log in to the OpenStack Environment using your credentials.
- Select Compute→Instances.
- Select the Launch Instances tab on the top right. A dialog box appears to configure a VM instance.
- In the dialog box, enter a VM instance name.
For example,
occne-<cluster-name>-bootstrap. Retain Availability Zone and Count values as is. - Perform the following steps to select Source from the
left pane:
Note:
There can be a long list of available images to choose from. Ensure that you choose the correct image.- Ensure that the Select Boot Source drop-down is set to Volume.
- Ensure that the Delete Volume on Instance Delete is set to Yes.
- Enter the volume name in the Available search filter. The system displays the volume that you created in the previous section. For example, occne-boostrap-host.
- Select the volume by clicking "↑" on the right side of the volume listing. This adds the volume as the source for the current VM.
- Perform the following steps to select Flavor from the
left pane:
- Enter a string (not case-sensitive) which best describes the flavor that is used for this customer specific OpenStack Environment in the Available search filter. This reduces the list of possible choices.
- Select the appropriate customer specific flavor (for example,
OCCNE-Bootstrap-host) by clicking "↑" on the right side of
the flavor listings. This adds the resources to the Launch
Instance dialog.
Note:
The Bootstrap requires a flavor that includes a disk size of 40GB or higher and the RAM size must be 8GB or higher.
- Perform the following steps to select Networks from the
left pane:
- Enter the appropriate network name as defined by the
customer with the OpenStack Environment (for example,
ext-net) in the Available search filter. This reduces the list of possible choices. - Select the appropriate network by clicking "↑" on the right side of the network listings. This adds the external network interface to the Launch Instance dialog.
- Enter the appropriate network name as defined by the
customer with the OpenStack Environment (for example,
- Perform the following step to select Key Pair from the left pane. This
dialog assumes that you have already uploaded a public key to OpenStack. For
more information, see Prerequisites:
- Choose the appropriate key by clicking "↑" on
the right side of the key pair listings. This adds the public key to
the
authorized_keysfile on the Bootstrap host.
- Choose the appropriate key by clicking "↑" on
the right side of the key pair listings. This adds the public key to
the
- Select Configuration from the left pane. This screen allows you to add
configuration data that is used by
cloud-initto set on the VM, the initial username, and hostname or FQDN additions to the/etc/hostsfile.Copy the following configuration into the Customization Script text box:Note:
- Ensure that the fields marked as
<instance_name_from_details_screen>are updated with the instance name provided as per step 4 in this procedure. - Ensure that the
<user-name>field is updated. The recommended value for this field is"cloud-user".
#cloud-config hostname: <instance_name_from_details_screen> fqdn: <instance_name_from_details_screen> system_info: default_user: name: <user-name> lock_passwd: false write_files: - content: | 127.0.0.1 localhost localhost4 localhost4.localdomain4 <instance_name_from_details_screen> ::1 localhost localhost6 localhost6.localdomain6 <instance_name_from_details_screen> path: /etc/hosts owner: root:root permissions: '0644' - Ensure that the fields marked as
- Select Launch Instance at the bottom right of the Launch Instance window. This initiates the creation of the VM. After the VM creation process is complete, you can see the VM instance on the Compute→Instances screen.
Predeployment Configuration for OpenStack
Note:
Run all the commands in this section from the Bootstrap host.Logging in to Bootstrap VM
Use SSH to log in to Bootstrap using the private key uploaded to OpenStack. For more information about the private key, see Prerequisites.
$ ssh -i $BOOTSTRAP_PRIVATE_KEY $USER@$BOOTSTRAP_EXT_IPThe values used in the example are for reference only. You must obtain the
Bootstrap external IP from Compute -> Instances on the
OpenStack Dashboard. The $USER parameter is the same as <user-name>.
Setting the Cluster Short-Name and User Variables
- Use the following commands for LBVM:
$ echo 'export OCCNE_CLUSTER=<cluster_short_name>' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_USER=<user-name>' | sudo tee -a /etc/profile.d/occne.sh $ . /etc/profile.d/occne.shFor example:$ echo 'export OCCNE_CLUSTER=occne1-rainbow' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_USER=cloud-user' | sudo tee -a /etc/profile.d/occne.sh $ . /etc/profile.d/occne.sh - Use the following commands for CNLB:
$ echo 'export OCCNE_CLUSTER=<cluster_short_name>' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_USER=<user-name>' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_vCNE=openstack' | sudo tee -a /etc/profile.d/occne.sh $ . /etc/profile.d/occne.shFor example:$ echo 'export OCCNE_CLUSTER=occne1-rainbow' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_USER=cloud-user' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_vCNE=openstack' | sudo tee -a /etc/profile.d/occne.sh $ . /etc/profile.d/occne.sh
In this step, the Bash variable references, such as
${OCCNE_CLUSTER}, expands to the short-cluster name in this
shell, and subsequent ones.
Creating Cluster Specific Directories
Create the base occne directory, cluster directory (using the
cluster short-name), and YUM local repo directories.
- Use the following command to create base
directory:
$ sudo mkdir -p -m 0750 /var/occne $ sudo chown -R ${USER}:${USER} /var/occne - Use the following command to create cluster
directory:
$ mkdir -p -m 0750 /var/occne/cluster/${OCCNE_CLUSTER}/
Obtaining TLS Certificate for OpenStack
Note:
Perform this step only if your OpenStack environment requires a TLS certificate to access the controller from the cluster nodes and the Bootstrap host (deployment only).- Contact the OpenStack admin for the required TLS
certificate to access the client commands. For example, in an Oracle
OpenStack system installed with kolla, the certificate is available at
/etc/kolla/certificates/openstack-cacert.pem. - Copy the entire certificate (including the
intermediate and root CA, if provided) to the
/var/occne/cluster/${OCCNE_CLUSTER}/openstack-cacert.pemdirectory on the Bootstrap Host. Run this step after creating cluster specific directory.Ensure that the certificate file name is
openstack-cacert.pem, when you copy the file to the/var/occne/cluster/${OCCNE_CLUSTER}/directory.If the certificate file name is different, then rename it to
openstack-cacert.pembefore copying to the/var/occne/cluster/${OCCNE_CLUSTER}/directory. - Set the
OS_CACERTenvironment variable to/var/occne/cluster/${OCCNE_CLUSTER}/openstack-cacert.pemusing the following command:export OS_CACERT=/var/occne/cluster/${OCCNE_CLUSTER}/openstack-cacert.pem
Getting the OpenStack RC (API v3) File
The OpenStack RC (API v3) file exports several environment variables on the Bootstrap host. Terraform (MetalLB) or OpenTofu (CNLB) uses these environment variables to communicate with OpenStack and create different cluster resources.
Note:
The following instructions may slightly vary depending on the version of OpenStack Dashboard you're using.- From the OpenStack Dashboard, go to Project - > API Access.
- From the Download OpenStack RC File
drop-down menu on the right side, choose OpenStack RC File (Identity API
v3).
This downloads an
openrc.shfile prefixed with the OpenStack project name (for example,OCCNE-openrc.sh) to your local system. - Copy the file securely (using SCP or WinSCP) to
the Bootstrap host in the /home/${USER} directory as .<project_name>-openrc.sh
Note:
In order for SCP or WinSCP to work properly, use the key mentioned in the Prerequisites to access the Bootstrap host. Also, it may be necessary to add the appropriate Security Group Rules to support SSH (Rule: SSH, Remote: CIDR CIDR: 0.0.0.0/0) under the Network → Security Groups page in the OpenStack Environment. If required, contact the OpenStack administrator to add the correct rules. - Run the following command to source the OpenStack
RC file:
source .<project_name>-openrc.sh
Creating SSH Key on Bootstrap Host
Create the private and public keys to access the other VMs. The following command generates the keys that are passed to the Bastion Host and are used to communicate to other nodes from that Bastion Host.
$ mkdir -p -m 0700 /var/occne/cluster/${OCCNE_CLUSTER}/.ssh
$ ssh-keygen -m PEM -t rsa -b 2048 -f /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa -N ""
$ cp /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa ~/.ssh/id_rsa
Note:
The Bootstrap host is transient. Therefore, create a backup of the occne_id_rsa private key and copy the private key to a safe location to access the Bastion Host for using it in case of Bootstrap failures. This key is used to access the Bastion Host when performing other maintenance actions such as Upgrade, and so on.You can also use an SSH client like Putty (keyGen and Pagent) to generate your own key pair and place the public key into the Bastion Host authorized_keys file to provide access.
Pagent can also be used to convert the .pem format of the occne_id_rsa private key to .ppk format for accessing Bastion Host using Putty.
Configuring Central Repository Access on Bootstrap
Configure the central repository access on Bootstrap by following the steps provided in the Configuring Central Repository Access on Bootstrap section.
Verifying Central Repository and Install Required Packages
deploy.sh script, and are installed directly
from the central repository. However, it's necessary to manually install Podman,
downloaded from the central repository.
- [Optional]: Allow access to central repository (Optional).
Ensure that proper security group that allows access to the central
repository is created. For more information, see Prerequisites. If necessary, add a security group to the Bootstrap
instance by performing the following steps:
- Navigate to Compute → Instances → occne-<cluster-name>-bootstrap.
- On the right most drop-down menu, select Edit Instance.
- Select Security Groups.
- Click the plus symbol on the default or custom security group to add it to the Bootstrap image.
- The security group may be already allocated depending on the OpenStack environment.
- Perform the following steps to test the central repository:
- Run the following command to perform a simple ping
test:
$ ping -c 3 ${CENTRAL_REPO}Sample output:PING winterfell (0.0.0.0) 56(84) bytes of data. 64 bytes from winterfell (128.128.128.128): icmp_seq=1 ttl=60 time=0.448 ms 64 bytes from winterfell (128.128.128.128): icmp_seq=2 ttl=60 time=0.478 ms 64 bytes from winterfell (128.128.128.128): icmp_seq=3 ttl=60 time=0.287 ms --- winterfell ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2060ms rtt min/avg/max/mdev = 0.287/0.404/0.478/0.085 ms - Run the following command to list the
repositories:
$ dnf repolistSample output:repo id repo name ol9_UEKR7 Unbreakable Enterprise Kernel Release 7 for Oracle Linux 9 (x86_64) ol9_addons Oracle Linux 9 Addons (x86_64) ol9_appstream Application packages released for Oracle Linux 9 (x86_64) ol9_baseos_latest Oracle Linux 9 Latest (x86_64) ol9_developer Packages for creating test and development environments for Oracle Linux 9 (x86_64) ol9_developer_EPEL EPEL Packages for creating test and development environments for Oracle Linux 9 (x86_64)
- Run the following command to perform a simple ping
test:
- Run the following command to instal
Podman:
$ sudo dnf install -y podman
Copying Necessary Files to Bootstrap
- Copy the
<project_name>-openrc.shscript, created in a previous section, from the user's home directory to the cluster directory:$ cp ~/<project_name>-openrc.sh /var/occne/cluster/${OCCNE_CLUSTER}/openrc.sh - The Bootstrap is not preloaded with default settings, including the presetting
of
OCCNE_VERSION. It is necessary to temporarily set the CNE version before copying or downloading the cluster files. Use the following command to replace the CNE version:
For example:$ export OCCNE_VERSION=<occne_version>
This value is permanently set by the deploy script in a later step.$ export OCCNE_VERSION=25.2.100 - Create
scriptsdirectory in the cluster directory. Copy the scripts to the newly created directory and the Terraform (MetalLB) or OpenTofu (CNLB) templates into the cluster directory:Use the following command for LBVM:Note:
The location to the templates and the cluster directory vary depending on the type of Load Balancer (MetalLB or CNLB) used. Use one of the following commands to copy the relevant templates to the relevant cluster directory.$ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host -v /var/occne:/var/occne:z --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in ../../scripts modules misc/. templates/. tffiles/. scripts/. ../common/templates/. ../../../../common/scripts ; do cp -r /platform/vcne/lbvm/terraform/openstack/"$source" /host; done'Use the following command for CNLB:$ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host -v /var/occne:/var/occne:z --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in modules ../scripts misc/. templates/. tffiles/. ../cluster/. ../../common/opentofu/templates/. ../../common/scripts ../../../../common/scripts ; do cp -r /platform/vcne/cnlb/openstack/opentofu/"$source" /host; done'$ mkdir /var/occne/cluster/${OCCNE_CLUSTER}/installer $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/cnlb_installer:${OCCNE_VERSION} -c 'for source in installer/validateCnlbIni.py installer/cnlb_logger.py installer/utils.py; do cp -r "$source" /host/installer/; done' - Run the following command to update the ownership of the files in the cluster
directory. Ensure that you update the ownership of all the copied
files.
$ sudo chown -R ${USER}:${USER} /var/occne/cluster/${OCCNE_CLUSTER}
Updating the
cluster.tfvars File
cluster.tfvars file contains all the variables required by
Terraform (MetalLB) or OpenTofu (CNLB) to implement the cluster. Depending on the
type of Load Balancer you want to use for traffic segregation, use one of the
following sections to modify and complete a copy of the
occne_example/cluster.tfvars template (copied from the
provisioning container as part of the previous step):
- If you want to use MetalLB for traffic segregation, follow the information in the Updating cluster.tfvars for MetalLB section.
- If you want to use Cloud Native Load Balancer (CNLB) for traffic segregation, follow the information in the Updating cluster.tfvars for CNLB section.
Note:
- You must configure the fields in the
cluster.tfvarsfile to adapt to the current OpenStack Environment used. This requires some of the fields and information directly from the OpenStack Dashboard or OpenStack CLI (not bundled with Bootstrap). The given procedures provide details on how to collect and set the fields that must be changed and doesn't provide examples about OpenStack CLI . - All the fields in the
cluster.tfvarsfile must be unique. Therefore, ensure that there are no duplicate fields in thecluster.tfvarsfile and refrain from duplicating any fields in the file using the comment tag "#", as it can cause possible parsing errors.
Deploying CNE Cluster in OpenStack Environment
This section describes the procedure to deploy the VMs in the OpenStack Environment, configure the Bastion Host, and deploy and configure the Kubernetes clusters.
Running Deploy Command
Note:
The Environment Variables section describes the list of possible environment variables that can be combined with thedeploy.sh command. Ensure that you refer
to this section before you proceed with the deployment.
- Run the following command to copy the
occne.ini.templatefile to define the required Ansible variables for the deployment:$ cp /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini.template /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini - Run the following command to edit the newly created
occne.inifile:$ vi /var/occne/cluster/${OCCNE_CLUSTER}/occne.iniSee the "occne.ini Variables" table in the Environment Variables section for details about the
occne.inivariables that can be combined with thedeploy.shcommand to further define the implementation of the deployment. - Create a copy of the
secrets.ini.templatefile:$ cp /var/occne/cluster/${OCCNE_CLUSTER}/secrets.ini.template /var/occne/cluster/${OCCNE_CLUSTER}/secrets.ini - For LBVM deployments, edit the
occne.inifile updated in Step 2 with the requiredsecrets.inivariables:$ vi /var/occne/cluster/${OCCNE_CLUSTER}/secrets.iniSee the "secrets.ini Variables" table in the Environment Variables section for details about the
secrets.inivariables that are required by thedeploy.shcommand to install the cluster. - Customize the common services by referring to the Common Installation Configuration section.
- Run the following command from the /var/occne/cluster/${OCCNE_CLUSTER}/
directory on the Bootstrap Host. This command can take a while to run. It can
take up to 2 to 4 hours depending on the machines its running on.
$ ./deploy.shNote:
If thedeploy.shcommand fails during installation, you can troubleshoot as follows:- Check the configuration parameter values and perform reinstallation.
- Contact Oracle support for assistance.
The system displays a message similar to the following when the CNE cluster is deployed successfully in an OpenStack Environment:
Sample output for LBVM:
Sample output for CNLB:... -POST Post Processing Finished |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Fri Sep 6 06:51:52 PM UTC 2024 Connection to 10.x.x.x closed. /var/occne/cluster/$OCCNE_CLUSTER/artifacts/pipeline.sh completed successfully ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||... -POST Post Processing Finished |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Mon Jun 17 17:52:13 UTC 2024 /var/occne/cluster/<cluster-name>/artifacts/pipeline.sh completed successfully ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Installing vCNE on VMware Environment
The installation procedure details the steps necessary to configure and install CNE cluster in a VMware Environment.
- Load Balancer VM (LBVM)
- Cloud Native Load Balancer (CNLB)
Prerequisites
Before installing CNE on VMware, ensure that the following prerequisites are met.
- A vSphere vCenter account must be available with the following
permissions:
- Datastore
- Low level file operations
- Host
- Configuration
- Storage partition configuration
- Configuration
- Cns
- Searchable
- Profile-driven storage
- Profile-driven storage view
- Virtual machine
- Change configuration
- Add existing disk
- Add or remove a device
- Advanced configuration
- Change configuration
- Datastore
- Configure the following minimal versions of both vCloud Director and
vSphere to allow correct functioning of vCNE:
- vCloud Director (VCD) = 10.2.1
- vSphere or vCenter = 7.0
- A vCloud Director account with "vApp Author" and
"Organization Network View" rights must be available.
Note:
The solution is to create a role with all the required rights and assign the account to this role. Check the vApp Author default rights in VMware Cloud Director Documentation 10.2. - Configure the vCloud Director environment with the backend.cloneBiosUuidOnVmCopy parameter set to 0 to ensure that VMs created from a vApp template have a different BIOS UUID than BIOS UUID of the template.
- Configure the VMware environment with appropriate CPU, disk, RAM, and Network required for resource allocation to the VMs.
- Set up the central repository by following the Setting Up a Central Repository procedure.
- Download all the necessary images, binaries, and files. Also, set up and configure the central and YUM repository. For procedures, see the Artifact Acquisition and Hosting section.
- Users must be familiar with using VMware as a virtual provider, vCloud Director, and vSphere.
- All necessary networking must be defined for use on the VMware.
For more information, refer to the following documentations:
Downloading Oracle Linux X Image
Download the OLX image by following the procedure
in the Downloading Oracle Linux section.
Note:
The 'X' in Oracle LinuxX or OLX in the procedure
indicates the version of Oracle Linux supported by CNE.
Uploading OLX Image as VMware Media
This section describes the procedure to upload OLX image as
VMware media.
- Log in to VMware GUI using your credentials.
- Click Libraries on the top navigation bar.
- From the left panel, select Media & Other.
- Click Add.
- Select ISOS from the Select a catalog drop-down.
- Click the up arrow on Select media to upload to select a file.
- Select the OLX image (for example, OL9) downloaded in the Downloading Oracle Linux X Image section and click OK.
Creating a Template
This procedure describes the process to create a template.
Note:
The following procedure considers creating template for OL9 and provides the configurations accordingly. The options vary for other versions.Procedure
- Log in to the vCloud Director Environment using your credentials.
- Select the Virtual DataCenter where you want to create the template.
- On the left panel, under Compute section, click vApps.
- Click on the NEW drop-down list and select New vApp.
- Enter the vAPP name and click CREATE.
For example:
Name: bootstrap-VERSION-template - Open the newly created vAPP.
- From the ALL-ACTIONS drop-down, select Add → Add VM.
- On the pop-up window that appears, click ADD VIRTUAL
MACHINE and fill the fields as given in the following
example:
- Refer to the following example to fill name, description, and
type:
Name: CLUSTER-bootstrap Computer Name: CLUSTER-bootstrap Description: (Optional) Type: select New. - Refer to the following example to fill the Operating System
section:
Operating System: OS family: Linux Operating System: Oracle Linux 9 (64-bit) Boot image: OracleLinux-R9-UX-x86_64-dvd.isoNote:
The specific OS version for theOperating Systemfield may not be present. In such cases, select an older Oracle Linux distribution. - Refer to the following example to fill the Compute
section:
Compute: Virtual CPUs: 1 Cores per socket: 1 Number of sockets: 1 Memory: 4GB - Refer to the following example to fill the Storage
section:
Storage: Size: 32GB (type the desired value in GB, E.g: 32GB, 64GB, 128GB) - Perform the following steps to configure the Networking section.
- Click ADD NETWORK TO VAPP.
- Check Type Direct.
- From the Org VDC Network Connection table, Select a network to add.
- Click ADD.
- In the Networks table,
select the following options:
Network: <Select the network added in the previous section> Network Adapter Type: VMXNET3 IP Mode: Static - IP Pool IP Address: Auto-assigned Primary NIC: SelectedNote:
IP Addressis auto-assigned. This default IP address cannot be changed.Primary NICcannot be deselected with only one NIC.
- Refer to the following example to fill name, description, and
type:
- Click OK and then click
ADD.
The system creates the VM. Wait until the VM is created.
- Perform the following steps to connect to the new VM:
- Select the newly created VM and click ALL ACTIONS.
- Select Media.
- Click Insert Media and select OracleLinux-R9-UX-x86_64-dvd.iso.
- Click INSERT at the bottom right corner.
- Click POWER ON and wait for the VM to start.
- When the VM is available, connect to the VM by choosing
one of the following options:
- Click LAUNCH WEB CONSOLE (recommended option).
- Click LAUNCH REMOTE CONSOLE (this option requires installing an external software which is not covered in this procedure).
- Perform the following steps to install Oracle Linux X:
Note:
The following steps provides the procedure to install Oracle Linux 9.X. The options may vary depending on the Linux version you are installing. Therefore, select the options as per your Linux version.- Select Test this media & install Oracle
Linux 9.X.X.
The system displays the installation window after running the test.
- On the Welcome to Oracle Linux X
screen, select your preferred language and click
Continue.
The system displays the Installation Summary page. Configure each section by performing the following steps.
- [Optional]: From Localization, choose your desired options for Keyboard, Language Support and Time & Date.
- Perform the following steps to configure the Software
section:
- Select Software → Installation Source and ensure that Auto Detect installation media is checked.
- Click Done.
- Select Software → Software
Selection and choose Minimal
Installation from the Base
Environment panel on the left.
Note:
Don't select any items from the Additional software for Selected Environment panel on the right. - Click Done.
- Perform the following steps to configure the System
section:
- Select System → Installation Destination.
- Select Custom under Storage Configuration and click Done.
- On the Manual
Partitioning screen that appears, create
three partitions by clicking the +
symbol and selecting the following options per partition:
- Mount Point: /boot - Desired Capacity: 1024
- Mount Point: swap - Desired Capacity: 4G
- Mount Point: / - Desired Capacity: (Leave this field blank to use the rest of the available space)
Click Add Mount Point to confirm each partition.
- Click Done.
- On the Summary of Changes screen that appears, click Accept Changes.
- Perform the following steps to configure the User
Settings section:
- Select User Settings → Root Password.
- On the Root Password field, type a strong root password.
- Select Allow root SSH login with password.
- Click Done.
- Retain the default values for the rest of the System settings (KDUMP, Network & Host Name, Security Profile).
- Click Begin Installation when you complete all the previous steps.
- Select Test this media & install Oracle
Linux 9.X.X.
- Once the installation is completed, wait for the VM to reboot and log in using LAUNCH WEB CONSOLE or LAUNCH REMOTE CONSOLE.
- Make sure one of the network interfaces has set the same IP that was assigned by vCloud Director.
- From the remote console, run the following command to configure
network:
$ ip addressIf there is no IP address on the new VM, run the followingnmclicommands:$ nmcli con mod <interface name> ipv4.method manual ipv4.addresses <ip-address/prefix> ipv4.gateway <gateway> connection.autoconnect yes $ nmcli con up <interface name>Run the
$ ip addresscommand again to verify if the IP address has changed.Use the IP address in the "NICs" section and get the gateway and prefix from VMware GUI:
Networking → <select networking name> → Static IP Pools → Gateway CIDR. - If a proxy is required to reach yum.oracle.com, add
proxy = parameterto the/etc/dnf/dnf.conffile. - Check that the
nameserversare indicated in/etc/resolv.conffile. If empty, fill in the required values. - Run the following command to update all the packages to the
latest versions.
$ dnf update -y - Run the following command to install the packages required for
CNE
Installation.
$ dnf install -y perl-interpreter cloud-utils-growpart - Change the following line in
/etc/sudoersusingvicommand as a root user:- Run the following commands to open the file in edit
mode:
$ chmod 640 /etc/sudoers $ sudo vi /etc/sudoers - Search next line and comment out the following
line:
%wheel ALL=(ALL) ALL - Uncomment the following
line:
# %wheel ALL=(ALL) NOPASSWD: ALL
- Run the following commands to open the file in edit
mode:
- Run the following commands to enable VMware customization
tools:
$ vmware-toolbox-cmd config set deployPkg enable-customization true $ vmware-toolbox-cmd config set deployPkg enable-custom-scripts true - Run the following commands to clean the temporary
files:
$ dnf clean all $ logrotate -f /etc/logrotate.conf $ find /var/log/ -type f -iname '*gz' -delete $ find /var/log/ -type f -iname *$(date +%Y)* -delete $ for log in $(find /var/log/ -type f -size +0) ; do echo " " > $log ; done - Remove any proxy parameter added to
/etc/dnf/dnf.conf. - Unmount the media from the VM.
- Power off the VM from vCloud Director.
- Log in to vSphere vCenter and search for the name given to the created VM.
- Right click on the VM and click Edit Settings.
- Click the VM Options tab.
- Expand the Advanced drop-down list.
- Search for Configuration Parameters and select Edit Configuration.
- Add the
disk.EnableUUIDparameter and set it toTRUE. - Go back to the vCloud Director GUI and search for the vApp created previously.
- Click the Actions drop-down and select Create Template.
- Add template to a catalog.
- Enter a name for the template in the Name field.
- Select Make identical
copy and click OK.
The new template is stored in
Libraries / vApp Templates.
Configuring Bootstrap VM
This procedure describes the process to configure a Bootstrap Virtual Machine (VM).
Procedure
- Create a new VM from CNE template:
- Log in to the VMware GUI using your credentials.
- Click Data Centers on the top of the page.
- Select Virtual Machines from the left panel.
- Click New VM and perform the following
steps:
- Input the name of the new VM in Name.
- Update the name of the computer in Computer Name if you want a different name than the default one.
- Select the From Template option if it is not preselected by default.
- Click Power on if it is not preselected by default.
- Select the template from the available list.
- Click OK
- When the VM is powered on, select the newly created VM and perform the following
steps:
- From the left pane, select NICs.
- Click Edit.
- Select the Connected checkbox.
- Select the Network dropdown.
- From the IP Mode dropdown, select Static - IP Pool.
- Click Save.
- Check if the values of Network, IP Mode are set per the values provided in the previous step. Also, check if the IP Address column displays the valid IP address.
- Connect to the VM by using either LAUNCH WEB CONSOLE (recommended option) or LAUNCH REMOTE CONSOLE (this option requires installing an external software which is not covered in this procedure).
- From the remote console, log in with the root user and password:
- Run the following command to get the IP
address:
$ ip address - Run the following command if there is no IP address on the new VM yet
or if the IP address does not match the IP address in the "NICs" section of VMware
GUI:
$ nmcli con mod <interface name> ipv4.method manual ipv4.addresses <ip-address/prefix> ipv4.gateway <gateway> connection.autoconnect yeswhere,<interface name>is the interface name obtained in the previous step.<ip-address/prefix>and<gateway>are the IP address, its prefix and gateway details obtained from the VMware GUI (Networking → <select networking name> → Static IP Pools → Gateway CIDR).
$ nmcli con up <interface name> - Verify if the IP reflected in GUI is correctly assigned on the
interface:
$ ip address
- Run the following command to get the IP
address:
- If reaching yum.oracle.com requires a proxy, add
proxy = parameterto the/etc/dnf/dnf.conffile. - Check if the nameservers are indicated in the
/etc/resolv.conffile. If it's empty, populate it with the required values. - Run the following commands to install packages and create a
<user-name>user. The recommended value for<user-name>iscloud-user. When prompted, enter a new password twice for the previously created user:$ dnf update -y $ dnf install -y oraclelinux-developer-release-el9 oracle-epel-release-el9 $ dnf install -y rsync podman python3-pip $ groupadd -g 1000 <user-name> $ useradd -g <user-name> <user-name> $ passwd <user-name> $ echo "<user-name> ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers $ su - <user-name> $ mkdir -m 700 /home/<user-name>/.ssh - Create the
/home/<user-name>/.ssh/configfile:
Add the following content to the file:$ vi /home/<user-name>/.ssh/configServerAliveInterval 10 TCPKeepAlive yes StrictHostKeyChecking=no UserKnownHostsFile=/dev/null - Perform the following steps to set the cluster short name and central
repository variables:
- Set the following cluster variables for the bootstrap environment and
load them into the current environment:
Note:
The<cluster name>and<user-name>parameters must contain lowercase alphanumeric characters,'.'or'-', and must start with an alphanumeric character.$ echo 'export LANG=en_US.utf-8' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export LC_ALL=en_US.utf-8' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_VERSION=25.2.100' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_PREFIX=' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_CLUSTER=<cluster name>' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_USER=<user-name>' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_vCNE=vcd' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export OCCNE_TFVARS_DIR=/var/occne/cluster/<cluster name>/<cluster name>' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export VCD_USERNAME=<username>' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export VCD_PASSWORD=<password>' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export VCD_AUTH_URL=https://<vcd IP address>' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export VCD_ORG=<org>' | sudo tee -a /etc/profile.d/occne.sh $ echo 'export VCD_VDC=<virtual data center>' | sudo tee -a /etc/profile.d/occne.sh $ source /etc/profile.d/occne.shRun the following command to confirm that all cluster variables are loaded tooccne.sh:$ cat /etc/profile.d/occne.sh
- Set the following cluster variables for the bootstrap environment and
load them into the current environment:
- Run the following commands to create a directory specific to your cluster
(using the cluster
short-name):
$ sudo mkdir /var/occne/ $ sudo chown -R <user-name>:<user-name> /var/occne/ $ mkdir -p -m 0750 /var/occne/cluster/${OCCNE_CLUSTER}/ - Create the private and public keys that are used to access the other VMs.
Run the following commands to generate the keys that are passed to the Bastion Host and
used to communicate to other nodes from that Bastion Host.
$ mkdir -p -m 0700 /var/occne/cluster/${OCCNE_CLUSTER}/.ssh $ ssh-keygen -m PEM -t rsa -b 2048 -f /tmp/occne_id_rsa -N "" $ cp /tmp/occne_id_rsa ~/.ssh/id_rsa $ cp /tmp/occne_id_rsa /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa $ sudo chmod 600 /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa $ ssh-keygen -f ~/.ssh/id_rsa -y > ~/.ssh/id_rsa.pub $ ssh-keygen -f /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa -y > /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa.pub $ rm /tmp/occne_id_rsaNote:
The private key (occne_id_rsa) must be backed-up and copied to a server that is going to be used, longer-term, to access the Bastion Host because the Bootstrap Host is transient. The key is used to access the Bastion Host when performing other maintenance actions such as upgrade, and so on. You can also use an SSH client like Putty (keyGen and Pagent) to generate your own key pair and place the public key into the Bastion Host authorized_keys file to provide the access. Pagent can also be used to convert the .pem format of the occne_id_rsa private key to .ppk format using Putty to access the Bastion Host. - Configure the central repository access on Bootstrap by following the steps provided in the Configuring Central Repository Access on Bootstrap section.
- Perform the following steps to copy necessary files to Bootstrap Host:
- Copy the templates to cluster directory:
Use the following command for LBVM:
Note:
- The location to the templates and the cluster directory vary depending on the type of Load Balancer (LBVM or CNLB) used. Use one of the following commands to copy the relevant templates to the relevant cluster directory.
- Copy the for loop completely and use it in the terminal as a single command.
$ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host -v /var/occne:/var/occne:z --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in ../../scripts modules misc/. templates/. tffiles/. scripts/. ../common/templates/. ../../../../common/scripts; do cp -r /platform/vcne/lbvm/terraform/vcd/"$source" /host; done'Use the following commands for CNLB:$ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host -v /var/occne:/var/occne:z --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in modules ../scripts misc/. templates/. tffiles/. ../cluster/. ../../common/opentofu/templates/. ../../common/scripts ../../../../common/scripts ; do cp -r /platform/vcne/cnlb/vcd/opentofu/"$source" /host; done' $ mkdir /var/occne/cluster/${OCCNE_CLUSTER}/installer $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/cnlb_installer:${OCCNE_VERSION} -c 'for source in installer/validateCnlbIni.py installer/cnlb_logger.py installer/utils.py; do cp -r "$source" /host/installer/; done' - Update the ownership of the files in the cluster
directory:
$ sudo chown -R ${USER}:${USER} /var/occne/cluster/${OCCNE_CLUSTER}
- Copy the templates to cluster directory:
Predeployment Configuration for VMware
This section describes the procedure to deploy the CNE cluster in a VMware Cloud Director (VCD) Environment.
- Customize the
occne.inifile: This file contains important information about the vsphere account (different from VCD), needed to allow cluster to run. You can use theoccne.ini.templatefile available in the same directory to copy and create the initial file, and then customize the file.- Create a copy of the
occne.inifile from its template:$ cp /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini.template /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini - Open the copied file and fill out all required parameters as
given in the following sample:
Sample
Note:
external_vsphere_versionis a mandatory parameter.external_vsphere_datacenteris not the same asVCD_VDC.- Ensure that you retrieve all the information from the VCD administrator, otherwise, the cluster will not work correctly.
- The
occne.inifile configuration varies depending on the type of Load Balancer (LBVM or CNLB) used. Refer to the relevant example and configure the file depending on your Load Balancer type.
occne.inifile for LBVM:
Sample[occne:vars] occne_cluster_name= <cluster-name> #### # The ipvs scheduler type when proxy mode is ipvs # this variable will be added with default value of "rr"( round robin). # rr: round-robin # sh: source hashing # kube_proxy_scheduler=rr #### ## Central repository for bastion to retrieve YUM, Docker registry, and HTTP access to files (helm charts, etc) central_repo_host= central_repo_host_address= # central_repo_protocol=http #### ## Auto filled by deploy.sh from values in TFVARS file occne_ntp_server = occne_cluster_network = ## Indicate DNS nameservers, comma separated name_server = <name_server1>,<name_server2> # Specify 'True' (case matters, no quotes) to deploy services as ClusterIP instead of LoadBalancer. Default is 'False' # cncc_enabled=False # Below is the default calico_mtu value. Change if needed. # The value should be a number, not a string. calico_mtu = 1500 # Below is the default kube_network_node_prefix value. Change if needed. # Default value has room for 128 nodes. The value should be a number, not a string. # kube_network_node_prefix_value=25 # [vcd:vars] # Specify True vSphere information of the External vSphere Controller/CSI Cinder plugin accounts, ## account must be provided by OCCNE personel and it is needed for deployment. ## All the rest of the information is the same as used before. external_vsphere_version = <6.7u3> or <7.0u1> or <7.0u2> external_vsphere_vcenter_ip = <Need to be completed> external_vsphere_vcenter_port = <Need to be completed> external_vsphere_insecure = <Need to be completed> external_vsphere_user = <Need to be completed> external_vsphere_password = <Need to be completed> external_vsphere_datacenter = <Need to be completed> external_vsphere_kubernetes_cluster_id = <cluster-name> # vCloud Director Information required for LB Controller. # User must have catalog author permissions + Org Network View # User and password must use alphanumeric characters, it can be uppercase or lowercase # The password cannot contain "/" or "\", neither contains or be contained in "()" vcd_user = <Need to be completed> vcd_passwd = <Need to be completed> org_name = <Need to be completed> org_vdc = <Need to be completed> vcd_url = <Need to be completed>occne.inifile for CNLB:[occne:vars] occne_cluster_name = <cluster-name> #### ## Central repository for bastion to retrieve YUM, Docker registry, and HTTP access to files (helm charts, etc) central_repo_host = <Need to be completed> central_repo_host_address = <Need to be completed> # central_repo_protocol=http #### ## Auto filled by deploy.sh from values in TFVARS file occne_ntp_server = occne_cluster_network = # See section 5.4.8 to fill in the following fields. occne_prom_cnlb = <Need to be completed> occne_alert_cnlb = <Need to be completed> occne_graf_cnlb = <Need to be completed> occne_nginx_cnlb = <Need to be completed> occne_jaeger_cnlb = <Need to be completed> occne_opensearch_cnlb = <Need to be completed> ## Indicate DNS nameservers, comma separated name_server = <name_server1>,<name_server2> # Specify 'True' (case matters, no quotes) to deploy services as ClusterIP instead of LoadBalancer. Default is 'False' # cncc_enabled=False # Below is the default calico_mtu value. Change if needed. # The value should be a number, not a string. calico_mtu = 1500 # Below is the default kube_network_node_prefix value. Change if needed. # Default value has room for 128 nodes. The value should be a number, not a string. # kube_network_node_prefix_value=25 [vcd:vars] ## Specify the vSphere information of the External vSphere Controller/CSI Cinder plugin accounts, ## account must be provided by OCCNE personel and it is needed for deployment. ## All the rest of the information is the same as used before. external_vsphere_version = <6.7u3> or <7.0u1> or <7.0u2> external_vsphere_vcenter_ip = <Need to be completed> external_vsphere_vcenter_port = <Need to be completed> external_vsphere_insecure = <Need to be completed> external_vsphere_datacenter = <Need to be completed> external_vsphere_kubernetes_cluster_id = <cluster-name> # vCloud Director Information required for LB Controller. # User must have catalog author permissions + Org Network View # User and password must use alphanumeric characters, it can be uppercase or lowercase # The password cannot contain "/" or "\", neither contains or be contained in "()" org_name = <Need to be completed> org_vdc = <Need to be completed> vcd_url = <Need to be completed> # The ipvs scheduler type when proxy mode is ipvs # this variable will be added with default value of "rr"( round robin). # rr: round-robin # sh: source hashing ipvs_scheduler=rr
- Create a copy of the
- Configure the
secrets.inifile: Thesecrets.inifile contains information about the cluster and vsphere account (different from VCD) credentials. This information is required to allow the cluster to run correctly.- Create a copy of the
secrets.inifile from its template:$ cp /var/occne/cluster/${OCCNE_CLUSTER}/secrets.ini.template /var/occne/cluster/${OCCNE_CLUSTER}/secrets.ini - Open the copied file and fill all the required parameters as
given in the following
sample:
[occne:vars] # Set grub password occne_grub_password= # Username and password for Bastion container registry occne_registry_user= occne_registry_pass= [vcd:vars] ## Specify the vSphere information of the External vSphere Controller/CSI Cinder plugin accounts ## needed for deployment. external_vsphere_user = external_vsphere_password = vcd_user= vcd_passwd=
- Create a copy of the
- Configure the
cluster.tfvarsfile: Thecluster.tfvarsfile contains all the variables required by Terraform (LBVM) or OpenTofu (CNLB) to configure the cluster.CNE supports multiple pool addresses for LBVM. Therefore the range of the IP addresses, as well as the ports must be specified in the file, along with the networks to be used, template or catalogs for VM creation, and all the parameters, to allow the process to get access to VCD. The following sample shows a fully set up
cluster.tfvarsfile. Some of the parameters that are completed by default are not shown in the following sample.Note:
Theoccne_metallb_peer_addr_pool_namesname cannot be a substring of thecluster_name. For example, if thecluster_nameis mysignal1, thenoccne_metallb_peer_addr_pool_namescannot be sig.- Create the directory where the
cluster.tfvarsfile must be copied:$ mkdir -p -m 0750 /var/occne/cluster/${OCCNE_CLUSTER}/${OCCNE_CLUSTER} - Copy the
cluster.tfvarsfile from its template to the new directory:$ cp /var/occne/cluster/${OCCNE_CLUSTER}/occne_example/cluster.tfvars /var/occne/cluster/${OCCNE_CLUSTER}/${OCCNE_CLUSTER}/cluster.tfvars - Open the copied file and fill all the required parameters as
given in the following samples:
Note:
Thecluster.tfvarsfile configuration varies depending on the type of Load Balancer (LBVM or CNLB) used. Refer to the relevant example and configure the file depending on your Load Balancer type.Samplecluster.tfvarsfile for LBVM:# vCloud Director Information required to create resources. # User must have catalog author permissions + Org Network View # User and password must use alphanumeric characters, it can be uppercase or lowercase # The password cannot contain "/" or "\", neither contains or be contained in "()" vcd_user = "<Need to be completed>" vcd_passwd = "<Need to be completed>" org_name = "<Need to be completed>" org_vdc = "<Need to be completed>" vcd_url = "<Need to be completed>" allow_unverified_ssl = true cluster_name = "<customer_specific_short_cluster_name>" #if affinity rules will be created set polarity to "Affinity" if anti-affinity rules set polarity to "Anti-Affinity" polarity = "<Anti-Affinity or Affinity>" # specify if the affinity/anti-affinity rule will be hard or soft if hard set variable to true if soft set variable to false. hard_rule = <true or false> # Network used for cluster communication, this network must have the feature to create SNAT and DNAT rules private_net_name = "<Need to be completed>" # Networks used for external network communication, normally used for Bastion and LBVMs ext_net1_name = "<Need to be completed>" ext_net2_name = "<Need to be completed>" # Catalog and template name where the vApp template is stored. This template will be used for all the VM's catalog_name = "<Need to be completed>" template_name = "<Need to be completed>" # number of hosts number_of_bastions = 2 number_of_k8s_ctrls_no_floating_ip = 3 number_of_k8s_nodes = <number of worker nodes> # Amount of RAM assigned to VM's, expressed in MB memory_bastion = "4096" memory_k8s_node = "32768" memory_k8s_ctrl = "8192" memory_lbvm = "2048" # Amount of CPU assigned to VM's cpu_bastion = "2" cpu_k8s_node = "8" cpu_k8s_ctrl = "2" cpu_lbvm = "4" # Amount of cores assigned to VM's. Terraform has a bug related to templates created with a different # core number of the ones assigned here, it is suggested to use the same number as the template cores_bastion = 1 cores_k8s_ctrl = 1 cores_k8s_node = 1 cores_lbvm = 1 # Disk size, expressed in MB. Minimum disk size is 25600. disk_bastion = "102400" disk_k8s_node = "40960" disk_k8s_ctrl = "40960" disk_lbvm = "40960" # <user-name> used to deploy your cluster, should be same as $OCCNE_USER # Uncomment the below line only if $OCCNE_USER is NOT "cloud-user" # ssh_user= "<user-name>" # Update this list with the names of the pools. # It can take any value as an address pool name e.g. "oam", "signaling", "random_pool_name_1", etc. # This field should be set depending on what peer address pools the user wishes to configure. # eg ["oam"] or ["oam", "signaling"] or ["oam", "signaling", "random_pool_name_1"] # Note : "oam" is required while other network pools are application specific. # occne_metallb_peer_addr_pool_names = ["oam"] # Use the following for creating the Metallb peer address pool object: # # (A) num_pools = number of network pools as defined in occne_metallb_peer_addr_pool_names # # (B) Configuring pool_object list. # Each object in this list must have only 4 input fields: # 1. pool_name : Its name must match an existing pool defined in occne_metallbaddr_pool_names list # 2. num_ports = number of ips needed for this address pool object # 3. Configuring 3rd input field : use only one of the three ip address input fields for # each peer address pool. The other two input fields should be commented out or deleted. # # - .. cidr : A string representing a range of IPs from the same subnet/network. # - .. ip_list : A random list of IPs from the same subnet/network. Must be # defined within brackets []. # - .. ip_range: A string representing a range of IPs from the same subnet/network. # This range is converted to a list before input to terraform. # IPs will be selected starting at the beginning of the range. # # WARNING: The cidr/list/range must include the number of IPs equal to or # greater than the number of ports defined for that peer address # pool. # # NOTE: Below variables are not relevant to vCloud Director, but they must have # a value, for example: nework_id = net-id # 4. network_id : this input field specifies network id of current pool object # 5. subnet_id : this input field specifies subnet id of current pool object # # Make sure all fields within the selected # input objects are set correctly. occne_metallb_list = { num_pools = 1 pool_object = [ { pool_name = "<pool_name>" num_ports = <no_of_ips_needed_for_this_addrs_pool_object> ip_list = ["<ip_0>","<ip_(num_ports-1)>"] ip_range = "<ip_n> - <ip_(n + num_ports - 1)>" cidr = "<0.0.0.0/29>" subnet_id = "<subnet UUID for the given network>" network_id = "<network_id>" egress_ip_addr = "<IP address for egress port>" } ] }Samplecluster.tfvarsfor CNLB:# vCloud Director Information required to create resources. # User must have catalog author permissions + Org Network View # User and password must use alphanumeric characters, it can be uppercase or lowercase # The password cannot contain "/" or "\", neither contains or be contained in "()" vcd_user = "<Need to be completed>" vcd_passwd = "<Need to be completed>" org_name = "<Need to be completed>" org_vdc = "<Need to be completed>" vcd_url = "<Need to be completed>" allow_unverified_ssl = true cluster_name = "<customer_specific_short_cluster_name>" #if affinity rules will be created set polarity to "Affinity" if anti-affinity rules set polarity to "Anti-Affinity" polarity = "<Anti-Affinity or Affinity>" # specify if the affinity/anti-affinity rule will be hard or soft if hard set variable to true if soft set variable to false. hard_rule = <true or false> # Network used for cluster communication, this network must have the feature to create SNAT and DNAT rules private_net_name = "<Need to be completed>" # Networks used for external network communication, normally used for Bastion ext_net1_name = "<Need to be completed>" ext_net2_name = "<Need to be completed>" # Catalog and template name where the vApp template is stored. This template will be used for all the VM's catalog_name = "<Need to be completed>" template_name = "<Need to be completed>" # number of hosts number_of_bastions = 2 number_of_k8s_ctrls_no_floating_ip = 3 number_of_k8s_nodes = <number of worker nodes> # Amount of RAM assigned to VM's, expressed in MB memory_bastion = "4096" memory_k8s_node = "32768" memory_k8s_ctrl = "8192" # Amount of CPU assigned to VM's cpu_bastion = "2" cpu_k8s_node = "8" cpu_k8s_ctrl = "2" # Amount of cores assigned to VM's. OpenTofu has a bug related to templates created with a different # core number of the ones assigned here, it is suggested to use the same number as the template cores_bastion = 1 cores_k8s_ctrl = 1 cores_k8s_node = 1 # Disk size, expressed in MB. Minimum disk size is 25600. disk_bastion = "102400" disk_k8s_node = "40960" disk_k8s_ctrl = "40960" # <user-name> used to deploy your cluster, should be same as $OCCNE_USER # Uncomment the below line only if $OCCNE_USER is NOT "cloud-user" # ssh_user= "<user-name>" occne_bastion_names = ["1", "2"] occne_control_names = ["1", "2", "3"] occne_node_names = ["1", "2", "3", "4"]
- Create the directory where the
- If you are installing CNE with CNLB for traffic segregation, then enable and configure CNLB by performing the procedure in the Configuring Cloud Native Load Balancer (CNLB) section.
Deploying CNE Cluster in VMware Environment
This section describes the procedure to deploy the CNE cluster in a VMware environment.
- Ensure that the following mandatory components are set up:
occne.inifile, with the credentials for the vSphere account.secrets.inifile, with the secret credentials for the vSphere account.occne.shprofile file, with the credentials for the VCD log in datacenter and CNE version.cluster.tfvars, with the range of IP addresses, ports, VCD log in information, and other parameters needed for the cluster to run.- repositories (repos), to distribute software across the cluster set up by the bootstrap Ansible file or task.
- Check if the network defined in the
cluster.tfvarsfile are reflecting in the VMware GUI:- From VMware GUI, go to Applications → <new vApp> → Networks.
- Check if the networks defined in the
cluster.tfvarsfile (ext_net1_name/ext_net2_name) are present in the vApp. - If the networks are not present, perform the following steps to
add the networks:
- Click NEW.
- Under Type, select Direct.
- Select the network.
- Click ADD.
- Run the following command from the
/var/occne/cluster/${OCCNE_CLUSTER}/directory on the Bootstrap Host. This command may take a while to run (can be up to 2 to 4 hours depending on the machines it is run on):$ ./deploy.sh
Postinstallation Tasks
This section explains the postinstallation tasks for CNE.
Verifying and Configuring Common Services
Introduction
This section describes the steps to verify and configure CNE Common services hosted on the cluster. There are various UI endpoints that are installed with common services, such as OpenSearch Dashboards, Grafana, Prometheus Server, and Alert Manager. The following sub-sections provide information about launching, verifying, and configuring the UI endpoints.
Prerequisites
- Ensure that all the Common services are installed.
- Gather the cluster names and version tags that are used during the installation.
- All the commands in this section must be run on a Bastion Host.
- Ensure you have an HTML5 compliant web browser with network connectivity to CNE.
Common Services Release Information
/var/occne/cluster/${OCCNE_CLUSTER}/artifactsKubernetes Release File: K8S_container_images.txt
Common Services Release File: CFG_container_images.txt
Disabling Bastion HTTP Server Service
Stop and disable the bastion_http_server service
to avoid DNS conflicts, regardless of whether local DNS is enabled or
not.
- Run the following command on Bastion host to stop the
bastion_http_serverservice:$ sudo systemctl stop --now bastion_http_server.service - Run the following command on Bastion host to disable the
bastion_http_serverservice:$ sudo systemctl disable bastion_http_server.service
Verifying LBVM Enabled Cluster
Note:
The procedures provided in this section are applicable to LBVM based deployments only.
The following procedure provides the verification steps for a non-CNC Console authenticated environment. For a CNC Console authenticated environment, the same verification procedure must be followed except for the step to get the URL for all the common service user interface. This is because the CNC Console provides the direct link to access the common services user interface.
Verify if OpenSearch Dashboards are Running and Accessible
- Run the commands to get the LoadBalancer IP address and port number
for occne-opensearch-dashboards Web Interface:
- To retrieve the LoadBalancer IP address of the
occne-opensearch-dashboards
service:
$ export OSD_LOADBALANCER_IP=$(kubectl get services occne-opensearch-dashboards --namespace occne-infra -o jsonpath="{.status.loadBalancer.ingress[*].ip}") - To print the complete URL to access
occne-opensearch-dashboardsin an external browser:$ echo http://${OSD_LOADBALANCER_IP}/${OCCNE_CLUSTER}/dashboardSample output:http://10.75.34.2/occne-example/dashboard
- To retrieve the LoadBalancer IP address of the
occne-opensearch-dashboards
service:
- Launch the browser and navigate to OpenSearch Dashboards at
http://$OSD_LOADBALANCER_IP:$OSD_LOADBALANCER_PORT/$OCCNE_CLUSTER/dashboard/app/home#. - From the welcome screen that appears, choose OpenSearch Dashboards to navigate to OpenSearch's home screen.
Create an Index Pattern Using OpenSearch Dashboards
- On OpenSearch Dashboards GUI, click the Hamburger icon on the top left corner to open the sidebar menu.
- Expand the Management section and select Dashboards Management.
- Select Index Patterns and click Create index pattern.
- Enter "occne-logstash-*" in the Index pattern name field.
- Verify that you get a "Your index pattern matches <n> sources" message.
- Click Next step.
- Select I don't want to use the time filter and click Create index pattern.
- Ensure that the web page containing the indexes appear on the main viewer frame.
- Click the Hamburger icon on the top left corner to open the sidebar menu and select Discover under OpenSearch Dashboard.
- Select your index from the drop-down. Additionally, you can use the
Search field next to the drop-down to filter the key arguments.
The system displays the raw log records.
- To create another index pattern, repeat Steps 3 to 8 using another
valid pattern name instead of
logstash*and verify that it matches at least one index name.
Verify OpenSearch Dashboards Cluster Health
- On the OpenSearch Dashboards' home page, click Interact with the OpenSearch API to navigate to Dev Tools.
- Enter the command
GET _cluster/healthand send the request by clicking the Play icon. - On the right panel, verify that the value of
Statusis green.
{
"cluster_name": "occne-opensearch-cluster",
"status": "green", # <----- Verify that status is green
"timed_out": false,
"number_of_nodes": 9,
"number_of_data_nodes": 3,
"discovered_master": true,
"discovered_cluster_manager": true,
"active_primary_shards": 13,
"active_shards": 26,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 100
}
Verify if Prometheus Alert Manager is Accessible
- Run the following commands to get the
LoadBalancer IP address and port number for Prometheus Alertmanager web
interface:
- Run the following command to retrieve the LoadBalancer IP
address of the Alertmanager
service:
$ export ALERTMANAGER_LOADBALANCER_IP=$(kubectl get services occne-kube-prom-stack-kube-alertmanager --namespace occne-infra -o jsonpath="{.status.loadBalancer.ingress[*].ip}") - Run the following command to retrieve the LoadBalancer port
number of the Alertmanager
service:
$ export ALERTMANAGER_LOADBALANCER_PORT=$(kubectl get services occne-kube-prom-stack-kube-alertmanager --namespace occne-infra -o jsonpath="{.spec.ports[*].port}") - Run the following command to print the complete URL for
accessing Alertmanager in an external
browser:
$ echo http://$ALERTMANAGER_LOADBALANCER_IP:$ALERTMANAGER_LOADBALANCER_PORT/$OCCNE_CLUSTER/alertmanagerSample output:http://10.75.34.9/occne-example/alertmanager
- Run the following command to retrieve the LoadBalancer IP
address of the Alertmanager
service:
- Launch the Browser and navigate to http://$ALERTMANAGER_LOADBALANCER_IP:$ALERTMANAGER_LOADBALANCER_PORT/$OCCNE_CLUSTER/alertmanager received in the output of the above commands. Ensure that the Alertmanager GUI is accessible.
Verify if Alerts are Configured Properly
- Navigate to
Alerts tab of the Prometheus server GUI.
Alternatively, you can access the Prometheus Alerts tab using the
andhttp://$PROMETHEUS_LOADBALANCER_IP:$PROMETHEUS_LOADBALANCER_PORT/$OCCNE_CLUSTER/prometheus/alertsURL. For<PROMETHEUS_LOADBALANCER_IP><PROMETHEUS_LOADBALANCER_PORT>values, refer to Step 1 of the Verify metrics are scraped and stored in Prometheus section. - To verify if the Alerts are configured
properly, check if the following alerts are displayed in the Alerts tab
of the Prometheus
GUI.
BASTION_HOST_ALERTS # ------------------------------------------------- BASTION_HOST_FAILED (0 active) ALL_BASTION_HOSTS_FAILED (0 active) # ------------------------------------------------- CERT_EXPIRATION_ALERTS # ------------------------------------------------- APISERVER_CERTIFICATE_EXPIRATION_90D (0 active) APISERVER_CERTIFICATE_EXPIRATION_60D (0 active) APISERVER_CERTIFICATE_EXPIRATION_30D (0 active) # ------------------------------------------------- COMMON_SERVICES_STATUS_ALERTS # ------------------------------------------------- PROMETHEUS_NODE_EXPORTER_NOT_RUNNING (0 active) OPENSEARCH_CLUSTER_HEALTH_RED (0 active) OPENSEARCH_CLUSTER_HEALTH_YELLOW (0 active) OPENSEARCH_TOO_FEW_DATA_NODES_RUNNING (0 active) OPENSEARCH_DOWN (0 active) PROMETHEUS_DOWN (0 active) ALERT_MANAGER_DOWN (0 active) SNMP_NOTIFIER_DOWN (0 active) JAEGER_DOWN (0 active) METALLB_SPEAKER_DOWN (0 active) METALLB_CONTROLLER_DOWN (0 active) GRAFANA_DOWN (0 active) PROMETHEUS_NO_HA (0 active) ALERT_MANAGER_NO_HA (0 active) PROMXY_METRICS_AGGREGATOR_DOWN (0 active) VCNE_LB_CONTROLLER_FAILED (0 active) VSPHERE_CSI_CONTROLLER_FAILED (0 active) # ------------------------------------------------- HOST_ALERTS # ------------------------------------------------- DISK_SPACE_LOW (0 active) CPU_LOAD_HIGH (0 active) LOW_MEMORY (0 active) OUT_OF_MEMORY (0 active) NTP_SANITY_CHECK_FAILED (0 active) NETWORK_INTERFACE_FAILED (2 active) PVC_NEARLY_FULL (0 active) PVC_FULL (0 active) NODE_UNAVAILABLE (0 active) ETCD_NODE_DOWN (0 active) CEPH_OSD_NEARLY_FULL (0 active) CEPH_OSD_FULL (0 active) CEPH_OSD_DOWN (0 active) # ------------------------------------------------- LOAD_BALANCER_ALERTS # ------------------------------------------------- LOAD_BALANCER_NO_HA (0 active) LOAD_BALANCER_NO_SERVICE (0 active) LOAD_BALANCER_FAILED (0 active) EGRESS_CONTROLLER_NOT_AVAILABLE (0 active) OPENSEARCH_DASHBOARD_DOWN (0 active) FLUENTD_OPENSEARCH_NOT_AVAILABLE (0 active) OPENSEARCH_DATA_PVC_NEARLY_FULL (0 active) # -------------------------------------------------If no alerts are configured, you can manually configure the alerts by creating the Prometheusrule CRD:$ cd /var/occne/cluster/$OCCNE_CLUSTER/artifacts/alerts $ kubectl apply -f occne-alerts.yaml --namespace occne-infra
Verify if Grafana is Accessible
- Run the following commands to get the
load-balancer IP address and port number for Grafana Web Interface:
- Retrieve the LoadBalancer IP address of the Grafana
service:
$ export GRAFANA_LOADBALANCER_IP=$(kubectl get services occne-kube-prom-stack-grafana --namespace occne-infra -o jsonpath="{.status.loadBalancer.ingress[*].ip}") - Run the following command to print the complete URL for
accessing Grafana in an external
browser:
$ echo http://$GRAFANA_LOADBALANCER_IP:$GRAFANA_LOADBALANCER_PORT/$OCCNE_CLUSTER/grafanaSample output:http://10.75.34.24/occne-example/grafana
- Retrieve the LoadBalancer IP address of the Grafana
service:
- Launch the browser and navigate to the
http://$GRAFANA_LOADBALANCER_IP/$OCCNE_CLUSTER/grafanaURL retrieved in the output of the previous step. Ensure that the Prometheus server GUI is accessible. The default username and password isadmin/adminfor the first time access. - Log in to Grafana.
- Click Create your first dashboard and choose + Add visualization.
- On the Query tab, use the
Data source drop-down to select Promxy as data source.
Promxy is the default metrics aggregator for Prometheus time series
database.
Note:
If Promxy is down, then use Prometheus data source temporarily to obtain metric information. - Select Code and enter the
following query in the Enter a PromQL query... textbox:
round((1 - (sum by(kubernetes_node, instance) (node_cpu_seconds_total{mode="idle"}) / sum by(kubernetes_node, instance) (node_cpu_seconds_total))) * 100, 0.01) - Click Run queries.
The system displays the CPU usage of all the Kubernetes nodes.
Create a Dashboard to Visualize the CPU Usage of the Kubernetes Nodes
- Log in to Grafana.
- Click Create your first dashboard and choose + Add visualization.
- On the Query tab, select the
Promxy datasource option from the Data source
drop-down. Promxy is the default metrics aggregator for
Prometheus time series database.
Note:
If Promxy is down, then Prometheus data source can be used temporarily to obtain metric information. - Click the code button and enter the following query in
the "Enter a PromQL query..."
textbox:
round((1 - (sum by(kubernetes_node, instance) (node_cpu_seconds_total{mode="idle"}) / sum by(kubernetes_node, instance) (node_cpu_seconds_total))) * 100, 0.01) - Click the Run queries button. This query displays the CPU usage of all the Kubernetes nodes.
Verifying CNLB Enabled Cluster
Note:
The procedures provided in this section are applicable to LBVM based deployments only.
Verify the Status of CNLB Components
Run the
following commands to confirm if the deployed CNLB application pods
(cnlb-apps) and CNLB manager pods
(cnlb-manager) are in healthy
state:
$ kubectl get all -n <occne-namespace> -l app=cnlb-appSample
output:NAME READY STATUS RESTARTS AGE
pod/cnlb-app-5467fb4f6d-gw5sx 1/1 Running 0 20h
pod/cnlb-app-5467fb4f6d-q2nlp 1/1 Running 0 20h
pod/cnlb-app-5467fb4f6d-rljhh 1/1 Running 0 20h
pod/cnlb-app-5467fb4f6d-thrz6 1/1 Running 0 20h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/cnlb-app 4/4 4 4 9d
NAME DESIRED CURRENT READY AGE
replicaset.apps/cnlb-app-5467fb4f6d 4 4 4 20h
replicaset.apps/cnlb-app-65b4b7b55c 0 0 0 7d2h
replicaset.apps/cnlb-app-675dbfdfb9 0 0 0 9d
replicaset.apps/cnlb-app-776c797d 0 0 0 7d2h
replicaset.apps/cnlb-app-7c49684f79 0 0 0 9dCommand
for CNLB manager
pods:$ kubectl get all -n <occne-namespace> -l app=cnlb-managerSample
output:NAME READY STATUS RESTARTS AGE
pod/cnlb-manager-64b9744876-9cns2 1/1 Running 0 7d2h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/cnlb-manager 1/1 1 1 9d
NAME DESIRED CURRENT READY AGE
replicaset.apps/cnlb-manager-6d589fbcbb 1 1 1 4dVerify the Accessibility of Common Services from CNLB Based Deployments
Note:
This step is applicable to CNLB based deployments only.[occne:vars] in the
occne.ini file while configuring CNLB:
Note:
- Use either CURL or browser to attempt access and verify the service URL.
- Replace
httpin the following commands withhttpsas applicable.
$ head -n 20 /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini...
occne_prom_cnlb = 10.5.18.151
occne_alert_cnlb = 10.5.18.230
occne_graf_cnlb = 10.5.18.199
occne_nginx_cnlb = 10.5.18.253
occne_jaeger_cnlb = 10.5.18.53
occne_opensearch_cnlb = 10.5.18.182
$ curl 10.5.18.151
<a href="/occne-example/prometheus">Found</a>.
$ curl 10.5.18.230
<a href="/occne-example/alertmanager">Found</a>.
$ curl 10.5.18.199
<a href="/occne-example/grafana/login">Found</a>.
$ curl 10.5.18.253
...
<code>/etc/nginx/nginx.conf</code>.</p>
...
$ curl 10.5.18.53
...
// Jaeger version data is embedded by the query-service via search/replace.
...
$ curl 10.5.18.182/${OCCNE_CLUSTER}/dashboard/app/home
...
<title>OpenSearch Dashboards</title>
...Performing Security Hardening
Introduction
After installation, perform an audit of the CNE system security stance before placing the system into service. The audit primarily consists of changing credentials and sequestering SSH keys to the trusted servers. The following table lists all the credentials to be checked, changed, or retained:
Note:
Refer to this section if you are performing bare metal installation.Table 2-7 Credentials
| Credential Name | Type | Associated Resource | Initial Setting | Credential Rotation |
|---|---|---|---|---|
| TOR Switch | username/password | Cisco Top of Rack Switch | username/password from PreFlight Checklist | Reset post-install |
| HP ILO Admin | username/password | HP Integrated Lights Out Manger | username/password from PreFlight Checklist | Reset post-install |
| Oracle ILOM user | username/password | Oracle Integrated Lights-out Manager | username/password from PreFlight Checklist | Reset post-install |
| Server Super User (root) | username/password | Server Super User | Set to well-known Oracle default during server installation | Reset post-install |
|
Server Admin User (admusr) |
username/password | Server Admin User | Set to well-known Oracle default during server installation | Reset post-install |
| Server Admin User SSH | SSH Key Pair | Server Admin User | Key Pair generated at install time | Can rotate keys at any time; key distribution manual procedure |
If Factory or Oracle defaults were used for any of these credentials, they must be changed before placing the system into operation. You must then store these credentials in a safe a secure way, off site. It is recommended to plan a regular schedule for updating (rotating) these credentials.
Prerequisites
This procedure is performed after the site has been deployed and prior to placing the site into service.
Limitations and Expectations
The focus of this procedure is to secure the various credentials used or created during the install procedure. There are additional security audits that the CNE operator must perform, such as scanning repositories for vulnerabilities, monitoring the system for anomalies, regularly checking security logs. These audits are outside the scope of this post-installation procedure.References
- Nexus commands to configure Top of Rack switch username and password: https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/6-x/security/configuration/guide/b_Cisco_Nexus_9000_Series_NX-OS_Security_Configuration_Guide/b_Cisco_Nexus_9000_Series_NX-OS_Security_Configuration_Guide_chapter_01001.html
- See ToR switch procedure for initial username and password configuration: Configure Top of Rack 93180YC-EX Switches
- See procedure to configure initial iLO/OA username and password: Configure Addresses for RMS iLOs
Procedure
- Reset credentials on the TOR Switch:
Note:
The following commands were tested in a laboratory environment in Cisco switches and may differ from other versions of Cisco IOS/NX-OS and other brands.- From the Bastion Host, log
in to the switch with username and password from the
procedure:
[bastion host]$ ssh <username>@<switch IP address> User Access Verification Password: <password> Cisco Nexus Operating System (NX-OS) Software TAC support: http://www.cisco.com/tac ... ... switch> enable switch# - Change the password for
current username:
switch# configure terminal Enter configuration commands, one per line. End with CNTL/Z. switch(config)# username <username> password <newpassword> - Create a new username:
switch(config)# username <new-username> password <new-password> role [network-operator|network-admin|vdc-admin|vdc-operator] - Save the changes and exit from the switch and
log in with the new username and password to verify if the new user
was created:
switch(config)# exit switch# write memory Building configuration [OK] switch# exit Connection to <switch IP address> closed. [bastion host]$ [some server]$ ssh <new-username>@<switch IP address> User Access Verification Password: <new-password> Cisco Nexus Operating System (NX-OS) Software TAC support: http://www.cisco.com/tac ... ... switch# - Delete the previous old
username if it is not needed:
switch# configure terminal Enter configuration commands, one per line. End with CNTL/Z. switch(config)# no username <username> switch(config)# - Set the secret for the exec mode:
switch(config)# enable secret <new-enable-password> switch(config)# exit switch# - Save the above
configuration:
switch# copy running-config startup-config Building configuration [OK] switch#
- From the Bastion Host, log
in to the switch with username and password from the
procedure:
- Reset credentials for the HP ILO Admin Console:
- From the Bastion Host, log
in to the iLO with username and password from the procedure:
[bastion host]$ ssh <username>@<iLO address> <username>@<iLO address>'s password: <password> User:<username> logged-in to ...(<iLO address> / <ipv6 address>) Integrated Lights-Out 5 iLO Advanced 3.02 at Feb 22 2024 Server Name: <server name> Server Power: On </>hpiLO-> - Change the password for
the current username:
</>hpiLO-> set /map1/accounts1/<username> password=<newpassword> status=0 status_tag=COMMAND COMPLETED Tue Aug 20 13:27:08 2019 </>hpiLO-> - Create a new user:
</>hpiLO-> create /map1/accounts1 username=<newusername> password=<newpassword> group=admin,config,oemHP_rc,oemHP_power,oemHP_vm status=0 status_tag=COMMAND COMPLETED Tue Aug 20 13:47:56 2019 User added successfully. - Exit from the iLOM and log
in with the new username and password to verify if the new change
works:
</>hpiLO-> exit status=0 status_tag=COMMAND COMPLETED Thu Jun 19 21:56:31 2025 CLI session stopped Received disconnect from <iLO address> port 22:11: Client Disconnect Disconnected from <iLO address> port 22 [bastion host]$ ssh <newusername>@<iLO address> <newusername>@<iLO address>'s password: <newpassword> User:<newusername> logged-in to ...(<iLO address> / <ipv6 address>) iLO Advanced 2.61 at Jul 27 2018 Server Name: <server name> Server Power: On </>hpiLO-> - Delete the previous old
username if it is not needed:
</>hpiLO-> delete /map1/accounts1/<username> status=0 status_tag=COMMAND COMPLETED Tue Aug 20 13:59:04 2019 User deleted successfully.
- From the Bastion Host, log
in to the iLO with username and password from the procedure:
- Reset credentials for the Netra iLOM user console:
- From the Bastion Host, log in to the
iLOM:
[bastion host]$ ssh <username>@<iLOM address> Password: Oracle(R) Integrated Lights Out Manager Version 4.0.4.51 r134837 Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved. Warning: password is set to factory default. Warning: HTTPS certificate is set to factory default. Hostname: ORACLESP-2114XLB026 -> - Change current
password:
-> set /SP/users/<currentuser> password Enter new password: ******** Enter new password again: ******** - Create a new
user:
-> create /SP/users/<username> Creating user... Enter new password: **** create: Non compliant password. Password length must be between 8 and 16 characters. Enter new password: ******** Enter new password again: ******** Created /SP/users/<username> - Exit from the iLO and log in as the new user (created
in step c) with the new username and password to verify if the new
change
works:
-> exit Connection to <iLOM address> closed. [bastion host]$ ssh <newusername>@<iLOM address> Password: Oracle(R) Integrated Lights Out Manager Version 4.0.4.51 r134837 Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved. Warning: password is set to factory default. Warning: HTTPS certificate is set to factory default. Hostname: ORACLESP-2114XLB026 -> - Delete the previous user if not
needed:
-> delete /SP/users/<non-needed-user> Are you sure you want to delete /SP/users/<non-needed-user> (y/n)? y Deleted /SP/users/<non-needed-user> ->
- From the Bastion Host, log in to the
iLOM:
-
Procedure for vCNE and BareMetal:
Reset credentials for the root account on each server. To reset the credential for the root account, log in to each server in the cluster (ssh root@cluster_host) and run the following command:Note:
Password must be at least 14 characters long, must contain 1 uppercase letter, 1 digit and 1 non-alphanumeric character.$ sudo passwd root Changing password for user root. New password: Retype new password: - Regenerate or redistribute SSH keys credentials for the
useraccount:- Log in to the Active Bastion Host VM and run the following command to
generate a new cluster-wide key-pair in the cluster directory as
user:
> ssh <user><@bastion-host-IP> $ is_active_bastion IS active-bastion $ ssh-keygen -b 4096 -t rsa -C "New SSH Key" -f /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/new_occne_id_rsa -q -N "" - Run the following commands to add the new generated private key to the
authorized_keyfile of every node:- For LBVM deployment, run the following
command:
$ for x in bastion-1 bastion-2 k8s-ctrl-1 k8s-ctrl-2 k8s-ctrl-3 k8s-node-1 k8s-node-2 k8s-node-3 k8s-node-4 oam-lbvm1 oam-lbvm2; do ssh-copy-id -i /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/new_occne_id_rsa ${OCCNE_CLUSTER}-"$x"; done - For CNLB deployment, run the following
command:
$ for x in bastion-1 bastion-2 k8s-ctrl-1 k8s-ctrl-2 k8s-ctrl-3 k8s-node-1 k8s-node-2 k8s-node-3 k8s-node-4; do ssh-copy-id -i /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/new_occne_id_rsa ${OCCNE_CLUSTER}-"$x"; done
- For LBVM deployment, run the following
command:
- Perform the following steps from Bastion to copy the content of
new_occne_id_rsa.pubtoid_rsa.pubandnew_occne_id_rsatoid_rsa. Also, rename thenew_occne_id_rsaandnew_occne_id_rsa.pubkeys tooccne_id_rsaandoccne_id_rsa.pubkeys respectively.- Copy the new public key to the
$home/.sshdirectory:$ cp /var/occne/cluster/<cluster-name>/.ssh/new_occne_id_rsa.pub /home/<user>/.ssh/id_rsa.pub - Copy the new private key to the
$home/.sshdirectory:$ cp /var/occne/cluster/<cluster-name>/.ssh/new_occne_id_rsa /home/<user>/.ssh/id_rsa - Replace old key with the new key in the cluster
directory:
$ mv /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/new_occne_id_rsa /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa $ mv /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/new_occne_id_rsa.pub /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa.pub
- Copy the new public key to the
- Log in to the Active Bastion Host VM and run the following command to
generate a new cluster-wide key-pair in the cluster directory as
user:
- Modify SSH key for LBVM:
- Run the following command to to delete the
lb-controller-ssh-keysecret:$ cat /var/occne/cluster/<cluster_name>/.ssh/new_occne_id_rsa | base64 -w0 - Recreate the
lb-controller-ssh-keydeleted secret using the private key created in the previous step.kubectl -n occne-infra create secret generic lb-controller-ssh-key --from-file=/var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa - Run the following commands to restart the
occne-lb-controller-serverdeployment:$ kubectl rollout restart deployment occne-lb-controller-server -n occne-infraSample output:deployment.apps/occne-lb-controller-server restarted
- Run the following command to to delete the
Activating Optional Features Post Installation
This section provides information about activating optional features, such as Velero, Local DNS, and floating IP post installation.
Dedicated CNLB nodes
Kubernetes supports taints on nodes. This step is needed when CNLB pods are to be run exclusively on specific nodes. Executing this procedure taints those nodes, and only CNLB pods (with the matching toleration) will be allowed to run on them. For more information about dedicating CNLB nodes, see Dedicating CNLB Pods to Specific Worker Nodes.
Activating Velero
Velero is used for performing on-demand backups and restoring CNE cluster data. Velero is an optional feature and has extra set of hardware and networking requirements. You can activate Veloro after installing CNE. For more information about activating Velero, see Activating Velero.
Activating Local DNS
The Local DNS feature is a reconfiguration of core DNS (CoreDNS) to support external hostname resolution. When Local DNS is enabled, CNE routes the connection to external hosts through core DNS rather than the nameservers on the Bastion Hosts. For information about activating this feature, see the "Activating Local DNS" section in Oracle Communications Cloud Native Core, Cloud Native Environment User Guide.
To stop DNS forwarding to Bastion DNS, you must define the DNS details through 'A' records and SRV records. A records and SRV records are added to CNE cluster using Local DNS API calls. For more information about adding and deleting DNS records, see the "Adding and Removing DNS Records" section in Oracle Communications Cloud Native Core, Cloud Native Environment User Guide.
Enabling or Disabling Floating IP in OpenStack
Floating IPs are additional public IP addresses that are associated with instances such as control nodes, worker nodes, Bastion Host, and LBVMs. Floating IPs can be quickly reassigned and switched from one instance to another using an API interface, thereby ensuring high availability and less maintenance. You can activate the Floating IP feature after installing CNE. For information about enabling or disabling the Floating IP feature, see Enabling or Disabling Floating IP in OpenStack.
Verifying LBVM HTTP Server
This section provides information about verifying the ports in LBVM HTTP server.
CNE runs an http server service (lbvm_http_server.service) on port 8887
of each LBVM. Ensure that you don’t deploy any LoadBalancer service using the TCP port
8887 on the LB as lbvm_http_server.service listens on this port.
Upgrading Grafana Post Installation
This section provides information about upgrading Grafana to a custom version post installation.
After installing CNE, depending on your requirement, you can upgrade Grafana to a custom version (For example, 11.2.x). To do so, perform the procedure in the Upgrading Grafana section.