2 Installing CNE

This chapter provides information about installing Oracle Communications Cloud Native Core, Cloud Native Environment (CNE). CNE can be deployed onto dedicated hardware, referred to as a baremetal CNE, or deployed onto Virtual Machines, referred to as a virtualized CNE.

Regardless of which deployment platform is selected, CNE installation is highly automated. A collection of container-based utilities automate the provisioning, installation, and configuration of CNE. These utilities are based on the following automation tools:

  • PXE helps to reliably automate the process of provisioning the hosts with a minimal operating system.
  • Terraform creates the virtual resources that are used to host the virtualized CNE.
  • Kubespray helps reliably install a base Kubernetes cluster, including all dependencies such as etcd, using the Ansible provisioning tool.
  • Ansible is used to deploy and manage a collection of operational tools (Common Services) that are:
    • provided by open source third party products such as Prometheus and Grafana
    • built from source and packaged as part of CNE releases such as Oracle OpenSearch and Oracle OpenSearch Dashboard
  • Kyverno Policy management is used to enforce security posture in CNE.
  • Helm is used to deploy and configure common services such as Prometheus, Grafana, and OpenSearch.

Note:

  • Ensure that the shell is configured with Keepalive to avoid unexpected timeout.
  • The 'X' in Oracle Linux X or OLX in the installation procedures indicates the latest version of Oracle Linux supported by CNE.
  • CNE 24.2.0 replaces Terraform with OpenTofu when you freshly install CNE with Cloud Native Load Balancer (CNLB). For vCNE instances deployed using Terraform, CNE 24.2.0 continues to use and support Terraform for upgrade and maintenance.

Preinstallation Tasks

This section describes the procedure to run before installing an Oracle Communications Cloud Native Environment also known in these installation procedures as CNE.

Sizing Kubernetes Cluster

CNE deploys a Kubernetes cluster to host application workloads and common services. The following table provides the minimum Kubernetes cluster sizing (node counts) required for production deployments of CNE.

Note:

Deployments that do not meet the minimum sizing requirements may not operate correctly when maintenance operations (including upgrades) are performed on the CNE.

Table 2-1 Kubernetes Cluster Sizing

Node Type Minimum Required Maximum Allowed
Kubernetes controller node 3 3
Kubernetes node 6 100

Sizing Prometheus Persistent Storage

Prometheus stores metrics in Kubernetes persistent storage. Use the following calculations to reserve the correct amount of persistent storage during installation so that Prometheus can store metrics for the desired retention period.

Calculating metrics storage requirements
Prometheus creates a Kubernetes Persistent Volume Claim (PVC) to reserve persistent storage for metrics. To reserve the correct amount of persistent storage, the size of the metrics stored by Prometheus each day needs to be determined. This value varies based on the deployed NFs. To know each NF value on daily metrics storage requirements for various ingress traffic rates, see the related NF documents. Record the daily metrics storage requirement (at the expected traffic rate) for each NF deployment in the CNE instance. This value, plus the daily CNE metrics storage requirement (given below) provides the expected metrics storage growth per day. To determine the total amount of storage that needs to be reserved for Prometheus, use the formula:
total_metrics_storage = (nf_metrics_daily_growth + occne_metrics_daily_growth) * metrics_retention_period * 1.2

Note:

  • An extra 20% storage is reserved to allow for a future increase in metrics growth.
  • It is recommended to maintain the metrics_retention_period at the default value of 14 days.
  • If the resulting storage size as per the above formula is greater than 500GB, then the retention period must be reduced until the resulting value is less than 500GB.
  • Ensure that the retention period is set to more than 3 days.
  • The default value for Prometheus persistent storage is 8GB.
  • If the total_metrics_storage value calculated as per the above fomula is less than 8GB, then use the default value.
Example:
One NF is installed on the OCCNE instance. From the NFs documentation, it is known that it generates 150 MB of metrics data per day at the expected ingress signaling traffic rate. Using the formula above:

metrics_retention_period = 14 days

nf_metrics_daily_growth = 150 MB/day (from NF documentation)

occne_metrics_daily_growth = 144 MB/day (from calculation below)

(0.15 GB/day + 0.144 GB/day) * 14 days * 1.2 = 5 GB (rounded up)

Since this is less than the default value of 8 GB, use 8 GB as the total_metrics_storage value.

Note:

After determining the required metrics storage, record the total_metrics_storage value for later use in the installation procedures.

Calculating CNE metrics daily storage growth requirements

CNE stores varying amounts of metrics data each day depending on the size of the Kubernetes cluster deployed in the CNE instance. To determine the correct occne_metrics_daily_growth value for the CNE instance, use the formula:

occne_metrics_daily_growth = 36 MB * num_kubernetes_nodes

Sizing Oracle OpenSearch Persistent Storage

Oracle OpenSearch data nodes stores logs and traces in Kubernetes persistent storage. Use the following calculations to reserve the correct amount of persistent storage during installation so that Oracle OpenSearch can store logs and traces for the desired retention period.

Note:

OpenSearch master nodes do not store any user data (logs/traces). Data ingestion has been explicitly disabled on master nodes, and PVCs attached to them are no longer used for data storage.

As a result:

  • Master PVC size is fixed at 1Gi by default.
  • Master PVC size cannot be resized, as they no longer serve a data ingestion purpose.
  • All log and trace data is now stored exclusively on hot and warm data nodes.
  • It is recommended to provision atleast 5 OpenSearch data nodes to ensure adequate storage and high availability.
Calculating log+trace storage requirements
Oracle OpenSearch creates a Kubernetes Persistent Volume Claim (PVC) to reserve persistent storage for logs and traces. OpenSearch master nodes do not require PVCs for storing logs or traces. To reserve the correct amount of persistent storage, the size of the logs and traces stored by Oracle OpenSearch each day must be determined. This value varies based on the deployed NFs and the ingress signaling traffic rate. To know each NF value on daily log storage requirements for various ingress traffic rates, see the related NF documents. Record the daily logs storage requirement (at the expected traffic rate) for each NF deployment in the CNE instance. This value, plus the daily CNE logs storage requirement (given below) provides the expected log storage growth per day. The amount of trace storage also needs to be calculated (see below). To determine the amount of storage that needs to be reserved for Oracle OpenSearch, use the formula:
log_trace_daily_growth = (nf_logs_daily_growth + nf_trace_daily_growth)
Every day, CNE creates a new set of indices for CNE logs, NF logs, and traces. Data nodes (hot/warm) store these indices, and master nodes do not store any data. Data for these active indices are stored on hot data nodes. OpenSearch master nodes are stateless with respect to data storage and do not require any PVCs. Use the following formula to determine the storage requirements for active and inactive index storage.
log_trace_active_storage = log_trace_daily_growth * (log_trace_retention_period+1) *1.2

Note:

  • An extra day's worth of storage is allocated on the hot data nodes that gets used when deactivating the old daily indices.
  • An extra 20% storage is reserved to allow for a future increase in logging growth.
  • It is recommended to maintain the log_trace_retention_period at the default value of 7 days.
  • If the resulting storage size as per the above formula is greater than 500GB, then the retention period must be reduced until the resulting value is less than 500GB.
  • Ensure that the retention period is set to more than 3 days.
  • The default values for Oracle OpenSearch persistent storage for log_trace_active_storage is 10Gi.
  • If the log_trace_active_storage value calculated as per the above fomula is less than 10Gi, then use the default value.
For example, if one NF is installed on the CNE instance and generates 150 MB of logs data per day at the expected ingress signaling traffic rate. The following are the storage requirements using the given formula:
log_trace_retention_period = 7

nf_mps_rate = 200 msgs/sec

nf_logs_daily_growth = 150 MB/day (from NF documentation)

nf_trace_daily_growth = 500 MB/day (from section below)

log_trace_daily_growth = (0.15 GB/day + 0.50 GB/day) = 0.65 GB/day

log_trace_active_storage = 0.65*(7+1)*1.2 = 6.24 GB

Note:

After determining the required logs and trace storage, record the log_trace_active_storage value for later use in the installation procedures.

Calculating NF trace data daily storage growth requirements

NFs store varying amounts of trace data each day, depending on the ingress traffic rate, the trace sampling rate, and the error rate for handling ingress traffic. The default trace sampling rate is .01%. Space is reserved for 10M trace records per NF per day (an amount equivalent to a 1% trace sampling rate) and uses 50 bytes as the average record size (as measured during testing). 1% is used instead of .01% to account for the capture of error scenarios and overhead.

nf_trace_daily_num_records = 10M records/day

nf_trace_avg_record_size = 50 bytes/record

nf_trace_daily_growth = 10M records/day * 50 bytes/record = 500 MB/day

Record the value log_trace_daily_growth for later use in the installation procedures.

Note:

  • Ensure that trace sampling rate is set to less than 1% under normal circumstances. Collecting a higher percentage of traces causes the Oracle OpenSearch to respond more slowly, and impacts the performance of the CNE instance. If you want to collect a higher percentage of traces, contact My Oracle Support.
  • CNE does not generate any traces, so no additional storage is reserved for CNE trace data.
  • CNE platform logs are disabled by default, so no additional storage is reserved for CNE logs data.
  • Master PVC size is fixed at 1Gi by default and must not be modified or resized.

Configuring OpenStack LB Controller Environment Variables

This section provides information about configuring the OpenStack Load Balancer (LB) controller environment variables.

You can update the following variables on an OpenStack occne-lb-controller-server deployment to adjust how a port recovery takes place, how often the health check runs, and how to set the log level. The following table provides the recommended default values that are set prior to installation for a standard deployment.

Note:

Change these variables after a deployment only if required in case of issues during port recovery (after a switchover) or as recommended by the Oracle support.

Table 2-2 Openstack LB Controller Variables

Variable name Description Default Value
OPENSTACK_MAX_PARALLEL The number of OpenStack API port calls made in parallel. The default value 0 indicates that the OpenStack API calls are run for all ports at once. This variable is helpful when the rate limiting is enabled at the OpenStack LB controller level or the OpenStack LB controller cannot process multiple API requests in a short period of time. 0
OPENSTACK_PORT_API_RETRY This variable controls the number of retries the system attempts to connect to a given port when the port detach API call, before deleting the port, creating a new port (using the same name and IP address), and attaching that port to the newly ACTIVE LBVM during a switchover. If this variable is set to 0, the ports are deleted immediately and recreated without trying to detach the ports (this is referred to as forced detachment and must be avoided until it is absolutely necessary due to underlying issues with the OpenStack LB controller). 5
OPENSTACK_PORT_API_TIMEOUT The time (in seconds) between the attempts to detach a port using the OpenStack API call. 2
LB_MON_REQ_TIMEOUT The time (in seconds) between the LB controller monitor health checks on the LBVMs across all pools. 2
LOG_LEVEL This variable allows to set the level of debug log in the LB controller. Additional level: DEBUG. INFO

Generating Root CA Certificate

To use an intermediate Certificate Authority (CA) as an issuer for Istio service mesh mTLS certificates, a signing certificate and key from an external CA must be generated. The generated certificate and key values must be base64 encoded.

For more information about generating the required certificate and key, see the Certificate Authority documentation.

Configuring GRUB Password

This section provides information about configuring GRUB password in all hosts of a cluster.

To configure the GRUB password, add the occne_grub_password variable to the [occne:vars] section of the occne.ini or hosts.ini file corresponding to Ansible configuration.

Set the value of the occne_grub_password variable to the required password. Before setting a password, ensure that the password you choose comply to the following conditions:
  • The password must contain at least eight characters.
  • The password must contain uppercase and lowercase characters.
  • The password must contain at least special character except single and double quotes. For example: ~ @ # ^ * - _ + [ { } ] : . / ? % = !
  • The password must contain at least two digits.
The following example shows the occne_grub_password variable in the occne.ini file:
################################################################################
#                                                                              #
# Copyright (c) 2024 Oracle and/or its affiliates. All rights reserved.        #
#                                                                              #
################################################################################

################################################################################
# OCCNE Cluster occne.ini file. Defines OCCNE deployment variables

[occne:vars]
occne_grub_password=TheGrubPassword2024

...

BareMetal Installation

This section provides information about installing CNE in a dedicated BareMetal hardware.

Note:

  • Before installing CNE in BareMetal, you must complete the preinstallation tasks.
  • CNE supports the following Load Balancers for traffic segregation:
    • Standard MetalLB
    • Cloud Native Load Balancer (CNLB)
    You can choose the type of Load Balancer used for traffic segregation while installing CNE on BareMetal. The configurations for installing CNE vary depending on the type of Load Balancer you choose. Therefore, read the procedure carefully and configure CNE depending on the Load Balancer type.

Overview

This section provides information about frames, components, and creating CNE instances.
Frame and Component

The initial release of the CNE system provides support for on-prem deployment to a very specific target environment consisting of a frame holding switches and servers. This section describes the layout of the frame and the roles performed by the racked equipment.

Frame

The physical frame is comprised of DL380 or DL360 rack mount servers, and 2 Top of Rack (ToR) Cisco switches. The frame components are added from the bottom up, thus designations found in the next section number from the bottom of the frame to the top of the frame.

Figure 2-1 Frame Overview


Frame Overview

Host Designations

Each physical server has a specific role designation within the CNE solution.

Figure 2-2 Host Designations


Host Designations

Node Roles

Along with the primary role of each host, you can assign a secondary role. The secondary role can be software related or in the case of the Bootstrap Host, hardware related, as there are unique out-of-band (OOB) connection to the ToR switches.

Figure 2-3 Node Roles


Node Roles

Transient Roles

RMS1 has unique out-of-band (OOB) connections to the ToR switches, which brings the designation of management host. This role is only relevant during initial switch configuration and fault recovery of the switch. RMS1 also has a transient role as the Installer Bootstrap Host, which is applicable only during initial installation of the frame, and subsequently to get an official install on RMS2. Later, this host is re-paved to its K8s Master role.

Figure 2-4 Transient Roles


Transient Roles

Creating CNE Instance

This section describes the procedures to create the CNE instance at a customer site. The following diagram shows the installation context:

Figure 2-5 CNE Installation Overview


CNE Installation Overview

Following is the basic installation flow to understand the overall effort:

  1. Check that the hardware is on-site, properly cabled, and powered up.
  2. Pre-assemble the basic equipments needed to perform a successful install:
    1. Identify
      1. Download and stage software and other configuration files using the manifests.
      2. Identify the layer 2 (MAC) and layer 3 (IP) addresses for the equipment in the target frame.
      3. Identify the addresses of key external network services, for example, NTP, DNS, and so on.
      4. Verify or set all of the credentials for the target frame hardware to known settings.
    2. Prepare
      1. Software Repositories: Load the various SW repositories (YUM, Helm, Docker, and so on) using the downloaded software and configuration files.
      2. Configuration Files: Populate the host's inventory file with credentials, layer 2 and layer 3 network information, switch configuration files with assigned IP addresses, and yaml files with appropriate information.
  3. Bootstrap the System:
    1. Manually configure a Minimal Bootstrapping Environment (MBE): perform the minimal set of manual operations to enable networking and initial loading of a single Rack Mount Server (RMS1), the transient Installer Bootstrap Host. In this procedure, a minimal set of packages needed to configure switches, iLOs, PXE boot environment, and provision RMS2 as an CNE Storage Host are installed.
    2. Using the newly constructed MBE, automatically create the first Bastion Host on RMS2.
    3. Using the newly constructed Bastion Host on RMS2, automatically deploy and configure the CNE on the other servers in the frame.
  4. Final Steps:
    1. Perform postinstallation checks.
    2. Perform recommended security hardening steps.

Cluster Bootstrapping Overview

The following install procedure describes how to install the CNE onto a new hardware that does not contain any networking configurations to switches or provisioned operating systems. Therefore, the initial step in the installation process is to provision RMS1 (see Installing CNE) as a temporary Installer Bootstrap Host. The Bootstrap Host is configured with a minimal set of packages to configure switches, iLOs, and Boot Firmware. From the Bootstrap host, a virtual Bastion Host is provisioned on RMS2. The Bastion Host is then used to provision (and in the case of the Bootstrap Host, re-provision) the remaining CNE hosts, install Kubernetes, and Common Services running within the Kubernetes cluster.

Prerequisites

Before installing and configuring CNE on BareMetal, ensure that the following prerequisites are met.

Prerequisites for Oracle X Servers:

Ensure that the Integrated Lights Out Manager (ILOM) firmware of Oracle X8-2 or X9-2 server is up to date. The ILOM firmware is crucial for seamless functioning of CNE and is essential for optimal performance, security, and compatibility. To update the ILOM firmware, perform the steps outlined in the Oracle documentation or contact the system administrator.
Prerequisites for Servers Other than HP and Oracle X:
  • Ensure that the preprovisioned nodes are installed with Oracle Linux 9, with packages from the @core and @base groupings.
  • Ensure that the KVM host nodes for Bastions and k8s-control nodes have sufficient space (at least 300Gb per hosted VM) in their /var volume for the virtual machine drives.
  • Ensure that the worker nodes have an unallocated storage device to be used for rook or ceph volume allocation of Kubernetes persistent volumes.
  • Ensure that the initial network setup is complete on the preprovisioned nodes, so that the installer can reach the nodes using ssh with a required interface name of bond0. The following example provides a sample configuration to create a bond0 interface out of two Ethernet interfaces:

    Note:

    Run the following commands as a root user.
    1. Run the following commands to create interfaces for bonding:
      nmcli con add type bond con-name bond0 ifname bond0 mode 802.3ad;
      nmcli con add type bond-slave con-name bond0-slave-0 ifname ens1f0 master bond0;
      nmcli con add type bond-slave con-name bond0-slave-1 ifname ens1f1 master bond0;
    2. On the KVM host nodes, run the following commands to create a bridge interface named bondbr0 and assign it to the host node's cluster IP address. This allows the KVM host node to connect with its VM guests.
      nmcli con add type bridge ifname bondbr0 con-name bondbr0 bridge.stp no ipv4.method manual ipv4.addresses ${IP};
      nmcli con mod bond0 connection.slave-type bridge connection.master bondbr0;
    3. Bridge interface is not required for the nodes that not KVM hosts. On these nodes, add the IP address and gateway directly to the bond0 interface:
      nmcli con mod bond0 ipv4.method manual ipv4.addresses ${IP} ipv4.gateway ${GATEWAY};
    4. On the bootstrap node, set up a connection to an OAM vlan with an appropriate IP address and gateway:
      nmcli connection add type bridge ifname vlan${OAM_VLAN}-br con-name vlan${OAM_VLAN}-br connection.autoconnect yes bridge.stp no ipv4.method manual ipv4.addresses ${OAM_IP} ipv4.gateway ${OAM_GATEWAY};
      nmcli connection add type vlan con-name bond0.${OAM_VLAN} dev bond0 id ${OAM_VLAN} master vlan${OAM_VLAN}-br connection.autoconnect yes;
    5. Stop any prior interface that may have the node's cluster IP address from OS installation, and start the new bond0 interface:
      nmcli con down eno1;   
      nmcli con up bond0;
  • Ensure that all the preprovisioned nodes that are to be included in the cluster have a user account (default admusr) with password-less sudo access:
    useradd admusr;
    usermod -aG wheel admusr;
    echo "%admusr ALL=(ALL) NOPASSWD: ALL" | tee -a /etc/sudoers;
    passwd admusr;
Configuring Artifact Acquisition and Hosting

CNE requires artifacts from Oracle Software Delivery Cloud (OSDC), Oracle Support (MOS), the Oracle YUM repository, and certain open-source projects. CNE deployment environments are not expected to have direct internet access. Thus, customer-provided intermediate repositories are necessary for the CNE installation process. These intermediate repositories need CNE dependencies to be loaded into them. This section provides the list of artifacts needed in the repositories.

Oracle eDelivery Artifact Acquisition
The CNE artifacts are posted on Oracle Software delivery Cloud (OSDC) or MOS.

Table 2-3 CNE Artifacts

Artifact Description File Type Destination repository
occne-images-25.2.1xx.tgz CNE Installers (Docker images) from OSDC/MOS Tar GZ Docker Registry
Templates OSDC/MOS
  • Switch config files
  • metallb config file
  • snmp mib files
  • sample hosts.ini file
  • deploy.sh script
  • configuration files from MOS
Config files (.conf, .ini, .yaml, .mib, .sh, .txt) Local media
Third Party Artifacts

CNE dependencies needed from open-source software must be available in repositories that are reachable by the CNE installation tools. For an accounting of third party artifacts needed for this installation, see the Artifact Acquisition and Hosting chapter.

Populating MetalLB Configuration

Introduction

The metalLB resources file (mb_resources.yaml) defines the Border Gateway Protocol (BGP) peers and address pools for metalLB. The mb_resources.yaml file must be placed in the same directory (/var/occne/<cluster_name>) as the hosts.ini file. This section provides information about configuring the MetalLB resource file.

Note:

The mb_resources.yaml metalLB resources file is applicable for MetalLB based deployments only and it is not applicable for CNLB based deployments.

Limitations

The IP addresses listed below can have three possible formats. Each peer address pool can use a different format from the others if desired.
  • IP List

    A list of IPs (each on a single line) in single quotes in the following format: 'xxx.xxx.xxx.xxx/32'. The IPs do not have to be sequential. The number of IPs must cover the number of IPs as needed by the application.

  • IP Range

    A range of IPs, separated by a dash in the following format: 'xxx.xxx.xxx.xxx - xxx.xxx.xxx.xxx'. The range must cover the number of IPs as needed by the application.

  • CIDR (IP-slash notation)

    A single subnet defined in the following format: 'xxx.xxx.xxx.xxx/nn'. The CIDR must cover the number of IPs as needed by the application.

  • The peer-address IP must be a different subnet from the IP subnets used to define the IPs for each peer address pool.
There is a limit of 10 peer address pools that can be managed within the mb_resources.yaml file.

Configuring MetalLB Pools and Peers

Following is the procedure to configure MetalLB pools and peers:
  1. Add BGP peers and address groups: Referring to the data collected in the Installation Preflight Checklist, add BGP peers (ToRswitchA_Platform_IP, ToRswitchB_Platform_IP) and address groups for each address pool. The Address-pools list the IP addresses that metalLB is allowed to allocate.
  2. Edit the mb_resources.yaml file with the site-specific values found in the Installation Preflight Checklist

    Note:

    The oam peer address pool is required for defining the IPs. Other pools are application specific and can be named to best fit the application it applies to. The following examples show how oam and signaling are used to define the IPs, each using a different method.

    Example for oam:

    apiVersion: metallb.io/v1beta2
    kind: BGPPeer
    metadata:
      creationTimestamp: null
      name: peer1
      namespace: occne-infra
    spec:
      holdTime: 1m30s
      keepaliveTime: 0s
      myASN: 64512
      passwordSecret: {}
      peerASN: 64501
      peerAddress: <ToRswitchA_Platform_IP>
    status: {}
    ---
    apiVersion: metallb.io/v1beta2
    kind: BGPPeer
    metadata:
      creationTimestamp: null
      name: peer2
      namespace: occne-infra
    spec:
      holdTime: 1m30s
      keepaliveTime: 0s
      myASN: 64512
      passwordSecret: {}
      peerASN: 64501
      peerAddress: <ToRswitchB_Platform_IP>
    status: {}
    ---
    apiVersion: metallb.io/v1beta1
    kind: IPAddressPool
    metadata:
      creationTimestamp: null
      name: oam
      namespace: occne-infra
    spec:
      addresses:
      - '<MetalLB_oam_Subnet_IPs>'
      autoAssign: false
    status: {}
    ---
    apiVersion: metallb.io/v1beta1
    kind: IPAddressPool
    metadata:
      creationTimestamp: null
      name: <application_specific_peer_address_pool_name>
      namespace: occne-infra
    spec:
      addresses:
      - '<MetalLB_app_Subnet_IPs>'
      autoAssign: false
    status: {}
    ---
    apiVersion: metallb.io/v1beta1
    kind: BGPAdvertisement
    metadata:
      creationTimestamp: null
      name: bgpadvertisement1
      namespace: occne-infra
    spec:
      ipAddressPools:
      - oam
      - <application_specific_peer_address_pool_name>
    status: {}
    Example for signaling:
    apiVersion: metallb.io/v1beta2
    kind: BGPPeer
    metadata:
      creationTimestamp: null
      name: peer1
      namespace: occne-infra
    spec:
      holdTime: 1m30s
      keepaliveTime: 0s
      myASN: 64512
      passwordSecret: {}
      peerASN: 64501
      peerAddress: 172.16.2.3
    status: {}
    ---
    apiVersion: metallb.io/v1beta2
    kind: BGPPeer
    metadata:
      creationTimestamp: null
      name: peer2
      namespace: occne-infra
    spec:
      holdTime: 1m30s
      keepaliveTime: 0s
      myASN: 64512
      passwordSecret: {}
      peerASN: 64501
      peerAddress: 172.16.2.2
    status: {}
    ---
    apiVersion: metallb.io/v1beta1
    kind: IPAddressPool
    metadata:
      creationTimestamp: null
      name: oam
      namespace: occne-infra
    spec:
      addresses:
        - '10.75.200.22/32'
        - '10.75.200.23/32'
        - '10.75.200.24/32'
        - '10.75.200.25/32'
        - '10.75.200.26/32'
       autoAssign: false
    status: {}
    ---
    apiVersion: metallb.io/v1beta1
    kind: IPAddressPool
    metadata:
      creationTimestamp: null
      name: signalling
      namespace: occne-infra
    spec:
      addresses:
        - '10.75.200.30 - 10.75.200.40'
      autoAssign: false
    status: {}
    ---
    apiVersion: metallb.io/v1beta1
    kind: BGPAdvertisement
    metadata:
      creationTimestamp: null
      name: bgpadvertisement1
      namespace: occne-infra
    spec:
      ipAddressPools:
      - oam
      - signalling
    status: {}

Predeployment Configuration - Preparing a Minimal Boot Strapping Environment

The steps in this section provide the details to establish a minimal bootstrap environment (to support the automated installation of the CNE environment) on the Installer Bootstrap Host using a Keyboard, Video, Mouse (KVM) connection.

Installing Oracle Linux X.x on Bootstrap Host

This procedure defines the steps to install Oracle Linux X.x onto the CNE Installer Bootstrap Host. This host is used to configure the networking throughout the system and install OLX. After OLX installation, the host is repaved as a Kubernetes Master Host in a later procedure.

Note:

Skip this section if you are installing CNE on servers other than HP Gen10 and Oracle X and run the bootstrap host procedures on the first k8s-host node which will be the KVM host of the first Kubernetes master and Bastion VMs. The topology in this case remains the same, however when you are installing CNE on other servers, the system assumes the following:

Prerequisites

  1. USB drive of sufficient size to contain the ISO (approximately 5Gb)
  2. Oracle Linux X.x iso (For example: Oracle Linux 9.x iso) is available
  3. YUM repository file is available
  4. Keyboard, Video, Mouse (KVM) are available

Limitations and Expectations

  1. The configuration of the Installer Bootstrap Host has to be quick and easy. The Installer Bootstrap Host is re-paved with the appropriate OS configuration for cluster and DB operation at a later installation stage. The Installer Bootstrap Host needs a Linux OS and some basic network to start the installation process.
  2. All steps in this procedure are performed using Keyboard, Video, Mouse (KVM).

Bootstrap Install Procedure

  1. Create Bootable USB Media:
    1. On the installer's notebook, download the OLX ISO from the customer's repository.
    2. Push the OLX ISO image onto the USB Flash Drive.

      Since the installer's notebook can be Windows or Linux OS, you must determine the appropriate details to run this task. For a Linux based notebook, insert a USB Flash Drive of the appropriate size into a Laptop (or some other Linux host on which you can copy the ISO), and run the dd command to create a bootable USB drive with the Oracle Linux X ISO (For example: Oracle Linux 9 ISO).

      $ dd -if=<path to ISO> -of=<USB device path> -bs=1m
  2. Install Oracle Linux on the Installer Bootstrap Host:

    Note:

    The following procedure considers installing OL9 and provides the options and commands accordingly. The procedure vary for other versions.
    1. Connect a Keyboard, Video, and Mouse (KVM) into the Installer Bootstrap Host's monitor and USB ports.
    2. Plug the USB flash drive containing the bootable ISO into an available USB port on the Bootstrap host (usually in the front panel).
    3. Reboot the host by momentarily pressing the power button on the host's front panel. The button turns yellow. If the button stays yellow, press the button again. The host automatically boots onto the USB flash drive.

      Note:

      If the host was configured previously, and the USB is unavailable in the bootable path as per the boot order, the booting process will be unsuccessful.
    4. If the host is unable to boot onto the USB, repeat step 2c, and interrupt the boot process by pressing F11 button which displays the Boot Menu.

      If the host has been recently booted with an OL, the Boot Menu displays Oracle Linux at the top of the list. Select Generic USB Boot as the first boot device and proceed.

    5. The host attempts to boot from the USB. The Boot Menu is displayed on the screen. Select Install Oracle Linux 9.x.y and click ENTER. This begins the boot process and the system displays the Welcome screen.

      When prompted for the language to use, select the default setting: English (United States) and click Continue in the lower left corner.

      Note:

      You can also select the second option Test this media & install Oracle Linux 9.x.y. This option first runs the media verification process.
    6. The system displays the INSTALLATION SUMMARY page. The system expects the following settings on the page. If any of these are not set correctly, then select that menu item and make the appropriate changes.
      1. LANGUAGE SUPPORT: English (United States)
      2. KEYBOARD: English (US)
      3. INSTALLATION SOURCE: Local Media
      4. SOFTWARE SELECTION: Minimal Install
      5. INSTALLATION DESTINATION: This must display No disks selected.
    7. Select INSTALLATION DESTINATION to indicate on which drive to install the OS.
    8. Select the drives where the OS is installed.

      Note:

      • The system displays a dialog box if there is no space in the selected drives to install Oracle Linux. When you encounter such a scenario, perform the following steps to clear up space for Oracle Linux installation:
        • Click Reclaim space.
        • Click Delete all.
        • Click Reclaim space.
      • Be aware that the data in the selected drives is lost.
    9. Select DONE. This returns to the INSTALLATION SUMMARY screen.
    10. At the INSTALLATION SUMMARY screen, select ROOT PASSWORD.

      Enter a root password appropriate for this installation.

      It is recommended to use the secure password that the customer provides. This helps to minimize the host from being compromised during installation.

    11. At the INSTALLATION SUMMARY screen, select Begin Installation. The INSTALLATION PROGRESS screen is displayed.
    12. After completing the installation process, remove the USB and select Reboot System to complete the installation and booting to the OS on the Bootstrap Host. At the end of the boot process, the Log in prompt appears.
Configuring Host BIOS

Introduction

The following procedure defines the steps to set up the Basic Input Output System (BIOS) changes on the following server types:
  • Bootstrap host uses the KVM. If you are using a previously configured Bootstrap host that can be accessed through the remote HTML5 console, follow the procedure according to the remaining servers.
  • All the remaining servers use remote HTML5 console.

The steps can vary based on the server type. Follow the appropriate steps specific to configured server. Some of the steps require a system reboot and are indicated in the procedure.

Prerequisites

Limitations and Expectations

  1. Applies to HP Gen10 iLO 5 and Netra X8-2 server only.
  2. Procedures listed here apply to both Bootstrap Host and other servers unless indicated explicitly.

Procedure for Netra X8-2 server

By default, BIOS of the Netra X8-2 server is set to the factory settings with predefined default values. Do not change BIOS of a new X8-2 server. If any issue occurs in the new X8-2 server, reset BIOS to the default factory settings.

To reset the BIOS to factory setting, do the following:
  1. Log in to https://<netra ilom address>.
  2. Navigate to System Management, select BIOS.
  3. Set the value to Factory from the drop-down list for Reset to Defaults under Settings.
  4. Click Save.

Exposing the System Configuration Utility on a RMS Host

Steps to Configure the Installer Bootstrap Host BIOS

Perform the following steps to launch the HP iLO 5 System Configuration Utility main page from the KVM. It does not provide instructions to connect the console, as it can differ on each installation.

  1. After providing connections for the KVM to access the console, you must reboot the host by momentarily pressing the power button on the front of the Bootstrap host.
  2. Navigate to the HP ProLiant DL380 Gen10 System Utilities.

    Once the remote console has been exposed, reset the system to force it through a restart. When the initial window is displayed, keep pressing the F9 key repeatedly. Once the F9 key is highlighted at the lower left corner of the remote console, it eventually brings up the main System Utilities screen.

  3. The System Utilities screen is displayed in the remote console.
Launching the System Utility for other RMS servers
Each RMS iLO is assigned an IP address from the installation prerequisites process. Each server can be reached using SSH from the Bootstrap Host login shell on the KVM.

Note:

As CNE 23.4.0 upgraded Oracle Linux to version 9, some of the OS capabilities are removed for security reasons. This includes the removal of older insecure cryptographic policies as well as shorter RSA lengths that are no longer supported. For more information, see Step c.
  1. Perform the following steps to launch the system utility for other RMS servers:
    1. SSH to the RMS using the iLO IP address, and the root user and password previously assigned at the Installation Preflight Checklist. This displays the HP iLO prompt.
      $ ssh root@<rms_ilo_ip_address>
      Using username "root".
      Last login: Fri Apr 19 12:24:56 2019 from 10.39.204.17
      [root@localhost ~]# ssh root@192.168.20.141
      root@192.168.20.141's password:
      User:root logged-in to ILO2M2909004F.labs.nc.tekelec.com(192.168.20.141 / FE80::BA83:3FF:FE47:649C)
      Integrated Lights-Out 5
      iLO Advanced 2.30 at  Aug 24 2020
      Server Name:
      Server Power: On
      </>hpiLO->
    2. Use Virtual Serial Port (VSP) to connect to the blade remote console:
      </>hpiLO->vsp
    3. Power cycle the blade to bring up the System Utilities for that blade.

      Note:

      The System Utility is a text based version of that exposed on the RMS via the KVM. You must use the directional (arrow) keys to manipulate between selections, ENTER key to select, and ESC to return from the current selection.
    4. Access the System Utility by pressing ESC 9.
  2. [Optional]: If you are using OL9, depending on the host that is used to connect the RMS, you may encounter the following error messages when you connect to iLOM. These errors are encountered in OL9 due to the change in security:
    Error in Oracle X8-2 or X9-2 server when RSA key length is too short:
    $ ssh root@172.10.10.10
    Bad server host key: Invalid key length
    Error in HP server when legacy crypto policy is not enabled:
    $ ssh root@172.11.11.10
    Unable to negotiate with 172.11.11.10 port 22: no matching key exchange method found. Their offer: diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
    Perform the following steps to resolve these iLOM connectivity issues:

    Note:

    Run the following command on the connecting host only if the host experiences the aforementioned errors while connecting to HP iLO or Oracle iLOM through SSH.
    1. Perform the following steps to reenable Legacy crypto policies to connect to HP iLO using SSH:
      1. Run the following command to enable Legacy crypto policies:
        $ sudo update-crypto-policies --set LEGACY
      2. Run the following command to revert the policies to default:
        $ sudo update-crypto-policies --set DEFAULT
    2. Run the following command to allow short short RSA length while connecting to Oracle X8-2 or X9-2 server using SSH:
      $ ssh -o RSAMinSize=1024 root@172.10.10.10
Changing from UEFI Booting Mode to Legacy BIOS Booting Mode
  1. Navigate to the System Utility as per step 1.
  2. Select System Configuration.
  3. Select BIOS/Platform Configuration (RBSU).
  4. Select Boot Options: If the Boot Mode is currently UEFI Mode and you decide to use BIOS Mode, use this procedure to change to Legacy BIOS Mode.

    Note:

    The server reset must go through an attempt to boot before the changes apply.
  5. Select the Reboot Required dialog window to drop back into the boot process. The boot must go into the process of actually attempting to boot from the boot order. Attempting to boot fails as disks are not installed at this point. The System Utility can be accessed again.
  6. After the reboot and you re-enter the System Utility, the Boot Options page appears.
  7. Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
Changing the System Utility Mode from Legacy BIOS Booting Mode to UEFI Booting Mode
The system supports both Legacy BIOS Booting Mode and UEFI Booting Mode. If you plan to use UEFI Booting mode and the current System Utility is Legacy BIOS Booting Mode, then use the following procedure to switch the booting mode to UEFI Booting Mode:
  1. Navigate to the System Utility as per step 1.
  2. Select System Configuration.
  3. Select BIOS/Platform Configuration (RBSU).
  4. Select Boot Options. Click drop-down the Warning prompt appears, click OK.

    Note:

    The server reset must go through an attempt to boot before the changes apply.
  5. Select the Reboot Required dialog window. Click OK for the warning reboot window.
  6. After the reboot and you re-enter the System Utility, the Boot Options page appears.

    The Boot Mode is changed to UEFI Mode and the UEFI Optimized Boot has changed to Enabled automatically.

  7. Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
Adding a New User Account to the Server iLO 5 Interface

Note:

Ensure the pxe_install_lights_out_usr and pxe_install_lights_out_passwd fields match as provided in the hosts inventory files created using the template. For more information about inventory file preparation, see Inventory File Preparation.
  1. Navigate to the System Utility as per step 1.
  2. Select System Configuration.
  3. Select iLO 5 Configuration Utility.
  4. Select User Management → Add User.
  5. Select the appropriate permissions. For the root user set all permissions to YES. Enter root as New User Name and Login Name fields, and select Password field, press Enter key to type <password> twice.
  6. Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
Forcing PXE to Boot from the First Embedded FlexibleLOM HPE Ethernet 10Gb 2-port Adapter
During host PXE, the DHCP DISCOVER requests from the hosts must be broadcast over the 10Gb port. This step provides details to configure the broadcast to use the 10Gb ports before using the 1Gb ports. Moving the 10Gb port up in the search order speeds up the response from the host servicing the DHCP DISCOVER. Enclosure blades have 2 10GE NICs which default to being configured for PXE booting. The RMS is re-configured to use the PCI NICs using this step.
  1. Navigate to the System Utility as per step 1.
  2. Select System Configuration.
  3. Select BIOS/Platform Configuration (RBSU).
  4. Select Boot Options.
  5. Perform the following steps depending the current Boot Mode:
    1. Perform the folloqing steps if the current Boot Mode is Legacy BIOS Mode:
      1. Ensure that the following options are configured properly:
        • UEFI Optimized Boot must be set to disabled
        • Boot Order Policy must be set to Retry Boot Order Indefinitely. This means that the systems will keep trying to boot without ever going to disk.
        • Legacy BIOS Boot Order must be selected by default.
        • If Legacy BIOS Mode is not selected, then follow the "Changing from UEFI Booting Mode to Legacy BIOS Booting Mode" procedure in this section to set the configuration utility to Legacy BIOS Mode.
      2. Select Legacy BIOS Boot Order.

        This page defines the legacy BIOS boot order. This includes the list of devices from which the server will listen for the DHCP OFFER (including the reserved IPv4) after the PXE DHCP DISCOVER message is broadcast from the server.

      3. In the default view, 10Gb Embedded FlexibleLOM 1 Port 1 is at the bottom of the list in the default view. When the server begins the scan for the response, it scans down this list until it receives the response. Each NIC takes a finite amount of time before the server gives up on that NIC and attempts another in the list. Moving the 10Gb Embedded FlexibleLOM 1 Port 1 up on this list decreases the time that is required to finally process the DHCP OFFER. To move an entry, drag and drop the entry up in the list below the entry it must reside.
    2. If the current Boot Mode is UEFI BIOS Mode, then perform the following steps:
      1. Ensure that the following options are configured properly:
        • UEFI Optimized Boot must be set to enabled
        • Boot Order Policy must be set to Retry Boot Order Indefinitely. This means that the systems will keep trying to boot without ever going to disk.
        • UEFI Boot Settings must be selected by default.
      2. Click UEFI Boot Settings and select UEFI Boot Order.
      3. Move the 10 Gb Embedded FlexibleLOM 1 Port 1 entry above the 1Gb Embedded LOM 1 Port 1 entry.
  6. Select F10: Save to save and stay in the utility or select the F12: Save and Exit and to complete the current boot process.
Enabling Virtualization on a Given BareMetal Server
You can configure virtualization using the default settings or via the Workload Profiles.
  1. Verifying Default Settings
    1. Navigate to the System Configuration Utility as per step 1.
    2. Select System Configuration.
    3. Select BIOS/Platform Configuration (RBSU).
    4. Select Virtualization Options.

      This screen displays the settings for the Intel(R) Virtualization Technology (IntelVT), Intel(R) VT-d, and SR-IOV options (Enabled or Disabled). The default value for each option is Enabled.

    5. Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
Disabling RAID Configurations
  1. Navigate to the System Configuration Utility as per step 1.
  2. Select System Configuration.
  3. Select Embedded RAID 1 : HPE Smart Array P408i-a SR Gen10.
  4. Select Array Configuration.
  5. Select Manage Arrays.
  6. Select Array A (or any designated Array Configuration if there are more than one).
  7. Select Delete Array.
  8. Select Submit Changes.
  9. Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
Enabling Primary Boot Device
This section provides details to configure the primary bootable device for a given Gen10 Server. In this case, the RMS includes two devices as Hard Drives (HDDs). Some configurations can also have two Solid State Drives (SSDs). Do not select SSDs for this configuration. Only the primary bootable device is set in this procedure since RAID is disabled. The secondary bootable device remains as Not Set.
  1. Navigate to the System Configuration Utility as per step 1.
  2. Select System Configuration.
  3. Select Embedded RAID 1 : HPE Smart Array P408i-a SR Gen10.
  4. Select Set Bootable Device(s) for Legacy Boot Mode. If the boot devices are not set then it will display Not Set for the primary and secondary devices.
  5. Select Select Bootable Physical Drive.
  6. Select Port 1| Box:3 Bay:1 Size:1.8 TB SAS HP EG00100JWJNR.

    Note:

    This example includes two HDDs and two SSDs. The actual configuration can be different.
  7. Select Set as Primary Bootable Device.
  8. Select Back to Main Menu.

    This returns to the HPE Smart Array P408i-a SR Gen10 menu. The secondary bootable device is left as Not Set.

  9. Select F10: Save to save and stay in the utility or select the F12: Save and Exit to save and exit to complete the current boot process.
Configuring the iLO 5 Static IP Address

Note:

This step requires a reboot after completion.
  1. Navigate to the System Configuration Utility as per step 1.
  2. Select System Configuration.
  3. Select iLO 5 Configuration Utility.
  4. Select Network Options.
  5. Enter the IP Address, Subnet Mask, and Gateway IP Address fields provided in Installation PreFlight Checklist.
  6. Select F12: Save and Exit to complete the current boot process. A reboot is required when setting the static IP for the iLO 5. A warning appears indicating that you must wait 30 seconds for the iLO to reset. A prompt requesting a reboot appears. Select Reboot.
  7. Once the reboot is complete, you can re-enter the System Utility and verify the settings if necessary.
Configuring Top of Rack Switches

Before installing CNE on BareMetal clusters, you must configure at least two ToR switches to support CNE installation. Though CNE primarily uses Cisco Nexus C93180YC-EX switches, it allows you to use any ToR switch to support a BareMetal CNE cluster. However, it is your responsibility to configure and manage the ToR switch in your domain. This section provides an overview of the generic requirements, capabilities, and configuration of a ToR switch to support BareMetal CNE clusters.

For the procedure to configure Cisco Nexus 93180YC-EX switch, see Configuring Top of Rack 93180YC-EX Switches.

Prerequisites

Before configuring your ToR switch to support BareMetal CNE, ensure that you meet the following prerequisites:
  • You must have the network topology design that specifies the roles and connections of each switch.
  • You must have Console or SSH access to the ToR switches.
  • You must have the administrative access to configure the switches.
  • The switches must be connected as per Installation PreFlight Checklist. The customer uplinks must not be active before the outside traffic is necessary.
  • The ToR switch must support user creation for secure access to the switches.

Features Required in ToR Switches

Ensure that the following features are available in the ToR switch to support CNE installation on BareMetal:
  • Border Gateway Protocol (BGP)
  • interface-vlan
  • Link Aggregation Control Protocol (LACP)
  • Virtual Port Channel (VPC) or Intelligent Resilient Framework (IRF)
  • Virtual Router Redundancy Protocol (VRRP v3) or Hot Standby Router Protocol (HSRP)
  • Open Shortest Path First (OSPF). This feature is optional.

Configurations

This section provides information about the generic configurations that are required in the ToR switch to support CNE installation on BareMetal.
Configuring mgmt0 Port:

The mgmt0 port of the ToR switch must be assigned with an IP address within the management Virtual Routing and Forwarding (VRF). This ensures that the management traffic is routed independent of data traffic thereby enhancing security.

Configuring MTU:

Maximum Transmission Unit (MTU), is the largest size of a packet or frame that can be sent in a single transmission on a network interface. On Cisco switches, MTU settings determine the maximum size of packets that can be transmitted over the network. The default MTU size on most switches is typically 1500 bytes. This is the standard size for Ethernet frames in most network environments. If you want to use a larger MTU, ensure that your ToR switch supports larger MTU (jumbomtu) and configure the larger MTU on all interfaces (vlan, port-channel, physical interfaces).

Configuring Redundancy protocols:

HP switches use the IRF protocol to combine two or more switches into a single logical device.

Cisco Nexus switches use the vPC protocol that displays the links that are physically connected to two different Cisco Nexus devices to appear as a single port channel to a third device. This improves redundancy, load balancing, and eliminates the ports blocked by Spanning Tree Protocol (STP). The vPC protocol uses mgmt0 for vPC peer keep-alive link which is used to monitor the health of the vPC peer link. The keep-alive link sends heartbeat messages between the vPC peer switches to ensure both are operational and synchronized. This link helps prevent split-brain scenarios where both switches assume the active role due to a peer link failure.

Configuring VRRP v3 or HSRP:

VRRP v3 and HSRP protocols provide high availability and redundancy for IP routing. This is done by allowing multiple routers to work together to present a single virtual router to end devices. It enhances network reliability by ensuring continuous availability of routing paths even if one of the routers fails. Configure these protocols depending on your requirement. For more information about configuring these protocols, see "Configuring VLAN".

Configuring Object-Track:

Tracking object (Object-Track) monitors the status of the line protocol on the uplink interface. This tracking object is used for routing through VRRPv3/HSRP, based on the interface status.

Configuring VLAN
A Virtual Local Area Network (VLAN) is a broadcast domain that is partitioned and isolated in a network at the data link layer. This network ensures that the communication occurs only within the network. Configuring VLANs on a ToR switch involves the following key steps:
  1. Defining the VLAN.
  2. Configuring the VLAN interface (also known as Switched Virtual Interface (SVI)).
  3. Assigning switch ports to VLAN.
Configuring Port Channel to Server
A port channel, also known as an EtherChannel or Link Aggregation Group (LAG), is a technology used to combine multiple physical network interfaces into a single logical interface. This setup increases bandwidth, provides redundancy, and improves reliability between network devices. While configuring the port channel to server, ensure that you meet the following requirements:
  • Each RMS server must have two eNet ports. Each eNet port must be connected to a separate ToR Switch.
  • The first two RMS systems must be configured as k8s-host-1/ks8-host-2 and bastion-1/bastion-2. VLANs 2, 3, and 4 must be allowed to enable external access to the Bastion Hosts and to facilitate their communication with all other nodes and ILOs.
  • RMS3 is dedicated to k8s-host-3 and k8s-master-3, where access to VLAN 3 is sufficient. However, to ensure redundancy in the event of an issue with RMS1 or RMS2, RMS3 must also be configured to allow VLANs 2, 3, and 4.
  • All nodes starting from RMS4 are worker nodes. When configuring the worker nodes, extend the commands to all of them.
Perform the following steps to create port channel and configure physical interface into the port channel:
  • Run the following or equivalent commands to create the port-channel and allow appropriate VLANS:
    1. Allow quick convergence of the servers:
      spanning-tree port type edge trunk
    2. Allow Pre-boot Execution Environment (PXE) boot on the first Network Interface Card (NIC):
      no lacp suspend-individual
  • Run the following or equivalent commands to configure physical interface into port channel. Allow same VLANS as in the port-channel:
    1. Allow quick convergence of the servers. Configure this on both port-channel and physical ports.
      spanning-tree port type edge trunk
    2. Configure a physical interface to be part of an EtherChannel (or Port Channel) in active mode using the Link Aggregation Control Protocol (LACP):
      channel-group <groupp-id> force mode active
Configuring Interfaces to Connect to iLO or iLOMs:

Ensure that each server has the iLO or iLOM port to connect to one of the switches, with access mode configured on the switch port.

Configuring Inter-Switch Link:
The Inter-Switch Link (ISL) are used for the following purposes:
  • The vPC peer link is a special link that connects two Cisco Nexus switches configured as vPC peers. It serves as the communication backbone between the two switches, allowing them to synchronize state and configuration information. This link is essential for the operation of vPCs and ensures that both switches operate together.
  • The inter-switch link facilitates communication between two ToR switches for VRRPv3 or HSRP. This is used to advertise the link and negotiate the controller or backup relationship.
Configuring Uplink
The uplinks are the connections to the user network. The IP addresses on the uplink port must correspond to the customer switch configuration. To avoid poor traffic or loop to the user network, keep the uplink down using one of the following techniques:
  1. Do not connect the cables between the ToR switch and customer network physically.
  2. Do not run the "no shutdown" or equivalent command which enables the uplink.
Configuring BGP

Border Gateway Protocol (BGP) is used to exchange routing information between different autonomous systems. When MetalLB is used in the cluster, BGP configurations are required on the switches to access LoadBalancer IPs.

Calico (used in the cluster) provides 64512 as the default route-as number. The ToR switches must use this number to establish peer relationship with the cluster. The router ID must be unique for each switch in the connections, including the two ToR switches and the connected customer switches.

Configuring OSPF

The routing between ToR switches and customer switches can be different and it is decided based on the user network. BGP and OSPF are the most commonly used static routing protocols. Open Shortest Path First (OSPF) is a widely used Interior Gateway Protocol (IGP) which is designed for routing within an autonomous system. OSPF is based on a link-state routing algorithm, which provides fast convergence and scalability, making it suitable for large and complex network topologies.

Support for CNLB Switch

Cloud Network Load Balancer (CNLB) is a crucial component in the modern cloud architectures. CNLB enables efficient, reliable, and scalable distribution of network traffic across multiple servers or services. To support CNLB feature, the ToR switches requires the following configurations:
  • CNLB bond0 version: Add secondary IP addresses for all external subnets on private VLAN interface.
  • CNLB VLAN version: Perform the following steps to configure CNLB VLAN for each CNLB internal and external subnet:
    • Add VLAN.
    • Add VLAN interface. IPv6 is required to keep worker node interfaces up and running.
    • Add allowed VLAN on each port assigned to the worker nodes.
For sample ToR switch configurations, see Sample ToR Switch Configurations.
Configuring Top of Rack 93180YC-EX Switches

Introduction

This section provides the steps to initialize and configure Cisco 93180YC-EX switches as per your topology design.

Note:

Run all instructions in this procedure from the Bootstrap Host.

Prerequisites

Ensure that the following tasks are performed before configuration:
  1. Procedure Installation of Oracle Linux X.X on Bootstrap Host is complete.
  2. The switches are in a factory default state. If the switches are out of the box or preconfigured, run write erase and reload to factory default.
  3. The switches are connected as per the Installation PreFlight Checklist. The customer uplinks are not active before outside traffic is necessary.
  4. DHCP, XINETD, and TFTP are already installed on the Bootstrap host but not configured.
  5. Available Utility USB contains all the necessary files according to the Installation PreFlight checklist: Create Utility USB.

Limitations/Expectations

All steps are run from a Keyboard, Video, Mouse (KVM) connection.

Configuration Procedure

Following is the procedure to configure Top of Rack 93180YC-EX Switches:

  1. Use KVM to log in to the Bootstrap Host as the root user.
  2. Insert and mount the Utility USB that contains the configuration and script files. Verify the files are listed in the USB using the ls /media/usb command. To mount USB, perform steps 2 and 3 of Installation of Oracle Linux X.X on Bootstrap Host.
  3. Create bridge interface to connect both management ports and setup the management bridge to support switch initialization:

    Note:

    The names of interface 1 and interface 2 depend on the version of Linux that is being run. You can obtain the names of the interfaces by running the ip a command.
    Commands for mgmtBridge:
    $ nmcli con add con-name mgmtBridge type bridge ifname mgmtBridge
    $ nmcli con add type bridge-slave ifname <interface 1> master mgmtBridge
    $ nmcli con add type bridge-slave ifname <interface 2> master mgmtBridge
    $ nmcli con mod mgmtBridge ipv4.method manual ipv4.addresses <mgmtBridge_IP>
    $ nmcli con up mgmtBridge
    For example:
    $ nmcli con add con-name mgmtBridge type bridge ifname mgmtBridge
    $ nmcli con add type bridge-slave ifname eno5np0 master mgmtBridge
    $ nmcli con add type bridge-slave ifname eno6np1 master mgmtBridge
    $ nmcli con mod mgmtBridge ipv4.method manual ipv4.addresses 192.168.2.11/24
    $ nmcli con up mgmtBridge
    Commands for bond:
    $ nmcli con add type bond con-name bond0 ifname bond0 mode 802.3ad
    $ nmcli con add type bond-slave con-name bond0-slave-1 ifname <interface 1> master bond0
    $ nmcli con add type bond-slave con-name bond0-slave-2 ifname <interface 2> master bond0
    For example:
    $ nmcli con add type bond con-name bond0 ifname bond0 mode 802.3ad
    $ nmcli con add type bond-slave con-name bond0-slave-1 ifname eno5np0 master bond0
    $ nmcli con add type bond-slave con-name bond0-slave-2 ifname eno6np1 master bond0

    Note:

    $ nmcli con mod bond0 ipv4.method manual ipv4.addresses <bootstrap_bond0_address_with_prefix>
    $ nmcli con add con-name bond0.<mgmt_vlan_id> type vlan id <mgmt_vlan_id> dev bond0
    $ nmcli con mod bond0.<mgmt_vlan_id> ipv4.method manual ipv4.addresses <CNE_Management_IP_Address_With_Prefix> ipv4.gateway <ToRswitch_CNEManagementNet_VIP>
    $ nmcli con up bond0.<mgmt_vlan_id>
    For example:
    $ nmcli con mod bond0 ipv4.method manual ipv4.addresses 172.16.3.4/24
    $ nmcli con add con-name bond0.4 type vlan id 4 dev bond0
    $ nmcli con mod bond0.4 ipv4.method manual ipv4.addresses 10.7.5.22/28 ipv4.gateway 10.7.5.17
    $ nmcli con up bond0.4
  4. This step is applicable for Netra X8-2 server only. Due to limitation of Ethernet ports, only three out of five ports can be enabled. Therefore, there are not enough ports to configure mgmtBridge and bond0 at the same time. Connect NIC1 or NIC2 to ToR switch mgmt ports in this step to configure ToR switches. After that, reconnect the ports to Ethernet ports on the ToR switches.
    $ nmcli con add con-name mgmtBridge type bridge ifname mgmtBridge
    $ nmcli con add type bridge-slave ifname eno5np0 master mgmtBridge
    $ nmcli con add type bridge-slave ifname eno6np1 master mgmtBridge
    $ nmcli con mod mgmtBridge ipv4.method manual ipv4.addresses 192.168.2.11/24
    $ nmcli con up mgmtBridge
  5. Copy the customer Oracle Linux repo file (for example: ol9.repo) from USB to /etc/yum.repo.d directory. Move the origin to backup file (For exmaple: oracle-linux-ol9.repo, uek-ol9.repo, and virt-ol9.repo):
    $ cd /etc/yum.repos.d
    $ mv oracle-linux-ol9.repo oracle-linux-ol9.repo.bkp
    $ mv virt-ol9.repo virt-ol9.repo.bkp
    $ mv uek-ol9.repo uek-ol9.repo.bkp
    $ cp /media/usb/<central_repo>.repo ./
    
  6. Install and set up tftp server on bootstrap host:
    $ dnf install -y tftp-server tftp
    $ cp /usr/lib/systemd/system/tftp.service /etc/systemd/system/tftp-server.service
    $ cp /usr/lib/systemd/system/tftp.socket /etc/systemd/system/tftp-server.socket
     
    $ tee /etc/systemd/system/tftp-server.service<<'EOF'
    [Unit]
    Description=Tftp Server
    Requires=tftp-server.socket
    Documentation=man:in.tftpd
     
    [Service]
    ExecStart=/usr/sbin/in.tftpd -c -p -s /var/lib/tftpboot
    StandardInput=socket
     
    [Install]
    WantedBy=multi-user.target
    Also=tftp-server.socket
    EOF
  7. Enable tftp on the Bootstrap host:
    $ systemctl daemon-reload
    $ systemctl enable --now tftp-server
    tftp is active and enabled:
    $ systemctl status tftp
    $ ps -elf | grep tftp
    
  8. Install and setup dhcp server on bootstrap host:
    $ dnf -y install dhcp-server
  9. Copy the dhcpd.conf file from the Utility USB in Installation PreFlight checklist : Create the dhcpd.conf File to the /etc/dhcp/ directory.
    $ cp /media/usb/dhcpd.conf /etc/dhcp/
  10. Restart and enable dhcpd service.
    $ systemctl enable --now dhcpd
    Use the following command to verify the active and enabled state:
    $ systemctl status dhcpd
  11. Depending on the type of Load Balancer being configured (CNLB or Metallb), copy the switch configuration and script files from the Utility USB to /var/lib/tftpboot/ directory as follows:
    • If you are using MetalLB, use the following commands:
      $ cp /media/usb/93180_switchA.cfg /var/lib/tftpboot/.
          $ cp /media/usb/93180_switchB.cfg /var/lib/tftpboot/.
          $ cp /media/usb/poap_nexus_script.py /var/lib/tftpboot/.
    • If you are using CNLB, you can either use bond0 or vlan in the CNLB network configuration:
      • Example for CNLB using bond0:
         $ cp /media/usb/93180_switchA_cnlb_bond0.cfg /var/lib/tftpboot/.
            $ cp /media/usb/93180_switchB_cnlb_bond0.cfg /var/lib/tftpboot/.
            $ cp /media/usb/poap_nexus_script.py /var/lib/tftpboot/.
      • Example for CNLB using vlan:
         $ cp /media/usb/93180_switchA_cnlb_vlan.cfg /var/lib/tftpboot/.
            $ cp /media/usb/93180_switchB_cnlb_vlan.cfg /var/lib/tftpboot/.
            $ cp /media/usb/poap_nexus_script.py /var/lib/tftpboot/.
  12. To modify the POAP script file change username and password credentials used to log in to the Bootstrap host.
    
    $ vi /var/lib/tftpboot/poap_nexus_script.py
    # Host name and user credentials
    options = {
       "username": "<username>",
       "password": "<password>",
       "hostname": "192.168.2.11",
       "transfer_protocol": "scp",
       "mode": "serial_number",
       "target_system_image": "nxos.9.2.3.bin",
    }
    

    Note:

    The version nxos.9.2.3.bin is used by default. If different version is to be used, modify target_system_image with new version.
  13. Run the md5Poap.sh script from the Utility USB created from Installation PreFlight checklist: Create the md5Poap Bash Script, to modify the POAP script file md5sum as follows:
    $ cd /var/lib/tftpboot/
    $ /bin/bash md5Poap.sh
    
  14. Create the files necessary to configure the ToR switches using the serial number from the switch.


    ToR switch

    Note:

    The serial number is located on a pullout card on the back of the switch in the left most power supply of the switch. Be careful in interpreting the exact letters. If the switches are preconfigured, then you can even verify the serial numbers using show license host-id command.
  15. Depending on the type of Load Balancer, copy the switch configuration files into a new file renamed according to the switch A or B serial number:
    • For standard MetalLB, copy the /var/lib/tftpboot/93180_switchA.cfg file into a file named /var/lib/tftpboot/conf.<switchA serial number>.
    • For CNLB:
      • If you are using bond0, copy the /var/lib/tftpboot/93180_switchA_bond0.cfg file into a file named /var/lib/tftpboot/conf.<switchA serial number>.
      • If you are using vlan, copy the /var/lib/tftpboot/93180_switchA_vlan.cfg file into a file named /var/lib/tftpboot/conf.<switchA serial number>.
  16. Modify the switch specific values in the /var/lib/tftpboot/conf.<switchA serial number> file, including all the values in the curly braces as shown in the following code block:
    These values are available in Installation PreFlight checklist : ToR and Enclosure Switches Variables Table (Switch Specific) and Installation PreFlight Checklist : Complete OA and Switch IP Table. Modify these values with the following sed commands or use an editor such as vi to modify the commands.

    Note:

    The template supports 12 RMS servers. If there are less than 12 servers, then the extra configurations may not work without physical connections and will not affect the first number of servers. If there are more than 12 servers, simulate the pattern to add for more servers.
    $ sed -i 's/{switchname}/<switch_name>/' conf.<switchA serial number>
    $ sed -i 's/{admin_password}/<admin_password>/' conf.<switchA serial number>
    $ sed -i 's/{user_name}/<user_name>/' conf.<switchA serial number>
    $ sed -i 's/{user_password}/<user_password>/' conf.<switchA serial number>
    $ sed -i 's/{ospf_md5_key}/<ospf_md5_key>/' conf.<switchA serial number>
    $ sed -i 's/{OSPF_AREA_ID}/<ospf_area_id>/' conf.<switchA serial number>
     
    $ sed -i 's/{NTPSERVER1}/<NTP_server_1>/' conf.<switchA serial number>
    $ sed -i 's/{NTPSERVER2}/<NTP_server_2>/' conf.<switchA serial number>
    $ sed -i 's/{NTPSERVER3}/<NTP_server_3>/' conf.<switchA serial number>
    $ sed -i 's/{NTPSERVER4}/<NTP_server_4>/' conf.<switchA serial number>
    $ sed -i 's/{NTPSERVER5}/<NTP_server_5>/' conf.<switchA serial number>
     
    Note: If less than 5 ntp servers available, delete the extra ntp server lines such as command:
    $ sed -i 's/{NTPSERVER5}/d' conf.<switchA serial number>
     
     Note: different delimiter is used in next two commands due to '/' sign in the variables
    $ sed -i 's#{ALLOW_5G_XSI_LIST_WITH_PREFIX_LEN}#<MetalLB_Signal_Subnet_With_Prefix>#g' conf.<switchA serial number>
    $ sed -i 's#{CNE_Management_SwA_Address}#<ToRswitchA_CNEManagementNet_IP>#g' conf.<switchA serial number>
    $ sed -i 's#{CNE_Management_SwB_Address}#<ToRswitchB_CNEManagementNet_IP>#g' conf.<switchA serial number>
    $ sed -i 's#{CNE_Management_Prefix}#<CNEManagementNet_Prefix>#g' conf.<switchA serial number>
    $ sed -i 's/{CNE_Management_VIP}/<ToRswitch_CNEManagementNet_VIP>/g' conf.<switchA serial number>
     
    $ sed -i 's/{OAM_UPLINK_CUSTOMER_ADDRESS}/<ToRswitchA_oam_uplink_customer_IP>/' conf.<switchA serial number>
    $ sed -i 's/{OAM_UPLINK_SwA_ADDRESS}/<ToRswitchA_oam_uplink_IP>/g' conf.<switchA serial number>
    $ sed -i 's/{SIGNAL_UPLINK_SwA_ADDRESS}/<ToRswitchA_signaling_uplink_IP>/g' conf.<switchA serial number>
    $ sed -i 's/{OAM_UPLINK_SwB_ADDRESS}/<ToRswitchB_oam_uplink_IP>/g' conf.<switchA serial number>
    $ sed -i 's/{SIGNAL_UPLINK_SwB_ADDRESS}/<ToRswitchB_signaling_uplink_IP>/g' conf.<switchA serial number>
    $ ipcalc -n  <ToRswitchA_signaling_uplink_IP>/30 | awk -F'=' '{print $2}' 
    $ sed -i 's/{SIGNAL_UPLINK_SUBNET}/<output from ipcalc command as signal_uplink_subnet>/' conf.<switchA serial number>
     
    $ ipcalc -n  <ToRswitchA_SQLreplicationNet_IP> | awk -F'=' '{print $2}'
    $ sed -i 's/{MySQL_Replication_SUBNET}/<output from the above ipcalc command appended with prefix >/' conf.<switchA serial number>
     
    Note: The version nxos.9.2.3.bin is used by default and hard-coded in the conf files. If different version is to be used, run the following command: 
    $ sed -i 's/nxos.9.2.3.bin/<nxos_version>/' conf.<switchA serial number>
     
    Note: access-list Restrict_Access_ToR
    The following line allow one access server to access the switch management and SQL vlan addresses while other accesses are denied. If no need, delete this line. If need more servers, add similar line. 
    $ sed -i 's/{Allow_Access_Server}/<Allow_Access_Server>/' conf.<switchA serial number>
    If you are using CNLB deployment, run the following commands in addition to the commands in the previous codeblock. This is applicable for both bond0 and vlan configurations:
    $ sed -i 's#{CNLB_OAM_EXT_SwA_Address}#<CNLB_OAM_EXT_SwA_Address>#g' conf.<switchA serial number>
    $ sed -i 's#{CNLB_OAM_EXT_VIP}#<CNLB_OAM_EXT_VIP>#g' conf.<switchA serial number>
    $ sed -i 's#{CNLB_OAM_EXT_Prefix}#<CNLB_OAM_EXT_Prefix>#g' conf.<switchA serial number>
     
    $ sed -i 's#{CNLB_SIG_EXT_SwA_Address}#<CNLB_SIG_EXT_SwA_Address>#g' conf.<switchA serial number>
    $ sed -i 's#{CNLB_SIG_EXT_VIP}#<CNLB_SIG_EXT_VIP>#g' conf.<switchA serial number>
    $ sed -i 's#{CNLB_SIG_EXT_Prefix}#<CNLB_SIG_EXT_Prefix>#g' conf.<switchA serial number>
  17. Copy the /var/lib/tftpboot/93180_switchB.cfg into a /var/lib/tftpboot/conf.<switchB serial number> file:

    Modify the switch specific values in the /var/lib/tftpboot/conf.<switchA serial number> file, including: hostname, username/password, oam_uplink IP address, signaling_uplink IP address, access-list ALLOW_5G_XSI_LIST permit address, prefix-list ALLOW_5G_XSI.

    These values are available in Installation PreFlight checklist : ToR and Enclosure Switches Variables Table and Installation PreFlight Checklist : Complete OA and Switch IP Table.

    Note:

    The template supports 12 RMS servers. If there are less than 12 servers, then the extra configurations may not work without physical connections and will not affect the first number of servers. If there are more than 12 servers, simulate the pattern to add for more servers.
    $ sed -i 's/{switchname}/<switch_name>/' conf.<switchB serial number>
    $ sed -i 's/{admin_password}/<admin_password>/' conf.<switchB serial number>
    $ sed -i 's/{user_name}/<user_name>/' conf.<switchB serial number>
    $ sed -i 's/{user_password}/<user_password>/' conf.<switchB serial number>
    $ sed -i 's/{ospf_md5_key}/<ospf_md5_key>/' conf.<switchB serial number>
    $ sed -i 's/{OSPF_AREA_ID}/<ospf_area_id>/' conf.<switchB serial number>
     
    $ sed -i 's/{NTPSERVER1}/<NTP_server_1>/' conf.<switchB serial number>
    $ sed -i 's/{NTPSERVER2}/<NTP_server_2>/' conf.<switchB serial number>
    $ sed -i 's/{NTPSERVER3}/<NTP_server_3>/' conf.<switchB serial number>
    $ sed -i 's/{NTPSERVER4}/<NTP_server_4>/' conf.<switchB serial number>
    $ sed -i 's/{NTPSERVER5}/<NTP_server_5>/' conf.<switchB serial number>
     
    Note: If less than 5 ntp servers available, delete the extra ntp server lines such as command:
    $ sed -i 's/{NTPSERVER5}/d' conf.<switchB serial number>
     
    Note: different delimiter is used in next two commands due to '/' sign in in the variables
    $ sed -i 's#{ALLOW_5G_XSI_LIST_WITH_PREFIX_LEN}#<MetalLB_Signal_Subnet_With_Prefix>#g' conf.<switchB serial number>
    $ sed -i 's#{CNE_Management_SwA_Address}#<ToRswitchA_CNEManagementNet_IP>#g' conf.<switchB serial number>
    $ sed -i 's#{CNE_Management_SwB_Address}#<ToRswitchB_CNEManagementNet_IP>#g' conf.<switchB serial number>
    $ sed -i 's#{CNE_Management_Prefix}#<CNEManagementNet_Prefix>#g' conf.<switchB serial number>
    $ sed -i 's/{CNE_Management_VIP}/<ToRswitch_CNEManagementNet_VIP>/' conf.<switchB serial number>
     
    $ sed -i 's/{OAM_UPLINK_CUSTOMER_ADDRESS}/<ToRswitchB_oam_uplink_customer_IP>/' conf.<switchB serial number>
    $ sed -i 's/{OAM_UPLINK_SwA_ADDRESS}/<ToRswitchA_oam_uplink_IP>/g' conf.<switchB serial number>
    $ sed -i 's/{SIGNAL_UPLINK_SwA_ADDRESS}/<ToRswitchA_signaling_uplink_IP>/g' conf.<switchB serial number>
    $ sed -i 's/{OAM_UPLINK_SwB_ADDRESS}/<ToRswitchB_oam_uplink_IP>/g' conf.<switchB serial number>
    $ sed -i 's/{SIGNAL_UPLINK_SwB_ADDRESS}/<ToRswitchB_signaling_uplink_IP>/g' conf.<switchB serial number>
    $ ipcalc -n  <ToRswitchB_signaling_uplink_IP>/30 | awk -F'=' '{print $2}'
    $ sed -i 's/{SIGNAL_UPLINK_SUBNET}/<output from ipcalc command as signal_uplink_subnet>/' conf.<switchB serial number>
     
    Note: The version nxos.9.2.3.bin is used by default and hard-coded in the conf files. If different version is to be used, run the following command: 
    $ sed -i 's/nxos.9.2.3.bin/<nxos_version>/' conf.<switchB serial number>
     
    Note: access-list Restrict_Access_ToR
    The following line allow one access server to access the switch management and SQL vlan addresses while other accesses are denied. If no need, delete this line. If need more servers, add similar line. 
    $ sed -i 's/{Allow_Access_Server}/<Allow_Access_Server>/' conf.<switchB serial number>
    If you are using a CNLB deployment, run the following commands in addition to the commands in the previous codeblock. This is applicable for both bond0 and vlan configurations:
    $ sed -i 's#{CNLB_OAM_EXT_SwB_Address}#<CNLB_OAM_EXT_SwB_Address>#g' conf.<switchB serial number>
    $ sed -i 's#{CNLB_OAM_EXT_VIP}#<CNLB_OAM_EXT_VIP>#g' conf.<switchB serial number>
    $ sed -i 's#{CNLB_OAM_EXT_Prefix}#<CNLB_OAM_EXT_Prefix>#g' conf.<switchB serial number>
     
    $ sed -i 's#{CNLB_SIG_EXT_SwB_Address}#<CNLB_SIG_EXT_SwB_Address>#g' conf.<switchB serial number>
    $ sed -i 's#{CNLB_SIG_EXT_VIP}#<CNLB_SIG_EXT_VIP>#g' conf.<switchB serial number>
    $ sed -i 's#{CNLB_SIG_EXT_Prefix}#<CNLB_SIG_EXT_Prefix>#g' conf.<switchB serial number>
  18. Generate the md5 checksum for each conf file in /var/lib/tftpboot and copy that into a new file called conf.<switchA/B serial number>.md5.
    
    $ md5sum conf.<switchA serial number> > conf.<switchA serial number>.md5
    $ md5sum conf.<switchB serial number> > conf.<switchB serial number>.md5
    
  19. Verify that the /var/lib/tftpboot directory has the correct files.
    Ensure that the file permissions are set as follows:

    Note:

    The ToR switches constantly attempts to find and run the poap_nexus_script.py script which uses tftp to load and install the configuration files.
    
    $ ls -l /var/lib/tftpboot/
    total 1305096
    -rw-r--r--. 1 root root       7161 Mar 25 15:31 conf.<switchA serial number>
    -rw-r--r--. 1 root root         51 Mar 25 15:31 conf.<switchA serial number>.md5
    -rw-r--r--. 1 root root       7161 Mar 25 15:31 conf.<switchB serial number>
    -rw-r--r--. 1 root root         51 Mar 25 15:31 conf.<switchB serial number>.md5
    -rwxr-xr-x. 1 root root      75856 Mar 25 15:32 poap_nexus_script.py
    
  20. Enable tftp-server again and verify the status.

    Note:

    The status of tftp-server remains active for only 15 minutes after enabling the server. Therefore, you must enable the tftp-server again before configuring the switches.
    $ systemctl enable --now tftp-server
    
    Verify tftp is active and enabled:
    $ systemctl status tftp-server
    $ ps -elf | grep tftp
  21. Disable and verify the firewalId service status:
    $ systemctl stop firewalld
    $ systemctl disable firewalld
    $ systemctl status firewalld

    After completing the above steps, the ToR Switches will attempt to boot from the tftpboot files automatically.

  22. Unmount the Utility USB and remove it as follows: umount /media/usb

Verification

Following is the procedure to verify Top of Rack 93180YC-EX Switches:
  1. After the configuration of ToR switches, ping the switches from bootstrap server. The switches mgmt0 interfaces are configured with the IP addresses that are in the conf files.

    Note:

    Wait for the device to respond.
    Example to ping switch 1:
    $ ping 192.168.2.1
    Sample output:
    
    PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.
    64 bytes from 192.168.2.1: icmp_seq=1 ttl=255 time=0.419 ms
    64 bytes from 192.168.2.1: icmp_seq=2 ttl=255 time=0.496 ms
    64 bytes from 192.168.2.1: icmp_seq=3 ttl=255 time=0.573 ms
    64 bytes from 192.168.2.1: icmp_seq=4 ttl=255 time=0.535 ms
    ^C
    --- 192.168.2.1 ping statistics ---
    4 packets transmitted, 4 received, 0% packet loss, time 3000ms
    rtt min/avg/max/mdev = 0.419/0.505/0.573/0.063 ms
    Example to ping switch 2:
    $ ping 192.168.2.2
    Sample output:
    
    PING 192.168.2.2 (192.168.2.2) 56(84) bytes of data.
    64 bytes from 192.168.2.2: icmp_seq=1 ttl=255 time=0.572 ms
    64 bytes from 192.168.2.2: icmp_seq=2 ttl=255 time=0.582 ms
    64 bytes from 192.168.2.2: icmp_seq=3 ttl=255 time=0.466 ms
    64 bytes from 192.168.2.2: icmp_seq=4 ttl=255 time=0.554 ms
    ^C
    --- 192.168.2.2 ping statistics ---
    4 packets transmitted, 4 received, 0% packet loss, time 3001ms
    rtt min/avg/max/mdev = 0.466/0.543/0.582/0.051 ms
    
  2. Attempt to SSH to the switches with the username and password provided in the configuration files.
    $ ssh plat@192.168.2.1
    Sample output:
    The authenticity of host '192.168.2.1 (192.168.2.1)' can't be established.
    RSA key fingerprint is SHA256:jEPSMHRNg9vejiLcEvw5qprjgt+4ua9jucUBhktH520.
    RSA key fingerprint is MD5:02:66:3a:c6:81:65:20:2c:6e:cb:08:35:06:c6:72:ac.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added '192.168.2.1' (RSA) to the list of known hosts.
    User Access Verification
    Password:
     
    Cisco Nexus Operating System (NX-OS) Software
    TAC support: http://www.cisco.com/tac
    Copyright (C) 2002-2019, Cisco and/or its affiliates.
    All rights reserved.
    The copyrights to certain works contained in this software are
    owned by other third parties and used and distributed under their own
    licenses, such as open source.  This software is provided "as is," and unless
    otherwise stated, there is no warranty, express or implied, including but not
    limited to warranties of merchantability and fitness for a particular purpose.
    Certain components of this software are licensed under
    the GNU General Public License (GPL) version 2.0 or
    GNU General Public License (GPL) version 3.0  or the GNU
    Lesser General Public License (LGPL) Version 2.1 or
    Lesser General Public License (LGPL) Version 2.0.
    A copy of each such license is available at
    http://www.opensource.org/licenses/gpl-2.0.php and
    http://opensource.org/licenses/gpl-3.0.html and
    http://www.opensource.org/licenses/lgpl-2.1.php and
    http://www.gnu.org/licenses/old-licenses/library.txt.
  3. Verify that the running-config contains all expected configurations in the conf file using the show running-config command as follows:
    $ show running-config
    Sample output:
    
    !Command: show running-config
    !Running configuration last done at: Mon Apr  8 17:39:38 2019
    !Time: Mon Apr  8 18:30:17 2019
    version 9.2(3) Bios:version 07.64
    hostname 12006-93108A
    vdc 12006-93108A id 1
      limit-resource vlan minimum 16 maximum 4094
      limit-resource vrf minimum 2 maximum 4096
      limit-resource port-channel minimum 0 maximum 511
      limit-resource u4route-mem minimum 248 maximum 248
      limit-resource u6route-mem minimum 96 maximum 96
      limit-resource m4route-mem minimum 58 maximum 58
      limit-resource m6route-mem minimum 8 maximum 8
    feature scp-server
    feature sftp-server
    cfs eth distribute
    feature ospf
    feature interface-vlan
    feature lacp
    feature vpc
    feature bfd
    feature vrrpv3
    ....
    ....
    
  4. In case some of the above features are missing, verify license on the switches and at least NXOS_ADVANTAGE level license is in use. If the license is not installed or too low level, contact the vendor for correct license key file. Then run write erase and reload to set back to factory default. The switches will go to POAP configuration again.
    # show license
    Example output:
    
    # show license
    MDS20190215085542979.lic:
    SERVER this_host ANY
    VENDOR cisco
    INCREMENT NXOS_ADVANTAGE_XF cisco 1.0 permanent uncounted \
            VENDOR_STRING=<LIC_SOURCE>MDS_SWIFT</LIC_SOURCE><SKU>NXOS-AD-XF</SKU> \
            HOSTID=VDH=FDO22412J2F \
            NOTICE="<LicFileID>20190215085542979</LicFileID><LicLineID>1</LicLineID> \
            <PAK></PAK>" SIGN=8CC8807E6918
    # show license usage
    Example output:
    
    # show license usage
    Feature                      Ins  Lic   Status Expiry Date Comments
                                     Count
    --------------------------------------------------------------------------------
    ...
    NXOS_ADVANTAGE_M4             No    -   Unused             -
    NXOS_ADVANTAGE_XF             Yes   -   In use never       -
    NXOS_ESSENTIALS_GF            No    -   Unused             -
    ...
    #
  5. For Netra X8-2 server, reconnect the cable on mgmt ports to the Ethernet ports for RMS1, delete mgmtBridge, and configure bond0 and management vlan interface:
    $ nmcli con delete con-name mgmtBridge
     
    $ nmcli con add type bond con-name bond0 ifname bond0 mode 802.3ad
    $ nmcli con add type bond-slave con-name bond0-slave-1 ifname  eno2np0 master bond0
    $ nmcli con add type bond-slave con-name bond0-slave-2 ifname  eno3np1 master bond0
    The following commands are related to the VLAN and IP address for this bootstrap server, the <mgmt_vlan_id> is the same as in hosts.ini and the <bootstrap bond0 address> is same as ansible_host IP for this bootstrap server:
    $ nmcli con mod bond0 ipv4.method manual ipv4.addresses <bootstrap bond0 address>
    $ nmcli con add con-name bond0.<mgmt_vlan_id> type vlan id <mgmt_vlan_id> dev bond0
    $ nmcli con mod bond0.<mgmt_vlan_id> ipv4.method manual ipv4.addresses <CNE_Management_IP_Address_With_Prefix> ipv4.gateway <ToRswitch_CNEManagementNet_VIP>
    $ nmcli con up bond0.<mgmt_vlan_id>
    For example:
    $ nmcli con mod bond0 ipv4.method manual ipv4.addresses 172.16.3.4/24
    $ nmcli con add con-name bond0.4 type vlan id 4 dev bond0
    $ nmcli con mod bond0.4 ipv4.method manual ipv4.addresses <CNE_Management_IP_Address_With_Prefix> ipv4.gateway <ToRswitch_CNEManagementNet_VIP>
    $ nmcli con up bond0.4
  6. Verify if the RMS1 can ping the CNE_Management VIP.
    $ ping <ToRSwitch_CNEManagementNet_VIP>
    Sample output:
    PING <ToRSwitch_CNEManagementNet_VIP> (<ToRSwitch_CNEManagementNet_VIP>) 56(84) bytes of data.
    64 bytes from <ToRSwitch_CNEManagementNet_VIP>: icmp_seq=2 ttl=255 time=1.15 ms
    64 bytes from <ToRSwitch_CNEManagementNet_VIP>: icmp_seq=3 ttl=255 time=1.11 ms
    64 bytes from <ToRSwitch_CNEManagementNet_VIP>: icmp_seq=4 ttl=255 time=1.23 ms
    ^C
    --- 10.75.207.129 ping statistics ---
    4 packets transmitted, 3 received, 25% packet loss, time 3019ms
    rtt min/avg/max/mdev = 1.115/1.168/1.237/0.051 ms
  7. Connect or enable customer uplink.
  8. Verify if the RMS1 can be accessed from laptop. Use application such as Putty to ssh to RMS1.
    $ ssh root@<CNE_Management_IP_Address>
    Sample output:
    
    Using username "root".
    root@<CNE_Management_IP_Address>'s password:<root password>
    Last login: Mon May  6 10:02:01 2019 from 10.75.9.171
    [root@RMS1 ~]#
    

SNMP Trap Configuration

The following procedure details the steps to configure SNMP Trap:
  1. SNMPv2c Configuration.

    When SNMPv2c configuration is needed, ssh to the two switches and run the following commands:

    These values <SNMP_Trap_Receiver_Address> and <SNMP_Community_String> are from Installation Preflight Checklist.

    $ ssh <user_name>@<ToRswitchA_CNEManagementNet_IP>
    Sample output:
    
    # configure terminal
    (config)# snmp-server host <SNMP_Trap_Receiver_Address> traps version 2c <SNMP_Community_String>
    (config)# snmp-server host <SNMP_Trap_Receiver_Address> use-vrf default
    (config)# snmp-server host <SNMP_Trap_Receiver_Address> source-interface Ethernet1/51
    (config)# snmp-server enable traps
    (config)# snmp-server community <SNMP_Community_String> group network-admin
  2. To restrict the direct access to ToR switches, create IP access list and apply on the uplink interfaces. Use the following commands on ToR switches:
    $ ssh <user_name>@<ToRswitchA_CNEManagementNet_IP>
    Sample output:
    # configure terminal
    (config)#
    ip access-list Restrict_Access_ToR
      permit ip {Allow_Access_Server}/32 any
      permit ip {NTPSERVER1}/32 {OAM_UPLINK_SwA_ADDRESS}/32
      permit ip {NTPSERVER2}/32 {OAM_UPLINK_SwA_ADDRESS}/32
      permit ip {NTPSERVER3}/32 {OAM_UPLINK_SwA_ADDRESS}/32
      permit ip {NTPSERVER4}/32 {OAM_UPLINK_SwA_ADDRESS}/32
      permit ip {NTPSERVER5}/32 {OAM_UPLINK_SwA_ADDRESS}/32
      deny ip any {CNE_Management_VIP}/32
      deny ip any {CNE_Management_SwA_Address}/32
      deny ip any {CNE_Management_SwB_Address}/32
      deny ip any {SQL_replication_VIP}/32
      deny ip any {SQL_replication_SwA_Address}/32
      deny ip any {SQL_replication_SwB_Address}/32
      deny ip any {OAM_UPLINK_SwA_ADDRESS}/32
      deny ip any {OAM_UPLINK_SwB_ADDRESS}/32
      deny ip any {SIGNAL_UPLINK_SwA_ADDRESS}/32
      deny ip any {SIGNAL_UPLINK_SwB_ADDRESS}/32
      permit ip any any
     
    interface Ethernet1/51
      ip access-group Restrict_Access_ToR in
     
    interface Ethernet1/52
      ip access-group Restrict_Access_ToR in
  3. Traffic egress out of cluster (including snmptrap traffic to SNMP trap receiver) and traffic goes to signal server:
    $ ssh <user_name>@<ToRswitchA_CNEManagementNet_IP>
    Sample output:
    
    # configure terminal
    (config)#
    feature nat
    ip access-list host-snmptrap
     10 permit udp 172.16.3.0/24 <snmp trap receiver>/32 eq snmptrap log
     
    ip access-list host-sigserver
     10 permit ip 172.16.3.0/24 <signal server>/32
     
    ip nat pool sig-pool 10.75.207.211 10.75.207.222 prefix-length 27
    ip nat inside source list host-sigserver pool sig-pool overload add-route
    ip nat inside source list host-snmptrap interface Ethernet1/51 overload
     
    interface Vlan3
     ip nat inside
     
    interface Ethernet1/51
     ip nat outside
     
    interface Ethernet1/52
     ip nat outside
     
     
    Run the same commands on ToR switchB
Configuring Addresses for RMS iLOs

Introduction

This section provides the procedure to configure RMS iLO addresses and add a new user account for each RMS other than the Bootstrap Host. When the RMSs are shipped and out of the box after hardware installation and powerup, the RMSs are in a factory default state with the iLO in Dynamic Host Configuration Protocol (DHCP) mode waiting for DHCP service. DHCP is used to configure the ToR switches, OAs, Enclosure switches, and blade server iLOs. It can also be used to configure the RMS iLOs.

Note:

Skip this procedure if the iLO network is controlled by lab network or customer network that is beyond the ToR switches. The iLO network can be accessed from the bastion host management interface. Perform this procedure only if the iLO network is local on the ToR switches and iLO addresses are not configured on the servers.

Prerequisites

Ensure that the procedure Configure Top of Rack 93180YC-EX Switches has been completed.

Limitations

All steps must be run from the SSH session of the Bootstrap server.

Procedure

Following is the procedure to configure addresses for RMS iLOs:

Setting up interface on bootstrap server and find iLO DHCP address
  1. Setup the VLAN interface to access ILO subnet. The ilo_vlan_id and ilo_subnet_cidr are the same value as in hosts.ini:
    $ nmcli con add con-name bond0.<ilo_vlan_id> type vlan id <ilo_vlan_id> dev bond0
    $ nmcli con mod bond0.<ilo_vlan_id> ipv4.method manual ipv4.addresses <unique ip in ilo subnet>/<ilo_subnet_cidr>
    $ nmcli con up bond0.<ilo_vlan_id>

    Example:

    $ nmcli con add con-name bond0.2 type vlan id 2 dev bond0
    $ nmcli con mod bond0.2 ipv4.method manual ipv4.addresses 192.168.20.11/24
    $ nmcli con up bond0.2
  2. Subnet and conf file address.

    The /etc/dhcp/dhcpd.conf file is already configured as per the OCCNE Configure Top of Rack 93180YC-EX Switches procedure and DHCP started or enabled on the bootstrap server. The second subnet 192.168.20.0 is used to assign addresses for OA and RMS iLOs.

  3. Display the DHCPD leases file at /var/lib/dhcpd/dhcpd.leases. The DHCPD lease file displays the DHCP addresses for all RMS iLOs:
    $ cat /var/lib/dhcpd/dhcpd.leases
    Sample output:
    
    # The format of this file is documented in the dhcpd.leases(5) manual page.
    # This lease file was written by isc-dhcp-4.2.5
    ...
    lease 192.168.20.106 {
      starts 5 2019/03/29 18:10:04;
      ends 5 2019/03/29 21:10:04;
      cltt 5 2019/03/29 18:10:04;
      binding state active;
      next binding state free;
      rewind binding state free;
      hardware ethernet b8:83:03:47:5f:14;
      uid "\000\270\203\003G_\024\000\000\000";
      client-hostname "ILO2M2909004B";
    }
    lease 192.168.20.104 {
      starts 5 2019/03/29 18:10:35;
      ends 5 2019/03/29 21:10:35;
      cltt 5 2019/03/29 18:10:35;
      binding state active;
      next binding state free;
      rewind binding state free;
      hardware ethernet b8:83:03:47:64:9c;
      uid "\000\270\203\003Gd\234\000\000\000";
      client-hostname "ILO2M2909004F";
    }
    lease 192.168.20.105 {
      starts 5 2019/03/29 18:10:40;
      ends 5 2019/03/29 21:10:40;
      cltt 5 2019/03/29 18:10:40;
      binding state active;
      next binding state free;
      rewind binding state free;
      hardware ethernet b8:83:03:47:5e:54;
      uid "\000\270\203\003G^T\000\000\000";
      client-hostname "ILO2M29090048";
HP iLO address setup and user account setup
  1. Access RMS iLO from the DHCP address with default Administrator password. From the above dhcpd.leases file. Find the IP address for the iLO name, the default username is Administrator, the password is on the label that can be pulled out from front of server.

    Note:

    The DNS Name is on the pull-out label. Use the DNS Name on the pull-out label to match the physical machine with the iLO IP. The same default DNS Name from the pull-out label is displayed upon logging in to the iLO command line interface, as shown in the following example:
    $ ssh Administrator@192.168.20.104
    Sample output:
    
    Administrator@192.168.20.104's password:
    User:Administrator logged-in to ILO2M2909004F.labs.nc.tekelec.com(192.168.20.104 / FE80::BA83:3FF:FE47:649C)
    iLO Standard 1.37 at  Oct 25 2018
    Server Name:
    Server Power: On
  2. Create an RMS iLO new user with customized username and password.
    </>hpiLO-> create /map1/accounts1 username=root password=TklcRoot group=admin,config,oemHP_rc,oemHP_power,oemHP_vm
    
    status=0
    status_tag=COMMAND COMPLETED
    Tue Apr  2 20:08:30 2019
    User added successfully.
  3. Disable the DHCP before you are able to setup the static IP. The setup of static IP failed before DHCP is disabled.
    </>hpiLO-> set /map1/dhcpendpt1 EnabledState=NO
    status=0
    status_tag=COMMAND COMPLETED
    Tue Apr  2 20:04:53 2019
    Network settings change applied.
    Settings change applied, iLO 5 will now be reset.
    Logged Out: It may take several minutes before you can log back in.
    CLI session stopped
    packet_write_wait: Connection to 192.168.20.104 port 22: Broken pipe
  4. Setup RMS iLO static IP address.

    After the previous step, log in back with the same address (which is static IP now), and enter new username and password. Go to next step to change the IP address, if required.

    $ ssh <new username>@192.168.20.104
    Sample output:
    <new username>@192.168.20.104's password: <new password>
    User: logged-in to ILO2M2909004F.labs.nc.tekelec.com(192.168.20.104 / FE80::BA83:3FF:FE47:649C)
    iLO Standard 1.37 at  Oct 25 2018
    Server Name:
    Server Power: On
     
    </>hpiLO-> set /map1/enetport1/lanendpt1/ipendpt1 IPv4Address=192.168.20.122 SubnetMask=255.255.255.0
     
    status=0
    status_tag=COMMAND COMPLETED
    Tue Apr  2 20:22:23 2019
     
    Network settings change applied.
    Settings change applied, iLO 5 will now be reset.
    Logged Out: It may take several minutes before you can log
    back in.
      
    CLI session stopped
     
    packet_write_wait: Connection to 192.168.20.104 port 22:
    Broken pipe
  5. Setup RMS iLO default gateway.
    $ ssh <new username>@192.168.20.122
    Sample output:
    <new username>@192.168.20.122's password: <new password>
    User: logged-in to ILO2M2909004F.labs.nc.tekelec.com(192.168.20.104 / FE80::BA83:3FF:FE47:649C)
    iLO Standard 1.37 at  Oct 25 2018
    Server Name:
    Server Power: On
    
    </>hpiLO-> set /map1/gateway1 AccessInfo=192.168.20.1
    
    status=0
    status_tag=COMMAND COMPLETED
    Fri Oct  8 16:10:27 2021
    
    Network settings change applied.
    
    
    Settings change applied, iLO will now be reset.
    Logged Out: It may take several minutes before you can log back in.
    
    CLI session stopped
    Received disconnect from 192.168.20.122 port 22:11:  Client Disconnect
    Disconnected from 192.168.20.122 port 22
Netra X8-2 iLO address setup and user account setup
  1. Access RMS iLO from the DHCP address with default root password. From the above dhcpd.leases file, find the IP address for the iLO name. The default username is root and the password is changeme. At the same time, note the DNS Name on the pull-out label.

    Note:

    The DNS Name is on the pull-out label. Use the DNS Name on the pull-out label to match the physical machine with the iLO IP. The same default DNS Name from the pull-out label is displayed upon logging in to the iLO command line interface, as shown in the following example:
    Using username "root".
    Using keyboard-interactive authentication.
    Password:
     
    Oracle(R) Integrated Lights Out Manager
     
    Version 5.0.1.28 r140682
     
    Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved.
     
    Warning: password is set to factory default.
     
    Warning: HTTPS certificate is set to factory default.
     
    Hostname: ORACLESP-2117XLB00V
  2. Netra server has the default root user. To change the default password, run the following set command:
    -> set /SP/users/root password
    Enter new password: ********
    Enter new password again: ********
  3. Create an RMS iLO new user with customized username and password:
    -> create /SP/users/<username>
    Creating user...
    Enter new password: ****
    
    create: Non compliant password. Password length must be between 8 and 16 characters.
    Enter new password: ********
    Enter new password again: ********
    Created /SP/users/<username>
  4. Setup RMS iLO static IP address.

    After the previous step, log in with the same address (which is a static IP now), and the new username and password. If not using the same address, go to next step to change the IP address:

    1. Check the current state before change:
      # show /SP/network
    2. Set command to configure:
      # set /SP/network state=enabled|disabled ipdiscovery=static|dhcp
      ipaddress=value ipgateway=value ipnetmask=value
      Example command:
      # set /SP/network state=enabled ipdiscovery=static
      ipaddress=172.16.9.13 ipgateway=172.16.9.1 ipnetmask=255.255.255.0
  5. Commit the changes to implement the updates performed:
    # set /SP/network commitpending=true
Generating SSH Key on Oracle Servers with Oracle ILOM

This section provides the procedures to generate a new SSH key on an Oracle X8-2 and Oracle X9-2 server using the Oracle Integrated Lights Out Manager (ILOM) web interface. The new SSH key is created at the service level and has a length of 3072 bits. It is automatically managed by the firmware.

Firmware Update for Oracle X8-2 Server:
The following firmwares are updated in Oracle X8-2 server. This update brings enhancements to the management and operation of the system. For more details, contact My Oracle Support.
  • Oracle ILOM: Version 5.1.3.20, revision 153596
  • BIOS (Basic Input/Output System): Version 51.11.02.00
Firmware Update for Oracle X9-2 Server:

Oracle X9-2 server is compatible with firmware 5.1. For more details, contact My Oracle Support.

Prerequisites

Before craeting a new SSH key, ensure that you have the necessary access permissions to log in to the Oracle ILOM web interface.

Generating SSH Key on Oracle X8-2 Server Using iLOM
  1. Open a web browser and access the Oracle ILOM user interface by entering the corresponding IP address in the address bar.
  2. Enter your login credentials for Oracle ILOM.
  3. Perform the following steps to generate the SSH key:
    1. Navigate to the SSH or security configuration section in the following path: ILOM Administration → Managament Access → SSH Server
    2. Click Generate Key to generate a new SSH key.

      The system generates a new SSH key of 3072 bits length.

    3. Run the following command on the CLI to validate the generated key:
      -> show -d properties /SP/services/ssh/keys/rsa
      Sample output:
      /SP/services/ssh/keys/rsa
          Properties:
              fingerprint = 53:66:65:85:45:ba:4e:63:2d:aa:ab:8b:ef:fa:95:ac:9e:17:8e:92
              fingerprint_algorithm = SHA1
              length = 3072

      Note:

      • The length of the SSH key is managed by the firmware and set to 3072 bits. There are no options to configure it to 1024 or 2048 bits.
      • Ensure that the client's configuration is compatible with 3072 bit SSH keys.
    4. After making the changes to SSH keys or user configuration in the ILOM web interface, log out from Oracle ILOM and then log back in. This applies the changes without having to restart the entire ILOM.
    5. [Optional]: You can also restart ILOM using the ILOM command line interface by running the following command. This command applies any configuration changes that you've made and initiates a restart of the ILOM:
      -> reset /SP
Generating SSH Key on Oracle X9-2 Server Using iLOM

Use the following properties and commands in the Oracle ILOM CLI to configure and manage SSH settings on the X9-2 server. Refer to the specific documentation for Oracle ILOM version 5.1 on the X9-2 server for any updates or changes to the commands (https://docs.oracle.com/en/servers/management/ilom/5.1/admin-guide/modifying-default-management-access-configuration-properties.html#GUID-073D4AA6-E5CC-45B5-9CF4-28D60B56B548).

The following list provides details about the configurable target and user role requirements:
  • CLI path: /SP/services/ssh
  • Web path: ILOM Administration > Management Access > SSH Server > SSH Server Settings
  • User Role: admin(a). Required for all property modifications.

Table 2-4 SSH Configuration Properties

Property Description
State

Parameter: state=

Description: Determines whether the SSH server is enabled or disabled. When enabled, the SSH server uses the server side keys to allow remote clients to securely connect to the Oracle ILOM SP using a command-line interface. On disabling or restarting, the SSH server automatically terminates all connected SP CLI sessions over SSH.

Default Value: Enabled

CLI Syntax:
-> set /SP/services/ssh state=enabled|disabled

Note: If you are using a web interface, the changes you made to the SSH Server State in the web interface takes effect in Oracle ILOM only after clicking Save. Restarting the SSH server is not required for this property.

Restart Button

Parameter: restart_sshd_action=

Description: This property allows you to restart the SSH server by terminating all connected SP CLI sessions and activating the newly generated server-side keys.

Default Value: NA

Available Options: True, False

CLI Syntax:
-> set /SP/services/ssh restart_sshd_action=true
Generate RSA Key Button

Parameter: generate_new_key_type=rsa generate_new_key_action=true

Description: This property provides the ability to generate a new RSA SSH key. This action is used for creating a new key pair for SSH authentication.

Default Value: NA

CLI Syntax:
-> set /SP/services/ssh generate_new_key_type=rsa generate_new_key_action=true

Note:

  • Periodic firmware updates for Oracle ILOM are crucial. Regularly check for updates to access the new features, improvements, or security enhancements in the Firmware Downloads and Release History for Oracle Systems page.
  • Verify that the clients connecting to the Oracle X8-2 server support 3072 bit SSH keys.
  • For detailed information about SSH key generation and management in your specific environment, refer to the official Oracle ILOM documentation.

Installing Bastion Host

This section describes the use of Installer Bootstrap Host to provision RMS2 with an operating system and creating VM guest to fulfill the role of Bastion Host. After the Bastion Host is created, it is used to complete the installation of CNE.

Provisioning Second Kubernetes Host (RMS2) from Installer Bootstrap Host (RMS1)

Table 2-5 Terminology used in Procedure

Name Description
bastion_full_name This is the full name of the Bastion Host as defined in the hosts.ini file.

Example: bastion-2.rainbow.lab.us.oracle.com

bastion_kvm_host_full_name This is the full name of the KVM server (usually RMS2/db-2) that hosts the Bastion Host VM.

Example: k8s-host-2.rainbow.lab.us.oracle.com

bastion_short_name This is the name of the Bastion Host derived from the bastion_full_name up to the first ".".

Example: bastion-2

bastion_external_ip_address This is the external address for the Bastion Host

Example: 10.75.148.5 for bastion-2

bastion_ip_address

This is the internal IPv4 "ansible_host" address of the Bastion Host as defined within the hosts.ini file.

Example: 172.16.3.100 for bastion-2

cluster_full_name This is the name of the cluster as defined in the hosts.ini file field: occne_cluster_name.

Example: rainbow.us.lab.us.oracle.com

cluster_short_name This is the short name of the cluster derived from the cluster_full_name up to the first ".".

Note:

Following are the specifications for cluster_short_name value:
  • only lowercase letters and numbers are allowed.
  • underscore and spaces are not allowed.

Example: rainbow

Note:

  • Setup the Bootstrap Host to use root/<customer_specific_root_password> as the credentials to access it. For the procedure to configure the user and password, see Installation of Oracle Linux X.X on Bootstrap Host.
  • The commands and examples in this procedure assume that Bastion Host is installed on Oracle Linux 9. The procedure vary for other versions.

Procedure

Following is the procedure to install Bastion Host:
  1. Run the following commands to create a user (<user-name>) and edit the sudoers file for no-password sudo.

    Note:

    Skip this step and proceed to the next step if you are installing CNE on servers other than HP Gen10 and Oracle X as the user creation is already taken care in the Prerequisites for Servers Other than HP and Oracle X.
    $ groupadd <user-name>
    $ useradd -g <user-name> <user-name>
    $ passwd <user-name>
    <Enter new password twice>    
    
    $ usermod -aG wheel <user-name>
    $ echo "%<user-name> ALL=(ALL) NOPASSWD: ALL" | tee -a /etc/sudoers
  2. Log in as <user-name> with the newly created password and perform the following steps in this procedure as a <user-name>.
  3. Set the cluster_short_name for use in the bootstrap environment, and load it into the current environment. Enter the user name when prompted.
    $ echo 'export OCCNE_CLUSTER=<cluster_short_name>' | sudo tee -a /etc/profile.d/occne.sh
    $ echo 'export OCCNE_USER=<user-name>' | sudo tee -a /etc/profile.d/occne.sh
    $ source /etc/profile.d/occne.sh
    Example:
    $ echo 'export OCCNE_CLUSTER=rainbow' | sudo tee -a /etc/profile.d/occne.sh
    $ echo 'export OCCNE_USER=admusr' | sudo tee -a /etc/profile.d/occne.sh
    $ source /etc/profile.d/occne.sh

    Note:

    After running this step, the bash variable references such as ${OCCNE_CLUSTER} expands to the cluster_short_name in this shell and subsequent ones.
  4. Configure the central repository access on Bootstrap by performing the following steps:
    1. Mount the Utility USB. For information about mounting a USB in Linux, see Installation of Oracle Linux X.X on Bootstrap Host.
    2. Create the cluster specific directory:
      $ sudo mkdir -p -m 0750 /var/occne/cluster/${OCCNE_CLUSTER}/
      $ sudo chown -R ${OCCNE_USER}:${OCCNE_USER} /var/occne/
    3. Configure the central repository access for Bootstrap by performing the Configuring Central Repository Access on Bootstrap procedure.
  5. Copy the OLx ISO to the Installer Bootstrap Host:

    The iso file must be accessible from a Customer Site Specific repository. This file must be accessible because the ToR switch configurations were completed in procedure: Configure Top of Rack 93180YC-EX Switches

    Create the /var/occne/os/ directory and copy the OLX ISO file to the directory. The following example uses OracleLinux-R9-U5-x86_64-dvd.iso. If this file was copied to the utility USB, it can be copied from the utility USB into the same directory on the Bootstrap Host.

    Note:

    If the user copies this ISO from their laptop then they must use an application like WinSCP pointing to the Management Interface IP.
    $ mkdir /var/occne/os
    $ scp <usr>@<site_specific_address>:/<path_to_iso>/OracleLinux-R9-U5-x86_64-dvd.iso /var/occne/os/
  6. Install packages onto the Installer Bootstrap Host: Use DNF to install podman, httpd, and sshpass onto the installer Bootstrap Host:
    $ sudo dnf install -y podman httpd sshpass
  7. Setup HTTPD on the Installer Bootstrap Host: Run the following commands to mount ISO and enable httpd service.

    Note:

    Before running the following commands, ensure that httpd is already installed in step 7 and the OLX ISO file is named OracleLinux-R9-UX-x86_64-dvd.iso.
    $ sudo mkdir -p -m 0755 /var/www/html/occne/pxe
    $ sudo mkdir -p -m 0755 /var/www/html/os/OL9
    $ sudo mount -t iso9660 -o loop /var/occne/os/OracleLinux-R9-UX-x86_64-dvd.iso /var/www/html/os/OL9
    $ sudo systemctl enable --now httpd
  8. Disable SELINUX:
    1. Set SELINUX to permissive mode. To successfully set the SELINUX mode, a reboot of the system is required. The getenforce command is used to determine the status of SELINUX.
      $ getenforce
      Enforcing
    2. If the output of this command does not display Permissive or Disabled, change it to Permissive or Disabled by running the following command (This step must be redone if the system reboots before the installation process completes):
      $ sudo setenforce 0
  9. Run the following commands on Bootstrap Host to generate the SSH private and public keys on Bootstrap Host. These keys are passed to the Bastion Host and used to communicate to other nodes from that Bastion Host.

    Note:

    • Do not supply a passphrase when the system asks for one and click Enter.
    • The private key (occne_id_rsa) must be copied to a server that is going to access the Bastion Host because the Bootstrap Host is repaved. This key is used later in the procedure to access the Bastion Host after it has been created. The user can also use an SSH client like Putty (keyGen and Pagent) to generate their own key pair and place the public key into the Bastion Host authorized_keys file to provide access. Pagent can also be used to convert the occne_id_rsa .pem key format to .ppk format using putty to access the Bastion Host.
    $ mkdir -p -m 0700 /var/occne/cluster/${OCCNE_CLUSTER}/.ssh
    $ ssh-keygen -b 4096 -t rsa -C "occne installer key" -f "/var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa" -q -N ""
    $ mkdir -p -m 0700 ~/.ssh
    $ cp /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa ~/.ssh/id_rsa
    If you are installing CNE on HP Gen10 or Oracle X, then the public key is passed to each node during OS installation. However, if you are installing CNE on other servers, copy the public key to the rest of the cluster nodes that already have an OS installed by performing the following step.
  10. If you are installing CNE on servers other than HP Gen10 or Oracle X, ensure that you are able to run SSH commands on other node from the bootstrap successfully without typing in passwords for SSH login or sudo access:

    Note:

    Skip this step if you are installing CNE on HP Gen10 or Oracle X.
    for NODE_ADDR in ${IP1} ${IP2} ... ; do
       echo Copy Key to $NODE_ADDR
       ssh $NODE_ADDR "sudo hostname"
    done

Performing Automated Installation

This section details the steps to run the automated configuration of the Bastion Host VM.

This procedure involves two stages:
  1. Setting up and running the deploy.sh script on the Bootstrap Host.
  2. Accessing the Bastion Host and implementing the final commands to run the pipeline.sh script to complete the Bastion Host configuration and deploy the CNE cluster.
To run the automated configuration of the Bastion Host VM:
  1. Set up and run the deploy.sh script on the Bootstrap Host:
    The deploy.sh script performs the initial configuration of the Bastion host. This includes installing the OS on the bastion and its kvm-host, populating the Bastion with repositories, and verifying that everything is up to date. The script is run on the Bootstrap Host using a set of environment variables that can be initialized on the command line along with the deploy.sh script. These variables include the following:

    Table 2-6 Environmental Variables

    Name Comment Example usage
    OCCNE_BASTION The full name of Bastion Host OCCNE_BASTION=bastion-2.rainbow.us.labs.oracle.com
    OCCNE_VERSION The version tag of the image releases OCCNE_VERSION=25.2.1xx
  2. Copy necessary files from CNE provision container and configure deployment:

    Note:

    If you want to save any of the files after configuration for future use, then copy those files from Bootstrap Host.
    1. Run the following command to set the CNE version:
      $ export OCCNE_VERSION=25.2.1xx
    2. Depending on the type of Load Balancer (MetalLB or CNLB) you want to use for traffic segregation, use one of the following command to run the podman to copy all necessary files to configure and run BareMetal deployment:

      Note:

      Ensure that you copy and run the command as such with out making any changes.
      1. Run the following command for MetalLB:
        $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in config_files/. scripts/. ../common/scripts/. ../../common/scripts/. ; do cp -r /platform/bare_metal/metallb/"$source" /host; done'
      2. There are two sets of files that must be copied to the Bootstrap Host. These include the common scripts and those needed that are contained within the CNLB installer:

        Run the following commands for CNLB:

        1. Files copied from the provision container:
          $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in switch_install cluster/. templates/. scripts/. ../common/scripts/. ../../common/scripts/. ../common/config_files/. ; do cp -r /platform/bare_metal/cnlb/"$source" /host; done'
        2. Files copied from the CNLB installer container:
          $ mkdir /var/occne/cluster/${OCCNE_CLUSTER}/installer
          $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/cnlb_installer:${OCCNE_VERSION} -c 'for source in installer/validateCnlbIni.py installer/cnlb_logger.py installer/utils.py; do cp -r "$source" /host/installer/; done'
    3. Copy the hosts_sample.ini or hosts_sample_remoteilo.ini file to the cluster directory as hosts.ini file and configure the file by performing the CNE Inventory File Preparation procedure.
      $ cp /var/occne/cluster/${OCCNE_CLUSTER}/hosts_sample.ini /var/occne/cluster/${OCCNE_CLUSTER}/hosts.ini

      Note:

      If you are installing CNE on servers other than HP Gen10 or Oracle X, when you are configuring hosts.ini, ensure that all systems that already have an OS preprovisioned have the pre_provisioned attribute set to True appended to their inventory declaration line.
      For example:
      [host_baremetal]
      k8s-host-1.airraid.lab.us.oracle.com ansible_host=172.16.3.4 oam_host=10.148.217.4 pre_provisioned=True
      k8s-host-2.airraid.lab.us.oracle.com ansible_host=172.16.3.5 oam_host=10.148.217.5 pre_provisioned=True
      k8s-host-3.airraid.lab.us.oracle.com ansible_host=172.16.3.6 pre_provisioned=True
    4. Verify the occne_repo_host_address filed in the hosts.ini file and check if the bond0 IP address is configured as per Configuring Top of Rack 93180YC-EX Switches. If not, modify the file to configure the required value:
      $ vi /var/occne/cluster/${OCCNE_CLUSTER}/hosts.ini
      Sample output:
      occne_repo_host_address = <bootstrap server bond0 address>
  3. Depending on the type of Load Balancer (MetalLB or CNLB) you want to use for traffic segregation, use one of the following steps to configure the MetalLB or CNLB configuration file:
    • If you want to use MetalLB for traffic segregation, configure the MetalLB configuration file by performing the Populate the MetalLB Configuration File procedure.
    • If you want to use CNLB for traffic segregation, configure the CNLB configuration file by performing the following steps:
      1. Copy the cnlb.ini.template file to the cluster directory as cnlb.ini:
        $ cp /var/occne/cluster/${OCCNE_CLUSTER}/cnlb.ini.template /var/occne/cluster/${OCCNE_CLUSTER}/cnlb.ini
      2. Configure CNLB by performing the Configuring Cloud Native Load Balancer (CNLB) procedure.
  4. Run the deploy.sh script from the cluster directory by setting the required parameters:
    $ cd /var/occne/cluster/${OCCNE_CLUSTER}
    $ export OCCNE_BASTION=<bastion_full_name>
    $ ./deploy.sh
    For example:
    $ cd /var/occne/cluster/${OCCNE_CLUSTER}
    $ export OCCNE_BASTION=bastion-2.rainbow.lab.us.oracle.com
    $ ./deploy.sh

    Note:

    The release version defaults to the current GA release. If you want to use a different version, specify the version by adding the OCCNE_VERSION=<release> variable to the command line.
  5. Run the following commands from the Bastion Host to complete the Bastion Host configuration and deploy CNE on the BareMetal system.

    Note:

    The Bootstrap Host cannot be used to access the Bastion Host as it is repaved from running this command.
    1. Log in to the Bastion Host as <user-name>.
    2. Use the private key that was saved earlier to access the Bastion Host from a server other than the Bootstrap Host.
      $ ssh <user-name>@<bastion_external_ip_address>
    3. Copy this private key to the /home/<user-name>/.ssh directory on that server as id_rsa using SCP or winSCP from a desktop PC. Set the permissions of the key to 0600 using the following command:
      chmod 0600 ~/.ssh/id_rsa
    4. Customize the common services and cnDBTier installation if required by using the Common Installation Configuration section.
    5. Run the following command to complete the deployment of CNE from the Bastion Host (excluding re-install on the Bastion Host and its KVM host, which are already setup). This action will repave the Bootstrap Host RMS.
      $ /var/occne/cluster/${OCCNE_CLUSTER}/artifacts/pipeline.sh

    Note:

    The release version defaults to the current GA release. If you want to use a different version, specify the version by adding the OCCNE_VERSION=<release> variable to the command line.

Installing BareMetal CNE using Bare Minimum Servers

This section provides the procedure to configure and install CNE on a BareMetal deployment using bare minimum servers (three worker nodes).

Prerequisites

Before performing this procedure, ensure that you meet the following prerequisites:
  • This procedure must be followed only for a fresh installation of BareMetal CNE with minimal resources.
  • Ensure that you have performed all the Preinstallation Tasks.
  • Ensure that you have read and understood all the instructions provided in the BareMetal Installation section.
  • Ensure that you have configured the worker nodes with at least the following resources:
    • vCPU: 32 GB
    • RAM: 65 GB
    • Disk: 80 GB
  • Ensure that you have configured the controller nodes with at least the following resources:
    • vCPU: 2 GB
    • RAM: 7.5 GB
    • Disk: 40 GB

Procedure

  1. Run the following command to edit the hosts.ini file:
    $ vi /var/occne/cluster/${OCCNE_CLUSTER}/hosts.ini
  2. Add the opensearch_data_replicas_count variable under the [occne:vars] section and set the value to 3. The value indicates the number of controller and worker nodes:
    ...
     
    [occne:vars]
     
    ...
      
    opensearch_data_replicas_count=3
     
    [openstack:vars]
     
    ...
  3. Follow the Baremetal installation procedure and run the deploy.sh script when indicated.

Virtualized CNE Installation

This section explains how to install CNE in an OpenStack Environment and VMware cloud director Environment.

Note:

Before installing CNE on vCNE, you must complete the preinstallation tasks.

Installing vCNE on OpenStack Environment

This installation procedure details the steps necessary to configure and install an CNE cluster in an OpenStack Environment. Currently, there are two disk configurations that are supported for installing vCNE on OpenStack:
  • Custom Configurable Volumes (CCVs) as block devices for each VM resource: The Custom Configurable Volumes (CCVs) allows the customers to configure the volume cinder type on their OpenStack cloud. These configuration can be used to create the hard disk for each VM resource.
  • Non-Configureable volumes that are created by Openstack using the flavors assigned to each VM resource. This is a standard configuration.
CNE supports the following Load Balancers for traffic segregation:
  • Load Balancer VM (LBVM)
  • Cloud Native Load Balancer (CNLB)
You can choose the type of Load Balancer used for traffic segregation while installing CNE on OpenStack. The predeployment configurations for CNE installation vary depending on the type of Load Balancer you choose. Therefore, read the Predeployment Configuration for OpenStack procedure carefully and choose the configurations depending on your Load Balancer type.
Prerequisites

Before installing and configuring CNE on OpenStack, ensure that the following prerequisites are met.

  1. You must have access to an existing OpenStack environment and OpenStack Dashboard (web-based user interface).
  2. Ensure that the Nova, Neutron, and Cinder modules are configured. The OpenStack environment uses these modules to manage compute, networking, and storage, respectively.
  3. Ensure that the OpenStack environment is configured with appropriate resource flavors and network resources for resource allocation to the VMs.
  4. The DHCP Enabled value for OpenStack subnet in each MetalLB pool must be set as "Yes".
  5. Ensure that all the required images, binaries, and files are downloaded from the Oracle OSDC before running this procedure. Ensure that these resources are available for use in this procedure. For instructions on how to generate the lists of images and binaries, see the Artifact Acquisition and Hosting chapter.
  6. You must obtain a public key (that can be configured) for logging into the Bootstrap Host. Before running this procedure, you must place the public key into the customer's OpenStack Environment as follows:

    Use the Import Key tab on the Launch Instance→Key Pair dialog or via the Compute→Access and Security screen.

  7. Ensure that there is a default or custom security group which allows the Bootstrap instance to reach the central repository.
Expectations
  1. You must be familiar with the use of OpenStack as a virtualized provider including the use of the OpenStack Client and OpenStack Dashboard.

    Note:

    The Bootstrap host doesn't provide OpenStack CLI tools. Therefore, you must use OpenStack Dashboard or an external Linux instance with access to OpenStack CLI to fetch the data or values from OpenStack.
  2. You must make a central repository available for all resources like images, binaries, and helm charts before running this procedure.
  3. The default user on installation is "cloud-user" and this is still recommended. However, starting 23.4.0, it is possible to define a different default user. This user will be used to access the VMs, configure, run tasks, and manage the cluster.

    Note:

    You can change the default user only during installation and not during an upgrade. When you change the default user, ensure that you use the changed user to run all the commands in the entire procedure. Refrain from using the root user.
  4. You must define all the necessary networking protocols, such as using fixed IPs or floating IPs, for use on the OpenStack provider.
  5. If you select Custom Configurable Volumes (CCVs), the size defined for each volume must match the size of the disk as defined in the flavor used for the given VM resource.
  6. When using CCV, you must be fully aware of the volume storage used. If there is insufficient volume storage on the cloud on which CNE is deployed, the deployment will fail while applying the Terraform (MetalLB) or OpenTofu (CNLB).
Downloading OLX Image

Download the OLX image by following the procedure in the Downloading Oracle Linux section.

Note:

The letter 'X' in Oracle Linux X or OLX in this procedure indicates the latest version of Oracle Linux supported by CNE.
Uploading Oracle Linux X to OpenStack

This procedure describes the process to upload the qcow2 image, that is obtained using the Downloading OLX Image procedure, to an OpenStack environment.

Note:

Run this procedure from the OpenStack Dashboard.
  1. Log in to OpenStack Dashboard using your credentials.
  2. Select Compute → Images.
  3. Click the +Create Image button. This displays the OpenStack Create Image dialog.
  4. In the OpenStack Create Image dialog, enter a name for the image.

    Use a name similar to the name of the qcow2 image at the time of download. It's recommended to include at least the OS version and the update version as part of the name. For example: ol9u5.

  5. Under Image Source, select File. This enables the File* → Browse button to search for the image file.

    Click the File* → Browse button to display the Windows Explorer dialog.

  6. From the Windows dialog, select the qcow2 image that is downloaded in the Downloading OLX Image section. This inserts the file name and set the Format option to QCOW2 - QEMU Emulator automatically.

    Note:

    If Format isn't set automatically, use the drop-down to select QCOW2 - QEMU Emulator.
  7. Retain the default values for the other options. However, you can adjust the Visibility and Protected options according to your requirement.
  8. Click the Create Image button at the bottom right corner of the dialog. This starts the process to upload image.

    It takes a while for the system to complete uploading the image. During the upload process, the system doesn't display any progress bar or final confirmation.

  9. Navigate to Compute → Images to verify the uploaded image.
Creating Bootstrap Host Using OpenStack Dashboard
This section describes the procedures to create Bootstrap host using OpenStack Dashboard. There are two ways to create Bootstrap host of which you can select one depending on your requirement:

Note:

  • A separate bootstrap image (qcow2) isn't required and not provided as part of the artifacts. The Bootstrap VM is created as an instance of a regular base OS image, similar to the other VM instances on the cluster.
  • Use the following examples for reference only. The actual values differ from the example values.
  • Perform the following procedures manually on the customer specific OpenStack environment.

Creating Bootstrap Host Using Nonconfigurable Volumes

The Bootstrap host drives the creation of virtualized cluster using Terraform (MetalLB) or OpenTofu (CNLB), OpenStack Client, and Ansible Playbooks.

Note:

These tools are installed as part of the previous step and are not bundled along with the base image.
  1. Log in to the OpenStack Dashboard using your credentials.
  2. Select Compute→Instances.
  3. Select the Launch Instance button on the upper right. A dialog box appears to configure a VM instance.
  4. In the dialog box, enter a VM instance name. For example, occne-<cluster-name>-bootstrap. Retain the Availability Zone and Count values as is.
  5. Perform the following steps to select Source from the left pane:

    Note:

    There can be a long list of available images to choose from. Ensure that you choose the correct image.
    1. Ensure that the Select Boot Source drop-down is set to Image.
    2. Enter the OLX image name that you created using the Uploading Oracle Linux X to OpenStack procedure. You can also use the Available search filter to search for the required image. For example, ol9u5.
    3. Enter occne-bootstrap in the Available filter. This displays the occne-bootstrap-<x.y.z> image uploaded earlier.

      Note:

      Do not use a Bootstrap image from any earlier versions of CNE.
    4. Select the OLX Image image by clicking "↑" on the right side of the image listing. This adds the image as the source for the current VM.
  6. Perform the following steps to select Flavor from the left pane:
    1. Enter a string (not case-sensitive) which best describes the flavor that is used for this customer specific OpenStack Environment in the Available search filter. This reduces the list of possible choices.
    2. Select the appropriate customer specific flavor (for example, OCCNE-Bootstrap-host) by clicking "↑" on the right side of the flavor listings. This adds the resources to the Launch Instance dialog.

      Note:

      The Bootstrap requires a flavor that includes a disk size of 40GB or higher and RAM size must be 8GB or higher.
  7. Perform the following steps to select Networks from the left pane:
    1. Enter the appropriate network name as defined by the customer with the OpenStack Environment (for example, ext-net) in the Available search filter. This reduces the list of possible choices.
    2. Select the appropriate network by clicking "↑" on the right side of the network listings. This adds the external network interface to the Launch Instance dialog.
  8. Perform the following step to select Key Pair from the left pane. This dialog assumes you have already uploaded a public key to OpenStack. For more information, see Prerequisites:
    • Choose the appropriate key by clicking "↑" on the right side of the key pair listings. This adds the public key to the authorized_keys file on the Bootstrap Host.
  9. Select Configuration from the left pane. This screen allows you to add configuration data that is used by cloud-init to set on the VM, the initial username, and hostname or FQDN additions to the /etc/hosts file.
    Copy the following configuration into the Customization Script text box:

    Note:

    • Ensure that the fields marked as <instance_name_from_details_screen> are updated with the instance name provided as per step 4 in this procedure.
    • Ensure that the <user-name> field is updated. The recommended value for this field is "cloud-user"
    #cloud-config
       hostname: <instance_name_from_details_screen>
       fqdn: <instance_name_from_details_screen>
       system_info:
         default_user:
           name: <user-name>
           lock_passwd: false
       write_files:
         - content: |
             127.0.0.1  localhost localhost4 localhost4.localdomain4 <instance_name_from_details_screen>
             ::1        localhost localhost6 localhost6.localdomain6 <instance_name_from_details_screen>
           path: /etc/hosts
           owner: root:root
           permissions: '0644'
  10. Select Launch Instance at the bottom right of the Launch Instance window. This initiates the creation of the VM. After the VM creation process is complete, you can see the VM instance on the Compute→Instances screen.

Creating Bootstrap Host Using Custom Configurable Volumes (CCV)

The CCV deployment includes additional steps to create the volume for the Bootstrap host prior to creating the Bootstrap host instance.

Note:

The Bootstrap host drives the creation of virtualized cluster using terraform (MetalLB) or OpenTofu (CNLB), OpenStack Client, and Ansible Playbooks.

To create Bootstrap host using Custom Configurable Volumes (CCV), perform the following steps manually on the customer specific OpenStack environment:

Creating Custom Configurable Volume (CCV)

Note:

Make sure you know the volume size defined by the flavor that is used to create the Bootstrap host. This information is required to create the Bootstrap host volume.
  1. Login to the OpenStack Dashboard using your credentials.
  2. Select Compute→Volumes.
  3. Select the + Create Volume button on the top right.

    The system displays a dialog box to configure the volume.

  4. Enter the volume name. Example: occne-boostrap-host.
  5. Enter a description. Example: Customer Configurable Volume for the Bootstrap host.
  6. From the Volume Source drop-down list, select image.
  7. From the OLX image name that you created using the Uploading Oracle Linux X to OpenStack procedure. For example, ol9u5.

    Note:

    Do not use a Bootstrap image from any earlier versions of CNE.
  8. From the Type drop-down list, select the image type. Example: nfs or as configured on the cloud.
  9. In the Size (GiB) field, enter the size of the image. This size must match the size defined for the Bootstrap host flavor that is used.
  10. From the Availability Zone drop-down list, select an availability zone.
  11. From the Group drop-down list, select No group.
  12. Click Create Volume.

    The system creates the Custom Configurable Volume (CCV) that is used to create the Bootstrap host VM.

Creating Bootstrap Host VM
  1. Log in to the OpenStack Environment using your credentials.
  2. Select Compute→Instances.
  3. Select the Launch Instances tab on the top right. A dialog box appears to configure a VM instance.
  4. In the dialog box, enter a VM instance name. For example, occne-<cluster-name>-bootstrap. Retain Availability Zone and Count values as is.
  5. Perform the following steps to select Source from the left pane:

    Note:

    There can be a long list of available images to choose from. Ensure that you choose the correct image.
    1. Ensure that the Select Boot Source drop-down is set to Volume.
    2. Ensure that the Delete Volume on Instance Delete is set to Yes.
    3. Enter the volume name in the Available search filter. The system displays the volume that you created in the previous section. For example, occne-boostrap-host.
    4. Select the volume by clicking "↑" on the right side of the volume listing. This adds the volume as the source for the current VM.
  6. Perform the following steps to select Flavor from the left pane:
    1. Enter a string (not case-sensitive) which best describes the flavor that is used for this customer specific OpenStack Environment in the Available search filter. This reduces the list of possible choices.
    2. Select the appropriate customer specific flavor (for example, OCCNE-Bootstrap-host) by clicking "↑" on the right side of the flavor listings. This adds the resources to the Launch Instance dialog.

      Note:

      The Bootstrap requires a flavor that includes a disk size of 40GB or higher and the RAM size must be 8GB or higher.
  7. Perform the following steps to select Networks from the left pane:
    1. Enter the appropriate network name as defined by the customer with the OpenStack Environment (for example, ext-net) in the Available search filter. This reduces the list of possible choices.
    2. Select the appropriate network by clicking "↑" on the right side of the network listings. This adds the external network interface to the Launch Instance dialog.
  8. Perform the following step to select Key Pair from the left pane. This dialog assumes that you have already uploaded a public key to OpenStack. For more information, see Prerequisites:
    • Choose the appropriate key by clicking "↑" on the right side of the key pair listings. This adds the public key to the authorized_keys file on the Bootstrap host.
  9. Select Configuration from the left pane. This screen allows you to add configuration data that is used by cloud-init to set on the VM, the initial username, and hostname or FQDN additions to the /etc/hosts file.
    Copy the following configuration into the Customization Script text box:

    Note:

    • Ensure that the fields marked as <instance_name_from_details_screen> are updated with the instance name provided as per step 4 in this procedure.
    • Ensure that the <user-name> field is updated. The recommended value for this field is "cloud-user".
    #cloud-config
       hostname: <instance_name_from_details_screen>
       fqdn: <instance_name_from_details_screen>
       system_info:
         default_user:
           name: <user-name>
           lock_passwd: false
       write_files:
         - content: |
             127.0.0.1  localhost localhost4 localhost4.localdomain4 <instance_name_from_details_screen>
             ::1        localhost localhost6 localhost6.localdomain6 <instance_name_from_details_screen>
           path: /etc/hosts
           owner: root:root
           permissions: '0644'
  10. Select Launch Instance at the bottom right of the Launch Instance window. This initiates the creation of the VM. After the VM creation process is complete, you can see the VM instance on the Compute→Instances screen.
Predeployment Configuration for OpenStack
This section provides information about the configurations that are performed before installing CNE in an OpenStack deployment.

Note:

Run all the commands in this section from the Bootstrap host.

Logging in to Bootstrap VM

Use SSH to log in to Bootstrap using the private key uploaded to OpenStack. For more information about the private key, see Prerequisites.

For example:
$ ssh -i $BOOTSTRAP_PRIVATE_KEY $USER@$BOOTSTRAP_EXT_IP

The values used in the example are for reference only. You must obtain the Bootstrap external IP from Compute -> Instances on the OpenStack Dashboard. The $USER parameter is the same as <user-name>.

Setting the Cluster Short-Name and User Variables

Depending on the type of Load Balancer you want to use for network segregation, use one of the following options to set the cluster short-name and other necessary variables for use in the Bootstrap environment, and load them into the current environment:
  • Use the following commands for LBVM:
    $ echo 'export OCCNE_CLUSTER=<cluster_short_name>' | sudo tee -a  /etc/profile.d/occne.sh
    $ echo 'export OCCNE_USER=<user-name>' | sudo tee -a /etc/profile.d/occne.sh
    $ . /etc/profile.d/occne.sh
    For example:
    $ echo 'export OCCNE_CLUSTER=occne1-rainbow' | sudo tee -a  /etc/profile.d/occne.sh
    $ echo 'export OCCNE_USER=cloud-user' | sudo tee -a /etc/profile.d/occne.sh
    $ . /etc/profile.d/occne.sh
  • Use the following commands for CNLB:
    $ echo 'export OCCNE_CLUSTER=<cluster_short_name>' | sudo tee -a  /etc/profile.d/occne.sh
    $ echo 'export OCCNE_USER=<user-name>' | sudo tee -a /etc/profile.d/occne.sh
    $ echo 'export OCCNE_vCNE=openstack' | sudo tee -a /etc/profile.d/occne.sh
    $ . /etc/profile.d/occne.sh
    For example:
    $ echo 'export OCCNE_CLUSTER=occne1-rainbow' | sudo tee -a  /etc/profile.d/occne.sh
    $ echo 'export OCCNE_USER=cloud-user' | sudo tee -a /etc/profile.d/occne.sh
    $ echo 'export OCCNE_vCNE=openstack' | sudo tee -a /etc/profile.d/occne.sh
    $ . /etc/profile.d/occne.sh

In this step, the Bash variable references, such as ${OCCNE_CLUSTER}, expands to the short-cluster name in this shell, and subsequent ones.

Creating Cluster Specific Directories

Create the base occne directory, cluster directory (using the cluster short-name), and YUM local repo directories.

For example:
  • Use the following command to create base directory:
    $ sudo mkdir -p -m 0750 /var/occne
    $ sudo chown -R ${USER}:${USER} /var/occne
  • Use the following command to create cluster directory:
    $ mkdir -p -m 0750 /var/occne/cluster/${OCCNE_CLUSTER}/

Obtaining TLS Certificate for OpenStack

Depending on your environment, the OpenStack most likely uses a TLS certificate for accessing the OpenStack controller API. Without the certificate, Terraform (MetalLB) or OpenTofu (CNLB) cannot communicate with the Openstack environment and fails. Therefore, you must obtain the TLS certificate before using OpenStack.

Note:

Perform this step only if your OpenStack environment requires a TLS certificate to access the controller from the cluster nodes and the Bootstrap host (deployment only).
  1. Contact the OpenStack admin for the required TLS certificate to access the client commands. For example, in an Oracle OpenStack system installed with kolla, the certificate is available at /etc/kolla/certificates/openstack-cacert.pem.
  2. Copy the entire certificate (including the intermediate and root CA, if provided) to the /var/occne/cluster/${OCCNE_CLUSTER}/openstack-cacert.pem directory on the Bootstrap Host. Run this step after creating cluster specific directory.

    Ensure that the certificate file name is openstack-cacert.pem, when you copy the file to the /var/occne/cluster/${OCCNE_CLUSTER}/ directory.

    If the certificate file name is different, then rename it to openstack-cacert.pem before copying to the /var/occne/cluster/${OCCNE_CLUSTER}/ directory.

  3. Set the OS_CACERT environment variable to /var/occne/cluster/${OCCNE_CLUSTER}/openstack-cacert.pem using the following command:
    export OS_CACERT=/var/occne/cluster/${OCCNE_CLUSTER}/openstack-cacert.pem

Getting the OpenStack RC (API v3) File

The OpenStack RC (API v3) file exports several environment variables on the Bootstrap host. Terraform (MetalLB) or OpenTofu (CNLB) uses these environment variables to communicate with OpenStack and create different cluster resources.

Note:

The following instructions may slightly vary depending on the version of OpenStack Dashboard you're using.
  1. From the OpenStack Dashboard, go to Project - > API Access.
  2. From the Download OpenStack RC File drop-down menu on the right side, choose OpenStack RC File (Identity API v3).

    This downloads an openrc.sh file prefixed with the OpenStack project name (for example, OCCNE-openrc.sh) to your local system.

  3. Copy the file securely (using SCP or WinSCP) to the Bootstrap host in the /home/${USER} directory as .<project_name>-openrc.sh

    Note:

    In order for SCP or WinSCP to work properly, use the key mentioned in the Prerequisites to access the Bootstrap host. Also, it may be necessary to add the appropriate Security Group Rules to support SSH (Rule: SSH, Remote: CIDR CIDR: 0.0.0.0/0) under the Network → Security Groups page in the OpenStack Environment. If required, contact the OpenStack administrator to add the correct rules.
  4. Run the following command to source the OpenStack RC file:
    source .<project_name>-openrc.sh

Creating SSH Key on Bootstrap Host

Create the private and public keys to access the other VMs. The following command generates the keys that are passed to the Bastion Host and are used to communicate to other nodes from that Bastion Host.

$ mkdir -p -m 0700 /var/occne/cluster/${OCCNE_CLUSTER}/.ssh
$ ssh-keygen -m PEM -t rsa -b 2048 -f /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa -N ""
$ cp /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa ~/.ssh/id_rsa

Note:

The Bootstrap host is transient. Therefore, create a backup of the occne_id_rsa private key and copy the private key to a safe location to access the Bastion Host for using it in case of Bootstrap failures. This key is used to access the Bastion Host when performing other maintenance actions such as Upgrade, and so on.

You can also use an SSH client like Putty (keyGen and Pagent) to generate your own key pair and place the public key into the Bastion Host authorized_keys file to provide access.

Pagent can also be used to convert the .pem format of the occne_id_rsa private key to .ppk format for accessing Bastion Host using Putty.

Configuring Central Repository Access on Bootstrap

Configure the central repository access on Bootstrap by following the steps provided in the Configuring Central Repository Access on Bootstrap section.

Verifying Central Repository and Install Required Packages

In previous versions of CNE, the Bootstrap image came built-in with the prerequisite YUM packages and their dependencies. However, this is no longer true. When you are using Oracle Linux Base OS image, most of the packages and dependencies are handled by the deploy.sh script, and are installed directly from the central repository. However, it's necessary to manually install Podman, downloaded from the central repository.
  1. [Optional]: Allow access to central repository (Optional). Ensure that proper security group that allows access to the central repository is created. For more information, see Prerequisites. If necessary, add a security group to the Bootstrap instance by performing the following steps:
    1. Navigate to Compute → Instances → occne-<cluster-name>-bootstrap.
    2. On the right most drop-down menu, select Edit Instance.
    3. Select Security Groups.
    4. Click the plus symbol on the default or custom security group to add it to the Bootstrap image.
    5. The security group may be already allocated depending on the OpenStack environment.
  2. Perform the following steps to test the central repository:
    1. Run the following command to perform a simple ping test:
      $ ping -c 3 ${CENTRAL_REPO}
      Sample output:
      PING winterfell (0.0.0.0) 56(84) bytes of data.
      64 bytes from winterfell (128.128.128.128): icmp_seq=1 ttl=60 time=0.448 ms
      64 bytes from winterfell (128.128.128.128): icmp_seq=2 ttl=60 time=0.478 ms
      64 bytes from winterfell (128.128.128.128): icmp_seq=3 ttl=60 time=0.287 ms
       
      --- winterfell ping statistics ---
      3 packets transmitted, 3 received, 0% packet loss, time 2060ms
      rtt min/avg/max/mdev = 0.287/0.404/0.478/0.085 ms
    2. Run the following command to list the repositories:
      $ dnf repolist
      Sample output:
      repo id                                    repo name
      ol9_UEKR7                                  Unbreakable Enterprise Kernel Release 7 for Oracle Linux 9 (x86_64)
      ol9_addons                                 Oracle Linux 9 Addons (x86_64)
      ol9_appstream                              Application packages released for Oracle Linux 9 (x86_64)
      ol9_baseos_latest                          Oracle Linux 9 Latest (x86_64)
      ol9_developer                              Packages for creating test and development environments for Oracle Linux 9 (x86_64)
      ol9_developer_EPEL                         EPEL Packages for creating test and development environments for Oracle Linux 9 (x86_64)
  3. Run the following command to instal Podman:
    $ sudo dnf install -y podman

Copying Necessary Files to Bootstrap

  1. Copy the <project_name>-openrc.sh script, created in a previous section, from the user's home directory to the cluster directory:
    $ cp ~/<project_name>-openrc.sh /var/occne/cluster/${OCCNE_CLUSTER}/openrc.sh
  2. The Bootstrap is not preloaded with default settings, including the presetting of OCCNE_VERSION. It is necessary to temporarily set the CNE version before copying or downloading the cluster files. Use the following command to replace the CNE version:
    $ export OCCNE_VERSION=<occne_version>
    For example:
    $ export OCCNE_VERSION=25.2.100
    This value is permanently set by the deploy script in a later step.
  3. Create scripts directory in the cluster directory. Copy the scripts to the newly created directory and the Terraform (MetalLB) or OpenTofu (CNLB) templates into the cluster directory:

    Note:

    The location to the templates and the cluster directory vary depending on the type of Load Balancer (MetalLB or CNLB) used. Use one of the following commands to copy the relevant templates to the relevant cluster directory.
    Use the following command for LBVM:
    $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host -v /var/occne:/var/occne:z --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in ../../scripts modules misc/. templates/. tffiles/. scripts/. ../common/templates/. ../../../../common/scripts ; do cp -r /platform/vcne/lbvm/terraform/openstack/"$source" /host; done'
    Use the following command for CNLB:
    $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host -v /var/occne:/var/occne:z --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in modules ../scripts misc/. templates/. tffiles/. ../cluster/. ../../common/opentofu/templates/. ../../common/scripts ../../../../common/scripts ; do cp -r /platform/vcne/cnlb/openstack/opentofu/"$source" /host; done'
    $ mkdir /var/occne/cluster/${OCCNE_CLUSTER}/installer
    $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/cnlb_installer:${OCCNE_VERSION} -c 'for source in installer/validateCnlbIni.py installer/cnlb_logger.py installer/utils.py; do cp -r "$source" /host/installer/; done'
  4. Run the following command to update the ownership of the files in the cluster directory. Ensure that you update the ownership of all the copied files.
    $ sudo chown -R ${USER}:${USER} /var/occne/cluster/${OCCNE_CLUSTER}

Updating the cluster.tfvars File

The cluster.tfvars file contains all the variables required by Terraform (MetalLB) or OpenTofu (CNLB) to implement the cluster. Depending on the type of Load Balancer you want to use for traffic segregation, use one of the following sections to modify and complete a copy of the occne_example/cluster.tfvars template (copied from the provisioning container as part of the previous step):

Note:

  • You must configure the fields in the cluster.tfvars file to adapt to the current OpenStack Environment used. This requires some of the fields and information directly from the OpenStack Dashboard or OpenStack CLI (not bundled with Bootstrap). The given procedures provide details on how to collect and set the fields that must be changed and doesn't provide examples about OpenStack CLI .
  • All the fields in the cluster.tfvars file must be unique. Therefore, ensure that there are no duplicate fields in the cluster.tfvars file and refrain from duplicating any fields in the file using the comment tag "#", as it can cause possible parsing errors.
Deploying CNE Cluster in OpenStack Environment

This section describes the procedure to deploy the VMs in the OpenStack Environment, configure the Bastion Host, and deploy and configure the Kubernetes clusters.

Running Deploy Command

Note:

The Environment Variables section describes the list of possible environment variables that can be combined with the deploy.sh command. Ensure that you refer to this section before you proceed with the deployment.
  1. Run the following command to copy the occne.ini.template file to define the required Ansible variables for the deployment:
    $ cp /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini.template /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini
    
  2. Run the following command to edit the newly created occne.ini file:
    $ vi /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini

    See the "occne.ini Variables" table in the Environment Variables section for details about the occne.ini variables that can be combined with the deploy.sh command to further define the implementation of the deployment.

  3. Create a copy of the secrets.ini.template file:
    $ cp /var/occne/cluster/${OCCNE_CLUSTER}/secrets.ini.template /var/occne/cluster/${OCCNE_CLUSTER}/secrets.ini
    
  4. For LBVM deployments, edit the occne.ini file updated in Step 2 with the required secrets.ini variables:
    $ vi /var/occne/cluster/${OCCNE_CLUSTER}/secrets.ini
    

    See the "secrets.ini Variables" table in the Environment Variables section for details about the secrets.ini variables that are required by the deploy.sh command to install the cluster.

  5. Customize the common services by referring to the Common Installation Configuration section.
  6. Run the following command from the /var/occne/cluster/${OCCNE_CLUSTER}/ directory on the Bootstrap Host. This command can take a while to run. It can take up to 2 to 4 hours depending on the machines its running on.
    $ ./deploy.sh

    Note:

    If the deploy.sh command fails during installation, you can troubleshoot as follows:
    • Check the configuration parameter values and perform reinstallation.
    • Contact Oracle support for assistance.

    The system displays a message similar to the following when the CNE cluster is deployed successfully in an OpenStack Environment:

    Sample output for LBVM:
    ...
    -POST Post Processing Finished
    ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    Fri Sep 6 06:51:52 PM UTC 2024
    Connection to 10.x.x.x closed.
    /var/occne/cluster/$OCCNE_CLUSTER/artifacts/pipeline.sh completed successfully
    ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    Sample output for CNLB:
    ...
    -POST Post Processing Finished
    ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    Mon Jun  17 17:52:13 UTC 2024
    /var/occne/cluster/<cluster-name>/artifacts/pipeline.sh completed successfully
    ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Installing vCNE on VMware Environment

The installation procedure details the steps necessary to configure and install CNE cluster in a VMware Environment.

CNE supports the following Load Balancers for traffic segregation:
  • Load Balancer VM (LBVM)
  • Cloud Native Load Balancer (CNLB)
You can choose the type of Load Balancer used for traffic segregation while installing CNE on VMware. The configurations for CNE installation vary depending on the type of Load Balancer you choose. Therefore, read the Configuring Bootstrap VM and Predeployment Configuration for VMware procedures carefully and configure CNE depending on your Load Balancer type.
Prerequisites

Before installing CNE on VMware, ensure that the following prerequisites are met.

  1. A vSphere vCenter account must be available with the following permissions:
    1. Datastore
      • Low level file operations
    2. Host
      • Configuration
        • Storage partition configuration
    3. Cns
      • Searchable
    4. Profile-driven storage
      • Profile-driven storage view
    5. Virtual machine
      • Change configuration
        • Add existing disk
        • Add or remove a device
        • Advanced configuration
  2. Configure the following minimal versions of both vCloud Director and vSphere to allow correct functioning of vCNE:
    • vCloud Director (VCD) = 10.2.1
    • vSphere or vCenter = 7.0
  3. A vCloud Director account with "vApp Author" and "Organization Network View" rights must be available.

    Note:

    The solution is to create a role with all the required rights and assign the account to this role. Check the vApp Author default rights in VMware Cloud Director Documentation 10.2.
  4. Configure the vCloud Director environment with the backend.cloneBiosUuidOnVmCopy parameter set to 0 to ensure that VMs created from a vApp template have a different BIOS UUID than BIOS UUID of the template.
  5. Configure the VMware environment with appropriate CPU, disk, RAM, and Network required for resource allocation to the VMs.
  6. Set up the central repository by following the Setting Up a Central Repository procedure.
  7. Download all the necessary images, binaries, and files. Also, set up and configure the central and YUM repository. For procedures, see the Artifact Acquisition and Hosting section.
  8. Users must be familiar with using VMware as a virtual provider, vCloud Director, and vSphere.
  9. All necessary networking must be defined for use on the VMware.

For more information, refer to the following documentations:

Downloading Oracle Linux X Image

Download the OLX image by following the procedure in the Downloading Oracle Linux section.

Note:

The 'X' in Oracle Linux X or OLX in the procedure indicates the version of Oracle Linux supported by CNE.
Uploading OLX Image as VMware Media

This section describes the procedure to upload OLX image as VMware media.

  1. Log in to VMware GUI using your credentials.
  2. Click Libraries on the top navigation bar.
  3. From the left panel, select Media & Other.
  4. Click Add.
  5. Select ISOS from the Select a catalog drop-down.
  6. Click the up arrow on Select media to upload to select a file.
  7. Select the OLX image (for example, OL9) downloaded in the Downloading Oracle Linux X Image section and click OK.
Creating a Template

This procedure describes the process to create a template.

Use this template to clone all the virtual machines (VMs) for CNE installation.

Note:

The following procedure considers creating template for OL9 and provides the configurations accordingly. The options vary for other versions.

Procedure

  1. Log in to the vCloud Director Environment using your credentials.
  2. Select the Virtual DataCenter where you want to create the template.
  3. On the left panel, under Compute section, click vApps.
  4. Click on the NEW drop-down list and select New vApp.
  5. Enter the vAPP name and click CREATE.

    For example: Name: bootstrap-VERSION-template

  6. Open the newly created vAPP.
  7. From the ALL-ACTIONS drop-down, select Add → Add VM.
  8. On the pop-up window that appears, click ADD VIRTUAL MACHINE and fill the fields as given in the following example:
    • Refer to the following example to fill name, description, and type:
      Name: CLUSTER-bootstrap
      Computer Name: CLUSTER-bootstrap
      Description: (Optional)
      Type: select New.
    • Refer to the following example to fill the Operating System section:
      Operating System:
          OS family: Linux
          Operating System: Oracle Linux 9 (64-bit)
          Boot image: OracleLinux-R9-UX-x86_64-dvd.iso

      Note:

      The specific OS version for the Operating System field may not be present. In such cases, select an older Oracle Linux distribution.
    • Refer to the following example to fill the Compute section:
      Compute:
          Virtual CPUs: 1
          Cores per socket: 1
          Number of sockets: 1
          Memory: 4GB
    • Refer to the following example to fill the Storage section:
      Storage:
          Size: 32GB (type the desired value in GB, E.g: 32GB, 64GB, 128GB)
      
    • Perform the following steps to configure the Networking section.
      1. Click ADD NETWORK TO VAPP.
      2. Check Type Direct.
      3. From the Org VDC Network Connection table, Select a network to add.
      4. Click ADD.
      5. In the Networks table, select the following options:
        Network: <Select the network added in the previous section>
        Network Adapter Type: VMXNET3
        IP Mode: Static - IP Pool
        IP Address: Auto-assigned
        Primary NIC: Selected

        Note:

        • IP Address is auto-assigned. This default IP address cannot be changed.
        • Primary NIC cannot be deselected with only one NIC.
  9. Click OK and then click ADD.

    The system creates the VM. Wait until the VM is created.

  10. Perform the following steps to connect to the new VM:
    1. Select the newly created VM and click ALL ACTIONS.
    2. Select Media.
    3. Click Insert Media and select OracleLinux-R9-UX-x86_64-dvd.iso.
    4. Click INSERT at the bottom right corner.
    5. Click POWER ON and wait for the VM to start.
    6. When the VM is available, connect to the VM by choosing one of the following options:
      • Click LAUNCH WEB CONSOLE (recommended option).
      • Click LAUNCH REMOTE CONSOLE (this option requires installing an external software which is not covered in this procedure).
      LAUNCH REMOTE CONSOLE, once active, to open the remote console.
  11. Perform the following steps to install Oracle Linux X:

    Note:

    The following steps provides the procedure to install Oracle Linux 9.X. The options may vary depending on the Linux version you are installing. Therefore, select the options as per your Linux version.
    1. Select Test this media & install Oracle Linux 9.X.X.

      The system displays the installation window after running the test.

    2. On the Welcome to Oracle Linux X screen, select your preferred language and click Continue.

      The system displays the Installation Summary page. Configure each section by performing the following steps.

    3. [Optional]: From Localization, choose your desired options for Keyboard, Language Support and Time & Date.
    4. Perform the following steps to configure the Software section:
      1. Select Software → Installation Source and ensure that Auto Detect installation media is checked.
      2. Click Done.
      3. Select Software → Software Selection and choose Minimal Installation from the Base Environment panel on the left.

        Note:

        Don't select any items from the Additional software for Selected Environment panel on the right.
      4. Click Done.
    5. Perform the following steps to configure the System section:
      1. Select System → Installation Destination.
      2. Select Custom under Storage Configuration and click Done.
      3. On the Manual Partitioning screen that appears, create three partitions by clicking the + symbol and selecting the following options per partition:
        • Mount Point: /boot - Desired Capacity: 1024
        • Mount Point: swap - Desired Capacity: 4G
        • Mount Point: / - Desired Capacity: (Leave this field blank to use the rest of the available space)

        Click Add Mount Point to confirm each partition.

      4. Click Done.
      5. On the Summary of Changes screen that appears, click Accept Changes.
    6. Perform the following steps to configure the User Settings section:
      1. Select User Settings → Root Password.
      2. On the Root Password field, type a strong root password.
      3. Select Allow root SSH login with password.
      4. Click Done.
    7. Retain the default values for the rest of the System settings (KDUMP, Network & Host Name, Security Profile).
    8. Click Begin Installation when you complete all the previous steps.
  12. Once the installation is completed, wait for the VM to reboot and log in using LAUNCH WEB CONSOLE or LAUNCH REMOTE CONSOLE.
  13. Make sure one of the network interfaces has set the same IP that was assigned by vCloud Director.
  14. From the remote console, run the following command to configure network:
    $ ip address
    If there is no IP address on the new VM, run the following nmcli commands:
    $ nmcli con mod <interface name> ipv4.method manual ipv4.addresses <ip-address/prefix> ipv4.gateway <gateway> connection.autoconnect yes
    $ nmcli con up <interface name>

    Run the $ ip address command again to verify if the IP address has changed.

    Use the IP address in the "NICs" section and get the gateway and prefix from VMware GUI: Networking → <select networking name> → Static IP Pools → Gateway CIDR.

  15. If a proxy is required to reach yum.oracle.com, add proxy = parameter to the /etc/dnf/dnf.conf file.
  16. Check that the nameservers are indicated in /etc/resolv.conf file. If empty, fill in the required values.
  17. Run the following command to update all the packages to the latest versions.
    $ dnf update -y
  18. Run the following command to install the packages required for CNE Installation.
    $ dnf install -y perl-interpreter cloud-utils-growpart
  19. Change the following line in /etc/sudoers using vi command as a root user:
    1. Run the following commands to open the file in edit mode:
      $ chmod 640 /etc/sudoers
      $ sudo vi /etc/sudoers
    2. Search next line and comment out the following line:
      %wheel ALL=(ALL) ALL
    3. Uncomment the following line:
      # %wheel ALL=(ALL) NOPASSWD: ALL
  20. Run the following commands to enable VMware customization tools:
    $ vmware-toolbox-cmd config set deployPkg enable-customization true
    $ vmware-toolbox-cmd config set deployPkg enable-custom-scripts true
  21. Run the following commands to clean the temporary files:
    $ dnf clean all
    $ logrotate -f /etc/logrotate.conf
    $ find /var/log/ -type f -iname '*gz' -delete
    $ find /var/log/ -type f -iname *$(date +%Y)* -delete
    $ for log in $(find /var/log/ -type f -size +0) ; do echo " " > $log ; done
  22. Remove any proxy parameter added to /etc/dnf/dnf.conf.
  23. Unmount the media from the VM.
  24. Power off the VM from vCloud Director.
  25. Log in to vSphere vCenter and search for the name given to the created VM.
  26. Right click on the VM and click Edit Settings.
  27. Click the VM Options tab.
  28. Expand the Advanced drop-down list.
  29. Search for Configuration Parameters and select Edit Configuration.
  30. Add the disk.EnableUUID parameter and set it to TRUE.
  31. Go back to the vCloud Director GUI and search for the vApp created previously.
  32. Click the Actions drop-down and select Create Template.
  33. Add template to a catalog.
  34. Enter a name for the template in the Name field.
  35. Select Make identical copy and click OK.

    The new template is stored in Libraries / vApp Templates.

Configuring Bootstrap VM

This procedure describes the process to configure a Bootstrap Virtual Machine (VM).

Procedure

  1. Create a new VM from CNE template:
    1. Log in to the VMware GUI using your credentials.
    2. Click Data Centers on the top of the page.
    3. Select Virtual Machines from the left panel.
    4. Click New VM and perform the following steps:
      1. Input the name of the new VM in Name.
      2. Update the name of the computer in Computer Name if you want a different name than the default one.
      3. Select the From Template option if it is not preselected by default.
      4. Click Power on if it is not preselected by default.
      5. Select the template from the available list.
      6. Click OK
  2. When the VM is powered on, select the newly created VM and perform the following steps:
    1. From the left pane, select NICs.
    2. Click Edit.
    3. Select the Connected checkbox.
    4. Select the Network dropdown.
    5. From the IP Mode dropdown, select Static - IP Pool.
    6. Click Save.
  3. Check if the values of Network, IP Mode are set per the values provided in the previous step. Also, check if the IP Address column displays the valid IP address.
  4. Connect to the VM by using either LAUNCH WEB CONSOLE (recommended option) or LAUNCH REMOTE CONSOLE (this option requires installing an external software which is not covered in this procedure).
  5. From the remote console, log in with the root user and password:
    1. Run the following command to get the IP address:
      $ ip address
    2. Run the following command if there is no IP address on the new VM yet or if the IP address does not match the IP address in the "NICs" section of VMware GUI:
      $ nmcli con mod <interface name> ipv4.method manual ipv4.addresses <ip-address/prefix> ipv4.gateway <gateway> connection.autoconnect yes
      where,
      • <interface name> is the interface name obtained in the previous step.
      • <ip-address/prefix> and <gateway> are the IP address, its prefix and gateway details obtained from the VMware GUI (Networking → <select networking name> → Static IP Pools → Gateway CIDR).
      Run the following command to restart the interface:
      $ nmcli con up <interface name>
    3. Verify if the IP reflected in GUI is correctly assigned on the interface:
      $ ip address 
  6. If reaching yum.oracle.com requires a proxy, add proxy = parameter to the /etc/dnf/dnf.conf file.
  7. Check if the nameservers are indicated in the /etc/resolv.conf file. If it's empty, populate it with the required values.
  8. Run the following commands to install packages and create a <user-name> user. The recommended value for <user-name> is cloud-user. When prompted, enter a new password twice for the previously created user:
    $ dnf update -y
    $ dnf install -y oraclelinux-developer-release-el9 oracle-epel-release-el9
    $ dnf install -y rsync podman python3-pip
     
    $ groupadd -g 1000 <user-name>
    $ useradd -g <user-name> <user-name>
    $ passwd <user-name>
    $ echo "<user-name> ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
     
    $ su - <user-name>
    $ mkdir -m 700 /home/<user-name>/.ssh
  9. Create the /home/<user-name>/.ssh/config file:
    $ vi /home/<user-name>/.ssh/config
    Add the following content to the file:
    ServerAliveInterval 10
    TCPKeepAlive yes
    StrictHostKeyChecking=no
    UserKnownHostsFile=/dev/null
  10. Perform the following steps to set the cluster short name and central repository variables:
    1. Set the following cluster variables for the bootstrap environment and load them into the current environment:

      Note:

      The <cluster name> and <user-name> parameters must contain lowercase alphanumeric characters, '.' or '-', and must start with an alphanumeric character.
      $ echo 'export LANG=en_US.utf-8' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export LC_ALL=en_US.utf-8' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export OCCNE_VERSION=25.2.100' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export OCCNE_PREFIX=' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export OCCNE_CLUSTER=<cluster name>' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export OCCNE_USER=<user-name>' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export OCCNE_vCNE=vcd' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export OCCNE_TFVARS_DIR=/var/occne/cluster/<cluster name>/<cluster name>' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export VCD_USERNAME=<username>' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export VCD_PASSWORD=<password>' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export VCD_AUTH_URL=https://<vcd IP address>' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export VCD_ORG=<org>' | sudo tee -a  /etc/profile.d/occne.sh
      $ echo 'export VCD_VDC=<virtual data center>' | sudo tee -a  /etc/profile.d/occne.sh
      $ source /etc/profile.d/occne.sh
      Run the following command to confirm that all cluster variables are loaded to occne.sh:
      $ cat /etc/profile.d/occne.sh
  11. Run the following commands to create a directory specific to your cluster (using the cluster short-name):
    $ sudo mkdir /var/occne/
    $ sudo chown -R <user-name>:<user-name> /var/occne/
    $ mkdir -p -m 0750 /var/occne/cluster/${OCCNE_CLUSTER}/
  12. Create the private and public keys that are used to access the other VMs. Run the following commands to generate the keys that are passed to the Bastion Host and used to communicate to other nodes from that Bastion Host.
    $ mkdir -p -m 0700 /var/occne/cluster/${OCCNE_CLUSTER}/.ssh
    $ ssh-keygen -m PEM -t rsa -b 2048 -f /tmp/occne_id_rsa -N ""
    $ cp /tmp/occne_id_rsa ~/.ssh/id_rsa
    $ cp /tmp/occne_id_rsa /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa
    $ sudo chmod 600 /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa
    $ ssh-keygen -f ~/.ssh/id_rsa -y > ~/.ssh/id_rsa.pub
    $ ssh-keygen -f /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa -y > /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa.pub
    $ rm /tmp/occne_id_rsa

    Note:

    The private key (occne_id_rsa) must be backed-up and copied to a server that is going to be used, longer-term, to access the Bastion Host because the Bootstrap Host is transient. The key is used to access the Bastion Host when performing other maintenance actions such as upgrade, and so on. You can also use an SSH client like Putty (keyGen and Pagent) to generate your own key pair and place the public key into the Bastion Host authorized_keys file to provide the access. Pagent can also be used to convert the .pem format of the occne_id_rsa private key to .ppk format using Putty to access the Bastion Host.
  13. Configure the central repository access on Bootstrap by following the steps provided in the Configuring Central Repository Access on Bootstrap section.
  14. Perform the following steps to copy necessary files to Bootstrap Host:
    1. Copy the templates to cluster directory:

      Note:

      • The location to the templates and the cluster directory vary depending on the type of Load Balancer (LBVM or CNLB) used. Use one of the following commands to copy the relevant templates to the relevant cluster directory.
      • Copy the for loop completely and use it in the terminal as a single command.
      Use the following command for LBVM:
      $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host -v /var/occne:/var/occne:z --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in ../../scripts modules misc/. templates/. tffiles/. scripts/. ../common/templates/. ../../../../common/scripts; do cp -r /platform/vcne/lbvm/terraform/vcd/"$source" /host; done'
      Use the following commands for CNLB:
      $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host -v /var/occne:/var/occne:z --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/provision:${OCCNE_VERSION} -c 'for source in modules ../scripts misc/. templates/. tffiles/. ../cluster/. ../../common/opentofu/templates/. ../../common/scripts ../../../../common/scripts ; do cp -r /platform/vcne/cnlb/vcd/opentofu/"$source" /host; done'
      
      $ mkdir /var/occne/cluster/${OCCNE_CLUSTER}/installer
      $ podman run -it --rm -v /var/occne/cluster/${OCCNE_CLUSTER}:/host --entrypoint bash ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/cnlb_installer:${OCCNE_VERSION} -c 'for source in installer/validateCnlbIni.py installer/cnlb_logger.py installer/utils.py; do cp -r "$source" /host/installer/; done'
    2. Update the ownership of the files in the cluster directory:
      $ sudo chown -R ${USER}:${USER} /var/occne/cluster/${OCCNE_CLUSTER}
Predeployment Configuration for VMware

This section describes the procedure to deploy the CNE cluster in a VMware Cloud Director (VCD) Environment.

  1. Customize the occne.ini file: This file contains important information about the vsphere account (different from VCD), needed to allow cluster to run. You can use the occne.ini.template file available in the same directory to copy and create the initial file, and then customize the file.
    1. Create a copy of the occne.ini file from its template:
      $ cp /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini.template /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini
    2. Open the copied file and fill out all required parameters as given in the following sample:

      Note:

      • external_vsphere_version is a mandatory parameter.
      • external_vsphere_datacenter is not the same as VCD_VDC.
      • Ensure that you retrieve all the information from the VCD administrator, otherwise, the cluster will not work correctly.
      • The occne.ini file configuration varies depending on the type of Load Balancer (LBVM or CNLB) used. Refer to the relevant example and configure the file depending on your Load Balancer type.
      Sample occne.ini file for LBVM:
      [occne:vars]
      occne_cluster_name= <cluster-name>
       
      ####
      # The ipvs scheduler type when proxy mode is ipvs
      # this variable will be added with default value of "rr"( round robin).
      # rr: round-robin
      # sh: source hashing
      # kube_proxy_scheduler=rr
       
      ####
      ## Central repository for bastion to retrieve YUM, Docker registry, and HTTP access to files (helm charts, etc)
      central_repo_host=
      central_repo_host_address=
      # central_repo_protocol=http
       
      ####
      ## Auto filled by deploy.sh from values in TFVARS file
      occne_ntp_server =
      occne_cluster_network =
       
      ## Indicate DNS nameservers, comma separated
      name_server = <name_server1>,<name_server2>
       
      # Specify 'True' (case matters, no quotes) to deploy services as ClusterIP instead of LoadBalancer. Default is 'False'
      # cncc_enabled=False
       
      # Below is the default calico_mtu value. Change if needed.
      # The value should be a number, not a string.
      calico_mtu = 1500
       
      # Below is the default kube_network_node_prefix value. Change if needed.
      # Default value has room for 128 nodes. The value should be a number, not a string.
      # kube_network_node_prefix_value=25
       
      # [vcd:vars]
      # Specify True vSphere information of the External vSphere Controller/CSI Cinder plugin accounts,
      ## account must be provided by OCCNE personel and it is needed for deployment.
      ## All the rest of the information is the same as used before.
      external_vsphere_version = <6.7u3> or <7.0u1> or <7.0u2>
      external_vsphere_vcenter_ip = <Need to be completed>
      external_vsphere_vcenter_port = <Need to be completed>
      external_vsphere_insecure = <Need to be completed>
      external_vsphere_user = <Need to be completed>
      external_vsphere_password = <Need to be completed>
      external_vsphere_datacenter = <Need to be completed>
      external_vsphere_kubernetes_cluster_id = <cluster-name>
       
      # vCloud Director Information required for LB Controller.
      # User must have catalog author permissions + Org Network View
      # User and password must use alphanumeric characters, it can be uppercase or lowercase
      # The password cannot contain "/" or "\", neither contains or be contained in "()"
      vcd_user             = <Need to be completed>
      vcd_passwd           = <Need to be completed>
      org_name             = <Need to be completed>
      org_vdc              = <Need to be completed>
      vcd_url              = <Need to be completed>
      Sample occne.ini file for CNLB:
      [occne:vars]
      occne_cluster_name = <cluster-name>
       
      ####
      ## Central repository for bastion to retrieve YUM, Docker registry, and HTTP access to files (helm charts, etc)
      central_repo_host = <Need to be completed>
      central_repo_host_address = <Need to be completed>
      # central_repo_protocol=http
        
      ####
      ## Auto filled by deploy.sh from values in TFVARS file
      occne_ntp_server =
      occne_cluster_network =
       
      # See section 5.4.8 to fill in the following fields.
      occne_prom_cnlb = <Need to be completed>
      occne_alert_cnlb = <Need to be completed>
      occne_graf_cnlb = <Need to be completed>
      occne_nginx_cnlb = <Need to be completed>
      occne_jaeger_cnlb = <Need to be completed>
      occne_opensearch_cnlb = <Need to be completed>
       
      ## Indicate DNS nameservers, comma separated
      name_server = <name_server1>,<name_server2>
       
      # Specify 'True' (case matters, no quotes) to deploy services as ClusterIP instead of LoadBalancer. Default is 'False'
      # cncc_enabled=False
       
      # Below is the default calico_mtu value. Change if needed.
      # The value should be a number, not a string.
      calico_mtu = 1500
       
      # Below is the default kube_network_node_prefix value. Change if needed.
      # Default value has room for 128 nodes. The value should be a number, not a string.
      # kube_network_node_prefix_value=25
       
      [vcd:vars]
      ## Specify the vSphere information of the External vSphere Controller/CSI Cinder plugin accounts,
      ## account must be provided by OCCNE personel and it is needed for deployment.
      ## All the rest of the information is the same as used before.
      external_vsphere_version = <6.7u3> or <7.0u1> or <7.0u2>
      external_vsphere_vcenter_ip = <Need to be completed>
      external_vsphere_vcenter_port = <Need to be completed>
      external_vsphere_insecure = <Need to be completed>
      external_vsphere_datacenter = <Need to be completed>
      external_vsphere_kubernetes_cluster_id = <cluster-name>
        
      # vCloud Director Information required for LB Controller.
      # User must have catalog author permissions + Org Network View
      # User and password must use alphanumeric characters, it can be uppercase or lowercase
      # The password cannot contain "/" or "\", neither contains or be contained in "()"
      org_name             = <Need to be completed>
      org_vdc              = <Need to be completed>
      vcd_url              = <Need to be completed>
       
      # The ipvs scheduler type when proxy mode is ipvs
      # this variable will be added with default value of "rr"( round robin).
      # rr: round-robin
      # sh: source hashing
       
      ipvs_scheduler=rr
  2. Configure the secrets.ini file: The secrets.ini file contains information about the cluster and vsphere account (different from VCD) credentials. This information is required to allow the cluster to run correctly.
    1. Create a copy of the secrets.ini file from its template:
      $ cp /var/occne/cluster/${OCCNE_CLUSTER}/secrets.ini.template /var/occne/cluster/${OCCNE_CLUSTER}/secrets.ini
    2. Open the copied file and fill all the required parameters as given in the following sample:
      [occne:vars]
       
      # Set grub password
      occne_grub_password=
      
      # Username and password for Bastion container registry
      occne_registry_user=
      occne_registry_pass=
       
      [vcd:vars]
      ## Specify the vSphere information of the External vSphere Controller/CSI Cinder plugin accounts
      ## needed for deployment.
      external_vsphere_user =
      external_vsphere_password =
      
      vcd_user=
      vcd_passwd=
  3. Configure the cluster.tfvars file: The cluster.tfvars file contains all the variables required by Terraform (LBVM) or OpenTofu (CNLB) to configure the cluster.

    CNE supports multiple pool addresses for LBVM. Therefore the range of the IP addresses, as well as the ports must be specified in the file, along with the networks to be used, template or catalogs for VM creation, and all the parameters, to allow the process to get access to VCD. The following sample shows a fully set up cluster.tfvars file. Some of the parameters that are completed by default are not shown in the following sample.

    Note:

    The occne_metallb_peer_addr_pool_names name cannot be a substring of the cluster_name. For example, if the cluster_name is mysignal1, then occne_metallb_peer_addr_pool_names cannot be sig.
    1. Create the directory where the cluster.tfvars file must be copied:
      $ mkdir -p -m 0750 /var/occne/cluster/${OCCNE_CLUSTER}/${OCCNE_CLUSTER}
    2. Copy the cluster.tfvars file from its template to the new directory:
      $ cp /var/occne/cluster/${OCCNE_CLUSTER}/occne_example/cluster.tfvars /var/occne/cluster/${OCCNE_CLUSTER}/${OCCNE_CLUSTER}/cluster.tfvars
    3. Open the copied file and fill all the required parameters as given in the following samples:

      Note:

      The cluster.tfvars file configuration varies depending on the type of Load Balancer (LBVM or CNLB) used. Refer to the relevant example and configure the file depending on your Load Balancer type.
      Sample cluster.tfvars file for LBVM:
      # vCloud Director Information required to create resources.
      # User must have catalog author permissions + Org Network View
      # User and password must use alphanumeric characters, it can be uppercase or lowercase
      # The password cannot contain "/" or "\", neither contains or be contained in "()"
      vcd_user             = "<Need to be completed>"
      vcd_passwd           = "<Need to be completed>"
      org_name             = "<Need to be completed>"
      org_vdc              = "<Need to be completed>"
      vcd_url              = "<Need to be completed>"
      allow_unverified_ssl = true
      cluster_name         = "<customer_specific_short_cluster_name>"
      #if affinity rules will be created set polarity to "Affinity"  if anti-affinity rules set polarity to "Anti-Affinity"
      polarity             = "<Anti-Affinity or Affinity>"
      # specify if the affinity/anti-affinity rule will be hard or soft  if hard set variable to true if soft set variable to false.
      hard_rule            = <true or false>
      # Network used for cluster communication, this network must have the feature to create SNAT and DNAT rules
      private_net_name     = "<Need to be completed>"
      # Networks used for external network communication, normally used for Bastion and LBVMs
      ext_net1_name        = "<Need to be completed>"
      ext_net2_name        = "<Need to be completed>"
      # Catalog and template name where the vApp template is stored. This template will be used for all the VM's
      catalog_name         = "<Need to be completed>"
      template_name        = "<Need to be completed>"
       
      # number of hosts
      number_of_bastions = 2
      number_of_k8s_ctrls_no_floating_ip = 3
      number_of_k8s_nodes = <number of worker nodes>
       
      # Amount of RAM assigned to VM's, expressed in MB
      memory_bastion = "4096"
      memory_k8s_node = "32768"
      memory_k8s_ctrl = "8192"
      memory_lbvm = "2048"
      # Amount of CPU assigned to VM's
      cpu_bastion = "2"
      cpu_k8s_node = "8"
      cpu_k8s_ctrl = "2"
      cpu_lbvm = "4"
      # Amount of cores assigned to VM's. Terraform has a bug related to templates created with a different
      # core number of the ones assigned here, it is suggested to use the same number as the template
      cores_bastion = 1
      cores_k8s_ctrl = 1
      cores_k8s_node = 1
      cores_lbvm = 1
      # Disk size, expressed in MB. Minimum disk size is 25600.
      disk_bastion = "102400"
      disk_k8s_node = "40960"
      disk_k8s_ctrl = "40960"
      disk_lbvm = "40960"
       
      # <user-name> used to deploy your cluster, should be same as $OCCNE_USER
      # Uncomment the below line only if $OCCNE_USER is NOT "cloud-user"
      # ssh_user= "<user-name>"
       
      # Update this list with the names of the pools.
      # It can take any value as an address pool name e.g. "oam", "signaling", "random_pool_name_1", etc.
      # This field should be set depending on what peer address pools the user wishes to configure.
      # eg ["oam"]  or ["oam", "signaling"] or ["oam", "signaling", "random_pool_name_1"]
      # Note : "oam" is required while other network pools are application specific.
      #
      occne_metallb_peer_addr_pool_names = ["oam"]
        
      # Use the following for creating the Metallb peer address pool object:
      #
      # (A) num_pools = number of network pools as defined in occne_metallb_peer_addr_pool_names
      #
      # (B) Configuring pool_object list.
      # Each object in this list must have only 4 input fields:
      # 1. pool_name : Its name must match an existing pool defined in occne_metallbaddr_pool_names list
      # 2. num_ports = number of ips needed for this address pool object
      # 3. Configuring 3rd input field : use only one of the three ip address input fields for
      #    each peer address pool. The other two input fields should be commented out or deleted.
      #
      # - .. cidr : A string representing a range of IPs from the same subnet/network.
      # - .. ip_list : A random list of IPs from the same subnet/network. Must be
      #                defined within brackets [].
      # - .. ip_range: A string representing a range of IPs from the same subnet/network.
      #                This range is converted to a list before input to terraform.
      #                IPs will be selected starting at the beginning of the range.
      #
      # WARNING: The cidr/list/range must include the number of IPs equal to or
      #          greater than the number of ports defined for that peer address
      #          pool.
      #
      # NOTE: Below variables are not relevant to vCloud Director, but they must have
      #       a value, for example: nework_id = net-id
      # 4. network_id : this input field specifies network id of current pool object
      # 5. subnet_id  : this input field specifies subnet id of current pool object
      #
        
      # Make sure all fields within the selected
      # input objects are set correctly.
        
        
      occne_metallb_list = {
        num_pools = 1
        pool_object = [
          {
            pool_name  = "<pool_name>"
            num_ports  = <no_of_ips_needed_for_this_addrs_pool_object>
            ip_list    = ["<ip_0>","<ip_(num_ports-1)>"]
            ip_range   = "<ip_n> - <ip_(n + num_ports - 1)>"
            cidr       = "<0.0.0.0/29>"
            subnet_id  = "<subnet UUID for the given network>"
            network_id = "<network_id>"
            egress_ip_addr = "<IP address for egress port>"
           }
        ]
      }
      Sample cluster.tfvars for CNLB:
      # vCloud Director Information required to create resources.
      # User must have catalog author permissions + Org Network View
      # User and password must use alphanumeric characters, it can be uppercase or lowercase
      # The password cannot contain "/" or "\", neither contains or be contained in "()"
      vcd_user             = "<Need to be completed>"
      vcd_passwd           = "<Need to be completed>"
      org_name             = "<Need to be completed>"
      org_vdc              = "<Need to be completed>"
      vcd_url              = "<Need to be completed>"
      allow_unverified_ssl = true
      cluster_name         = "<customer_specific_short_cluster_name>"
       
      #if affinity rules will be created set polarity to "Affinity"  if anti-affinity rules set polarity to "Anti-Affinity"
      polarity             = "<Anti-Affinity or Affinity>"
      # specify if the affinity/anti-affinity rule will be hard or soft  if hard set variable to true if soft set variable to false.
      hard_rule            = <true or false>
      # Network used for cluster communication, this network must have the feature to create SNAT and DNAT rules
      private_net_name     = "<Need to be completed>"
      # Networks used for external network communication, normally used for Bastion
      ext_net1_name        = "<Need to be completed>"
      ext_net2_name        = "<Need to be completed>"
      # Catalog and template name where the vApp template is stored. This template will be used for all the VM's
      catalog_name         = "<Need to be completed>"
      template_name        = "<Need to be completed>"
       
       
      # number of hosts
      number_of_bastions = 2
      number_of_k8s_ctrls_no_floating_ip = 3
      number_of_k8s_nodes = <number of worker nodes>
       
      # Amount of RAM assigned to VM's, expressed in MB
      memory_bastion = "4096"
      memory_k8s_node = "32768"
      memory_k8s_ctrl = "8192"
      # Amount of CPU assigned to VM's
      cpu_bastion = "2"
      cpu_k8s_node = "8"
      cpu_k8s_ctrl = "2"
       
      # Amount of cores assigned to VM's. OpenTofu has a bug related to templates created with a different
      # core number of the ones assigned here, it is suggested to use the same number as the template
      cores_bastion = 1
      cores_k8s_ctrl = 1
      cores_k8s_node = 1
      # Disk size, expressed in MB. Minimum disk size is 25600.
      disk_bastion = "102400"
      disk_k8s_node = "40960"
      disk_k8s_ctrl = "40960"
       
      # <user-name> used to deploy your cluster, should be same as $OCCNE_USER
      # Uncomment the below line only if $OCCNE_USER is NOT "cloud-user"
      # ssh_user= "<user-name>"
       
      occne_bastion_names = ["1", "2"]
      occne_control_names = ["1", "2", "3"]
      occne_node_names    = ["1", "2", "3", "4"]
  4. If you are installing CNE with CNLB for traffic segregation, then enable and configure CNLB by performing the procedure in the Configuring Cloud Native Load Balancer (CNLB) section.
Deploying CNE Cluster in VMware Environment

This section describes the procedure to deploy the CNE cluster in a VMware environment.

  1. Ensure that the following mandatory components are set up:
    1. occne.ini file, with the credentials for the vSphere account.
    2. secrets.ini file, with the secret credentials for the vSphere account.
    3. occne.sh profile file, with the credentials for the VCD log in datacenter and CNE version.
    4. cluster.tfvars, with the range of IP addresses, ports, VCD log in information, and other parameters needed for the cluster to run.
    5. repositories (repos), to distribute software across the cluster set up by the bootstrap Ansible file or task.
  2. Check if the network defined in the cluster.tfvars file are reflecting in the VMware GUI:
    1. From VMware GUI, go to Applications → <new vApp> → Networks.
    2. Check if the networks defined in the cluster.tfvars file (ext_net1_name/ext_net2_name) are present in the vApp.
    3. If the networks are not present, perform the following steps to add the networks:
      1. Click NEW.
      2. Under Type, select Direct.
      3. Select the network.
      4. Click ADD.
  3. Run the following command from the /var/occne/cluster/${OCCNE_CLUSTER}/ directory on the Bootstrap Host. This command may take a while to run (can be up to 2 to 4 hours depending on the machines it is run on):
    $ ./deploy.sh

Postinstallation Tasks

This section explains the postinstallation tasks for CNE.

Verifying and Configuring Common Services

Introduction

This section describes the steps to verify and configure CNE Common services hosted on the cluster. There are various UI endpoints that are installed with common services, such as OpenSearch Dashboards, Grafana, Prometheus Server, and Alert Manager. The following sub-sections provide information about launching, verifying, and configuring the UI endpoints.

Prerequisites

  1. Ensure that all the Common services are installed.
  2. Gather the cluster names and version tags that are used during the installation.
  3. All the commands in this section must be run on a Bastion Host.
  4. Ensure you have an HTML5 compliant web browser with network connectivity to CNE.

Common Services Release Information

On successful installation, CNE generates a list of files on Bastion Host to store the release details of Kubernetes and all the common services. You can refer to the files for release information. These files get updated during the upgrade. You can locate these files in the following directory on Bastion Host:
/var/occne/cluster/${OCCNE_CLUSTER}/artifacts

Kubernetes Release File: K8S_container_images.txt

Common Services Release File: CFG_container_images.txt

Disabling Bastion HTTP Server Service

Stop and disable the bastion_http_server service to avoid DNS conflicts, regardless of whether local DNS is enabled or not.

  1. Run the following command on Bastion host to stop the bastion_http_server service:
    $ sudo systemctl stop --now bastion_http_server.service
  2. Run the following command on Bastion host to disable the bastion_http_server service:
    $ sudo systemctl disable bastion_http_server.service

Verifying LBVM Enabled Cluster

Note:

The procedures provided in this section are applicable to LBVM based deployments only.

The following procedure provides the verification steps for a non-CNC Console authenticated environment. For a CNC Console authenticated environment, the same verification procedure must be followed except for the step to get the URL for all the common service user interface. This is because the CNC Console provides the direct link to access the common services user interface.

Verify if OpenSearch Dashboards are Running and Accessible

  1. Run the commands to get the LoadBalancer IP address and port number for occne-opensearch-dashboards Web Interface:
    1. To retrieve the LoadBalancer IP address of the occne-opensearch-dashboards service:
      $ export OSD_LOADBALANCER_IP=$(kubectl get services occne-opensearch-dashboards --namespace occne-infra -o jsonpath="{.status.loadBalancer.ingress[*].ip}")
      
    2. To print the complete URL to access occne-opensearch-dashboards in an external browser:
      $ echo http://${OSD_LOADBALANCER_IP}/${OCCNE_CLUSTER}/dashboard
      Sample output:
      http://10.75.34.2/occne-example/dashboard
  2. Launch the browser and navigate to OpenSearch Dashboards at http://$OSD_LOADBALANCER_IP:$OSD_LOADBALANCER_PORT/$OCCNE_CLUSTER/dashboard/app/home#.
  3. From the welcome screen that appears, choose OpenSearch Dashboards to navigate to OpenSearch's home screen.

Create an Index Pattern Using OpenSearch Dashboards

  1. On OpenSearch Dashboards GUI, click the Hamburger icon on the top left corner to open the sidebar menu.
  2. Expand the Management section and select Dashboards Management.
  3. Select Index Patterns and click Create index pattern.
  4. Enter "occne-logstash-*" in the Index pattern name field.
  5. Verify that you get a "Your index pattern matches <n> sources" message.
  6. Click Next step.
  7. Select I don't want to use the time filter and click Create index pattern.
  8. Ensure that the web page containing the indexes appear on the main viewer frame.
  9. Click the Hamburger icon on the top left corner to open the sidebar menu and select Discover under OpenSearch Dashboard.
  10. Select your index from the drop-down. Additionally, you can use the Search field next to the drop-down to filter the key arguments.

    The system displays the raw log records.

  11. To create another index pattern, repeat Steps 3 to 8 using another valid pattern name instead of logstash* and verify that it matches at least one index name.

Verify OpenSearch Dashboards Cluster Health

  1. On the OpenSearch Dashboards' home page, click Interact with the OpenSearch API to navigate to Dev Tools.
  2. Enter the command GET _cluster/health and send the request by clicking the Play icon.
  3. On the right panel, verify that the value of Status is green.
{
  "cluster_name": "occne-opensearch-cluster",
  "status": "green",    # <----- Verify that status is green
  "timed_out": false,
  "number_of_nodes": 9,
  "number_of_data_nodes": 3,
  "discovered_master": true,
  "discovered_cluster_manager": true,
  "active_primary_shards": 13,
  "active_shards": 26,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 0,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 100
}

Verify if Prometheus Alert Manager is Accessible

  1. Run the following commands to get the LoadBalancer IP address and port number for Prometheus Alertmanager web interface:
    1. Run the following command to retrieve the LoadBalancer IP address of the Alertmanager service:
      
      $ export ALERTMANAGER_LOADBALANCER_IP=$(kubectl get services occne-kube-prom-stack-kube-alertmanager --namespace occne-infra -o jsonpath="{.status.loadBalancer.ingress[*].ip}")
    2. Run the following command to retrieve the LoadBalancer port number of the Alertmanager service:
      $ export ALERTMANAGER_LOADBALANCER_PORT=$(kubectl get services occne-kube-prom-stack-kube-alertmanager --namespace occne-infra -o jsonpath="{.spec.ports[*].port}")
    3. Run the following command to print the complete URL for accessing Alertmanager in an external browser:
      $ echo http://$ALERTMANAGER_LOADBALANCER_IP:$ALERTMANAGER_LOADBALANCER_PORT/$OCCNE_CLUSTER/alertmanager
      Sample output:
      http://10.75.34.9/occne-example/alertmanager
  2. Launch the Browser and navigate to http://$ALERTMANAGER_LOADBALANCER_IP:$ALERTMANAGER_LOADBALANCER_PORT/$OCCNE_CLUSTER/alertmanager received in the output of the above commands. Ensure that the Alertmanager GUI is accessible.

Verify if Alerts are Configured Properly

  1. Navigate to Alerts tab of the Prometheus server GUI.

    Alternatively, you can access the Prometheus Alerts tab using the http://$PROMETHEUS_LOADBALANCER_IP:$PROMETHEUS_LOADBALANCER_PORT/$OCCNE_CLUSTER/prometheus/alerts URL. For <PROMETHEUS_LOADBALANCER_IP>

    and <PROMETHEUS_LOADBALANCER_PORT> values, refer to Step 1 of the Verify metrics are scraped and stored in Prometheus section.
  2. To verify if the Alerts are configured properly, check if the following alerts are displayed in the Alerts tab of the Prometheus GUI.
    BASTION_HOST_ALERTS
    # -------------------------------------------------
    BASTION_HOST_FAILED (0 active)
    ALL_BASTION_HOSTS_FAILED (0 active)
    # -------------------------------------------------
    
    CERT_EXPIRATION_ALERTS
    # -------------------------------------------------
    APISERVER_CERTIFICATE_EXPIRATION_90D (0 active)
    APISERVER_CERTIFICATE_EXPIRATION_60D (0 active)
    APISERVER_CERTIFICATE_EXPIRATION_30D (0 active)
    # -------------------------------------------------
    
    COMMON_SERVICES_STATUS_ALERTS
    # -------------------------------------------------
    PROMETHEUS_NODE_EXPORTER_NOT_RUNNING (0 active) 
    OPENSEARCH_CLUSTER_HEALTH_RED (0 active) 
    OPENSEARCH_CLUSTER_HEALTH_YELLOW (0 active)
    OPENSEARCH_TOO_FEW_DATA_NODES_RUNNING (0 active)
    OPENSEARCH_DOWN (0 active)
    PROMETHEUS_DOWN (0 active)
    ALERT_MANAGER_DOWN (0 active)
    SNMP_NOTIFIER_DOWN (0 active)
    JAEGER_DOWN (0 active)
    METALLB_SPEAKER_DOWN (0 active)
    METALLB_CONTROLLER_DOWN (0 active)
    GRAFANA_DOWN (0 active)
    PROMETHEUS_NO_HA (0 active)
    ALERT_MANAGER_NO_HA (0 active)
    PROMXY_METRICS_AGGREGATOR_DOWN (0 active)
    VCNE_LB_CONTROLLER_FAILED (0 active)
    VSPHERE_CSI_CONTROLLER_FAILED (0 active)
    # -------------------------------------------------
     
    HOST_ALERTS
    # -------------------------------------------------
    DISK_SPACE_LOW (0 active)
    CPU_LOAD_HIGH (0 active)
    LOW_MEMORY (0 active)
    OUT_OF_MEMORY (0 active)
    NTP_SANITY_CHECK_FAILED (0 active)
    NETWORK_INTERFACE_FAILED (2 active)
    PVC_NEARLY_FULL (0 active)
    PVC_FULL (0 active)
    NODE_UNAVAILABLE (0 active)
    ETCD_NODE_DOWN (0 active)
    CEPH_OSD_NEARLY_FULL (0 active)
    CEPH_OSD_FULL (0 active)
    CEPH_OSD_DOWN (0 active)
    # -------------------------------------------------
     
    LOAD_BALANCER_ALERTS
    # -------------------------------------------------
    LOAD_BALANCER_NO_HA (0 active)
    LOAD_BALANCER_NO_SERVICE (0 active)
    LOAD_BALANCER_FAILED (0 active)
    EGRESS_CONTROLLER_NOT_AVAILABLE (0 active)
    OPENSEARCH_DASHBOARD_DOWN (0 active)
    FLUENTD_OPENSEARCH_NOT_AVAILABLE (0 active)
    OPENSEARCH_DATA_PVC_NEARLY_FULL (0 active)
    # -------------------------------------------------
    If no alerts are configured, you can manually configure the alerts by creating the Prometheusrule CRD:
    $ cd /var/occne/cluster/$OCCNE_CLUSTER/artifacts/alerts
    $ kubectl apply -f occne-alerts.yaml --namespace occne-infra

Verify if Grafana is Accessible

  1. Run the following commands to get the load-balancer IP address and port number for Grafana Web Interface:
    1. Retrieve the LoadBalancer IP address of the Grafana service:
      $ export GRAFANA_LOADBALANCER_IP=$(kubectl get services occne-kube-prom-stack-grafana --namespace occne-infra -o jsonpath="{.status.loadBalancer.ingress[*].ip}")
    2. Run the following command to print the complete URL for accessing Grafana in an external browser:
      $ echo http://$GRAFANA_LOADBALANCER_IP:$GRAFANA_LOADBALANCER_PORT/$OCCNE_CLUSTER/grafana
      Sample output:
      http://10.75.34.24/occne-example/grafana
  2. Launch the browser and navigate to the http://$GRAFANA_LOADBALANCER_IP/$OCCNE_CLUSTER/grafana URL retrieved in the output of the previous step. Ensure that the Prometheus server GUI is accessible. The default username and password is admin/admin for the first time access.
  3. Log in to Grafana.
  4. Click Create your first dashboard and choose + Add visualization.
  5. On the Query tab, use the Data source drop-down to select Promxy as data source. Promxy is the default metrics aggregator for Prometheus time series database.

    Note:

    If Promxy is down, then use Prometheus data source temporarily to obtain metric information.
  6. Select Code and enter the following query in the Enter a PromQL query... textbox:
    round((1 - (sum by(kubernetes_node, instance) (node_cpu_seconds_total{mode="idle"}) / sum by(kubernetes_node, instance)  (node_cpu_seconds_total))) * 100, 0.01)
    
  7. Click Run queries.

    The system displays the CPU usage of all the Kubernetes nodes.

Create a Dashboard to Visualize the CPU Usage of the Kubernetes Nodes

  1. Log in to Grafana.
  2. Click Create your first dashboard and choose + Add visualization.
  3. On the Query tab, select the Promxy datasource option from the Data source drop-down. Promxy is the default metrics aggregator for Prometheus time series database.

    Note:

    If Promxy is down, then Prometheus data source can be used temporarily to obtain metric information.
  4. Click the code button and enter the following query in the "Enter a PromQL query..." textbox:
    round((1 - (sum by(kubernetes_node, instance) (node_cpu_seconds_total{mode="idle"}) / sum by(kubernetes_node, instance) (node_cpu_seconds_total))) * 100, 0.01)
    
  5. Click the Run queries button. This query displays the CPU usage of all the Kubernetes nodes.

Verifying CNLB Enabled Cluster

This section provides information about the verifications to be performed on a CNLB enabled cluster post successful installation. Perform the following steps from the active Bastion of the deployed cluster.

Note:

The procedures provided in this section are applicable to LBVM based deployments only.

Verify the Status of CNLB Components

Run the following commands to confirm if the deployed CNLB application pods (cnlb-apps) and CNLB manager pods (cnlb-manager) are in healthy state:

Command for CNLB application pods:
$ kubectl get all -n <occne-namespace> -l app=cnlb-app
Sample output:
NAME                            READY   STATUS    RESTARTS   AGE
pod/cnlb-app-5467fb4f6d-gw5sx   1/1     Running   0          20h
pod/cnlb-app-5467fb4f6d-q2nlp   1/1     Running   0          20h
pod/cnlb-app-5467fb4f6d-rljhh   1/1     Running   0          20h
pod/cnlb-app-5467fb4f6d-thrz6   1/1     Running   0          20h

NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cnlb-app   4/4     4            4           9d

NAME                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/cnlb-app-5467fb4f6d   4         4         4       20h
replicaset.apps/cnlb-app-65b4b7b55c   0         0         0       7d2h
replicaset.apps/cnlb-app-675dbfdfb9   0         0         0       9d
replicaset.apps/cnlb-app-776c797d     0         0         0       7d2h
replicaset.apps/cnlb-app-7c49684f79   0         0         0       9d
Command for CNLB manager pods:
$ kubectl get all -n <occne-namespace> -l app=cnlb-manager
Sample output:
NAME                                READY   STATUS    RESTARTS   AGE
pod/cnlb-manager-64b9744876-9cns2   1/1     Running   0          7d2h

NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cnlb-manager   1/1     1            1           9d

NAME                                      DESIRED   CURRENT   READY   AGE
replicaset.apps/cnlb-manager-6d589fbcbb   1         1         1       4d

Verify the Accessibility of Common Services from CNLB Based Deployments

Note:

This step is applicable to CNLB based deployments only.
Run the following command to verify if all CNE common services are accessible on the respective IP configured under [occne:vars] in the occne.ini file while configuring CNLB:

Note:

  • Use either CURL or browser to attempt access and verify the service URL.
  • Replace http in the following commands with https as applicable.
$ head -n 20 /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini
Sample output:
...
occne_prom_cnlb = 10.5.18.151
occne_alert_cnlb = 10.5.18.230
occne_graf_cnlb = 10.5.18.199
occne_nginx_cnlb = 10.5.18.253
occne_jaeger_cnlb = 10.5.18.53
occne_opensearch_cnlb = 10.5.18.182
 
$ curl 10.5.18.151
<a href="/occne-example/prometheus">Found</a>.
 
$ curl 10.5.18.230
<a href="/occne-example/alertmanager">Found</a>.
 
$ curl 10.5.18.199
<a href="/occne-example/grafana/login">Found</a>.
 
$ curl 10.5.18.253
...
<code>/etc/nginx/nginx.conf</code>.</p>
...
 
$ curl 10.5.18.53
...
// Jaeger version data is embedded by the query-service via search/replace.
...
 
$ curl 10.5.18.182/${OCCNE_CLUSTER}/dashboard/app/home
...
<title>OpenSearch Dashboards</title>
...

Performing Security Hardening

Introduction

After installation, perform an audit of the CNE system security stance before placing the system into service. The audit primarily consists of changing credentials and sequestering SSH keys to the trusted servers. The following table lists all the credentials to be checked, changed, or retained:

Note:

Refer to this section if you are performing bare metal installation.

Table 2-7 Credentials

Credential Name Type Associated Resource Initial Setting Credential Rotation
TOR Switch username/password Cisco Top of Rack Switch username/password from PreFlight Checklist Reset post-install
HP ILO Admin username/password HP Integrated Lights Out Manger username/password from PreFlight Checklist Reset post-install
Oracle ILOM user username/password Oracle Integrated Lights-out Manager username/password from PreFlight Checklist Reset post-install
Server Super User (root) username/password Server Super User Set to well-known Oracle default during server installation Reset post-install

Server Admin User

(admusr)

username/password Server Admin User Set to well-known Oracle default during server installation Reset post-install
Server Admin User SSH SSH Key Pair Server Admin User Key Pair generated at install time Can rotate keys at any time; key distribution manual procedure

If Factory or Oracle defaults were used for any of these credentials, they must be changed before placing the system into operation. You must then store these credentials in a safe a secure way, off site. It is recommended to plan a regular schedule for updating (rotating) these credentials.

Prerequisites

This procedure is performed after the site has been deployed and prior to placing the site into service.

Limitations and Expectations

The focus of this procedure is to secure the various credentials used or created during the install procedure. There are additional security audits that the CNE operator must perform, such as scanning repositories for vulnerabilities, monitoring the system for anomalies, regularly checking security logs. These audits are outside the scope of this post-installation procedure.

References

  1. Nexus commands to configure Top of Rack switch username and password: https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/6-x/security/configuration/guide/b_Cisco_Nexus_9000_Series_NX-OS_Security_Configuration_Guide/b_Cisco_Nexus_9000_Series_NX-OS_Security_Configuration_Guide_chapter_01001.html
  2. See ToR switch procedure for initial username and password configuration: Configure Top of Rack 93180YC-EX Switches
  3. See procedure to configure initial iLO/OA username and password: Configure Addresses for RMS iLOs

Procedure

  1. Reset credentials on the TOR Switch:

    Note:

    The following commands were tested in a laboratory environment in Cisco switches and may differ from other versions of Cisco IOS/NX-OS and other brands.
    1. From the Bastion Host, log in to the switch with username and password from the procedure:
      [bastion host]$ ssh <username>@<switch IP address>
      User Access Verification
      Password: <password>
       
      Cisco Nexus Operating System (NX-OS) Software
      TAC support: http://www.cisco.com/tac
      ...
      ...
      switch> enable
      switch#
    2. Change the password for current username:
      switch# configure terminal
      Enter configuration commands, one per line. End with CNTL/Z.
      switch(config)# username <username> password <newpassword>
    3. Create a new username:
      switch(config)# username <new-username> password <new-password> role
                      [network-operator|network-admin|vdc-admin|vdc-operator]
    4. Save the changes and exit from the switch and log in with the new username and password to verify if the new user was created:
      switch(config)# exit
      switch# write memory
      Building configuration
      [OK]
      switch# exit
      Connection to <switch IP address> closed.
      [bastion host]$
       
      [some server]$ ssh <new-username>@<switch IP address>
      User Access Verification
      Password: <new-password>
       
      Cisco Nexus Operating System (NX-OS) Software
      TAC support: http://www.cisco.com/tac
      ...
      ...
      switch#
    5. Delete the previous old username if it is not needed:
      switch# configure terminal
      Enter configuration commands, one per line. End with CNTL/Z.
      switch(config)# no username <username>
      switch(config)#
    6. Set the secret for the exec mode:
      switch(config)# enable secret <new-enable-password>
      switch(config)# exit
      switch#
    7. Save the above configuration:
      switch# copy running-config startup-config
      Building configuration
      [OK]
      switch#
  2. Reset credentials for the HP ILO Admin Console:
    1. From the Bastion Host, log in to the iLO with username and password from the procedure:
      [bastion host]$ ssh <username>@<iLO address>
      <username>@<iLO address>'s password: <password>
      User:<username> logged-in to ...(<iLO address> / <ipv6 address>)
      Integrated Lights-Out 5
      iLO Advanced 3.02 at  Feb 22 2024
      Server Name: <server name>
      Server Power: On
       
      </>hpiLO->
    2. Change the password for the current username:
      </>hpiLO-> set /map1/accounts1/<username> password=<newpassword>
       
      status=0
      status_tag=COMMAND COMPLETED
      Tue Aug 20 13:27:08 2019
       
      </>hpiLO->
    3. Create a new user:
      </>hpiLO-> create /map1/accounts1 username=<newusername> password=<newpassword> group=admin,config,oemHP_rc,oemHP_power,oemHP_vm
      status=0
      status_tag=COMMAND COMPLETED
      Tue Aug 20 13:47:56 2019
       
      User added successfully.
    4. Exit from the iLOM and log in with the new username and password to verify if the new change works:
      </>hpiLO-> exit
       
      status=0
      status_tag=COMMAND COMPLETED
      Thu Jun 19 21:56:31 2025
       
       
      CLI session stopped
      Received disconnect from <iLO address> port 22:11:  Client Disconnect
      Disconnected from <iLO address> port 22
       
      [bastion host]$ ssh <newusername>@<iLO address>
      <newusername>@<iLO address>'s password: <newpassword>
      User:<newusername> logged-in to ...(<iLO address> / <ipv6 address>)
       
      iLO Advanced 2.61 at  Jul 27 2018
      Server Name: <server name>
      Server Power: On
       
      </>hpiLO->
    5. Delete the previous old username if it is not needed:
      </>hpiLO-> delete /map1/accounts1/<username>
       
      status=0
      status_tag=COMMAND COMPLETED
      Tue Aug 20 13:59:04 2019
       
      User deleted successfully.
  3. Reset credentials for the Netra iLOM user console:
    1. From the Bastion Host, log in to the iLOM:
      [bastion host]$ ssh <username>@<iLOM address>
      Password:
       
      Oracle(R) Integrated Lights Out Manager
       
      Version 4.0.4.51 r134837
       
      Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
       
      Warning: password is set to factory default.
       
      Warning: HTTPS certificate is set to factory default.
       
      Hostname: ORACLESP-2114XLB026
       
      ->
    2. Change current password:
      -> set /SP/users/<currentuser> password
      Enter new password: ********
      Enter new password again: ********
    3. Create a new user:
      -> create /SP/users/<username>
      Creating user...
      Enter new password: ****
      create: Non compliant password. Password length must be between 8 and 16 characters.
      Enter new password: ********
      Enter new password again: ********
      Created /SP/users/<username>
    4. Exit from the iLO and log in as the new user (created in step c) with the new username and password to verify if the new change works:
      -> exit
      Connection to <iLOM address> closed.
      [bastion host]$ ssh <newusername>@<iLOM address>
      Password:
       
      Oracle(R) Integrated Lights Out Manager
       
      Version 4.0.4.51 r134837
       
      Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.
       
      Warning: password is set to factory default.
       
      Warning: HTTPS certificate is set to factory default.
       
      Hostname: ORACLESP-2114XLB026
       
      ->
    5. Delete the previous user if not needed:
      -> delete /SP/users/<non-needed-user>
      Are you sure you want to delete /SP/users/<non-needed-user> (y/n)? y
      Deleted /SP/users/<non-needed-user>
      ->
  4. Procedure for vCNE and BareMetal:

    Reset credentials for the root account on each server. To reset the credential for the root account, log in to each server in the cluster (ssh root@cluster_host) and run the following command:

    Note:

    Password must be at least 14 characters long, must contain 1 uppercase letter, 1 digit and 1 non-alphanumeric character.
    $ sudo passwd root
    Changing password for user root.
    New password:
    Retype new password:
  5. Regenerate or redistribute SSH keys credentials for the user account:
    1. Log in to the Active Bastion Host VM and run the following command to generate a new cluster-wide key-pair in the cluster directory as user:
      > ssh <user><@bastion-host-IP>
      $ is_active_bastion
      IS active-bastion
      $ ssh-keygen -b 4096 -t rsa -C "New SSH Key" -f /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/new_occne_id_rsa -q -N ""
      
    2. Run the following commands to add the new generated private key to the authorized_key file of every node:
      1. For LBVM deployment, run the following command:
        $ for x in bastion-1 bastion-2 k8s-ctrl-1 k8s-ctrl-2 k8s-ctrl-3 k8s-node-1 k8s-node-2 k8s-node-3
              k8s-node-4 oam-lbvm1 oam-lbvm2; do ssh-copy-id -i /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/new_occne_id_rsa ${OCCNE_CLUSTER}-"$x"; done
      2. For CNLB deployment, run the following command:
        $ for x in bastion-1 bastion-2 k8s-ctrl-1 k8s-ctrl-2 k8s-ctrl-3 k8s-node-1 k8s-node-2 k8s-node-3
                k8s-node-4; do ssh-copy-id -i /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/new_occne_id_rsa ${OCCNE_CLUSTER}-"$x"; done
    3. Perform the following steps from Bastion to copy the content of new_occne_id_rsa.pub to id_rsa.pub and new_occne_id_rsa to id_rsa. Also, rename the new_occne_id_rsa and new_occne_id_rsa.pub keys to occne_id_rsa and occne_id_rsa.pub keys respectively.
      1. Copy the new public key to the $home/.ssh directory:
        $ cp /var/occne/cluster/<cluster-name>/.ssh/new_occne_id_rsa.pub /home/<user>/.ssh/id_rsa.pub
      2. Copy the new private key to the $home/.ssh directory:
        $ cp /var/occne/cluster/<cluster-name>/.ssh/new_occne_id_rsa /home/<user>/.ssh/id_rsa
      3. Replace old key with the new key in the cluster directory:
        $ mv /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/new_occne_id_rsa /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa
        $ mv /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/new_occne_id_rsa.pub /var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa.pub
  6. Modify SSH key for LBVM:
    1. Run the following command to to delete the lb-controller-ssh-key secret:
      $ cat /var/occne/cluster/<cluster_name>/.ssh/new_occne_id_rsa | base64 -w0
    2. Recreate the lb-controller-ssh-key deleted secret using the private key created in the previous step.
      kubectl -n occne-infra create secret generic lb-controller-ssh-key --from-file=/var/occne/cluster/${OCCNE_CLUSTER}/.ssh/occne_id_rsa
    3. Run the following commands to restart the occne-lb-controller-server deployment:
      $ kubectl rollout restart deployment occne-lb-controller-server -n occne-infra
      Sample output:
      deployment.apps/occne-lb-controller-server restarted

Activating Optional Features Post Installation

This section provides information about activating optional features, such as Velero, Local DNS, and floating IP post installation.

Dedicated CNLB nodes

Kubernetes supports taints on nodes. This step is needed when CNLB pods are to be run exclusively on specific nodes. Executing this procedure taints those nodes, and only CNLB pods (with the matching toleration) will be allowed to run on them. For more information about dedicating CNLB nodes, see Dedicating CNLB Pods to Specific Worker Nodes.

Activating Velero

Velero is used for performing on-demand backups and restoring CNE cluster data. Velero is an optional feature and has extra set of hardware and networking requirements. You can activate Veloro after installing CNE. For more information about activating Velero, see Activating Velero.

Activating Local DNS

The Local DNS feature is a reconfiguration of core DNS (CoreDNS) to support external hostname resolution. When Local DNS is enabled, CNE routes the connection to external hosts through core DNS rather than the nameservers on the Bastion Hosts. For information about activating this feature, see the "Activating Local DNS" section in Oracle Communications Cloud Native Core, Cloud Native Environment User Guide.

To stop DNS forwarding to Bastion DNS, you must define the DNS details through 'A' records and SRV records. A records and SRV records are added to CNE cluster using Local DNS API calls. For more information about adding and deleting DNS records, see the "Adding and Removing DNS Records" section in Oracle Communications Cloud Native Core, Cloud Native Environment User Guide.

Enabling or Disabling Floating IP in OpenStack

Floating IPs are additional public IP addresses that are associated with instances such as control nodes, worker nodes, Bastion Host, and LBVMs. Floating IPs can be quickly reassigned and switched from one instance to another using an API interface, thereby ensuring high availability and less maintenance. You can activate the Floating IP feature after installing CNE. For information about enabling or disabling the Floating IP feature, see Enabling or Disabling Floating IP in OpenStack.

Verifying LBVM HTTP Server

This section provides information about verifying the ports in LBVM HTTP server.

CNE runs an http server service (lbvm_http_server.service) on port 8887 of each LBVM. Ensure that you don’t deploy any LoadBalancer service using the TCP port 8887 on the LB as lbvm_http_server.service listens on this port.

Upgrading Grafana Post Installation

This section provides information about upgrading Grafana to a custom version post installation.

After installing CNE, depending on your requirement, you can upgrade Grafana to a custom version (For example, 11.2.x). To do so, perform the procedure in the Upgrading Grafana section.