A Appendix
This section contains additional topics that are referred to while performing some of the procedures in the document.
A.1 Artifact Acquisition and Hosting
Introduction
The CNE deployment containers require access to several resources that are downloaded from the internet. For cases where the target system is isolated from the internet, you can use the locally available repositories. These repositories require provisioning with the proper files and versions, and some of the cluster configurations need to be updated to allow the installation containers to locate these local repositories.
- Configuring YUM Repository is needed to hold a mirror of several OL8 repositories, as well as the version of the docker-ce required for the CNE's Kubernetes deployment.
-
Configuring HTTP Repository is required to hold Kubernetes binaries and Helm charts.
- Configuring PIP Repository is required to allow CNE packages to be retrieved by the Bastion Hosts.
- Configuring Container Image Registry is required to configure the container image registry.
- A copy of the Oracle Linux ISO. See Download Oracle Linux 8 for OS installation.
A.1.1 Downloading Oracle Linux
Note:
The 'X' in Oracle LinuxX
or OLX
in this procedure indicates the
latest version of Oracle Linux supported by CNE.
Download Oracle Linux X VM Image for OpenStack
Run this procedure to download an OLX
VM image or
template (QCOW2 format). Use this image to instantiate VMs with OLX
as the guest OS.
- Open the link page https://yum.oracle.com/oracle-linux-templates.html
- Under the section Downloads, click on template *.qcow for release X.X to download the image.
- Once the download is complete, verify that the
sha256sum
of the downloaded image matches theSHA256
checksum provided in the page.
Download Oracle Linux X for BareMetal or VMware
Perform the following procedure to download an OLX
ISO. The
ISO can then be used to install OLX
as the host OS for BareMetal
servers..
- Open the link page https://yum.oracle.com/oracle-linux-isos.html
- Under the section Oracle Linux x86_64 ISOs, click Full ISO version of the image for release X.X to download the ISO image.
- Once the download is complete, perform the verification of the downloaded image by following the Verify Oracle Linux Downloads procedure.
A.1.2 Configuring Container Image Registry
Introduction
The container images used to run the Common services are loaded onto a central server registry to avoid exposing CNE instances to the Public Internet. The container images are downloaded to the Bastion Host in each CNE instance during installation. To allow the Bastion Host to retrieve the container images, create a Container Registry in the Central Server, provisioned with the necessary files.
Prerequisites
- Ensure that the CNE delivery package archive contains the CNE
container images (delivered as the file named
occne_images_${OCCNE_VERSION}.tgz
). - Ensure that a signed 'Central' container registry is running and is able to accept container pushes from the executing system.
- Ensure that Podman (or Docker) container engine is installed on the executing system and the Podman (or Docker) commands are running successfully.
- Ensure that the executing system container engine can reach the internet
docker.io
registry and perform pulls without interference by rate-limiting. This requires a Docker Hub account, obtained at hub.docker.com and signed into by the container tool via the login command before running the container retrieval script.
References
Procedure
- Provisioning the registry with the necessary images:
On a system that is connected to the Central Repository registry, run the following steps to populate the Central Repository registry with the required container images.
Set the environment variables to ensure all commands are working with the same registry and CNE version consistently (if targeting baremetal, do not set OCCNE_vCNE):$ CENTRAL_REPO=<central-repo-name> $ CENTRAL_REPO_REGISTRY_PORT=<central-repo-registry-port> $ OCCNE_VERSION=<OCCNE version> $ OCCNE_CLUSTER=<cluster-name> $ OCCNE_vCNE=<openstack, oci, vmware, or do not define if Bare-Metal> $ if [ -x "$(command -v podman)" ]; then OCCNE_CONTAINER_ENGINE='podman' else OCCNE_CONTAINER_ENGINE='docker' fi
Example:$ CENTRAL_REPO=rainbow-reg $ CENTRAL_REPO_REGISTRY_PORT=5000 $ OCCNE_VERSION=23.4.6 $ OCCNE_CLUSTER=rainbow $ OCCNE_vCNE=openstack $ if [ -x "$(command -v podman)" ]; then OCCNE_CONTAINER_ENGINE='podman' else OCCNE_CONTAINER_ENGINE='docker' fi
- Once the environment is setup, load the provided images into the
CNE image
.tar
file and to the local container registry:$ tar -zxvf occne_images_${OCCNE_VERSION}.tgz $ ${OCCNE_CONTAINER_ENGINE} load -i images_${OCCNE_VERSION}.tar
- Run the following commands to push the CNE images to the Central
Repository registry and remove them from temporary local
storage:
for IMAGE in $(cat images.txt); do ${OCCNE_CONTAINER_ENGINE} image tag ${IMAGE} ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/${IMAGE} ${OCCNE_CONTAINER_ENGINE} image push ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/${IMAGE} ${OCCNE_CONTAINER_ENGINE} image rm ${IMAGE} ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/${IMAGE} done
- Run the following commands to retrieve the lists of required docker
images, binaries, and helm-charts, from each CNE
container:
$ mkdir -p /var/occne/cluster/${OCCNE_CLUSTER} $ for CONTAINER in provision k8s_install configure; do ${OCCNE_CONTAINER_ENGINE} run --rm --privileged -v /var/occne/cluster/${OCCNE_CLUSTER}:/host -e "${OCCNE_vCNE:+OCCNEARGS=--extra-vars=occne_vcne=${OCCNE_vCNE}}" ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/occne/${CONTAINER}:${OCCNE_VERSION} /getdeps/getdeps done
- Add the
/var/occne/cluster/${OCCNE_CLUSTER}/artifacts
directory into $PATH:$ if [[ ":$PATH:" != *":/var/occne/cluster/${OCCNE_CLUSTER}/artifacts:"* ]]; then PATH=${PATH}:/var/occne/cluster/${OCCNE_CLUSTER}/artifacts; fi
- Run the following command to navigate to the
/var/occne/cluster/${OCCNE_CLUSTER}/artifacts
directory and verify that there is aretrieve_container_images.sh
script and a few*_container_images.txt
files.$ cd /var/occne/cluster/${OCCNE_CLUSTER}/artifacts $ for f in *_container_images.txt; do retrieve_container_images.sh '' ${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT} < $f done
- If there are errors reported due to the docker-hub rate-limiting,
you need to create a docker-hub account and do a 'podman login' (or 'docker
login', as appropriate). Before re-running the above steps using that new
account on this system, run the following command to see a list of the required
container
images:
$ cd /var/occne/cluster/${OCCNE_CLUSTER}/artifacts $ for f in *_container_images.txt; do cat $f done
- Verify the list of repositories in the docker registry as follows:
Access endpoint
<registryaddress>:<port>/v2/_catalog
using a browser or from any linux server using the following curl command:$ curl -k https://${CENTRAL_REPO}:${CENTRAL_REPO_REGISTRY_PORT}/v2/_catalog
Example result:$ {"repositories":["23.4.0/kubespray","anchore/anchore-engine","anoxis/registry-cli","aquasec/kube-bench","atmoz/sftp","bats/bats","bd_api","benigno/cne_scan","busybox","cap4c/cap4c-model-executor","cap4c-model-controller-mesh","cap4c-stream-analytics","cdcs/cdcstest","cdcs-auts-test","ceph/ceph","cnc-nfdata-collector","cncc/apigw-common-config-hook","cncc/apigw-configurationinit","cncc/apigw-configurationupdate","cncc/cncc-apigateway","cncc/cncc-cmservice","cncc/cncc-core/validationhook","cncc/cncc-iam","cncc/cncc-iam/healthcheck","cncc/cncc-iam/hook","cncc/debug_tools","cncc/nf_test","cncdb/cndbtier-mysqlndb-client","cncdb/db_backup_executor_svc","cncdb/db_backup_manager_svc","cncdb/db_monitor_svc","cncdb/db_replication_svc","cncdb/docker","cncdb/gitlab/gitlab-runner","cncdb/gradle_image","cncdb/mysql-cluster","cndb2210/cndbtier-mysqlndb-client","cndb2210/db_backup_executor_svc","cndb2210/db_backup_manager_svc","cndb2210/db_monitor_svc","cndb2210/db_replication_svc","cndb2210/mysql-cluster","cndbtier/cicd/sdaas/dind","cndbtier-mysqlndb-client","cndbtier-sftp","cnsbc-ansible-precedence-testing/kubespray","cnsbc-ansible-precedence-testing2/kubespray","cnsbc-occne-8748/kubespray","cnsbc-occne-hugetlb/kubespray","coala/base","curlimages/curl","db_backup_executor_svc","db_backup_manager_svc","db_monitor_svc","db_replication_svc","devansh-kubespray/kubespray","devansh-vsphere-uplift/kubespray","diamcli","docker-remote.dockerhub-iad.oci.oraclecorp.com/jenkins/jenkins","docker-remote.dockerhub-iad.oci.oraclecorp.com/registry","docker.elastic.co/elasticsearch/elasticsearch-oss","docker.elastic.co/kibana/kibana-oss","docker.io/aquasec/kube-bench","docker.io/bats/bats","docker.io/bitnami/kubectl","docker.io/busybox","docker.io/calico/cni","docker.io/calico/kube-controllers","docker.io/calico/node","docker.io/ceph/ceph","docker.io/coredns/coredns","docker.io/curlimages/curl","docker.io/giantswarm/promxy","docker.io/governmentpaas/curl-ssl","docker.io/grafana/grafana","docker.io/istio/pilot","docker.io/istio/proxyv2","docker.io/jaegertracing/all-in-one","docker.io/jaegertracing/example-hotrod","docker.io/jaegertracing/jaeger-agent","docker.io/jaegertracing/jaeger-collector","docker.io/jaegertracing/jaeger-query","docker.io/jenkins/jenkins","docker.io/jenkinsci/blueocean","docker.io/jettech/kube-webhook-certgen","docker.io/jimmidyson/configmap-reload","docker.io/justwatch/elasticsearch_exporter","docker.io/k8s.gcr.io/ingress-nginx/kube-webhook-certgen","docker.io/k8scloudprovider/cinder-csi-plugin","docker.io/k8scloudprovider/openstack-cloud-controller-manager","docker.io/kennethreitz/httpbin","docker.io/lachlanevenson/k8s-helm","docker.io/library/busybox","docker.io/library/nginx","docker.io/library/registry"]
A.1.3 Configuring HTTP Repository
Introduction
To avoid exposing CNE instances to the Public Internet, load the binaries used for Kubespray (Kubernetes installation) and the Helm charts used during Common Services installation onto a Central Server. After loading these binaries and Helm charts, you can download them to the Bastion Host in each CNE instance during installation. To allow the retrieval of binaries and charts by the Bastion Hosts, create an HTTP repository in the Central Server, provisioned with the necessary files.
Prerequisites
- Ensure that an HTTP server is deployed and running on the Central Repository server.
- Ensure that the steps to configure the container image registry are run to obtain the list of dependencies required for each CNE container.
Procedure
- Retrieve Kubernetes Binaries:
The Kubespray requires access to an HTTP server from which it can download the correct version of a set of binary files. To provision an internal HTTP repository, you need to obtain these files from the internet and place them at a known location on the internal HTTP server.
Run the following command to view the list of required binaries:
$ cd /var/occne/cluster/${OCCNE_CLUSTER}/artifacts $ for f in *_binary_urls.txt; do cat $f | grep http done
- Run the following command retrieve the required binaries and place
them in the
binaries
directory under the command-line specified directory:$ for f in *_binary_urls.txt; do retrieve_bin.sh /var/www/html/occne/binaries < $f done
- Retrieve Helm charts:
The provision container requires access to an HTTP server from which it can download the correct version of a set of Helm charts for the required services. To provision the Central Repo HTTP repository, you need to obtain these charts from the internet and place them at a known location on the Central HTTP server using the following command:
- (Optional) Run the following commands to install Helm
from the binaries. In case Helm 3 is already installed, skip this
step:
- Identify the URL where Helm was
downloaded:
$ HELMURL=$(cat /var/occne/cluster/${OCCNE_CLUSTER}/artifacts/PROV_binary_urls.txt | grep -o '\S*helm.sh\S*')
- Determine the archive file name from the
URL:
$ HELMZIP=/var/www/html/occne/binaries/${HELMURL##*/}
- Install Helm from the archive in
/usr/bin
:$ sudo tar -xvf ${HELMZIP} linux-amd64/helm -C /usr/bin --strip-components 1
- Identify the URL where Helm was
downloaded:
- (Optional) Run the following commands to install Helm
from the binaries. In case Helm 3 is already installed, skip this
step:
- Run the following command to view the list of required Helm
charts:
$ for f in *_helm_charts.txt; do cat $f done
- Run the following commands to retrieve the Helm charts from
the
internet:
$ for f in *_helm_charts.txt; do retrieve_helm.sh /var/www/html/occne/charts < $f done
A.1.4 Configuring PIP Repository
Introduction
To avoid exposing CNE instances to the public Internet, packages used during CNE installation are loaded to a central server. These packages are then downloaded to the Bastion Host in each CNE instance during the installation. In order to allow these packages to be retrieved by the Bastion Hosts, a Preferred Installer Program (PIP) repository must be created in the central server and provisioned with the necessary files.
Note:
- CNE 23.4.0 runs on top of Oracle Linux 9, therefore the default Python version is updated Python 3.9. The central repository is not required to run OL9. However, to download the appropriate Python packages, run the following procedure using Python 3.9.
- A given central repository can contain both packages required by OL8 (Python
3.6) or OL9 (Python 3.9) at the same time:
- Python 3.6 packages are stored in
/var/www/html/occne/python
. This path is used by clusters running CNE 23.3.x and lower, which run on top of OL8. - Python 3.9 packages are stored in
/var/www/html/occne/ol9_python
. This path is used by clusters running CNE 23.4.0 and above, which run on top of OL9.
- Python 3.6 packages are stored in
Prerequisites
- Ensure that an HTTP server is deployed and running on the Central Repository server.
- Ensure that the steps to configure the container image registry are run to obtain the list of dependencies required for each CNE container.
- Ensure that Python 3.9 is available at the central repository server.
Procedure
- Ensure that Python3.9 is available:
Log in to the central repository server and run the following command to validate the Python version:
$ python3 --version
Sample output:Python 3.9.X
In case the central repository is running a different Python version (an older version like 3.6 or a newer version like 3.11), perform the Using Provision Container to Configure PIP alternate procedure to utilize CNE provision container to run the necessary steps in a proper Python environment.
- Retrieve PIP Binaries:
Run the following commands to retrieve the required PIP libraries and place them in a directory named
ol9_python
under the command-line specified directory:$ cd /var/occne/cluster/${OCCNE_CLUSTER}/artifacts $ for f in *_python_libraries.txt; do retrieve_python.sh /var/www/html/occne/ol9_python < $f done
- Run the following command view the list of required PIP
libraries:
$ for f in *_python_libraries.txt; do cat $f done
- Run the following command to view the list of required helm
charts:
$ for f in *_helm_charts.txt; do cat $f done
Using Provision Container to Configure PIP
Note:
Set appropriateOCCNE_VERSION
and https_proxy
environment
variables in the shell that is used to launch the
command.
podman run --rm ${https_proxy:+-e https_proxy=${https_proxy}} -v /var/www/html/occne:/var/www/html/occne occne/provision:${OCCNE_VERSION} /bin/sh -c "/getdeps/getdeps && sed -i s'/sudo/#sudo/'g /host/artifacts/retrieve_python.sh && pip3 install -U pip python-pypi-mirror && /host/artifacts/retrieve_python.sh /var/www/html/occne/ol9_python '' < /host/artifacts/PROV_python_libraries.txt"
A.1.5 Configuring YUM Repository
Introduction
The packages used during the OS installation and configuration are loaded onto a central server to avoid exposing CNE instances to the public Internet. These packages are downloaded to the Bastion Host in each CNE instance during the installation. To allow Bastion Hosts to retrieve these packages, you must create a YUM repository in the central server and provision all the necessary files.
You must create a repository file to reference the local YUM repository and place it in the required systems (the systems that run the CNE installation Docker instances).
Note:
The letter 'X' in Oracle Linux version in this section indicates the latest version of Oracle Linux supported by CNE.Prerequisites
- Use one of the following approaches to create a
local YUM mirror repository for the OL
X
baseos_latest, addons, developer, developer_EPEL, appstream, and UEKR7 repositories:- Follow the instructions given in the Managing Software in Oracle
Linux document to subscribe to automatic synching and updates
through the Unbreakable Linux Network (ULN):
Note:
Recently (in August 2023), the appstream repository provided by ULN was found to be incomplete and lead to installation issues with CNE. - Use an Oracle CDCS server setup with a YUM mirror.
- Mirror the necessary YUM channels explicitly using the
reposync
andcreaterepo
Oracle Linux tools. The following example provides a sample bash script (for OL9) and the guidelines to create and sync such a YUM mirror:- Ensure that yum.oracle.com is reachable.
- Create an alternative
'yum.sync.conf'
file to configure the settings other than the machine's defaults. This file can be an altered copy of/etc/yum.conf
. - The bash script can be run regardless of the OS version of the
central repository. However, there can be differences in
parameters or arguments. This specific version is tested on an
OL9 machine.
Sample Bash script:
#!/bin/bash # script to run reposync to get needed YUM packages for the central repo set -x set -e DIR=/var/www/html/yum/OracleLinux/OL9 umask 027 for i in "ol9_baseos_latest 1" "ol9_addons 1" "ol9_developer 1" "ol9_developer_EPEL 1" "ol9_appstream" "ol9_UEKR7 1"; do set -- $i # convert tuple into params $1 $2 etc REPO=$1 NEWESTONLY=$2 # per Oracle Linux Support: appstream does not properly support 'newest-only' mkdir -p ${DIR} # ignore errors as sometimes packages and index do not fully match, just re-run to ensure everything is gathered # use alternate yum.conf file that may point to repodir and settings not used for managing THIS machine's packages reposync --config=/etc/yum.sync.conf --repo=${REPO} -p ${DIR} ${NEWESTONLY:+--newest-only} --delete || true createrepo ${DIR}/${REPO} || true done
- Follow the instructions given in the Managing Software in Oracle
Linux document to subscribe to automatic synching and updates
through the Unbreakable Linux Network (ULN):
- Subscribe (in case of ULN) or Download (in case of
CDCS or Oracle YUM) the following channels while creating the yum mirror:
- Oracle Linux X baseOS Latest. For example:
[ol9_x86_64_baseos_latest]
- Oracle Linux X addons. For example:
[ol9_x86_64_addons]
- Packages for test and development. For example:
OL9 [ol9_x86_64_developer]
- EPEL packages for OL
X
. For example:[ol9_x86_64_developer_EPEL]
- Oracle Linux X appstream. For example:
[ol9_x86_64_appstream]
- Unbreakable Enterprise Kernel Rel 7 for Oracle Linux X x86_64. For
example:
[ol9_x86_64_UEKR7]
- Oracle Linux X baseOS Latest. For example:
Procedure
Configuring the OLX
repository mirror repo file for CNE:
Once the YUM
repository mirror is set up and functional, create a .repo
file
to allow the CNE installation logic to reach and pull files from it to create the
cluster-local mirrors hosted on the Bastion nodes.
The following is a sample repository file providing the details on a mirror with the necessary repositories. This repository file is placed on the CNE Bootstrap machine which will setup the CNE Bastion Host. The directions on the locations is provided in the installation procedure.
Note:
The repository names and the sample repository file provided are explicitly for OL9.- ol9_baseos_latest
- ol9_addons
- ol9_developer
- ol9_appstream
- ol9_developer_EPEL
- ol9_UEKR7
Note:
The host used in the.repo
file must be resolvable by the target nodes. Either it
must be registered in the configured name server or specify the
baseurl
fields by IP
address.
[ol9_baseos_latest]
name=Oracle Linux 9 Latest (x86_64)
baseurl=http://winterfell/yum/OracleLinux/OL9/ol9_baseos_latest
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
enabled=1
module_hotfixes=1
proxy=_none_
[ol9_addons]
name=Oracle Linux 9 Addons (x86_64)
baseurl=http://winterfell/yum/OracleLinux/OL9/ol9_addons
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
enabled=1
module_hotfixes=1
proxy=_none_
[ol9_developer]
name=Packages for creating test and development environments for Oracle Linux 9 (x86_64)
baseurl=http://winterfell/yum/OracleLinux/OL9/ol9_developer
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
enabled=1
module_hotfixes=1
proxy=_none_
[ol9_developer_EPEL]
name=EPEL Packages for creating test and development environments for Oracle Linux 9 (x86_64)
baseurl=http://winterfell/yum/OracleLinux/OL9/ol9_developer_EPEL
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
enabled=1
module_hotfixes=1
proxy=_none_
[ol9_appstream]
name=Application packages released for Oracle Linux 9 (x86_64)
baseurl=http://winterfell/yum/OracleLinux/OL9/ol9_appstream
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
enabled=1
module_hotfixes=1
proxy=_none_
[ol9_UEKR7]
name=Unbreakable Enterprise Kernel Release 7 for Oracle Linux 9 (x86_64)
baseurl=http://winterfell/yum/OracleLinux/OL9/ol9_UEKR7
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
enabled=1
module_hotfixes=1
proxy=_none_
A.2 Installation Reference Procedures
A.2.1 Inventory File Preparation
Introduction
CNE installation automation uses information within an CNE Inventory file to provision servers and virtual machines, install cloud native components, and configure all the components within the cluster so that they constitute a cluster compatible with the CNE platform specifications.
To assist with the creation of the CNE Inventory, a boilerplate CNE Inventory is
provided as hosts_sample.ini
. If the iLO network is controlled by
lab network or customer network which is beyond the ToR switches, another Inventory
is provided as hosts_sample_remoteilo.ini
.
- 2 Bastion nodes: These nodes provide management access to the cluster and a repository for container images and helm charts used to install Kubernetes applications into the cluster. During installation and upgrade, the installer runs on one of these nodes.
- 3 Kubernetes master or etcd nodes: These serve as the management of the Kubernetes cluster and run-time storage of configuration data.
- Kubernetes worker nodes: These nodes run the applications for the services that the cluster provides.
- 3 Master Host machines: Each master host machine hosts one Kubernetes master or etcd virtual machine, and two of them also host a one bastion virtual machine. All of the host machines need access to the cluster network, and the two with bastions also need network accessibility to the ILO and Management networks.
- Worker machines: Each worker machine hosts the Kubernetes worker node logic locally (not in a virtual machine) for the best performance. All of the worker machines need access to the cluster network.
Inventory File Overview
The inventory file is an Initialization (INI) formatted file named hosts.ini. The elements of an inventory file are hosts, properties, and groups.
- A host is defined as a Fully Qualified Domain Name (FQDN). Properties are defined as the key is equal to value pairs.
- A property applies to a specific host when it appears on the same line as the host.
- Square brackets define group names. For example, host_hp_gen_10 defines the group of physical HP Gen10 machines. There is no explicit "end of group" delimiter, rather group definitions end at the next group declaration or the end of the file. Groups cannot be nested.
- A property applies to an entire group when it is defined under a group heading not on the same line as a host.
- Groups of groups are formed using the children keyword. For example, the occne:children creates an occne group comprised of several other groups.
- Inline comments are not allowed.
Table A-1 Base Groups
Group Name | Description |
---|---|
host_hp_gen_10
host_netra_x8_2 host_netra_x9_2 |
Contains the list of all physical machines in the CNE cluster.
Each host must be listed in the group matching its hardware type.
Each entry starts with the fully qualified name of the machine as
its inventory hostname. Each host in this group must have several
properties defined as follows:
The default configuration of a node in this group is
for a Gen 10 RMS with modules providing boot interfaces at Linux
interface identifiers 'eno5' and 'eno6'. For Gen 10 blades, the
boot interfaces are usually 'eno1' and 'eno2' and must be
specified by adding the following properties:
|
host_kvm_guest | Contains the list of all virtual machines in the CNE cluster. Each host
in this group must have several properties defined as follows:
|
occne:children | Do not modify the children of the occne group. |
occne:vars | This is a list of variables representing configurable site-specific data. While some variables are optional, define the ones listed in the boilerplate with valid values. If a given site does not have applicable data to fill in for a variable, consult the CNE installation or engineering team. For a description of Individual variable values, see the subsequent sections. |
kube-master | The list of Master Node hosts where Kubernetes master components run. For example,k8s-master-1.rainbow.lab.us.oracle.com |
etcd | The list of hosts that compose the etcd server. It must always be an odd number. This set is the same list of nodes as the kube-master group. |
kube-node | The list of Worker Nodes. Worker Nodes are where Kubernetes pods run, and they must consist of the bladed hosts. For example, k8s-node-1.rainbow.lab.us.oracle.com |
k8s-cluster:children | Do not modify the children of k8s-cluster. |
occne_bastion | The list of Bastion Hosts names. For example, bastion-1.rainbow.lab.us.oracle.com |
Note:
Before initiating the procedure, copy the Inventory file to a system where it can be edited and saved for future use. Eventually, thehosts.ini
file needs to be transferred to the
CNE bootstrap server.
Procedure
CNE Cluster Name
To provide each CNE host with a unique FQDN, the first step in composing the CNE Inventory is to create an CNE Cluster domain suffix. The CNE Cluster domain suffix starts with a Top-level Domain (TLD). Various government and commercial authorities maintain the structure of a TLD. Additional domain name levels help identify the cluster and are added to help convey additional meaning. CNE suggests adding at least one "adhoc" identifier and at least one "geographic" and "organizational" identifier.
Geographic and organizational identifiers can be multiple levels deep.
- Adhoc Identifier: atlantic
- Organizational Identifier: lab1
- Organizational Identifier: research
- Geographical Identifier (State of North Carolina): nc
- Geographical Identifier (Country of United States): us
- TLD: oracle.com
For example, CNE Cluster name: atlantic.lab1.research.nc.us.oracle.com
Create host_hp_gen_10/host_netra_x8_2 and host_kvm_guest group lists
Using the CNE Cluster domain suffix created as per the above example, fill out
the inventory boilerplate with the list of hosts in the host_hp_gen_10 and
host_kvm_guest groups. The recommended hostname prefix for Kubernetes
nodes is k8s-[host|master|node]-x
where x is a number 1
to N. The k8s-host-x machines run the k8s-master-x and bastion-x virtual
machines.
Edit occne:vars
Edit the values in the occne:vars group to reflect site-specific data. Values in the occne:vars group are defined as follows:
Table A-2 Edit occne:vars
Var Name | Description/Comment |
---|---|
occne_cluster_name | Set to the CNE Cluster Name as shown in the CNE Cluster Name section. |
subnet_ipv4 | Set to the subnet of the network used to assign IPs for CNE hosts. |
subnet_cidr | Set to the cidr notation for the subnet with leading
/.
For example: /24 |
netmask | Set appropriately for the network used to assign IPs for CNE hosts. |
broadcast_address | Set appropriately for the network used to assign IPs for CNE hosts. |
default_route | Set to the IP of the TOR switch. |
name_server | Set to comma separated list of external nameserver(s) (Optional) |
ntp_server | Set to a comma-separated list of NTP servers to provide time to the
cluster. This can be the TOR switch if it is appropriately
configured with NTP. If unspecified, then the
central_repo_host will be used.
|
occne_repo_host_address | Set to the Bootstrap Host internal IPv4 address. |
calico_mtu | The default value for calico_mtu is 1500 for Baremetal. If this value needs to be modified, use an integer value between 100 and 100000. |
central_repo_host | Set to the hostname of the central repository (for YUM, Docker, HTTP resources). |
central_repo_host_address | Set to the IPv4 address of the
central_repo_host .
|
pxe_install_lights_out_usr | Set to the user name configured for iLO admins on each host in the CNE Frame. |
pxe_install_lights_out_passwd | Set to the password configured for iLO admins on each host in the CNE Frame. |
ilo_vlan_id | Set to the VLAN ID of the iLO network. For example: 2. This variable is required only when iLO network is local to the ToR switches and ilo_host is needed on Bastion Host servers. Skip this variable if if the iLO network is beyond the ToR switches. |
ilo_subnet_ipv4 | Set to the subnet of the iLO network used to assign IPs for bastion hosts. This variable is required only when iLO network is local to the ToR switches and ilo_host is needed on Bastion Host servers. Skip this variable if if the iLO network is beyond the ToR switches. |
ilo_subnet_cidr | Set to the cidr notation for the subnet. For example: 24. This variable is required only when iLO network is local to the ToR switches and ilo_host is needed on Bastion Host servers. Skip this variable if if the iLO network is beyond the ToR switches. |
ilo_netmask | Set appropriately for the network used to assign iLO IPs for bastion hosts. This variable is required only when iLO network is local to the ToR switches and ilo_host is needed on Bastion Host servers. Skip this variable if if the iLO network is beyond the ToR switches. |
ilo_broadcast_address | Set appropriately for the network used to assign iLO IPs for bastion hosts. This variable is required only when iLO network is local to the ToR switches and ilo_host is needed on Bastion Host servers. Skip this variable if if the iLO network is beyond the ToR switches. |
ilo_default_route | Set to the ILO VIP of the TOR switch. This variable is required only when iLO network is local to the ToR switches and ilo_host is needed on Bastion Host servers. Skip this variable if if the iLO network is beyond the ToR switches. |
mgmt_vlan_id | Set to the VLAN ID of the Management network. For example: 4 |
mgmt_subnet_ipv4 | Set to the subnet of the Management network used to assign IPs for bastion hosts. |
mgmt_subnet_cidr | Set to the cidr notation for the Management subnet. For example: 29 |
mgmt_netmask | Set appropriately for the network used to assign Management IPs for bastion hosts. |
mgmt_broadcast_address | Set appropriately for the network used to assign Management IPs for bastion hosts. |
mgmt_default_route | Set to the Management VIP of the TOR switch. |
occne_snmp_notifier_destination | Set to the address of SNMP trap receiver. For example: "127.0.0.1:162" |
cncc_enabled | Set to False for LoadBalance type service. Set to True for ClusterIP type service. The default value is False. |
local_dns_enabled | Set to False to enable Bastion Host DNS. Set to True to enable the Local DNS feature. |
CNE Inventory Sample hosts.ini File
An example hosts_sample.ini
or
hosts_sample_remoteilo.ini
file can be obtained via MOS. It is
delivered in the occne-config-<release_number>.tgz
file.
A.2.2 Installation Preflight Checklist
Introduction
This procedure identifies the pre-conditions necessary to begin the installation of an CNE frame. The field-install personnel can use this procedure as a reference to ensure that the frame is correctly assembled and the inventory of required artifacts is available before attempting the installation activities.
Reference
PrerequisitesThe primary function of this procedure is to identify the prerequisites necessary for the installation to begin.
Confirm if the hardware components are installed in the frame and connected as per the tables below:
Figure A-1 Rackmount ordering

The CNE frame installation must be complete before running any software installation. This section provides a reference to verify if the frame is installed as expected by software installation tools.
This
section also contains the point-to-point connections for the switches. The
switches in the solution must follow the naming scheme of
Switch<series number>
like Switch1, Switch2, and so
on, where Switch1 is the first switch in the solution, and switch2 is the
second. These two switches form a redundant pair. To find the switch datasheet,
see https://www.cisco.com/c/en/us/products/collateral/switches/nexus-9000-series-switches/datasheet-c78-736651.html.
Table A-3 ToR Switch Connections
Switch Port Name/ID (From) | From Switch 1 to Destination | From Switch 2 to Destination | Cable Type | Module Required |
---|---|---|---|---|
1 | RMS 1, FLOM NIC 1 | RMS 1, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
2 | RMS 1, iLO | RMS 2, iLO | CAT 5e or 6A | 1GE Cu SFP |
3 | RMS 2, FLOM NIC 1 | RMS 2, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
4 | RMS 3, FLOM NIC 1 | RMS 3, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
5 | RMS 3, iLO | RMS 4, iLO | CAT 5e or 6A | 1GE Cu SFP |
6 | RMS 4, FLOM NIC 1 | RMS 4, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
7 | RMS 5, FLOM NIC 1 | RMS 5, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
8 | RMS 5, iLO | RMS 6, iLO | CAT 5e or 6A | 1GE Cu SFP |
9 | RMS 6, FLOM NIC 1 | RMS 6, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
10 | RMS 7, FLOM NIC 1 | RMS 7, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
11 | RMS 7, iLO | RMS 8, iLO | CAT 5e or 6A | 1GE Cu SFP |
12 | RMS 8, FLOM NIC 1 | RMS 8, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
13 | RMS 9, FLOM NIC 1 | RMS 9, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
14 | RMS 9, iLO | RMS 10, iLO | CAT 5e or 6A | 1GE Cu SFP |
15 | RMS 10, FLOM NIC 1 | RMS 10, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
16 | RMS 11, FLOM NIC 1 | RMS 11, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
17 | RMS 11, iLO | RMS 12, iLO | CAT 5e or 6A | 1GE Cu SFP |
18 | RMS 12, FLOM NIC 1 | RMS 12, FLOM NIC 2 | Cisco 10GE DAC | Integrated in DAC |
19 - 48 | Unused (add for more RMS when needed) | Unused (add for more RMS when needed) | NA | NA |
49 | Mate Switch, Port 49 | Mate Switch, Port 49 | Cisco 40GE DAC | Integrated in DAC |
50 | Mate Switch, Port 50 | Mate Switch, Port 50 | Cisco 40GE DAC | Integrated in DAC |
51 | OAM Uplink to Customer | OAM Uplink to Customer | 40GE (MM or SM) Fiber | 40GE QSFP |
52 | Signaling Uplink to Customer | Signaling Uplink to Customer | 40GE (MM or SM) Fiber | 40GE QSFP |
53 | Unused | Unused | N/A | N/A |
54 | Unused | Unused | N/A | N/A |
Management (Ethernet) | RMS 1, NIC 2 (1GE) | RMS 1, NIC 3 (1GE) | CAT5e or CAT 6A | None (RJ45 port) |
Management (Serial) | Unused | Unused | None | None |
You can find the Server quick specs as follows: https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00008180enw
- iLO: The integrated Lights Out management interface (iLO) contains an ethernet out of band management interface for the server. This connection is 1GE RJ45.
- 4x1GE LOM: For most servers in the solution, their 4x1GE LOM ports will be unused. The exception is the first server in the first frame. This server will serve as the management server for the ToR switches. In this case, the server will use 2 of the LOM ports to connect to ToR switches' respective out of band ethernet management ports. These connections will be 1GE RJ45 (CAT 5e or CAT 6).
- 2x10GE FLOM: Every server will be equipped with a 2x10GE Flex LOM card (or FLOM). These will be for in-band or application and solution management traffic. These connections are 10GE fiber (or DAC) and will terminate towards the ToR switches' respective SFP+ ports.
All RMS in the frame will only use the 10GE FLOM connections, except for the "management server", the first server in the frame will have some special connections listed as follows:
Table A-4 Bootstrap Server Connections
Server Interface | Destination | Cable Type | Module Required | Notes |
---|---|---|---|---|
Base NIC1 (1GE) | Unused | None | None | N/A |
Base NIC2 (1GE) | Switch1A Ethernet Mngt | CAT5e or 6a | None | Switch Initialization |
Base NIC3 (1GE) | Switch1B Ethernet Mngt | CAT5e or 6a | None | Switch Initialization |
Base NIC4 (1GE) | Unused | None | None | N/A |
FLOM NIC1 | Switch1A Port 1 | Cisco 10GE DAC | Integrated in DAC | OAM, Signaling, Cluster |
FLOM NIC2 | Switch1B Port 1 | Cisco 10GE DAC | Integrated in DAC | OAM, Signaling, Cluster |
USB Port1 | USB Flash Drive | None | None | Bootstrap Host Initialization Only (temporary) |
USB Port2 | Keyboard | USB | None | Bootstrap Host Initialization Only (temporary) |
USB Port3 | Mouse | USB | None | Bootstrap Host Initialization Only (temporary) |
Monitor Port | Video Monitor | DB15 | None | Bootstrap Host Initialization Only (temporary) |
Ensure artifacts listed in the Artifact Acquisition and Hosting are available in repositories accessible from the CNE Frame.
The beginning stage of installation requires a local KVM for installing the bootstrap environment.
Procedure
Table A-5 Complete Site Survey Subnet Table
Sl No. | Network Description | Subnet Allocation | Bitmask | VLAN ID | Gateway Address |
---|---|---|---|---|---|
1 | iLO/OA Network | 192.168.20.0 | 24 | 2 | N/A |
2 | Platform Network | 172.16.3.0 | 24 | 3 | 172.16.3.1 |
3 | Switch Configuration Network | 192.168.2.0 | 24 | N/A | N/A |
4 | Management Network - Bastion Hosts | 28 | 4 | ||
5 | Signaling Network - MySQL Replication | 29 | 5 | ||
6 | OAM Pool - metalLB pool for common services | N/A | N/A (BGP redistribution) | ||
7 | Signaling Pool - metalLB pool for 5G NFs | N/A | N/A (BGP redistribution) | ||
8 | Other metalLB pools (Optional) | N/A | N/A (BGP redistribution) | ||
9 | Other metalLB pools (Optional) | N/A | N/A (BGP redistribution) | ||
10 | Other metalLB pools (Optional) | N/A | N/A (BGP redistribution) | ||
11 | ToR Switch A OAM Uplink Subnet | 30 | N/A | ||
12 | ToR Switch B OAM Uplink Subnet | 30 | N/A | ||
13 | ToR Switch A Signaling Uplink Subnet | 30 | N/A | ||
14 | ToR Switch B Signaling Uplink Subnet | 30 | N/A | ||
15 | ToR Switch A/B Crosslink Subnet (OSPF link) | 172.16.100.0 | 30 | 100 |
Note:
The "iLO VLAN IP Address (VLAN 2)" column is not required if "Device iLO IP Address" is accessed from management IP interface.Table A-6 Complete Site Survey Host IP Table
Sl No. | Component/Resource | Platform VLAN IP Address (VLAN 3) | iLO VLAN IP Address (VLAN 2) | CNE Management IP Address (VLAN 4) | Device iLO IP Address | MAC of Primary NIC |
---|---|---|---|---|---|---|
1 | RMS 1 Host IP | 172.16.3.4 | 192.168.20.11 | 192.168.20.121 | Eno5: | |
2 | RMS 2 Host IP | 172.16.3.5 | 192.168.20.12 | 192.168.20.122 | Eno5: | |
3 | RMS 3 Host IP | 172.16.3.6 | N/A | N/A | 192.168.20.123 | Eno5: |
4 | RMS 4 Host IP | 172.16.3.7 | N/A | N/A | 192.168.20.124 | Eno5: |
5 | RMS 5 Host IP | 172.16.3.8 | N/A | N/A | 192.168.20.125 | Eno5: |
Table A-7 Complete VM IP Table
Sl No. | Component/Resource | Platform VLAN IP Address (VLAN 3) | iLO VLAN IP Address (VLAN 2) | CNE Management IP Address (VLAN 4) | SQL Replication IP Address(VLAN 5) |
---|---|---|---|---|---|
1 | Bastion Host 1 | 172.16.3.100 | 192.168.20.100 | NA | NA |
2 | Bastion Host 2 | 172.16.3.101 | 192.168.20.101 | NA | NA |
Table A-8 Complete Switch IP Table
Sl No. | Procedure Reference Variable Name | Description | IP Address | VLAN ID | Notes |
---|---|---|---|---|---|
1 | ToRswitchA_Platform_IP | Host Platform Network | 172.16.3.2 | 3 | |
2 | ToRswitchB_Platform_IP | Host Platform Network | 172.16.3.3 | 3 | |
3 | ToRswitch_Platform_VIP | Host Platform Network Default Gateway | 172.16.3.1 | 3 | This address is also used as the source NTP address for all servers. |
4 | ToRswitchA_CNEManagementNet_IP | Bastion Host Network | 4 | Address needs to be without prefix length, such as 10.25.100.2 | |
5 | ToRswitchB_CNEManagementNet_IP | Bastion Host Network | 4 | Address needs to be without prefix length, such as 10.25.100.3 | |
6 | ToRswitch_CNEManagementNet_VIP | Bastion Host Network Default Gateway | 4 | No prefix length, address only for VIP | |
7 | CNEManagementNet_Prefix | Bastion Host Network Prefix Length | 4 | number only such as 29 | |
8 | ToRswitchA_SQLreplicationNet_IP | SQL Replication Network | 5 | Address needs to be with prefix length, such as 10.25.200.2 | |
9 | ToRswitchB_SQLreplicationNet_IP | SQL Replication Network | 5 | Address needs to be with prefix length, such as 10.25.200.3 | |
10 | ToRswitch_SQLreplicationNet_VIP | SQL Replication Network Default Gateway | 5 | No prefix length, address only for VIP | |
11 | SQLreplicationNet_Prefix | SQL Replication Network Prefix Length | 5 | number only such as 28 | |
12 | ToRswitchA_oam_uplink_customer_IP | ToR Switch A OAM uplink route path to customer network | N/A | No prefix length in address, static to be /30 | |
13 | ToRswitchA_oam_uplink_IP | ToR Switch A OAM uplink IP | N/A | No prefix length in address, static to be /30 | |
14 | ToRswitchB_oam_uplink_customer_IP | ToR Switch B OAM uplink route path to customer network | N/A | No prefix length in address, static to be /30 | |
15 | ToRswitchB_oam_uplink_IP | ToR Switch B OAM uplink IP | N/A | No prefix length in address, static to be /30 | |
16 | ToRswitchA_signaling_uplink_customer_IP | ToR Switch A Signaling uplink route path to customer network | N/A | No prefix length in address, static to be /30 | |
17 | ToRswitchA_signaling_uplink_IP | ToR Switch A Signaling uplink IP | N/A | No prefix length in address, static to be /30 | |
18 | ToRswitchB_signaling_uplink_customer_IP | ToR Switch B Signaling uplink route path to customer network | N/A | No prefix length in address, static to be /30 | |
19 | ToRswitchB_signaling_uplink_IP | ToR Switch B Signaling uplink IP | N/A | No prefix length in address, static to be /30 | |
20 | ToRswitchA_mngt_IP | ToR Switch A Out of Band Management IP | 192.168.2.1 | N/A | |
21 | ToRswitchB_mngt_IP | ToR Switch A Out of Band Management IP | 192.168.2.2 | N/A | |
22 | MetalLB_Signal_Subnet_With_Prefix | ToR Switch route provisioning for metalLB | N/A | From Section 2.1 | |
23 | MetalLB_Signal_Subnet_IP_Range | Used for mb_resources.yaml signaling address pool | host address range from the above row subnet, exclude network and broadcast address, such as 1.1.1.1-1.1.1.14 for 1.1.1.0/28 subnet | ||
24 | MetalLB_OAM_Subnet_With_Prefix | ToR Switch route provisioning for metalLB | N/A | From Section 2.1 | |
25 | MetalLB_OAM_Subnet_IP_Range | Used for mb_resources.yaml OAM address pool | host address range from the above row subnet, exclude network and broadcast address, such as 1.1.1.1-1.1.1.14 for 1.1.1.0/28 subnet | ||
26 | Allow_Access_Server | IP address of external management server to access ToR switches | access-list Restrict_Access_ToR denied all direct external access to ToR switch vlan interfaces, in case of trouble shooting or management need to access direct access from outside, allow specific server to access. If no need, delete this line from switch configuration file. If need more than one, add similar line. | ||
27 | SNMP_Trap_Receiver_Address | IP address of the SNMP trap receiver | |||
28 | SNMP_Community_String | SNMP v2c community string | To be easy, same for snmpget and snmp traps |
Table A-9 ToR and Enclosure Switches Variables Table (Switch Specific)
Key/Vairable Name | ToR_SwitchA Value | ToR_SwitchB Value | Notes |
---|---|---|---|
switch_name | NA | NA | Customer defined switch name for each switch. |
admin_password | NA | NA | Password for admin user. Strong password requirement: Length should be at least 8 characters Contain characters from at least three of the following classes: lower case letters, upper case letters, digits and special characters. No '?' as special character due to not working on switches. No '/' as special character due to the procedures. |
user_name | NA | NA | Customer defined user. |
user_password | NA | NA | Password for <user_name> Strong password requirement: Length should be at least 8 characters. Contain characters from at least three of the following classes: lower case letters, upper case letters, digits and special characters. No '?' as special character due to not working on switches. No '/' as special character due to the procedures. |
ospf_md5_key | NA | NA | The key has to be same on all ospf interfaces on ToR switches and connected customer switches |
ospf_area_id | NA | NA | The number as OSPF area id. |
nxos_version | NA | NA | The version nxos.9.2.3.bin is used by default and hard-coded in the configuration template files. If the installed ToR switches use a different version, record the version here. The installation procedures will reference this variable and value to update a configuration template file. |
NTP_server_1 | NA | NA | NA |
NTP_server_2 | NA | NA | NA |
NTP_server_3 | NA | NA | NA |
NTP_server_4 | NA | NA | NA |
NTP_server_5 | NA | NA |
Table A-10 Complete Site Survey Repository Location Table
Repository | Location Override Value |
---|---|
Yum Repository | |
Docker Registry | |
Helm Repository |
Run the Inventory File Preparation Procedure to populate the inventory file.
Since the bootstrap environment is not connected to the network until the ToR switches are configured, you must provide the environment with the required software via USB flash drives to begin the install process.
Use one flash drive to install an OS on the Installer Bootstrap Host. The details on how to setup of the USB for OS installation is provided in a different procedure. Ensure that this flash drive contains approximately 6GB capacity.
Once the OS installation is complete, use another flash drive to transfer the required configuration files to the Installer Bootstrap Host. Ensure that this flash drive contains approximately 6GB capacity.
Note:
- The instructions listed here are for a Linux host. You can obtain these instructions from the Web if needed. The mount instructions are for a Linux machine.
- When creating these files
on a USB from Windows (using notepad or some other Windows editor),
the files can contain control characters that are not recognized
when used in a Linux environment. Usually, this includes a
^M at the end of each line. You can remove the
control characters using the dos2unix command in Linux with the
file:
dos2unix <filename>
. - When copying the files to this USB, make sure that the USB is formatted as FAT32.
- Copy the hosts.ini file from Set up the Host Inventory File (hosts.ini) onto the Utility USB.
- Copy Oracle Linux repository (for example: OL9) file from the customer's OL YUM mirror instance onto the Utility USB. For more details, see the YUM Repository Configuration section.
- Copy the following switch
configuration template files from OHC to the Utility USB:
- 93180_switchA.cfg
- 93180_switchB.cfg
dhcpd.conf
file that is needed to Configuring Top of Rack 93180YC-EX Switches.
- Mount the Utility USB.
Note:
For instructions on mounting a USB in Linux, see Downloading Oracle Linux section. - Change drive (cd) to the mounted USB directory.
- Download the
poap.py
file to the USB. You can obtain the file using the following command on any Linux server or laptop:wget https://raw.githubusercontent.com/datacenter/nexus9000/master/nx-os/poap/poap.py
- Rename the
poap.py
script topoap_nexus_script.py
.mv poap.py poap_nexus_script.py
- The switches' firmware version
is handled before the installation procedure, so no need to handle it
from the poap.py script. Comment out the lines to handle the firmware at
lines
1931-1944.
vi poap_nexus_script.py # copy_system() # if single_image is False: # copy_kickstart() # signal.signal(signal.SIGTERM, sig_handler_no_exit) # # install images # if single_image is False: # install_images() # else: # install_images_7_x() # # Cleanup midway images if any # cleanup_temp_images()
dhcpd.conf
file that is needed to Configure Top of Rack 93180YC-EX Switches.
- Edit file: dhcpd.conf. The
first subnet 192.168.2.0 is the subnet for mgmtBridge on bootstrap host,
the second subnet 192.168.20.0 is the ilo_subnet_ipv4 in
hosts.ini
file. Modify the subnet according to real value for the cluster. - Copy the following contents to
that file and save it on the
USB.
# # DHCP Server Configuration file. # see /usr/share/doc/dhcp-server/dhcpd.conf.example # see dhcpd.conf(5) man page # # Set DNS name and DNS server's IP address or hostname option domain-name "example.com"; option domain-name-servers ns1.example.com; # Declare DHCP Server authoritative; # The default DHCP lease time default-lease-time 10800; # Set the maximum lease time max-lease-time 43200; # Set Network address, subnet mask and gateway subnet 192.168.2.0 netmask 255.255.255.0 { # Range of IP addresses to allocate range dynamic-bootp 192.168.2.101 192.168.2.102; # Provide broadcast address option broadcast-address 192.168.2.255; # Set default gateway option routers 192.168.2.1; } subnet 192.168.20.0 netmask 255.255.255.0 { # Range of IP addresses to allocate range dynamic-bootp 192.168.20.4 192.168.20.254; # Provide broadcast address option broadcast-address 192.168.20.255; # Set default gateway option routers 192.168.20.1; }
- Edit file: md5Poap.sh
- Copy the following contents to
that file and save it on the USB.
#!/bin/bash f=poap_nexus_script.py ; cat $f | sed '/^#md5sum/d' > $f.md5 ; sed -i "s/^#md5sum=.*/#md5sum=\"$(md5sum $f.md5 | sed 's/ .*//')\"/" $f
A.2.3 Common Installation Configuration
This section details the configurations that are common to both baremetal and virtualized versions of the CNE installation.
Common Services Configuration
Update the file
/var/occne/cluster/${OCCNE_CLUSTER}/hosts.ini
or
occne.ini
to define the required ansible variables for the
deployment. The following table describes the list of possible
/var/occne/cluster/${OCCNE_CLUSTER}/occne.ini
variables that can be
combined with the deploy.sh/pipeline.sh
command to further define the
deployment. A starting point for this file is provided as
/var/occne/cluster/${OCCNE_CLUSTER}/occne.ini.template
in
virtual environment and as hosts_sample.ini
in baremetal
environment.
Set all these variables in the [occne:vars] section of the
.ini
file.
Prerequisites
Gather log_trace_active_storage
,
log_trace_inactive_storage
and
total_metrics_storage
values from the Preinstallation Tasks.
Table A-11 Configuration for CNCC Authenticated Environment
occne.ini Variable | Definition | Default | Required (Y/N) |
---|---|---|---|
cncc_enabled |
Can be set to either 'True' or 'False' Set to 'True' for Cloud Native Control Console (CNCC) authenticated environment. Set to 'False' or do not define for non-CNCC authenticated environment. |
False | N |
Configuration for OpenSearch and Prometheus
Update the file
/var/occne/cluster/${OCCNE_CLUSTER}/hosts.ini
or
occne.ini
to define the required ansible variables for the
deployment. The following table describes the list of possible variables that can be
combined with the deploy.sh/pipeline.sh
command to further define
the deployment. A starting point for this file is provided as
/var/occne/cluster/${OCCNE_CLUSTER}/occne.ini.template
in
virtual environments and as hosts_sample.ini
in baremetal
environments.
Set all the variables in the [occne:vars] section of the
.ini
file.
Note:
- The default values in the table do not reflect the actual values that a component in a particular environment requires. The values must be updated as per your environment and requirement. The performance of a component is proportional to the amount of resources allocated and the change in usage metrics. Therefore, ensure that you set the variable values diligently. Providing values that do not fit the requirements can lead to poor performance and unstable environment issues.
- The variables with Y under the Required (Y/N) column are necessary, however you can use the defaults if they meet the deployment requirements.
Table A-12 CNE Variables
occne.ini Variable | Definition | Default | Required (Y/N) |
---|---|---|---|
occne_opensearch_data_size | Used to define log_trace_active_storage value | 10Gi | N |
occne_opensearch_master_size | Used to define log_trace_inactive_storage value | 30Gi | N |
retention_period | Used to define log_trace_retention_period in days | 7 | N |
opensearch_client_cpu_request | Used to define the CPU request value for OpenSearch client | 1000m | N |
opensearch_client_cpu_limit | Used to define the CPU limit value for OpenSearch client | 1000m | N |
opensearch_client_memory_request | Used to define the memory request value for OpenSearch client | 2048Mi | N |
opensearch_client_memory_limit | Used to define the memory limit value for OpenSearch client | 2048Mi | N |
opensearch_data_cpu_request | Used to define the CPU request value for OpenSearch data | 1000m | N |
opensearch_data_cpu_limit | Used to define the CPU limit value for OpenSearch data | 1000m | N |
opensearch_data_memory_request | Used to define the memory request value for OpenSearch data | 16Gi | N |
opensearch_data_memory_limit | Used to define the memory limit value for OpenSearch data | 16Gi | N |
opensearch_master_cpu_request | Used to define the CPU request value for OpenSearch master | 1000m | N |
opensearch_master_cpu_limit | Used to define the CPU limit value for OpenSearch master | 1000m | N |
opensearch_master_memory_request | Used to define the memory request value for OpenSearch master | 2048Mi | N |
opensearch_master_memory_limit | Used to define the memory limit value for OpenSearch master | 2048Mi | N |
occne_prom_kube_state_metrics_cpu_request | Used to define the CPU usage request for kube state metrics. | 20m | N |
occne_prom_kube_state_metrics_cpu_limit | Used to define the CPU usage limit for kube state metrics. | 20m | N |
occne_prom_kube_state_metrics_memory_limit | Used to define the memory usage limit for kube state metrics | 100Mi | N |
occne_prom_kube_state_metrics_memory_request | Used to define the memory usage request for kube state metrics | 32Mi | N |
occne_prom_operator_cpu_request | Used to define the CPU usage request for Prometheus operator. | 100m | N |
occne_prom_operator_cpu_limit | Used to define the CPU usage limit for Prometheus operator | 200m | N |
occne_prom_operator_memory_request | Used to define the memory usage request for Prometheus operator | 100Mi | N |
occne_prom_operator_memory_limit | Used to define the memory usage limit for Prometheus operator | 200Mi | N |
occne_prom_server_size | Used to define total_metrics_storage value | 8Gi | N |
occne_prom_cpu_request | Used to define Prometheus CPU usage request | 2000m | N |
occne_prom_cpu_limit | Used to define Prometheus CPU usage limit | 2000m | N |
occne_prom_memory_request | Used to define Prometheus memory usage request | 4Gi | N |
occne_prom_memory_limit | Used to define Prometheus CPU usage limit | 4Gi | N |
occne_metallb_cpu_request | Used to define the CPU usage request for Metallb Controller | 100m | N |
occne_metallb_memory_request | Used to define the memory usage request for Metallb Controller | 100Mi | N |
occne_metallb_cpu_limit | Used to define the CPU usage limit for Metallb Controller | 100m | N |
occne_metallb_memory_limit | Used to define the memory usage limit for Metallb Controller | 100Mi | N |
occne_metallbspeaker_cpu_request | Used to define the CPU usage request for Metallb speaker | 100m | N |
occne_metallbspeaker_cpu_limit | Used to define the CPU usage limit for Metallb speaker | 100m | N |
occne_metallbspeaker_memory_request | Used to define the memory usage request for Metallb speaker | 100Mi | N |
occne_metallbspeaker_memory_limit | Used to define the memory usage limit for Metallb speaker | 100Mi | N |
occne_snmp_notifier_destination | Used to define SNMP trap receiver address | 127.0.0.1:162 | N |
occne_fluentd_opensearch_cpu_request | Used to define the CPU usage request for Fluentd | 500m | N |
occne_fluentd_opensearch_cpu_limit | Used to define the CPU usage limit for Fluentd | 500m | N |
occne_fluentd_opensearch_memory_request | Used to define the memory usage request for Fluentd | 1Gi | N |
occne_fluentd_opensearch_memory_limit | Used to define the memory usage limit for Fluentd | 1Gi | N |
occne_prom_kube_alertmanager_cpu_request | Specifies the amount of CPU resources that are
guaranteed to be allocated for the Prometheus AlertManager container
in the Kubernetes cluster. It represents the baseline CPU allocation
needed to ensure the application runs effectively without
interruption.
This is not a mandatory variable. |
20m | N |
occne_prom_kube_alertmanager_cpu_limit | Defines the maximum amount of CPU resources that the
Prometheus AlertManager container is allowed to use. It ensures that
the container does not exceed this specified CPU usage, preventing
it from consuming excessive resources that could impact other
applications in the cluster.
This is not a mandatory variable. |
20m | N |
occne_prom_kube_alertmanager_memory_request | Sets the amount of memory resources that are
guaranteed for the Prometheus AlertManager container in the
Kubernetes cluster. It ensures that a certain amount of memory is
reserved for the container, allowing it to operate smoothly under
normal conditions.
This is not a mandatory variable. |
64Mi | N |
occne_prom_kube_alertmanager_memory_limit | Determines the maximum amount of memory resources
that the Prometheus AlertManager container can use. It restricts the
container's memory usage to the specified limit, preventing it from
using more memory than allocated, which helps in maintaining the
stability of the overall cluster.
This is not a mandatory variable. |
64Mi | N |
Configuration for Service Mesh PKI Integration
/var/occne/cluster/${OCCNE_CLUSTER}/ca-config.ini
to define
the required ansible variables for the deployment. The Table A-13 table describes the list of variables defined in the
ca-config.ini
file. A starting point for this file is
provided as
/var/occne/cluster/${OCCNE_CLUSTER}/ca-config.ini.template
.
Table A-13 CA Config Variables
ca-config.ini Variable | Definition | Default |
---|---|---|
occne_ca_issuer_type |
Used to define CA issuer type. Allowed values are internal and intermediate. If you set the occne_ca_issuer_type to internal, then istio CA will be used. If you set the
|
internal |
occne_ca_certificate | Used to define base64 encoded CA certificate
value and required only when
occne_ca_issuer_type is
intermediate.
|
"" |
occne_ca_key | Used to define base64 encoded CA key value and
required only when occne_ca_issuer_type is
intermediate.
|
"" |
occne_ca_client_max_duration | Maximum validity duration that can be requested
for a client certificate in XhYmZs format. If
occne_ca_client_max_duration is not set,
certificate validity duration will be one year by
default.
|
"" |
A.2.4 Sizing vCNE VMs
For a virtualized CNE, the VMs can be sized to host each node in the CNE so that the resources used by each node closely match the expected workload. This section provides recommendations on VM sizes for each node type. Note that these are sizing guidelines. Customers do not have to use these exact sizes, although creating smaller VMs than the minimum recommended sizes can result in a CNE that performs poorly.
Bootstrap VM
Table A-14 Bootstrap VM
VM name | vCPUs | RAM | DISK | Comments |
---|---|---|---|---|
Bootstrap host | 2 | 8 GB | 40 GB | Delete the Bootstrap Host VM after the CNE installation is complete. |
Table A-15 CNE Loadbalancer VMs
VM name | vCPUs | RAM | DISK |
---|---|---|---|
<cluster_name>-<peer_pool_name>-lbvm | 4 | 4 GB | 40 GB |
Kubernetes VMs
Master nodesNote:
3 master nodes are required.GCE and AWS have established consistent sizing guidelines for master node VMs. CNE follows these generally accepted guidelines.
Table A-16 Kubernetes Master Node
VM name | vCPUs | RAM | DISK | Comments |
---|---|---|---|---|
K8s Master - large | 4 | 15 GB | 40 GB | For K8s clusters with 1-100 worker nodes. |
Worker nodes
A minimum of 6 worker nodes is required. You can add more worker nodes if you expect a high 5G traffic volume or if you want to install multiple NFs in the cluster.
Both GCE and AWS offer several machine types. Follow the general purpose VM sizing that is generally around 4GB RAM per vCPU.
Table A-17 Kubernetes Worker Node
VM Name | vCPUs | RAM | DISK |
---|---|---|---|
K8s worker - medium | 8 | 30 GB | 40 GB |
K8s worker - large | 16 | 60 GB | 40 GB |
K8s worker - extra large | 32 | 120 GB | 40 GB |
Note:
The above mentioned values are suggested for worker node size. The actual size is determined only after testing the environment.Bastion host VMs
The Bastion hosts will have light, occasional workloads with a few persistent processes.
Table A-18 Bastion Host VMs
VM name | vCPUs | RAM | DISK |
---|---|---|---|
Bastion host | 1 | 8 GB | 100 GB |
A.2.5 Environmental Variables
The following table describes the list of possible environment variables
that can be combined with the deploy.sh
command to further define the
running of the deployment.
Note:
The variables marked Y under the Required (Y/N) column are necessary, but you can use the defaults if they meet the deployment requirements.Table A-19 Environmental Variables
Environment Variable | Definition | Default Value | Required (Y/N) |
---|---|---|---|
OCCNE_VERSION | Used to define the version of the container images used during deployment. | Defaults to current release | Y |
OCCNE_TFVARS_DIR | Provides the path to the
clusters.tfvars file in reference to the
current directory.
|
${OCCNE_CLUSTER} | Y |
OCCNE_VALIDATE_TFVARS |
Instructs the
|
1 | N |
CENTRAL_REPO | Central Repository Hostname | winterfell | Y |
CENTRAL_REPO_IP | Central Repository IPv4 Address | 10.75.216.10 | Y |
CENTRAL_REPO_DOCKER_PORT | Central Repository Docker Port | 5000 | Y |
OCCNE_PIPELINE_ARGS | Additional parameters to the installation process. | N | |
OCCNE_PREFIX | Development time prefix for the OCCNE image names. | N | |
OS_USERNAME | OpenStack user name account for deployment (must be set by the OpenStack RC file). | (Set by .rc file) | Y |
OS_PASSWORD | OpenStack password for the account for deployment (must be set by the OpenStack RC file). | (Set by .rc file) | Y |
/var/occne/cluster/${OCCNE_CLUSTER}/occne.ini.template
file
to /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini
to define the
required ansible variables for the
deployment:$ cp /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini.template /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini
occne.ini
file:$ vi /var/occne/cluster/${OCCNE_CLUSTER}/occne.ini
/var/occne/cluster/${OCCNE_CLUSTER}/occne.ini
variables that can be
combined with the deploy.sh command to further define the execution of the deployment.
Note:
The variables marked Y under the Required (Y/N) column are necessary, but you can use the defaults if they meet the deployment requirements.Table A-20 occne.ini Variables
occne.ini Variables | Definition | Default setting | Required (Y/N) |
---|---|---|---|
central_repo_host | Central Repository Hostname | NA | Y |
central_repo_host_address | Central Repository IPv4 Address | NA | Y |
central_repo_registry_port | Central Repository Container Registry Port | 5000 | N |
name_server | External DNS nameserver to resolve out-of-cluster DNS
queries (must be a comma separated list of IPv4 addresses).
It is always required to set name_server to openstack environment nameserver(s). Optionally, you can add external nameservers to this list. Openstack environment nameserver(s) can be obtained by running the following command on bootstrap, which was created while creating the Bootstrap host: cat
/etc/resolv.conf | grep nameserver | awk '{ print $2 }' | paste -d,
-s |
NA | Y |
local_dns_enabled | Set to False to enable Bastion Host DNS. Set to True to enable the Local DNS feature. | False | N |
kube_network_node_prefix | If you want to modify the default value, change the
variable name to kube_network_node_prefix_value and use
only number and not string.
|
25 | N |
occne_cluster_name | Name of cluster to deploy. | NA | Y |
external_openstack_auth_url | OpenStack authorization URL (must be set by the OpenStack RC file in the environment as OS_AUTH_URL). | NA | Y |
external_openstack_region | OpenStack region name (must be set by the OpenStack RC file in the environment as OS_REGION_NAME). | NA | Y |
external_openstack_tenant_id | OpenStack project ID (must be set by the OpenStack RC file in the environment as OS_PROJECT_ID). | NA | Y |
external_openstack_domain_name | OpenStack domain name (must. be set by the OpenStack RC file in the environment as OS_USER_DOMAIN_NAME). | NA | Y |
external_openstack_tenant_domain_id | OpenStack tenant domain id (must be set by the OpenStack RC file in the environment as OS_PROJECT_DOMAIN_ID). | NA | Y |
cinder_auth_url | Cinder authorization URL (must be set by the OpenStack RC file in the environment as OS_AUTH_URL). | NA | Y |
cinder_region | Cinder region name (must be set by the OpenStack RC file in the environment as OS_REGION_NAME). | NA | Y |
cinder_tenant_id | Cinder project ID (must be set by the OpenStack RC file in the environment as OS_PROJECT_ID). | NA | Y |
cinder_tenant_domain_id | Cinder tenant domain ID (must be set by the OpenStack RC file in the environment as OS_PROJECT_DOMAIN_ID) | NA | Y |
cinder_domain_name | Cinder domain name (must be set by the OpenStack RC file in the environment as OS_USER_DOMAIN_NAME) | NA | Y |
openstack_cinder_availability_zone | OpenStack Cinder storage volume availability zone (must be set on bootstrap host if OpenStack Cinder availability zone needs to be different from default zone 'nova') | nova | N |
external_openstack_username | OpenStack username (must be set by the OpenStack RC file in the environment as OS_USERNAME). | NA | Y |
external_openstack_password | OpenStack password (must be set by the OpenStack RC file in the environment as OS_PASSWORD). | NA | Y |
cinder_username | Cinder username (must be set by the OpenStack RC file in the environment as OS_USERNAME). | NA | Y |
cinder_password | Cinder password (must be set by the OpenStack RC file in the environment as OS_PASSWORD). | NA | Y |
openstack_cacert | Path to OpenStack certificate in installation container.
Do not define if no certificate is needed. Define as
/host/openstack-cacert.pem if you need the
certificate.
|
NA | N |
flannel_interface | Interface to use for flannel networking if using Fixed IP deployment. Do not define if not using Fixed IP. Define as eth0 if using Fixed IP. | NA | N |
calico_mtu | The default value for calico_mtu is 8980 from kubenetes. If this value needs to be modified, then set the value as an integer between 100 and 100000. | 8980 | Y |
openstack_parallel_max_limit | Specifies the maximum number of parallel requests that can be handled by openstack controller. | 0 | N |
A.3 Performing Manual Switchover of LBVMs During Upgrade
Note:
It is highly recommended to perform the switchover on a single PAP at a time if there are more than one (oam) lbvm pair on the system. After each switchover, verify if all services associated with that service are functioning correctly before moving to the next pair.- For VMware or Openstack: Initiate a switchover manually from the OpenStack or VMware Desktop by selecting the ACTIVE LBVMs (the LBVMs that include the service IPs attached) and performing a hard reboot on each LBVM.
- For Openstack: Use the
lbvmSwitchOver.py
script included in the/var/occne/cluster/${OCCNE_CLUSTER}/artifacts/upgrade
directory.Note:
This option is applicable to OpenStack deployments only. VMware deployments must use the option described in Step 1.The
lbvmSwitchOver.py
script allows the user to perform the switchover on a single PAP (such as "oam") or on all PAPs where it performs the reboot on each LBVM pair. It also includes a warning while issuing the command which can be overridden using the--force
option. Invalid PAPs return an error response.Use the current Bastion Host to run the switchover script and use the following command for help and information about running the script:$ ./artifacts/upgrade/lbvmSwitchOver.py --help
Sample output:Command called to perform a LBVM switchover for Openstack only. --all : Required parameter: run on all peer address pools. --pap : Required parameter: run on a single peer address pool. --force: Optional parameter: run without prompt. --help : Print usage lbvmSwitchOver --all [optional: --force] lbvmSwitchOver --pap <peer address pool name> [optional: --force] Examples: ./artifacts/upgrade/lbvmSwitchOver.py --pap oam ./artifacts/upgrade/lbvmSwitchOver.py --pap oam --force ./artifacts/upgrade/lbvmSwitchOver.py --all ./artifacts/upgrade/lbvmSwitchOver.py --all --force
The following code block provides a sample command to use the
--force
option to ignore the warnings that are encountered while running the script:$ ./var/occne/cluster/${OCCNE_CLUSTER}/artifacts/upgrade/lbvmSwitchOver.py --pap oam --force
Sample output:Performing LBVM switchover on LBVM pairs: [{'id': 0, 'poolName': 'oam', 'name': 'my-cluster-name-oam-lbvm-1', 'ipaddr': '192.168.0.1', 'role': 'ACTIVE', 'status': 'UP'}, {'id': 1, 'poolName': 'oam', 'name': 'my-cluster-name-oam-lbvm-2', 'ipaddr': '192.168.0.2', 'role': 'STANDBY', 'status': 'UP'}]. - Validating LBVM states and communication... - Calling monitor for LBVM: my-cluster-name-oam-lbvm-1 - Calling monitor for LBVM: my-cluster-name-oam-lbvm-2 - Requesting ACTIVE LBVM reboot to force switchover... - Sending reboot request to ACTIVE LBVM: my-cluster-name-oam-lbvm-1 - Waiting for LBVM communication to be re-established to LBVM(s)... - Calling monitor for LBVM: my-cluster-name-oam-lbvm-1 - Calling monitor for LBVM: my-cluster-name-oam-lbvm-2 LBVM switchover successful on LBVMs. Service ports require additional time to switchover. Please wait for all ports to switchover and verify service operation.
A.4 rook_toolbox
apiVersion: apps/v1
kind: Deployment
metadata:
name: rook-ceph-tools
namespace: rook-ceph # namespace:cluster
labels:
app: rook-ceph-tools
spec:
replicas: 1
selector:
matchLabels:
app: rook-ceph-tools
template:
metadata:
labels:
app: rook-ceph-tools
spec:
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: rook-ceph-tools
image: occne-repo-host:5000/docker.io/rook/ceph:v1.10.2
command: ["/bin/bash"]
args: ["-m", "-c", "/usr/local/bin/toolbox.sh"]
imagePullPolicy: IfNotPresent
tty: true
securityContext:
runAsNonRoot: true
runAsUser: 2016
runAsGroup: 2016
env:
- name: ROOK_CEPH_USERNAME
valueFrom:
secretKeyRef:
name: rook-ceph-mon
key: ceph-username
- name: ROOK_CEPH_SECRET
valueFrom:
secretKeyRef:
name: rook-ceph-mon
key: ceph-secret
volumeMounts:
- mountPath: /etc/ceph
name: ceph-config
- name: mon-endpoint-volume
mountPath: /etc/rook
volumes:
- name: mon-endpoint-volume
configMap:
name: rook-ceph-mon-endpoints
items:
- key: data
path: mon-endpoints
- name: ceph-config
emptyDir: {}
tolerations:
- key: "node.kubernetes.io/unreachable"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 5