Go to primary content
Oracle® Communications OC-CNE Installation Guide
Release 1.0
F16979-01
Go To Table Of Contents
Contents

Previous
Previous
Next
Next

Oracle Linux OS Installer

This procedure details the steps required to configure the hosts bare metal servers using the OCCNE os_install image, running within an os-install docker container, which provides a PXE based installer pre-configured to aid in the provisioning of the required hosts. This procedure will require the use of an inventory file (hosts.ini) which provides the install with all the necessary information about the cluster.

These procedures provide the steps required to install the OL7 image onto all hosts via the Bastion Host using a occne/os_install container. Once completed, all hosts includes all necessary rpm updates and tools necessary to run the k8-install procedure.

Prerequisites:
  1. All procedures in OCCNE Installation of the Bastion Host are complete.
  2. The Utility USB is available containing the necessary files as per: OCCNE 1.0 Installation PreFlight checklist : Miscellaneous Files.

Limitations and Expectations

All steps are executable from a SSH application (putty) connected laptop accessible via the Management Interface.

References

https://docs.ansible.com/ansible/latest/user_guide/intro_patterns.html

Table 3-11 Procedure to run the auto OS-installer container

Step # Procedure Description
1.

Initial Configuration on the Bastion Host to Support the OS Install This procedure is used to provide the steps for creating directories and copying all supporting files to the appropriate directories on the Bastion Host so that the OS Install Container successfully installs OL7 onto each host.

Note: The cluster_name field is derived from the hosts.ini file field: occne_cluster_name.

  1. Log into the Bastion Host using the IP supplied from: OCCNE 1.0 Installation PreFlight Checklist : Complete VM IP Table
  2. Create the directories needed on the Bastion Host.
    $ mkdir /var/occne/<cluster_name>/yum.repos.d
  3. Update the repository fields in the hosts.ini file to reflect the changes from procedure: OCCNE Configuration of the Bastion Host . The fields listed must reflect the new Bastion Host IP (172.16.3.100) and the names of the repositories.

    $ vim /var/occne/<cluster_name>/hosts.ini
     
    Update the following fields with the new values from the configuration of the Bastion Host.
    ntp_server
    occne_private_registry
    occne_private_registry_address
    occne_private_registry_port
    occne_k8s_binary_repo
    occne_helm_stable_repo_url
    occne_helm_images_repo
    docker_rh_repo_base_url
    docker_rh_repo_gpgkey
    
    Comment out the following lines:
    #http_proxy=<proxy_url>
    #https_proxy=<proxy_url>
     
    Example:
    ntp_server='172.16.3.1'
    occne_private_registry=registry
    occne_private_registry_address='10.75.207.133'
    occne_private_registry_port=5000
    occne_k8s_binary_repo='http://10.75.207.133/binaries/'
    occne_helm_stable_repo_url='http://10.75.207.133/helm/'
    occne_helm_images_repo='10.75.207.133:5000/'
    docker_rh_repo_base_url=http://10.75.207.133/yum/centos/7/updates/x86_64/
    docker_rh_repo_gpgkey=http://10.75.207.133/yum/centos/RPM-GPG-CENTOS
2.

Retrieve the updated version of the Kickstart Configuration file

The initial install of the OS requires an updated version of the kickstart configuration file. This file is maintained on the Oracle OHC software download site and must be copied to the appropriate folder prior to executing the initial OS install steps below.

  1. Mount the Utility USB on RMS2.

    Note: Instructions for mounting a USB in Linux are at: OCCNE Installation of Oracle Linux 7.5 on Bootstrap Host : Install Additional Packages . Only follow steps 1-4 to mount the USB.

  2. Copy the kickstart file from the Utility USB to the /tmp on RMS2.
    $ cp /media/usb/occne-ks.cfg.j2.new /tmp/occne-ks.cfg.j2.new
  3. On the Bastion Host, copy the kickstart configuration file from RMS2 to the Bastion Host.
    $ scp root@172.16.3.5:/tmp/occne-ks.cfg.j2.new /var/occne/<clust_name>/occne-ks.cfg.j2.new
3.

Copy the OL7 ISO to the Bastion Host

The iso file is normally accessible from a Customer Site Specific repository. It is accessible because the ToR switch configurations were completed in procedure: OCCNE Configure Top of Rack 93180YC-EX Switches. For this procedure the file has already been copied to the /var/occne directory on RMS2 and can be copied to the same directory on the Bastion Host.

Copy from RMS2, the OL7 ISO file to the /var/occne directory. The example below uses OracleLinux-7.5-x86_64-disc1.iso.

Note: If the user copies this ISO from their laptop then they must use an application like WinSCP pointing to the Management Interface IP.

$ scp root@172.16.3.5:/var/occne/OracleLinux-7.5-x86_64-disc1.iso /var/occne/OracleLinux-7.5-x86_64-disc1.iso
4.

Set up the Boot Loader on the Bastion Host

Execute the following commands:

Note: The iso can be unmounted after the files have been copied if the user wishes to do so using the command: umount /mnt.

$ mkdir -p /var/occne/pxelinux
$ mount -t iso9660 -o loop /var/occne/OracleLinux-7.5-x86_64-disc1.iso /mnt
$ cp /mnt/isolinux/initrd.img /var/occne/pxelinux
$ cp /mnt/isolinux/vmlinuz /var/occne/pxelinux
5.

Verify and Set the PXE Configuration File Permissions on the Bastion Host Each file configured in the step above must be open for read and write permissions.
$ chmod 777 /var/occne/pxelinux
$ chmod 777 /var/occne/pxelinux/vmlinuz
$ chmod 777 /var/occne/pxelinux/initrd.img
6.

Copy and Update .repo files
  1. The customer specific Oracle Linux .repo file on the bastion_host must be copied to the /var/occne/<cluster_name> /yum.repos.d directory and updated to reflect the URL to the bastion host. This file is transferred to /etc/yum.repos.d directory on the host by ansible after the host has been installed but before the actual yum update is performed.
    $ cp /etc/yum.repos.d/<customer_OL7_specifc.repo> /var/occne/<cluster_name>/yum.repos.d/. 
  2. Edit each .repo file in the /var/occne/<cluster_name>/yum.repos.d directory and update the baseurl IP of the repo to reflect the IP of the bastion_host.
    $ vim /var/occne/<cluster_name>/yum.repos.d/<repo_name>.repo
     
    Example:
     
    [local_ol7_x86_64_UEKR5]
    name=Unbreakable Enterprise Kernel Release 5 for Oracle Linux 7 (x86_64)
    baseurl=http://10.75.155.195/yum/OracleLinux/OL7/UEKR5/
    gpgcheck=1
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY
    enabled=1
    proxy=_none_
     
    Change the IP address of the baseurl IP: 10.75.155.195 to the bastion host ip: 172.16.3.100.
     
    The URL may have to change based on the configuration of the customer repos. That cannot be indicated in this procedure.
7.

Execute the OS Install on the Hosts from the Bastion Host

This procedure requires executing docker run for four different Ansible tags.

Note: The <image_name>:<image_tag> represent the images in the docker image registry accessible by Bastion host.

Note: The initial OS install is performed from the OS install container running bash because the new kickstart configuration file must be copied over the existing configuration prior to executing the Ansible playbook.

Note: The <limit_filter> value used in the commands below is based on the settings in the hosts.ini file. See reference 1 above. An example used in this procedure is host_hp_gen_10[0:7]. The example hosts.ini file includes a grouping referred to as host_hp_gen_10. This can be considered an array of hosts with the 0:7 indicating the hosts (or indexes into that array) to include in the execution of the ansible commands listed below. In this example, the hosts would be all the k8s nodes (RMS3-5 and blades 1-4), db-1 (RMS1) and not include db-2 (RMS2). The settings at a customer site are specific to the site hosts.ini file and those used in this procedure are presented here as an example only. An example is illustrated below:

Example section from a hosts.ini file:

[host_hp_gen_10]

k8s-1.rainbow.lab.us.oracle.com ansible_host=172.16.3.6ilo=192.168.20.123mac=48-df-37-7a-41-60

k8s-2.rainbow.lab.us.oracle.com ansible_host=172.16.3.7ilo=192.168.20.124mac=48-df-37-7a-2f-60

k8s-3.rainbow.lab.us.oracle.com ansible_host=172.16.3.8ilo=192.168.20.125mac=48-df-37-7a-2f-70

k8s-4.rainbow.lab.us.oracle.com ansible_host=172.16.3.11ilo=192.168.20.141mac=d0-67-26-b1-8c-50

k8s-5.rainbow.lab.us.oracle.com ansible_host=172.16.3.12ilo=192.168.20.142mac=d0-67-26-ac-4a-30

k8s-6.rainbow.lab.us.oracle.com ansible_host=172.16.3.13ilo=192.168.20.143mac=d0-67-26-c8-88-30

k8s-7.rainbow.lab.us.oracle.com ansible_host=172.16.3.14ilo=192.168.20.144mac=20-67-7c-08-94-40

db-1.rainbow.lab.us.oracle.com ansible_host=172.16.3.4ilo=192.168.20.121mac=48-df-37-7a-41-50

db-2.rainbow.lab.us.oracle.com ansible_host=172.16.3.5ilo=192.168.20.122mac=48-df-37-7a-40-40

In the above example host_hp_gen_10[0:7] would be all of the above hosts except for db-2 (RMS2) which is index 8 starting at index 0. So using limit filter of [0:7] would install all hosts except RMS2.
  1. Run the docker command below to create a container running bash. This command must include the -it option and the bash executable at the end of the command. After execution of this command the user prompt will be running within the container.
    $ docker run -it --rm --network host --cap-add=NET_ADMIN -v /var/occne/<cluster_name>/:/host -v /var/occne/:/var/occne:rw <image_name>:<image_tag> bash
     
    Example:
     
    $ docker run -it --rm --network host --cap-add=NET_ADMIN -v /var/occne/rainbow.lab.us.oracle.com/:/host -v /var/occne/:/var/occne:rw 10.75.200.217:5000/os_install:1.0.1 bash
  2. From the container, copy the /var/occne/occne-ks.cfg.j2.new file (which is mounted to the /host directory on the container) over the existing /install/os-install/roles/pxe_config/templates/ks/occne-ks.cfg.j2 file.
    $ cp /host/occne-ks.cfg.j2.new/install/roles/pxe_config/templates/ks/occne-ks.cfg.j2 
  3. Install the OS onto each host using the ansible command indicated below. This command installs the OS while the subsequent commands (steps) set up the environment for yum repository support, datastore, and security. This command can take up to 30 minutes to complete.
    $ ansible-playbook -i /host/hosts.ini --become --become-user=root --private-key /host/.ssh/occne_id_rsa /install/os-install.yaml --limit <limit_filter>,localhost --skip-tags "ol7_hardening,datastore,yum_update"
     
    Example:
     
    $ ansible-playbook -i /host/hosts.ini --become --become-user=root --private-key /host/.ssh/occne_id_rsa /install/os-install.yaml --limit host_hp_gen_10[0:7],localhost --skip-tags "ol7_hardening,datastore,yum_update"
    
    Note: This ansible task times out in 35 minutes. If the a timeout condition occurs on any set of the given hosts (usually just the blades indicated as k8s-4 through k8s-7), this can be caused by the Linux boot process taking too long to complete the reboot at the end of the installation. This does not mean the install failed. If the following conditions appear, the blades can be rebooted to force up the login prompt and the installation process can continue.

    The install task display may look something like the following showing 1 or more blades (in this example k8s-4 and k8s-5 failed to complete the task due to the lockout issue)

    .
    .
    .
    TASK [pxe_install : PXE boot a blade. Will reboot even if currently powered on.] ******************************************************
    changed: [k8s-3.rainbow.lab.us.oracle.com -> localhost]
    changed: [k8s-1.rainbow.lab.us.oracle.com -> localhost]
    changed: [k8s-2.rainbow.lab.us.oracle.com -> localhost]
    changed: [k8s-5.rainbow.lab.us.oracle.com -> localhost]
    changed: [k8s-4.rainbow.lab.us.oracle.com -> localhost]
    changed: [k8s-6.rainbow.lab.us.oracle.com -> localhost]
    changed: [k8s-7.rainbow.lab.us.oracle.com -> localhost]
    changed: [db-1.rainbow.lab.us.oracle.com -> localhost]
     
    TASK [pxe_install : Wait for hosts to come online] ************************************************************************************
    ok: [k8s-1.rainbow.lab.us.oracle.com]
    ok: [k8s-6.rainbow.lab.us.oracle.com]
    ok: [k8s-3.rainbow.lab.us.oracle.com]
    ok: [k8s-2.rainbow.lab.us.oracle.com]
    ok: [k8s-7.rainbow.lab.us.oracle.com]
    ok: [db-1.rainbow.lab.us.oracle.com]
    fatal: [k8s-5.rainbow.lab.us.oracle.com]: FAILED! => {"changed": false, "elapsed": 2100, "msg": "Timeout when waiting for 172.16.3.12:22"}
    fatal: [k8s-4.rainbow.lab.us.oracle.com]: FAILED! => {"changed": false, "elapsed": 2101, "msg": "Timeout when waiting for 172.16.3.11:22"}
    Accessing the install process via a KVM or via VSP should display the following last few lines.
    .
    .
    .
    [  OK  ] Stopped Remount Root and Kernel File Systems.
             Stopping Remount Root and Kernel File Systems...
    [  OK  ] Stopped Create Static Device Nodes in /dev.
             Stopping Create Static Device Nodes in /dev...
    [  OK  ] Started Restore /run/initramfs.
    [  OK  ] Reached target Shutdown.
    Reboot the "stuck" blades using the KVM, the ssh session to the HP ILO, or via the power button on the blade. This example shows how to use the HP ILO to reboot the blade (using blade 1 or k8s-4)
    Login to the blade ILO:
    $ ssh root@192.168.20.121 using the root credentials
    </>hpiLO->
     
    Once the prompt is displayed issue the following command:
    </>hpiLO-> reset /system1 hard
     
    Go to the VSP console to watch the reboot process and the login prompt to appear:
    </>hpiLO-> vsp
  4. Execute the OS Install yum-update on the Hosts from the Bastion Host. This step disables any existing .repo files that are currently existing in directory /etc/yum.repos.d on Host after the OS Install. It then copies any .repo files in the /var/occne/<cluster_name> directory into the /etc/yum.repos.d and sets up the customer repo access.
    $ docker run --rm --network host --cap-add=NET_ADMIN -v /var/occne/<cluster_name>/:/host -v /var/occne/:/var/occne:rw -e "OCCNEARGS=--limit <limit_filter>,localhost --tags yum_update" <image_name>:<image_tag>
     
    Example:
     
    $ docker run --rm --network host --cap-add=NET_ADMIN -v /var/occne/rainbow.lab.us.oracle.com/:/host -v /var/occne/:/var/occne:rw -e "OCCNEARGS=--limit host_hp_gen_10[0:7],localhost --tags yum_update" 10.75.200.217:5000/os_install:1.0.1
  5. Check the /etc/yum.repos.d directory on each host for non-disabled repo files. These files should be disabled. The only file that should be enabled is the customer specif .repo file that was set in the /var/occne/<cluster_name>/yum.repos.d directory on the Bastion Host. If any of these files are not disabled then each file must be renamed as <filename>.repo.disabled.
    $ cd /etc/yum.repos.d
    $ ls
     
    Check for any files other than the customer specific .repo file that are not listed as disabled. If any exist, disable them using the following command:
     
    $ mv <filename>.repo <filename>.repo.disabled
     
    Example:
    $ mv oracle-linux-ol7.repo oracle-linux-ol7.repo.disabled
    $ mv uek-ol7.repo uek-ol7.repo.disabled
    $ mv virt-ol7.repo virt-ol7.repo.disabled
  6. Execute the OS Install datastore on the Hosts from the Bastion Host
    $ docker run --rm --network host --cap-add=NET_ADMIN -v /var/occne/<cluster_name>/:/host -v /var/occne/:/var/occne:rw -e "OCCNEARGS=--limit <limit_filter>,localhost --tags datastore" <image_name>:<image_tag>
     
    Example:
     
    $ docker run --rm --network host --cap-add=NET_ADMIN -v /var/occne/rainbow.lab.us.oracle.com/:/host -v /var/occne/:/var/occne:rw -e "OCCNEARGS=--limit host_hp_gen_10[0:7],localhost --tags datastore" 10.75.200.217:5000/os_install:1.0.1
  7. Execute the OS Install OL7 Security Hardening on the Hosts from the Bastion Host. This step performs a set of security hardening steps on the OS after it has been installed.

    Note: The two extra-vars included in the command are not used in the context of this command but need to be there to set the values to something other than an empty string.

    $ docker run --rm --network host --cap-add=NET_ADMIN -v /var/occne/<cluster_name>/:/host -v /var/occne/:/var/occne:rw -e "OCCNEARGS=--limit <limit_filter>,localhost --tags ol7_hardening --extra-vars ansible_env=172.16.3.4 --extra-vars http_proxy=172.16.3.4" <image_name>:<image_tag>
     
    Example:
     
    $ docker run -it --rm --network host --cap-add=NET_ADMIN -v /var/occne/rainbow.lab.us.oracle.com/:/host -v /var/occne/:/var/occne:rw -e "OCCNEARGS=--limit host_hp_gen_10[0:7],localhost --tags ol7_hardening --extra-vars ansible_env=172.16.3.4 --extra-vars http_proxy=172.16.3.4" 10.75.200.217:5000/os_install:1.0.1
    
8.

Re-instantiate the management link bridge on RMS1
  1. Run the following commands on RMS1 host OS:
    $ sudo su
    $ nmcli con add con-name mgmtBridge type bridge ifname mgmtBridge
    $ nmcli con add type bridge-slave ifname eno2 master mgmtBridge
    $ nmcli con add type bridge-slave ifname eno3 master mgmtBridge
    $ nmcli con mod mgmtBridge ipv4.method manual ipv4.addresses 192.168.2.11/24
    $ nmcli con up mgmtBridge
  2. Verify access to the ToR switches' management ports.
    $ ping 192.168.2.1
    $ ping 192.168.2.2