Note:

Migrate Oracle Linux Automation Manager to a Clustered Deployment

Introduction

Whether upgrading from a previous release or starting with a single host installation, both environments can migrate to a clustered deployment. Administrators need to plan their topology before migrating, as the cluster may consist of a combination of Control Plane, Execution, and Hop nodes and a remote database.

After following this tutorial, you’ll know how to migrate a single host installation to a clustered deployment with a remote database.

Objectives

In this tutorial, you’ll learn how to:

Prerequisites

Deploy Oracle Linux Automation Manager

Note: If running in your own tenancy, read the linux-virt-labs GitHub project README.md and complete the prerequisites before deploying the lab environment.

  1. Open a terminal on the Luna Desktop.

  2. Clone the linux-virt-labs GitHub project.

    git clone https://github.com/oracle-devrel/linux-virt-labs.git
    
  3. Change into the working directory.

    cd linux-virt-labs/olam
    
  4. Install the required collections.

    ansible-galaxy collection install -r requirements.yml
    
  5. Update the Oracle Linux instance configuration.

    cat << EOF | tee instances.yml > /dev/null
    compute_instances:
      1:
        instance_name: "olam-node"
        type: "control"
      2:
        instance_name: "exe-node"
        type: "execution"
      3:
        instance_name: "db-node"
        type: "db"
    passwordless_ssh: true
    add_cluster_ports: true
    EOF
    
  6. Deploy the lab environment.

    ansible-playbook create_instance.yml -e ansible_python_interpreter="/usr/bin/python3.6" -e "@instances.yml" -e olam_single_host=true
    

    The free lab environment requires the extra variable ansible_python_interpreter because it installs the RPM package for the Oracle Cloud Infrastructure SDK for Python. The location for this package’s installation is under the python3.6 modules.

    The default deployment shape uses the AMD CPU and Oracle Linux 8. To use an Intel CPU or Oracle Linux 9, add -e instance_shape="VM.Standard3.Flex" or -e os_version="9" to the deployment command.

    Important: Wait for the playbook to run successfully and reach the pause task. The Oracle Linux Automation Manager installation is complete at this stage of the playbook, and the instances are ready. Take note of the previous play, which prints the public and private IP addresses of the nodes it deploys.

Log into the WebUI

  1. Open a terminal and configure an SSH tunnel to Oracle Linux Automation Manager.

    ssh -L 8444:localhost:443 oracle@<hostname_or_ip_address>
    

    In the free lab environment, use the external IP address of the olam-node instance.

  2. Open a web browser and enter the URL.

    https://localhost:8444
    

    Note: Approve the security warning based on the browser used. For Chrome, click the **Advanced button and then the Proceed to localhost (unsafe) link.

  3. Log in to Oracle Linux Automation Manager with the Username of admin and the Password of admin created during the automated deployment.

    olam-login

  4. After logging in, the WebUI displays.

    olam-webui

Migrate to a Cluster Deployment

While Oracle Linux Automation Manager runs as a single host deployment, it also supports running as a cluster with a remote database and separate control plane and execution nodes. The installation configures the single-host instance as a hybrid node. The first step in migrating to a cluster deployment is converting this instance to a control plane node.

For more information on different installation topologies, see the Planning the Installation chapter of the Oracle Linux Automation Manager Installation Guide documentation.

Prepare the Control Plane Node

  1. Switch to the terminal connected to the olam-node instance running Oracle Linux Automation Manager.

    Note: From now on, we’ll refer to this instance as the control plane node.

  2. Stop the Oracle Linux Automation Manager service.

    sudo systemctl stop ol-automation-manager
    
  3. Create a backup of the database.

    sudo su - postgres -c 'pg_dumpall > /tmp/olamv2_db_dump'
    

Install the Remote Database

  1. Copy the database backup from the control plane node to the new remote database host.

    scp /tmp/olamv2_db_dump oracle@db-node:/tmp/
    

    The scp command communicates using an SSH connection between the nodes. This connection is possible due to the free lab environment configuring passwordless SSH logins between the instances.

  2. Connect via ssh to the db-node instance.

    ssh oracle@db-node
    
  3. Enable the database module stream.

    Oracle Linux Automation Manager allows the use of Postgresql database version 12 or 13. We’ll use and enable the version 13 module stream in this tutorial.

    sudo dnf -y module reset postgresql
    sudo dnf -y module enable postgresql:13
    
  4. Install the database server.

    sudo dnf -y install postgresql-server
    
  5. Add the database firewall rule.

    sudo firewall-cmd --add-port=5432/tcp --permanent
    sudo firewall-cmd --reload
    
  6. Initialize the database.

    sudo postgresql-setup --initdb
    
  7. Set the database default storage algorithm.

    sudo sed -i "s/#password_encryption.*/password_encryption = scram-sha-256/"  /var/lib/pgsql/data/postgresql.conf
    

    For more details regarding this database functionality, see Password Authentication in the upstream documentation.

  8. Update the database host-based authentication file.

    echo "host  all  all 0.0.0.0/0 scram-sha-256" | sudo tee -a /var/lib/pgsql/data/pg_hba.conf > /dev/null
    

    This additional line performs SCRAM-SHA-256 authentication to verify a user’s password when connecting from any IP address.

  9. Update the listen_address value on which the database listens for connections.

    sudo sed -i "/^#port = 5432/i listen_addresses = '"$(hostname -s)"'" /var/lib/pgsql/data/postgresql.conf
    

    You can choose either the IP address or hostname for this value. The tutorial uses hostname -s to select the hostname.

  10. Start and enable the database service.

    sudo systemctl enable --now postgresql
    
  11. Import the database dump file.

    sudo su - postgres -c 'psql -d postgres -f /tmp/olamv2_db_dump'
    
  12. Set the Oracle Linux Automation Manager database user account password.

    sudo su - postgres -c "psql -U postgres -d postgres -c \"alter user awx with password 'password';\""
    

    This command sets the awx password to password. Choose a more secure password if running this command outside the free lab environment.

  13. Close the SSH session connected to the db-node instance, as that completes the necessary steps to set up the remote database.

    exit
    

Add the Remote Database Settings

  1. Confirm your connection to the olam-node instance by checking the terminal prompt.

  2. Verify the host can communicate with the remote database.

    pg_isready -d awx -h db-node -p 5432 -U awx
    

    The postgresql package includes the pg_isready command. That package is part of the original single-host installation. If this command does not work, you likely skipped a step above or are missing ingress access to port 5432 on the network.

  3. Add the remote database settings to a new custom configuration file.

    cat << EOF | sudo tee /etc/tower/conf.d/db.py > /dev/null
    DATABASES = {
        'default': {
            'ATOMIC_REQUESTS': True,
            'ENGINE': 'awx.main.db.profiled_pg',
            'NAME': 'awx',
            'USER': 'awx',
            'PASSWORD': 'password',
            'HOST': 'db-node',
            'PORT': '5432',
        }
    }
    EOF
    

    Use the same password set previously for the awx database user account.

  4. Stop and disable the local database on the control plane node.

    sudo systemctl stop postgresql
    sudo systemctl disable postgresql
    
  5. Mask the local database service.

    sudo systemctl mask postgresql
    

    This step prevents the local database service from starting when starting the Oracle Linux Automation Manager service.

  6. Start Oracle Linux Automation Manager.

    sudo systemctl start ol-automation-manager
    
  7. Verify Oracle Linux Automation Manager connects to the remote database.

    sudo su -l awx -s /bin/bash -c "awx-manage check_db"
    

    The output returns the remote database version details if a connection is successful.

Remove the Local Database Instance

Removing the original local database is safe after confirming the connection to the remote database is working.

  1. Remove the database packages.

    sudo dnf -y remove postgresql
    
  2. Remove the pgsql directory containing the old database data files.

    sudo rm -rf /var/lib/pgsql
    

Change the Node Type of the Control Plane Node

When converting to a clustered deployment, switch the single-host instance node_type from hybrid to control.

  1. Confirm the current node type of the control plane node.

    sudo su -l awx -s /bin/bash -c "awx-manage list_instances"
    

    The output shows the node_type set to a value of hybrid.

  2. Remove the default instance group.

    sudo su -l awx -s /bin/bash -c "awx-manage remove_from_queue --queuename default --hostname $(hostname -i)"
    
  3. Define the new instance and queue.

    sudo su -l awx -s /bin/bash -c "awx-manage provision_instance --hostname=$(hostname -i) --node_type=control"
    sudo su -l awx -s /bin/bash -c "awx-manage register_queue --queuename=controlplane --hostnames=$(hostname -i)"
    
    
  4. Add the default queue name values in the custom settings file.

    cat << EOF | sudo tee -a /etc/tower/conf.d/olam.py > /dev/null
    DEFAULT_EXECUTION_QUEUE_NAME = 'execution'
    DEFAULT_CONTROL_PLANE_QUEUE_NAME = 'controlplane'
    EOF
    
  5. Update Receptor settings.

    cat << EOF | sudo tee /etc/receptor/receptor.conf > /dev/null
    ---
    - node:
        id: $(hostname -i)
    
    - log-level: info
    
    - tcp-listener:
        port: 27199
    
    - control-service:
        service: control
        filename: /var/run/receptor/receptor.sock
    
    - work-command:
        worktype: local
        command: /var/lib/ol-automation-manager/venv/awx/bin/ansible-runner
        params: worker
        allowruntimeparams: true
        verifysignature: false
    EOF
    
  6. Restart Oracle Linux Automation Manager

    sudo systemctl restart ol-automation-manager
    

The conversion of the single-host hybrid node to a control plane node with a remote database is complete. Now, we’ll add an execution plane node to make this cluster fully functional.

Add an Execution Plane Node to the Cluster

Before the cluster is fully functional, add one or more execution nodes. Execution nodes run standard jobs using ansible-runner, which runs playbooks within an OLAM EE Podman container-based execution environment.

Prepare the Execution Plane Node

  1. Connect via ssh to the *exe-node instance.

    ssh exe-node
    
  2. Install the Oracle Linux Automation Manager repository package.

    sudo dnf -y install oraclelinux-automation-manager-release-el8
    
  3. Disable the repository for the older release.

    sudo dnf config-manager --disable ol8_automation ol8_automation2
    
  4. Enable the current release’s repository.

    sudo dnf config-manager --enable ol8_automation2.2
    
  5. Install the Oracle Linux Automation Manager package.

    sudo dnf -y install ol-automation-manager
    
  6. Add the Receptor firewall rule.

    sudo firewall-cmd --add-port=27199/tcp --permanent
    sudo firewall-cmd --reload
    
  7. Edit the Redis socket configuration.

    sudo sed -i '/^# unixsocketperm/a unixsocket /var/run/redis/redis.sock\nunixsocketperm 775' /etc/redis.conf
    
  8. Copy the secret key from the control plane node.

    ssh oracle@olam-node "sudo cat /etc/tower/SECRET_KEY" | sudo tee /etc/tower/SECRET_KEY > /dev/null
    

    Important: Every cluster node requires the same secret key.

  9. Create a custom settings file containing the required settings.

    cat << EOF | sudo tee /etc/tower/conf.d/olamv2.py > /dev/null
    CLUSTER_HOST_ID = '$(hostname -i)'
    DEFAULT_EXECUTION_QUEUE_NAME = 'execution'
    DEFAULT_CONTROL_PLANE_QUEUE_NAME = 'controlplane'
    EOF
    

    The CLUSTER_HOST_ID is a unique identifier of the host within the cluster.

  10. Create a custom settings file containing the remote database configuration.

    cat << EOF | sudo tee /etc/tower/conf.d/db.py > /dev/null
    DATABASES = {
        'default': {
            'ATOMIC_REQUESTS': True,
            'ENGINE': 'awx.main.db.profiled_pg',
            'NAME': 'awx',
            'USER': 'awx',
            'PASSWORD': 'password',
            'HOST': 'db-node',
            'PORT': '5432',
        }
    }
    EOF
    
  11. Deploy the ansible-runner execution environment.

    1. Open a shell as the awx user.

      sudo su -l awx -s /bin/bash
      
    2. Migrate any existing containers to the latest podman version while keeping the unprivileged namespaces alive.

      podman system migrate
      
    3. Pull the Oracle Linux Automation Engine execution environment for Oracle Linux Automation Manager.

      podman pull container-registry.oracle.com/oracle_linux_automation_manager/olam-ee:2.2
      
    4. Exit out of the awx user shell.

      exit
      
  12. Generate the SSL certificates for NGINX.

    sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/tower/tower.key -out /etc/tower/tower.crt
    

    Enter the requested information or just hit the ENTER key.

  13. Replace the default NGINX configuration with the configuration below.

    cat << 'EOF' | sudo tee /etc/nginx/nginx.conf > /dev/null
    user nginx;
    worker_processes auto;
    error_log /var/log/nginx/error.log;
    pid /run/nginx.pid;
    
    # Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
    include /usr/share/nginx/modules/*.conf;
    
    events {
        worker_connections 1024;
    }
    
    http {
        log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                          '$status $body_bytes_sent "$http_referer" '
                          '"$http_user_agent" "$http_x_forwarded_for"';
    
        access_log  /var/log/nginx/access.log  main;
    
        sendfile            on;
        tcp_nopush          on;
        tcp_nodelay         on;
        keepalive_timeout   65;
        types_hash_max_size 2048;
    
        include             /etc/nginx/mime.types;
        default_type        application/octet-stream;
    
        # Load modular configuration files from the /etc/nginx/conf.d directory.
        # See http://nginx.org/en/docs/ngx_core_module.html#include
        # for more information.
        include /etc/nginx/conf.d/*.conf;
    }
    EOF
    
  14. Update the Receptor configuration file.

    cat << EOF | sudo tee /etc/receptor/receptor.conf > /dev/null
    ---
    - node:
        id: $(hostname -i)
    
    - log-level: debug
    
    - tcp-listener:
        port: 27199
    
    - tcp-peer:
        address: $(ssh olam-node hostname -i):27199
        redial: true
    
    - control-service:
        service: control
        filename: /var/run/receptor/receptor.sock
    
    - work-command:
        worktype: ansible-runner
        command: /var/lib/ol-automation-manager/venv/awx/bin/ansible-runner
        params: worker
        allowruntimeparams: true
        verifysignature: false
    EOF
    
    • node:id is the hostname or IP address of the current node.
    • tcp-peer:address is the Receptor mesh’s hostname or IP address and port on the control plane node.
  15. Start and enable the Oracle Linux Automation Manager service.

    sudo systemctl enable --now ol-automation-manager.service
    
  16. Close the SSH session connected to the exe-node instance, as that completes the necessary steps to set up the execution node.

    exit
    

Provision the Execution Plane Node

  1. Confirm your connection to the olam-node instance by checking the terminal prompt.

    You must run the provisioning step on one of the cluster’s control plane nodes and apply it to all clustered instances of Oracle Linux Automation Manager.

  2. Define the execution instance and queue.

    sudo su -l awx -s /bin/bash -c "awx-manage provision_instance --hostname=$(ssh exe-node hostname -i) --node_type=execution"
    sudo su -l awx -s /bin/bash -c "awx-manage register_default_execution_environments"
    sudo su -l awx -s /bin/bash -c "awx-manage register_queue --queuename=execution --hostnames=$(ssh exe-node hostname -i)"
    
    
    • register_queue takes a queuename to create/update and a list of comma-delimited hostnames where jobs run.
  3. Register the service mesh peer relationship.

    sudo su -l awx -s /bin/bash -c "awx-manage register_peers $(ssh exe-node hostname -i) --peers $(hostname -i)"
    

Verify the Execution Plane Node Registration

  1. Connect via ssh to the *exe-node instance.

    ssh exe-node
    
  2. Verify the Oracle Linux Automation Manager mesh service is running.

    sudo systemctl status receptor-awx
    
  3. Check the status of the service mesh.

    sudo receptorctl  --socket /var/run/receptor/receptor.sock status
    

    Example Output:

    [oracle@execution-node ~]$ sudo receptorctl  --socket /var/run/receptor/receptor.sock status
    Node ID: 10.0.0.62
    Version: +g
    System CPU Count: 2
    System Memory MiB: 15713
    
    Connection   Cost
    10.0.0.55   1
    
    Known Node   Known Connections
    10.0.0.55    10.0.0.62: 1
    10.0.0.62    10.0.0.55: 1
    
    Route        Via
    10.0.0.55   10.0.0.55
    
    Node         Service   Type       Last Seen             Tags
    10.0.0.62   control   Stream     2022-11-06 19:46:53   {'type': 'Control Service'}
    10.0.0.55   control   Stream     2022-11-06 19:46:06   {'type': 'Control Service'}
    
    Node         Work Types
    10.0.0.62   ansible-runner
    10.0.0.55   local
    

    For more details about Receptor, see the upstream documentation.

  4. Verify the running cluster instances and show the available capacity.

    sudo su -l awx -s /bin/bash -c "awx-manage list_instances"
    

    The output appears green once the cluster establishes communication across all instances. If the results appear red, wait 20-30 seconds and try rerunning the command.

    Example Output:

    [oracle@control-node ~]$ sudo su -l awx -s /bin/bash -c "awx-manage list_instances"
    [controlplane capacity=136]
    	10.0.0.55 capacity=136 node_type=control version=19.5.1 heartbeat="2022-11-08 16:24:03"
    
    [default capacity=0]
    
    [execution capacity=136]
    	10.0.0.62 capacity=136 node_type=execution version=19.5.1 heartbeat="2022-11-08 17:16:45"
    
    

That completes the migration of Oracle Linux Automation Manager to a clustered deployment.

(Optional) Verify the Cluster is Working

  1. Refresh the web browser window used to display the previous WebUI, or open a new web browser window and enter the URL.

    https://localhost:8444
    

    The port used in the URL needs to match the local port of the SSH tunnel.

    Note: Approve the security warning based on the browser used. For Chrome, click the Advanced button and then the Proceed to localhost (unsafe) link.

  2. Log in to Oracle Linux Automation Manager again with the USERNAME admin and the password admin.

    olam2-login

  3. After logging in, the WebUI displays.

    olam2-webui

  4. Using the navigation menu on the left, click Instance Groups under the Administration section.

    olam2-menu-ig

  5. In the main window, click the Add button and then select Add instance group.

    olam2-ig

  6. Enter the required information on the Create new instance group page.

    olam2-new-ig

  7. Click the Save button.

  8. From the Details summary page, click the Instances tab.

  9. From the Instances page, click the Associate button.

  10. On the Select Instances page, click the checkbox next to the execution node.

    olam2-ig-associate

  11. Click the Save button.

  12. Using the navigation menu on the left, click Inventories under the Resources section.

    olam2-menu-inv

  13. In the main window, click the Add button and then select Add inventory.

    olam2-inventories

  14. Enter the required information on the Create new inventory page.

    olam2-new-inv

    For Instance Groups, select the search icon to display the Select Instance Groups pop-up dialog. Click the checkbox next to the execution group and then click the Select button.

  15. Click the Save button.

  16. From the Details summary page, click the Hosts tab.

    olam2-inv-detail

  17. From the Hosts page, click the Add button.

    olam2-hosts

  18. On the Create new host page, enter the IP address or hostname of an available instance.

    olam2-new-host

    In the free lab environment, we’ll use db-node, which is the hostname of the remote database instance.

  19. Click the Save button.

  20. Navigate to the menu on the left, and click on **Credentials`.

    olam2-menu-creds

  21. On the Credentials page, click the Add button.

    olam2-credentials

  22. Enter the required information on the Create New Credential page.

    olam2-new-creds

    For the Credential Type, click the drop-down menu and select Machine. That displays the credentials Type Details.

    Enter a Username of oracle and browse for the SSH Private Key. Clicking the Browse… button displays an Open File dialog window.

    olam2-open-file

    Right-click on the main window of that dialog and then select Show Hidden Files.

    olam2-show-hide

    Then select the .ssh folder and the id_rsa file. Clicking the Open button causes the contents of the private key file to copy into the SSH Private Key dialog box. Scroll down and click the Save button.

  23. Navigate to the menu on the left and click on Inventories.

    olam2-menu-inv

  24. From the Inventories page, click on the Test inventory.

    olam2-inv-test

  25. From the Details summary page, click the Hosts tab.

    olam2-inv-test-detail

  26. On the Hosts page, click the checkbox next to the db-node host.

    olam2-webui

    Then click the Run Command button.

  27. From the Run command dialog, select the ping module from the Modules list-of-values and click the Next button.

    olam2-webui

  28. Select the OLAM EE (2.2) execution environment and click the Next button.

    olam2-webui

  29. Select the db-node machine credential and click the Next button.

    olam2-webui

  30. The panel refreshes and displays a preview of the command.

    olam2-webui

    After reviewing the details, click the Launch button.

  31. The job will launch and display the job Output page.

    olam2-webui

    If everything ran successfully, the output shows a SUCCESS message that the execution node contacted the db-node instance using the Ansible ping module. If you do not see the output, refresh the page by clicking the Details tab and returning to the Output tab.

Next Steps

The output within the WebUI confirms you have a working cluster environment for Oracle Linux Automation Manager. Continue to build your skills, and check out our other Oracle Linux Automation Manager training on Oracle Linux Training Station.

Oracle Linux Automation Manager Documentation
Oracle Linux Automation Manager Training
Oracle Linux Training Station

More Learning Resources

Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.

For product documentation, visit Oracle Help Center.