3 Post Installation Activities

This chapter describes the verification and security hardening procedures post installation of OCCNE.

Post Install Verification

Introduction

This document verifies installation of CNE Common services on all nodes hosting the cluster. There are different UI end points installed with common services like Kibana, Grafana, Prometheus Server, Alert Manager; below are the steps to launch different UI endpoints and verify the services are installed and working properly.

Prerequisities

  1. Common services have been installed on all nodes hosting the cluster.
  2. Gather list of cluster names and version tags for docker images that were used during install.
  3. All cluster nodes and services pods should be up and running.
  4. Commands are required to be run on Management server.
  5. Any Modern browser (HTML5 compliant) with network connectivity to CNE.

Verify Kibana is Running and Accessible

  1. Run the commands to get the load-balancer IP address and port number for Kibana Web Interface:
    # LoadBalancer ip address of the kibana service is retrieved with below command  
    $ export KIBANA_LOADBALANCER_IP=$(kubectl get services occne-kibana --namespace occne-infra -o jsonpath="{.status.loadBalancer.ingress[*].ip}")
      
    # LoadBalancer port number of the kibana service is retrieved with below command  
    $ export KIBANA_LOADBALANCER_PORT=$(kubectl get services occne-kibana --namespace occne-infra -o jsonpath="{.spec.ports[*].port}")
      
    # Complete url for accessing kibana in external browser
    $ echo http://$KIBANA_LOADBALANCER_IP:$KIBANA_LOADBALANCER_PORT
    http://10.75.182.51:80
  2. Launch the browser and navigate to Kibana Web Interface. The browser url to access kibana in external browser is: http://$KIBANA_LOADBALANCER_IP:$KIBANA_LOADBALANCER_PORT (for example: http://10.75.182.51:80 in the example above) received in the output of the above commands.

Using Kibana verify Log and Tracer data is stored in Elasticsearch

  1. Navigate to "Management" Tab in Kibana.
  2. Click on "Index Patterns". You should be able to see the two patterns as below which confirms Log and Tracer data been stored in Elastic-Search successfully.
    1. jaeger-*
    2. logstash-*
  3. Type logstash* in the index pattern field and wait for few seconds.
  4. Verify the "Success" message and index pattern "logstash-YYYY.MM.DD" appeared as highlighted in the bottom red box. Click on " Next step ".
  5. Select "I don't want to use the Time Filter" and click on "Create index pattern".
  6. Ensure the Web page having the indices appear in the main viewer frame.
  7. Click on "Discover" Tab and you should be able to view raw Log records.
  8. Repeat steps 3-6 using "jaeger*" instead of "logstash* to ensure the data is stored in elastic search.

Verify Elasticsearch cluster health

  1. Navigate to "Dev Tools" in Kibana
  2. Enter the command "GET _cluster/health" and press on the green arrow mark. You should see the status as "green"on the right side of the screen.

Verify Prometheus Alert manager is accessible

  1. Run the following commands to get the load-balancer IP address and port number for Prometheus Alert Manager Web Interface.
    # LoadBalancer ip address of the alertmanager service is retrieved with below command  
    $ export ALERTMANAGER_LOADBALANCER_IP=$(kubectl get services occne-prometheus-alertmanager --namespace occne-infra -o jsonpath="{.status.loadBalancer.ingress[*].ip}")
      
    # LoadBalancer port number of the alertmanager service is retrieved with below command  
    $ export ALERTMANAGER_LOADBALANCER_PORT=$(kubectl get services occne-prometheus-alertmanager --namespace occne-infra -o jsonpath="{.spec.ports[*].port}")
      
    # Complete url for accessing alertmanager in external browser
    $ echo http://$ALERTMANAGER_LOADBALANCER_IP:$ALERTMANAGER_LOADBALANCER_PORT
    http://10.75.182.53:80
  2. Launch the Browser and navigate to http://$ALERTMANAGER_LOADBALANCER_IP:$ALERTMANAGER_LOADBALANCER_PORT (e.g.: http://10.75.182.53:80 in the example above) received in the output of the above commands. Ensure the AlertManager GUI is accessible.

Verify metrics are scraped and stored in prometheus server

  1. Run the following commands to get the load-balancer IP address and port number for Prometheus Server Web Interface.
    # LoadBalancer ip address of the prometheus service is retrieved with below command  
    $ export PROMETHEUS_LOADBALANCER_PORT=$(kubectl get services occne-prometheus-server --namespace occne-infra -o jsonpath="{.spec.ports[*].port}")
      
    # LoadBalancer port number of the prometheus service is retrieved with below command  
    $ export PROMETHEUS_LOADBALANCER_IP=$(kubectl get services occne-prometheus-server --namespace occne-infra -o jsonpath="{.status.loadBalancer.ingress[*].ip}")
      
    # Complete url for accessing prometheus in external browser
    $ echo http://$PROMETHEUS_LOADBALANCER_IP:$PROMETHEUS_LOADBALANCER_PORT
    http://10.75.182.54:80
  2. Launch the Browser and navigate to http://$PROMETHEUS_LOADBALANCER_IP:$PROMETHEUS_LOADBALANCER_PORT (e.g.: http://10.75.182.54:80 in the example above) received in the output of the above commands. Ensure the Prometheus server GUI is accessible.
  3. Select "UP" option from "insert metric at cursor" drop down and click on "Execute" button.
  4. Here the entries present under the Element section are scrape endpoints and under the value section its corresponding status( 1 for up 0 for down). Ensure all the scrape endpoints have value as 1 (means up and running).

Verify Alerts are configured

  1. Navigate to alerts tab of Prometheus server GUI or navigate using URL: http://$PROMETHEUS_LOADBALANCER_IP:$PROMETHEUS_LOADBALANCER_PORT/alerts. For <PROMETHEUS_LOADBALANCER_IP> and <PROMETHEUS_LOADBALANCER_PORT>, refer to above section.
  2. If below alerts are seen in "Alerts" tab of prometheus GUI, then Alerts are configured properly.

Verify grafana is accessible and change the default password for admin user

  1. Run below commands to get the load-balancer IP address and port number for Grafana Web Interface.
    # LoadBalancer ip address of the grafana service is retrieved with below command  
    $ export GRAFANA_LOADBALANCER_IP=$(kubectl get services occne-grafana --namespace occne-infra -o jsonpath="{.status.loadBalancer.ingress[*].ip}")
      
    # LoadBalancer port number of the grafana service is retrieved with below command  
    $ export GRAFANA_LOADBALANCER_PORT=$(kubectl get services occne-grafana --namespace occne-infra -o jsonpath="{.spec.ports[*].port}")
      
    # Complete url for accessing grafana in external browser
    $ echo http://$GRAFANA_LOADBALANCER_IP:$GRAFANA_LOADBALANCER_PORT
    http://10.75.182.55:80
  2. Launch the Browser and navigate to http://$GRAFANA_LOADBALANCER_IP:$GRAFANA_LOADBALANCER_PORT (e.g.: http://10.75.182.55:80 in the example above) received in the output of the above commands. Ensure the Prometheus server GUI is accessible. The default username and password is admin/admin for the 1st time access.
  3. At first connection to the Grafana dashboard, a 'Change Password' screen will appear. Change the password to the customer provided credentials.

    Note: Grafana data is not persisted, so if Grafana services restarted for some reason change password screen will appear again.

  4. Grafana dashboards are accessed after the changing the default password in the above step.
  5. Click on "New dashboard" as marked red below.
  6. Click on "Add Query"
  7. From "Queries to" drop down select "Prometheus" as data source. Presence of " Prometheus " entry in the " Queries to " drop down ensures Grafana is connected to Prometheus time series database.
  8. In the Query Section marked in Red below put " sum by(__name__)({kubernetes_namespace="occne-infra"}) " and then click any where outside of the textbox and wait for few seconds. Ensure the dashboard appearing in the top section of the page. Example for using the metrics list, write a promQL query: sum($metricnamefromlist)sum by(kubernetes_pod_name) ($metricnamefromlist{kubernetes_namespace="occne-infra"}). For more details about promQl, follow the Prometheus Query Examples.

Post-Installation Security Hardening

Introduction

After installation, the OC-CNE system security stance should be audited prior to placing the system into service. This primarily consists of changing credentials and sequestering SSH keys to trusted servers. The following table lists all the credentials that need to be checked / changed / retained:

Note:

Refer to this section if you are performing bare metal installation.

Table 3-1 Credentials

Credential Name Type Associated Resource Initial Setting Credential Rotation
TOR Switch username / password Cisco Top or Rack Switch username/password from PreFlight Checklist Reset post-install
Enclosure Switch username / password HP Enclosure Switch username/password from PreFlight Checklist Reset post-install
OA Admin username / password HP On-board Administrator Console username/password from PreFlight Checklist Reset post-install
ILO Admin username / password HP Integrated Lights Out Manger username/password from PreFlight Checklist Reset post-install
Server Super User (root) username / password Server Super User Set to well-known Oracle default during server installation Reset post-install
Server Admin User (admusr) username / password Server Admin User Set to well-known Oracle default during server installation Reset post-install
Server Admin User SSH SSH Key Pair Server Admin User Key Pair generated at install time Can rotate keys at any time; key distribution manual procedure
MySQL Admin username / password MySQL Database Set by customer during initial install Reset post-install

If factory or Oracle defaults were used for any of these credentials, they should be changed prior to placing the system into operation. The customer should then store these credentials in a safe a secure way off site. It is recommended that the customer may plan a regular schedule for updating (rotating) these credentials.

Prerequisites

This procedure is performed after the site has been deployed and prior to placing the site into service.

Limitations and Expectations

The focus of this procedure is to secure the various credentials used or created during the install procedure. There are additional security audits that the CNE operator should perform such as scanning repositories for vulnerabilities, monitoring the system for anomalies, regularly checking security logs. These are outside the scope of this post-installation procedure.

References

  1. Nexus commands to configure Top of Rack switch username and password:https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/6-x/security/configuration/guide/b_Cisco_Nexus_9000_Series_NX-OS_Security_Configuration_Guide/b_Cisco_Nexus_9000_Series_NX-OS_Security_Configuration_Guide_chapter_01001.html
  2. HP commands to configure Enclosure switch username and password:https://support.hpe.com/hpsc/doc/public/display?docId=c04763521
  3. HP OA commands to configure OA username and password:https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-a00040582en_us&docLocale=en_US#N101C8
  4. HP iLO commands to configure iLO username and password:https://www.golinuxhub.com/2018/02/hp-ilo4--cli-guide-cheatsheet-example.html
  5. See ToR switch procedure for initial username/password configuration: Configure Top of Rack 93180YC-EX Switches
  6. See procedure to configure initial iLO/OA username/password: Configure Addresses for RMS iLOs, OA, EBIPA
  7. See Enclosure switch procedure for initial username/password: Configure Enclosure Switches

Procedure

  1. Reset Credentials on the TOR Switch:
    1. From bastion host, login to the switch with username and password from the procedure:
      [bastion host]# ssh <username>@<switch IP address>
      User Access Verification
      Password: <password>
       
      Cisco Nexus Operating System (NX-OS) Software
      TAC support: http://www.cisco.com/tac
      ...
      ...
      <switch name>#
    2. Change the password for current username:
      #
      # configure
      Enter configuration commands, one per line. End with CNTL/Z.
      (config)# username <username> password <newpassword>
      (config)#exit
      #
    3. Create new username:
      #
      # configure
      Enter configuration commands, one per line. End with CNTL/Z.
      (config)# username <newusername> password <newpassword> role [network-operator|network-admin|vdc-admin|vdc-operator]
      (config)#exit
      #
    4. Exit from the switch and login with the new username and password to verify the new change works:
      # exit
      Connection to <switch IP address> closed.
      [bastion host]#
       
      [some server]# ssh <newusername>@<switch IP address>
      User Access Verification
      Password: <newpassword>
       
      Cisco Nexus Operating System (NX-OS) Software
      TAC support: http://www.cisco.com/tac
      ...
      ...
      <switch name>#
    5. Delete the previous old username if it is not needed:
      #
      # configure
      Enter configuration commands, one per line. End with CNTL/Z.
      (config)# no username <username>
      (config)#exit
      #
    6. Change the enable secret when needed:
      #
      (config)# enable secret <newenablepassword>
      (config)# exit
      #
    7. Save the above configuration:
      # copy running-config startup-config
      [########################################] 100%
      Copy complete, now saving to disk (please wait)...
      Copy complete.
      #
      
  2. Reset Credentials on the Enclosure Switch:
    1. From bastion host, login to the switch with username and password from the procedure:
      [bastion host]# ssh <username>@<switch IP address>
      <username>@<switch IP address>'s password: <password>
       
      ******************************************************************************
      * Copyright (c) 2010-2017 Hewlett Packard Enterprise Development LP          *
      * Without the owner's prior written consent,                                 *
      * no decompiling or reverse-engineering shall be allowed.                    *
      ******************************************************************************
       
      <switchname>
      <switchname>sys
      System View: return to User View with Ctrl+Z.
      [switchname]
    2. Change the password for current username:
      [switchname]local-user <username> class <current class>
      [switchname-luser-manage-<username>]password simple <newpassword>
      [switchname-luser-manage-<username>]quit
      [switchname]
    3. Create new username:
      [switchname]local-user <newusername> class [manage|network]
      New local user added.
      [switchname-luser-manage-<newusername>]password simple <newpassword>
      [switchname-luser-manage-<newusername>]quit
      [switchname]
    4. Exit from the switch and login with the new username and password to verify the new change works:
      <switchname>quit
      Connection to <switch IP address> closed.
      [bastion host]#
        
      [bastion host]# ssh <newusername>@<switch IP address>
      <newusername>@<switch IP address>'s password: <newpassword>
       
      ******************************************************************************
      * Copyright (c) 2010-2017 Hewlett Packard Enterprise Development LP          *
      * Without the owner's prior written consent,                                 *
      * no decompiling or reverse-engineering shall be allowed.                    *
      ******************************************************************************
       
      <switchname>
      <switchname>sys
      System View: return to User View with Ctrl+Z.
      [switchname]
    5. Delete the previous old username if it is not needed:
      [switchname]undo local-user <username> class <current class>
    6. Save the above configuration:
      [switchname]save
      The current configuration will be written to the device. Are you sure? [Y/N]:y
      Please input the file name(*.cfg)[flash:/<filename>]
      (To leave the existing filename unchanged, press the enter key):
      flash:/<filename> exists, overwrite? [Y/N]:y
      Validating file. Please wait...
      Saved the current configuration to mainboard device successfully.
      Slot 1:
      Save next configuration file successfully.
      [switchname]
  3. Reset Credentials for the OA Admin Console:
    1. From bastion host, login to the OA with username and password from the procedure: (Note: If Standby OA, exit and login with the other OA address)
      [bastion host]# ssh <username>@<OA address>
       
      -----------------------------------------------------------------------------
      WARNING: This is a private system.  Do not attempt to login unless you are an
      authorized user.  Any authorized or unauthorized access and use may be moni-
      tored and can result in criminal or civil prosecution under applicable law.
      -----------------------------------------------------------------------------
       
      Firmware Version: 4.85
      Built: 04/06/2018 @ 06:14
      OA Bay Number:  1
      OA Role:        Active
      <username>@<OA address>'s password:<password>
       
      HPE BladeSystem Onboard Administrator
      (C) Copyright 2006-2018 Hewlett Packard Enterprise Development LP
       
       
      Type 'HELP' to display a list of valid commands.
      Type 'HELP <command>' to display detailed information about a specific command.
      Type 'HELP HELP' to display more detailed information about the help system.
       
      OA-A45D36FD5FB1>
    2. Change the password for current username:
      OA-A45D36FD5FB1> set password <newpassword>
       
      Changed password for the "<username>" user account.
       
      OA-A45D36FD5FB1>
    3. Add new user:
      OA-A45D36FD5FB1> add user <newusername>
       
      New Password: <newpassword>
      Confirm     : <newpassword>
      User "<newusername>" created.
      You may set user privileges with the 'SET USER ACCESS' and 'ASSIGN' commands.
       
      OA-A45D36FD5FB1> set user access <newusername> [ADMINISTRATOR|OPERATOR|USER]
       
      "<newusername>" has been given [administrator|operator|user] level privileges.
    4. Assign full access to the enclosure for the user:
      OA-A45D36FD5FB1> assign server all <newusername>
       
      <newusername> has been granted access to the valid requested bay(s
       
      OA-A45D36FD5FB1> assign interconnect all <newusername>
       
      <newusername> has been granted access to the valid requested bay(s)
       
      OA-A45D36FD5FB1> assign oa <newusername>
       
      <newusername> has been granted access to the OA.
    5. Exit from the OA and login with the new username and password to verify the new change works:
      OA-A45D36FD5FB1> exit
       
      Connection to <OA address> closed.
      [bastion host]# ssh <newusername>@<OA address>
       
      -----------------------------------------------------------------------------
      WARNING: This is a private system.  Do not attempt to login unless you are an
      authorized user.  Any authorized or unauthorized access and use may be moni-
      tored and can result in criminal or civil prosecution under applicable law.
      -----------------------------------------------------------------------------
       
      Firmware Version: 4.85
      Built: 04/06/2018 @ 06:14
      OA Bay Number:  1
      OA Role:        Active
      <newusername>@<OA address>'s password:<newpassword>
       
       
       
       
       
       
      HPE BladeSystem Onboard Administrator
      (C) Copyright 2006-2018 Hewlett Packard Enterprise Development LP
       
       
      Type 'HELP' to display a list of valid commands.
      Type 'HELP <command>' to display detailed information about a specific command.
      Type 'HELP HELP' to display more detailed information about the help system.
       
       
      OA-A45D36FD5FB1>
    6. Delete previous user if not needed:
      OA-A45D36FD5FB1> remove user <username>
       
      Entering anything other than 'YES' will result in the command not executing.
       
      Are you sure you want to remove testuser1? yes
       
      User "<username>" removed.
  4. Reset Credentials for the ILO Admin Console:
    1. From bastion host, login to the iLO with username and password from the procedure:
      [root@winterfell ~]# ssh <username>@<iLO address>
      <username>@<iLO address>'s password: <password>
      User:<username> logged-in to ...(<iLO address> / <ipv6 address>)
       
      iLO Advanced 2.61 at  Jul 27 2018
      Server Name: <server name>
      Server Power: On
       
      </>hpiLO->
    2. Change current password:
      </>hpiLO-> set /map1/accounts1/<username> password=<newpassword>
       
      status=0
      status_tag=COMMAND COMPLETED
      Tue Aug 20 13:27:08 2019
       
      </>hpiLO->
    3. Create new user:
      </>hpiLO-> create /map1/accounts1 username=<newusername> password=<newpassword> group=admin,config,oemHP_rc,oemHP_power,oemHP_vm
      status=0
      status_tag=COMMAND COMPLETED
      Tue Aug 20 13:47:56 2019
       
      User added successfully.
    4. Exit from the iLO and login with the new username and password to verify the new change works:
      </>hpiLO-> exit
       
      status=0
      status_tag=COMMAND COMPLETED
      Tue Aug 20 13:30:52 2019
       
       
       
       
      CLI session stopped
      Received disconnect from <iLO address> port 22:11:  Client Disconnect
      Disconnected from <iLO address> port 22
       
      [bastion host]# ssh <newusername>@<iLO address>
      <newusername>@<iLO address>'s password: <newpassword>
      User:<newusername> logged-in to ...(<iLO address> / <ipv6 address>)
       
      iLO Advanced 2.61 at  Jul 27 2018
      Server Name: <server name>
      Server Power: On
       
      </>hpiLO->
    5. Delete the previous username if not needed:
      </>hpiLO-> delete /map1/accounts1/<username>
       
      status=0
      status_tag=COMMAND COMPLETED
      Tue Aug 20 13:59:04 2019
       
      User deleted successfully.
      
  5. Reset Credentials for the root account on each server:

    Login to each server in the cluster (ssh admusr@cluster_host) and perform the following command:

    sudo passwd root
  6. Reset (or Delete) Credentials for the admusr account on each server:

    Login to each and every server in the cluster (ssh admusr@cluster_host ) and perform the following command:

    sudo passwd -l admusr
  7. Regenerate / Redistribute SSH Keys Credentials for the admusr account:

    Log into the Bastion Host VM and generate a new cluster-wide keypair by perform the following:

    ssh-keygen -b 4096 -t rsa -C "New SSH Key" -f .ssh/new_occne_id_rsa -q -N ""
    
    Now, for each server in the cluster, perform these actions:
    # for each cluster_host in the cluster; do
        # copy the public key to the node
        scp .ssh/new_occne_id_rsa.pub admusr@cluster_host:.ssh/
     
     
        # install the key
        ssh admusr@cluster_host "cat .ssh/new_occne_id_rsa.pub >> .ssh/authorized_keys"
    # done
    At this point, the new key should be usable. Switch from using the old key to the new key, and confirm that each and every cluster host is still reachable. On the Bastion Host VM, perform these actions:
    # remove the old keys from the agent (assuming you are using an agent)
    ssh-add -D
    # add the new key to the agent
    ssh-add .ssh/new_occne_is_rsa
     
    # for each cluster_host in the cluster; do
        # confirm access to the cluster host(s) and remove the old key
        ssh admusr@cluster_host "sed -i '/ occne installer key$/d' .ssh/authorized_keys"
    # done

    The new private key (new_occne_id_rsa) should also be copied to any secondary Bastion Host VM, and possibly copied off site and securely saved.