6 Monitoring and Maintaining Offline Mediation Controller Cloud Native

Learn how to maintain your Oracle Communications Offline Mediation Controller cloud native deployment.

Using Prometheus Operator to Monitor Offline Mediation Controller Cloud Native

Offline Mediation Controller cloud native tracks and exposes the following metric data in Prometheus format:

  • Node Manager-level statistics, which include:

    • The total network account records (NARs) processed

    • The current NARs processed

    • The current processing rate

    • The average processing rate

    Node Manager-level statistics are exposed through the endpoint http://hostname:8082/metrics, where hostname is the host name of the machine on which Offline Mediation Controller cloud native is running. You can change the port number where the metric data is exposed using the ocomc.configEnv.metricsPort key in your override-values.yaml file for oc-cn-ocomc-helm-chart.

  • JVM metrics for all Offline Mediation Controller components, which include:

    • Performance on the Node Manager level

    • JVM parameters

    JVM metrics are exposed through the endpoint http://hostname:portJVM/metrics, where portJVM is the port number where the JVM metrics are exposed. You can set the port number by using the ocomc.configEnv.metricsPortCN key in your override-values.yaml file for oc-cn-ocomc-helm-chart.

To monitor Offline Mediation Controller more easily, you can configure an external centralized metrics service, such as Prometheus Operator, to scrape metrics from each endpoint and store them for analysis and monitoring. You can then set up a visualization tool, such as Grafana, to display your metric data in a graphical format.

For the list of compatible Prometheus Operator and Grafana software versions, see "Offline Mediation Controller Cloud Native Deployment Software Compatibility" in Offline Mediation Controller Compatibility Matrix.

Enabling the Automatic Scraping of Metrics

You can configure the Prometheus Operator ServiceMonitor to automatically scrape Offline Mediation Controller metrics. For more information about Prometheus Operator and ServiceMonitors, see the prometheus-operator documentation on the GitHub website (https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/getting-started.md).

To enable the automatic scraping of Offline Mediation Controller metrics:

  1. Install Prometheus Operator on your cloud native environment.

  2. In your override-values.yaml file for oc-cn-ocomc-helm-chart, set the serviceMonitor.enabled key to true:

    ocomc:
       serviceMonitor:
          enabled: true
  3. Run the helm upgrade command to update the Offline Mediation Controller Helm release:

    helm upgrade ReleaseName oc-cn-ocomc-helm-chart --values OverridingValueFile -n Namespace 

    where:

    • ReleaseName is the release name, which is used to track the installation instance.

    • OverrideValuesFile is the path to a YAML file that overrides the default configurations in the chart's values.yaml file.

    • NameSpace is the namespace in which to create Offline Mediation Controller Kubernetes objects.

Using the Sample Grafana Dashboards

The Offline Mediation Controller package includes sample Grafana Dashboard templates that you can use for visualizing metrics. To use the sample dashboards, import the following JSON files from the OMC_home/sampleData/dashboards directory into Grafana:

  • OCOMC_JVM_Dashboard.json: This dashboard lets you view JVM-related metrics for Offline Mediation Controller.

  • OCOMC_Node_Manager_Summary.json: This dashboard lets you view NAR processing metrics for the Node Manager.

  • OCOMC_Node_Summary.json: This dashboard lets you view NAR processing metrics for all nodes.

  • OCOMC_Summary_Dashboard.json: This dashboard lets you view NAR-related metrics for all Offline Mediation Controller components.

For information about importing dashboards, see "Manage Dashboards" in the Grafana Dashboards documentation.

Using NMShell to Automate Deployment of Node Chains (Patch Set 5 and Later)

Note:

Running NMShell outside of a pod is supported in Offline Mediation Controller 12.0 Patch Set 5 and later releases.

You can use the Offline Mediation Controller Shell (NMShell) tool to:

  • Access Offline Mediation Controller cloud native system information

  • Make node configuration changes

  • Discover the status of one or more nodes

  • Perform start and stop operations, basic alarm monitoring, and node configuration changes

For more information, see "Using the Offline Mediation Controller Shell Tool" and "Managing Nodes Using NMShell Command-Line Components" in Offline Mediation Controller System Administrator's Guide.

You can run the NMShell tool outside of an Offline Mediation Controller pod by creating an NMShell job. The job runs as a post-upgrade or post-install hook and executes the NMShell tool using the commands that are passed in input files. You specify the location of your input files and how to handle errors that occur while processing an input file by using keys in your override-values.yaml file for oc-cn-ocomc-helm-chart.

To create an NMShell job:

  1. Create one or more input files that specify the list of NMShell commands to run as part of the NMShell job. Ensure that you add the appropriate extension to the input file name.

    For example input file content, see "Example: Input File for Adding a Mediation Host" and "Example: Input File with Command Blocks".

  2. Do one of the following:

    • If the job will run as part of a post-install hook, create a Dockerfile that copies your input files to an accessible location. For example:

      FROM oc-cn-ocomc:12.0.0.x.0
      RUN mkdir -p /home/ocomcuser/nmshell
      COPY test.nmshell /home/ocomcuser/nmshell
      COPY export_20220311_024702.nmx /home/ocomcuser/nmshell
      COPY export_20220311_024702.xml /home/ocomcuser/nmshell
    • If you will run the job as part of a post-upgrade hook, copy your input files to the PV that can be accessed by the Administration Server pod (vol-external PVC).

  3. In your override-values.yaml file for oc-cn-ocomc-helm-chart, set the following keys under ocomc.job.runNMShell:

    • flag: Set this to true.

    • fileExtension: Set this to the file name extension of all input files that you want processed, such as .input.

    • inputDir: Set this to the absolute path where input files will be placed, such as /home/ocomcuser/nmshell.

    • strictMode: Specify how to handle errors that occur while processing an input file by setting this key to one of the following:

      • cmd: The tool checks the return code after every NMShell command. If a command fails, the NMShell job stops processing that input file. It then starts processing the next input file in the directory.

      • block: The tool checks the return code after NMShell processes a command block, which is indicated by EOB commands. If a block fails, the NMShell job stops processing that input file. It then starts processing the next input file in the directory.

      • no: The tool processes all input files irrespective of their return codes.

    • hook: Set this key to one of the following:

      • post-upgrade: Specifies to run this job as a post-upgrade hook.

      • post-install: Specifies to run this job as post-install hook.

    • hookWeight: This key applies only when the scaling down job and NMShell job are both enabled. The lowest number of the two jobs gets loaded first. If you want the NMShell job to run first, set hookWeight to a negative number, such as -1.

  4. Run the NMShell job.

    • To run the job as a post-install hook, enter this command:

      helm install ReleaseName oc-cn-ocomc-helm-chart --namespace NameSpace --timeout 15m --values OverrideValuesFile
    • To run the NMShell job as a post-upgrade hook, enter this command:

      helm upgrade ReleaseName oc-cn-ocomc-helm-chart --namespace NameSpace --timeout 15m --values OverrideValuesFile

    where:

    • ReleaseName is the release name, which is used to track this installation instance.

    • NameSpace is the name space in which to create Offline Mediation Controller Kubernetes objects. To integrate the Offline Mediation Controller cloud native deployment with the ECE and BRM cloud native deployments, they must be set up in the same name space.

    • OverrideValuesFile is the path to a YAML file that overrides the default configurations in the chart's values.yaml file.

After processing each input file, NMShell adds one of these extensions to the input file's name:

  • .err: An error was encountered while running a command in the input file.

  • .done: The input file was processed with strictMode set to no.

  • .success: All of the commands in the input file were processed successfully.

Example: Input File for Adding a Mediation Host

This shows sample input file content that would do the following:

  1. Add a mediation host named Test to node-mgr-app.

  2. Change the context to the node-mgr-app Node Manager.

  3. List the cartridges in node-mgr-app.

addhost -n Test -ip node-mgr-app p 55109 
cd node-mgr-app 55109
ls

Example: Input File with Command Blocks

This shows sample input file content with blocks of code separated by EOB commands. Based on the content, NMShell would do the following:

  1. Add a mediation host named Test to node-mgr-app.

  2. Import the node customization data from exportFile.nmx into the Test mediation host.

  3. Import the node configuration data from exportFile.xml into the Test mediation host.

  4. Start all of the nodes for the currently running mediation host.

  5. Add a mediation host named Test1 to node-mgr-app-xyz.

  6. List the cartridges in node-mgr-app-xyz.

  7. Stop all nodes for the currently running mediation host.

If the strictMode key was set to block, the NMShell job would check for return codes after processing the last line in each block. That is, it would check for the return code after performing steps 1, 2, 3, 5, and 7. If an error code was returned by one of these steps, the job would stop processing the input file and add the .err extension to the input file's name.

addhost -n test -ip node-mgr-app p 55109
EOB
import -n test@node-mgr-app:55109 -f exportFile.nmx -c Y
EOB
import -n test@node-mgr-app:55109 -f exportFile.xml -c N
EOB
startNodes
addhost -n test1 -ip node-mgr-app-xyz -p 55109
EOB
ls
stopNodes
EOB

Managing a Helm Release

After you install a Helm chart, Kubernetes manages all of its objects and deployments. All pods created through oc-cn-ocomc-helm-chart are wrapped in a Kubernetes controller, which creates and manages the pods and performs health checks. For example, if a node fails, a controller can automatically replace a pod by scheduling an identical replacement on a different node.

Administrators can perform these maintenance tasks on a Helm chart release:

Tracking a Release's Status

When you install a Helm chart, it creates a release. A release contains Kubernetes objects, such as ConfigMap, Secret, Deployment, Pod, PersistentVolume, and so on. Not every object is up and running immediately. Some objects have a start delay, but the Helm install command completes immediately.

To track the status of a release and its Kubernetes objects, enter this command:

helm status ReleaseName -n Namespace

where:

  • ReleaseName is the name you assigned to this installation instance.

  • NameSpace is the name space in which the Offline Mediation Controller Kubernetes objects reside.

Updating a Release

To update any key value after a release has been created, enter this command. This command updates or re-creates the impacted Kubernetes objects, without impacting other objects in the release. It also creates a new revision of the release.

helm upgrade ReleaseName oc-cn-ocomc-helm-chart --values OverridingValueFile --values NewOverridingValueFile -n Namespace 

where:

  • ReleaseName is the name you assigned to this installation instance.

  • OverridingValueFile is the path to the YAML file that overrides the default configurations in the oc-cn-ocomc/values.yaml file.

  • NewOverridingValueFile is the path to the YAML file that has updated values. The values in this file are newer than those defined in values.yaml and OverridingValueFile.

  • Namespace is the name space in which the Offline Mediation Controller Kubernetes objects reside.

Checking a Release's Revision

Helm keeps track of the revisions you make to a release. To check the revision for a particular release, enter this command:

helm history ReleaseName -n Namespace

where:

  • ReleaseName is the name you assigned to this installation instance.

  • Namespace is the name space in which the Offline Mediation Controller Kubernetes objects reside.

Rolling Back a Release to a Previous Revision

To roll back a release to any previous revision, enter this command:

helm rollback ReleaseName RevisionNumber -n Namespace

where:

  • ReleaseName is the name you assigned to this installation instance.

  • RevisionNumber is the value from the Helm history command.

  • Namespace is the name space is which the Offline Mediation Controller Kubernetes objects reside.

Rolling Back an Offline Mediation Controller Cloud Native Upgrade

If you encounter errors after upgrading, you can roll back to a previous patch set version of Offline Mediation Controller.

The following procedure assumes that you have upgraded Offline Mediation Controller from Patch Set 5 (Revision 1), to Patch Set 6 (Revision 2), and then to Patch Set 7 (Revision 3). To roll back your upgrade from Patch Set 7 to Patch Set 6, you would do this:

  1. Check the revision history of the Offline Mediation Controller release:

    helm history ReleaseName -n Namespace

    You should see something similar to this:

    REVISION   UPDATED                    STATUS        CHART APP                 VERSION        DESCRIPTION 
    1          Thu May 30 07:12:46 2030   superseded    oc-cn-ocomc-helm-chart    12.0.0.5.0     Initial install
    2          Thu May 30 08:32:09 2030   superseded    oc-cn-ocomc-helm-chart    12.0.0.6.0     Upgraded successfully
    3          Thu May 30 09:50:00 2030   deployed      oc-cn-ocomc-helm-chart    12.0.0.7.0     Upgraded successfully
  2. Roll back the release to Offline Mediation Controller 12.0 Patch Set 6:

    helm rollback ReleaseName 2 -n BrmNamespace

    If successful, you will see this:

    Rollback was a success! Happy Helming!
  3. Check the revision history of the Offline Mediation Controller release:

    helm history ReleaseName -n BrmNamespace

    If successful, you should see something similar to this:

    REVISION   UPDATED                    STATUS        CHART APP                 VERSION        DESCRIPTION 
    1          Thu May 30 07:12:46 2030   superseded    oc-cn-ocomc-helm-chart    12.0.0.5.0     Initial install
    2          Thu May 30 08:32:09 2030   superseded    oc-cn-ocomc-helm-chart    12.0.0.6.0     Upgraded successfully
    3          Thu May 30 09:50:00 2030   superseded    oc-cn-ocomc-helm-chart    12.0.0.7.0     Upgraded successfully
    4          Thu May 30 11:25:00 2030   deployed      oc-cn-ocomc-helm-chart    12.0.0.6.0     Roll back to 2