Metrics Examples

Oracle Communications Unified Assurance Performance Monitoring provides a complete set of tools capable of gathering any metric, from any device, using any technology, at the granularity required for near real-time data collection. The gathered comprehensive set of data is stored leveraging the "Big Data" model and enhances it further with proactive analysis, monitoring, and reporting, providing forewarning of anomalies before they become outages. Integration with ticketing systems allows for rapid turnaround, high visibility, and increased tracking of troubleshooting issues. Through the included Knowledgebase system, detailed historical and current information and troubleshooting documents are available all in one place so repeat issues can be quickly found and resolved. This, coupled with ad hoc reports, scheduled reporting and dashboards provides a powerful and useful tool for reducing overhead costs and minimizing downtime.

The following example shows how to configure and use performance monitoring in Unified Assurance. Ping and SNMP polling are set up for devices, thresholds are added and configured, and a polling policy is created

Metric Collection

This example shows how to meet the following objectives:

Configuring Ping Polling

The following section covers Ping Polling of devices for Latency and Packet Loss metrics, using Unified Assurance's Default Ping Poller Template.

  1. Navigate to the Metric Types UI:

    From the main navigation menu, select Configuration, then Metrics, and then Metric Types.

    From this UI, you can add, edit and remove metric types, which are used in Unified Assurance to define how a metric is visualized and displayed.

  2. In this example, the Latency and Packet Loss metric types are enabled for TopN viewing (TopN Scope set to Value), so that these metric types become available in the TopN Overview.

    Note:

    This is an optional step. Having the TopN Scope set to "Disabled" will not affect the gathering of metrics. It only effects metric availability/visibility in the TopN Overview.

    • To enable these metrics in the TopN Overview, click on them to open for editing, change the "TopN Scope" value to "Value", and click "Submit" to save the changes.
  3. Navigate to the Poller Templates UI:

    From the main navigation menu, select Configuration, then Metrics, and then Poller Templates.

    • From this UI, you can create, edit and delete Poller Templates.

    • These templates consist of groups of Metric Types, and specify what metrics will be created and stored for the devices and instances that are polled using the template. The Default Ping template includes the Latency, Packet Loss, Ping Jitter and Ping Jitter Utilization metric types.

  4. Navigate to the Polling Assignments UI:

    From the main navigation menu, select Configuration, then Metrics, and then Polling Assignments.

    From this UI, you configure polling on a device instance for metric data.

  5. In the example, ping polling is set up for a selected number of devices.

    • For Method, select "NA".

    • For Poller Template, select "Default Ping".

    • For Threshold Group, select "Default Ping".

      • Thresholds will be covered in detail in a later section.
    • For Poll Time, enter 300.

      • Poll Time is how often the poller will poll for these metrics. Poll Time is measured in seconds, so entering 300 means that these metrics will be polled for every 5 minutes.
  6. In the Devices section, select the devices you want to poll metrics from and use the "Add" or "Add All" buttons to select them for polling.

    • If the list of devices is very large, you can use the filter button to filter specific devices by Device Name, Device ID, Device Group and/or Device Zone.
  7. Using the filter button, filter for the "Device" instances and use the "Add All" button to select the instances.

  8. Click "Submit" to create the metrics for polling.

  9. Navigate to the Services UI:

    From the main navigation menu, select Configuration, then Broker Control, and then Services.

  10. Click on the "Metric Ping Latency Poller" service to open the "Service (Edit)" form to the right of the UI.

  11. In the edit form, change the Status value to "Enabled". Click the "Submit" button to save the changes. The Ping Latency Poller service is now set to enabled.

  12. Select the service again, then click on the Start button to start the ping poller immediately. The Ping Latency Poller will then poll devices for Latency, Packet Loss, Ping Jitter and Ping Jitter Utilization metrics every 5 minutes (assuming the default Unified Assurance application configuration is used). These metrics will be viewable from a number of UI pages, such as the "Device Overview" dashboard and the "All Metrics Overview".

    Note:

    If not manually started, the application will be automatically started within the next minute by the broker.

  13. Navigate to Devices in the navigational bar, and then click on the "Metrics" icon for a device to view the data. This interface will display a list of all metrics that are being polled from that device. You can click on the filter icon in the top right of the UI to open the filter bar, in order to filter the list of metrics. Clicking on a metric from the list will open a performance graph for that metric.

Configuring SNMP Polling

This section covers the configuration/setup of SNMP-based polling of devices, firstly with an example using Unified Assurance's "Default CDM" (CPU, Disk, Memory) poller template. This is followed by an example demonstrating how to configure custom rules-based SNMP polling of devices.

Unified Assurance Default CDM (CPU, Disk, Memory)

Note:

The "Polling Assignments" UI is not needed for SNMP Polling. The Generic SNMP Poller is 100% rules based, and will not take any of the configured items from the "Polling Assignments" UI into consideration while running. Configuring the SNMP Poller through "Polling Assignments" may generate incorrect metrics. The only time SNMP-based metrics should be used in the "Polling Assignments" UI is specifically for applying non-rules based thresholds to already defined metrics (thresholds will be covered in a subsequent section).

Configuration -> Metrics -> Polling Assignments

  1. Navigate to the Metric Types UI:

    From the main navigation menu, select Configuration, then Metrics, and then Metric Types.

  2. As described in the "Configuring Ping Polling" section, in this example, "Memory Used", "CPU Utilization" and "Disk Used" can be enabled for TopN Overview (TopN Scope: Utilization).

    Note:

    This is an optional step and is purely for the purpose of showing these metrics in the TopN Overview. This will have no effect on the polling of these metrics.

  3. Navigate to the Services UI:

    From the main navigation menu, select Configuration, then Broker Control, and then Services.

  4. Click on the "Metric Generic SNMP Poller" service to open the "Service (Edit)" form to the right of the UI.

  5. In the edit form, change the Status value to "Enabled". Click the "Submit" button to save the changes. The Generic SNMP Poller service is now set to enabled.

  6. Select the service again, then click on the Start button to start the SNMP poller immediately. The Generic SNMP Poller will then poll devices for various metrics every 5 minutes (assuming the default Unified Assurance application configuration is used), depending on the rules that are available for that device. These metrics will be viewable from a number of UI pages, such as the "Device Overview" dashboard and the "All Metrics Overview".

    Note:

    If not manually started, the application will be automatically started within the next minute by the broker.

  7. Navigate to Devices in the navigational bar, and then click on the "Metrics" icon for a device to view the data. This interface will display a list of all metrics that are being polled from that device. You can click on the filter icon in the top right of the UI to open the filter bar, in order to filter the list of metrics. Clicking on a metric from the list will open a performance graph for that metric.

Custom Metrics - UPS Metrics

This second example demonstrates how to write your own rules files for polling custom SNMP metrics. In this example, rules files are written to poll UPS data from a device. The metrics polled include:

  1. Battery temperature

  2. Battery runtime

  3. Battery capacity

  4. Input and output voltage

  5. Output load (%)

Note:

The custom UPS rules in this example are provided only as an example, demonstrating how to write your own custom metrics rules files. A default installation includes a set of rules files for polling a variety of devices from numerous different vendors. The following documentation has information regarding supported devices and other useful information:

Please contact Oracle Communications if there are devices that are not polled by the out-of-the-box Foundation rules.

  1. Navigate to the Metric Types UI:

    From the main navigation menu, select Configuration, then Metrics, and then Metric Types.

  2. Add new metric types to Unified Assurance, giving them the values shown in the table below.

    Note:

    In this example, the TopN Scope is set to "Disabled". If you wish for these metrics to be available in the TopN Overview, you can do so by setting TopN Scope to "Utilization" or "Value" respectively.

    Name Metric Group Format Unit Name Abbreviation Name Value Type Unit Division Direction TopN Type TopN Scope
    UPS Battery Capacity None Float Capacity % Utilization SI (1000) Descending (Normal) Both Disabled
    UPS Battery Runtime None Integer Seconds s Raw Time Descending (Normal) Both Disabled
    UPS Battery Temperature None Float Celsius C Raw SI (1000) Descending (Normal) Both Disabled
    UPS Input Voltage None Float Volts V Raw None Descending (Normal) Both Disabled
    UPS Output Voltage None Float Volts V Raw None Descending (Normal) Both Disabled
    UPS Output Load % None Float Percentage % Raw SI (1000) Descending (Normal) Both Disabled
  3. Navigate to the Rules UI:

    From the main navigation menu, select Configuration, and then Rules.

    • The UI contains a list of rules directories and subdirectories.

    • Click on the "right arrow" symbol to the immediate left of a folder icon to expand that directory. Clicking on the "down arrow" symbol will collapse the directory.

  4. Click to expand Core Rules (core) -> Default read-write branch (default) -> collection -> metric -> snmp.

  5. Click to select the "snmp" folder, then click the "Add" button, and click "Add File" to add a new rules file (the form will appear to the right of the UI).

  6. Enter an appropriate name for the rules in the File Name field (e.g. ups-snmp.rules).

  7. The rules logic (Perl syntax) is entered in the text area underneath the File Name field. The following is example code for the UPS rules file.

    my $DeviceID     = $DeviceHash->{DeviceID};
    my $DeviceInfo   = $DeviceHash->{DeviceID} . ':' . $DeviceHash->{DNS} . ':' . $DeviceHash->{IP};
    my $PollInterval = $PollerConfig->{'PollTime'};
    my $PolledTime   = $DeviceHash->{PollTime};
    
    $Log->Message("INFO","ups-snmp.rules -> [$DeviceInfo] -> Entering ups-snmp.rules");
    

    Here you specify the OID's you wish to poll for data. These exact OID's used in this example were taken from the "PowerNet-MIB" MIB file.

    # OID's to be polled
    my %OIDs = (
        'upsAdvBatteryCapacity'         => '1.3.6.1.4.1.318.1.1.1.2.2.1.0',  # Battery Capacity (%) # 
        'upsAdvBatteryRunTimeRemaining' => '1.3.6.1.4.1.318.1.1.1.2.2.3.0',  # Battery run time remaining #
        'upsAdvBatteryTemperature'      => '1.3.6.1.4.1.318.1.1.1.2.2.2.0',  # Temperature in Celcius #
        'upsAdvInputLineVoltage'        => '1.3.6.1.4.1.318.1.1.1.3.2.1.0',  # Input Voltage #
        'upsAdvOutputVoltage'           => '1.3.6.1.4.1.318.1.1.1.4.2.1.0',  # Output Voltage #
        'upsAdvOutputLoad'              => '1.3.6.1.4.1.318.1.1.1.4.2.3.0'   # Output Load (%) #
    );
    
    my %metricNames = reverse %OIDs;
    

    Next, match the OID's to be polled with their corresponding MetricTypeIDs in Unified Assurance (created in steps 1 and 2). NOTE: The MetricTypeIDs shown in this rules file example will probably differ from the IDs of your own MetricTypes that you create.

    # Matching MetricType ID's in Unified Assurance with OID's to poll
    my %MetricTypeIDs= (
        'upsAdvBatteryCapacity'         => '1015',
        'upsAdvBatteryRunTimeRemaining' => '1016',
        'upsAdvBatteryTemperature'      => '1017',
        'upsAdvInputLineVoltage'        => '1018',
        'upsAdvOutputVoltage'           => '1019',
        'upsAdvOutputLoad'              => '1020',
    );
    
    $Session->translate([ -timeticks => 0 ]);  # This tells the snmp client not to translate it into friendly time
    # Then, dividing $result by 100 will give the time in seconds
    

    Next, grab the available metrics from the device for polling. This is done via $Session->get_request:

    # Grab available metrics from device for polling
    my $DeviceData= $Session->get_request (
        -varbindlist => [
            $OIDs{'upsAdvBatteryCapacity'},
            $OIDs{'upsAdvBatteryRunTimeRemaining'},
            $OIDs{'upsAdvBatteryTemperature'},      
            $OIDs{'upsAdvInputLineVoltage'},
            $OIDs{'upsAdvOutputVoltage'},
            $OIDs{'upsAdvOutputLoad'},      
        ]
    );
    

    Finally, iterate through the polled metrics and update their values in Unified Assurance:

    # Iterate through polled metrics and update each one in Unified Assurance
    foreach my $thisOID (keys(%{$DeviceData})) {
        my $result = $DeviceData->{$thisOID};
        my $metricName = $metricNames{$thisOID}; 
        my $MetricTypeID = $MetricTypeIDs{$metricName};
        $Log->Message("DEBUG", "UPS rules ->  [$metricName], oid: [$thisOID] value: [$result] type: [$MetricTypeID]");
    
        $InstanceID = 0;
        #$Log->Message('DEBUG', "UPS rules -> Searching for InstanceID for [$InstanceName] on DeviceID[$DeviceID]");
        #($InstanceID, $Error) = FindInstanceID($RulesDBH, $MetricHash, $Log, $DeviceID, $InstanceName, 1); # Not necessary, as InstanceID is already specified (0)    
        #$Log->Message('DEBUG', "UPS rules -> Found InstanceID: $InstanceID for [$InstanceName]");
    
        my ($MetricID, $Error) = FindMetricID($RulesDBH, $MetricHash, $Log, $DeviceID, $InstanceID, $MetricTypeID, $Factor, $max, $PollInterval);
        $Log->Message('DEBUG', "UPS rules -> created/updated metric [$MetricID] for [$InstanceID]");
    
        # Converting Battery Runtime metric to minutes (default runtime metric looks like this example: [1 hour, 20:00.00])
        if($thisOID eq '1.3.6.1.4.1.318.1.1.1.2.2.3.0') { 
            my $convertedTime = $result/100;
            $Log->Message("DEBUG", "UPS rules -> [$DeviceID]DataQueue params: metricid[$MetricID], value[$convertedTime], status [$Status], polltime[$PolledTime]");     
            $DataQueue->enqueue($MetricID. ':' . $convertedTime . ':' . $Status . ':' . $PolledTime);
            $Log->Message("DEBUG", "UPS rules -> Finsihed with oid [$metricName]");
        }
        else {
            $Log->Message("DEBUG", "UPS rules -> [$DeviceID]DataQueue params: metricid[$MetricID], value[$result], status [$Status], polltime[$PolledTime]");     
            $DataQueue->enqueue($MetricID. ':' . $result . ':' . $Status . ':' . $PolledTime);
            $Log->Message("DEBUG", "UPS rules -> Finsihed with oid [$metricName]");
        }
    }
    $Log->Message("INFO", "Exiting ups-snmp.rules");
    

    It is good practice to include log messages in your rules file, to aid in the debugging process, should anything not work as intended.

    You will also need to update the 'base.rules' file and 'base.includes' file, to include the new rules file you just created:

    base.rules

    elsif($SysObjectID =~ '1.3.6.1.4.1.318') { # UPS
        $Log->Message("WARN","Base Rules -> [$DeviceInfo] -> Polling using ups-snmp rules");
        UPSsnmpRules();
    }
    

    base.includes

    UPSsnmpRules,metricStdPoller/snmp/ups-snmp.rules
    
  8. Navigate to the Services UI:

    From the main navigation menu, select Configuration, then Broker Control, and then Services.

  9. Click to select the "Metric Generic SNMP Poller" and ensure that application configuration has the "LogLevel" set to DEBUG.

  10. Click the restart button to restart the service.

    • When the service is restarted, the new UPS rules file and new log level will be taken into account. The poller will now poll for UPS metrics using the UPS rules file.
  11. Navigate to Logs.

  12. Use the filter bar to enter the following, replacing the "" with the value of the poller from the "Services" UI. This will filter the log file using the keyword "ups":

    event.dataset:"GenericSNMPPollerd(35)" and message : "ups"
    
  13. Navigate to Devices in the navigational bar, and then click on the "Metrics" icon for a UPS device to view the data. This interface will display a list of all metrics that are being polled from that device. You can click on the filter icon in the top right of the UI to open the filter bar, in order to filter the list of metrics. Clicking on a metric from the list will open a performance graph for that metric.

Configuring Thresholds

This section covers configuring thresholds for the UPS Metrics from the previous section.

Thresholds can detect and give early warning for problems that may exist for metric data being collected. The Standard Thresholding Engine analyzes the threshold definitions (defined in the Thresholds UI), looks at the metric database for the status, and creates an event if the defined limit is breached. You can also optionally set up watcher policies to send email or syslog notifications for thresholds using the Event Watcher Custom Correlation Engine. See EventWatcherd and Watcher Policies for more information.

To define thresholds for the UPS metrics set up in the previous section:

  1. Create thresholds:

    1. Navigate to the Thresholds UI:

      From the main navigation menu, select Configuration, then Metrics, and then Thresholds.

    2. Click Add.

      The Threshold (new) form appears.

    3. Create a threshold that is violated if the temperature of the battery reaches above 50 degrees centigrade by setting the following fields:

      • Name: UPS High Battery Temp

      • Type: Standard

      • Measurement: UPS Battery Temperature

      • Metric Field: value

      • Time Range: 15m

      • Warning:

        • Warning Operator: >=

        • Warning Value: 50

        • Warning Severity: Major

      • Critical:

        • Critical Operator: >=

        • Critical Value: 70

        • Critical Severity: Critical

      • Message: Performance threshold violation: UPS High Battery Temp

      • Check Location: Threshold Engine

      • Status: Enabled

    4. Click Submit.

    5. Add similar thresholds for the rest of the UPS metrics using the Threshold UI. The following are some examples that could be configured:

      • UPS Battery Runtime

      • UPS Output Load %

      • UPS Output Voltage Surge

      • UPS Input Voltage Surge

  2. Create a threshold group:

    1. Navigate to the Threshold Groups UI:

      From the main navigation menu, select Configuration, then Metrics, and then Threshold Groups.

      You can use this UI to group individual thresholds together to form a threshold group for polling assignments.

    2. Click the Add.

      The Threshold Group (new) form appears.

    3. In Name, enter Default UPS.

    4. Add the UPS thresholds to the group by selecting them from the list of available thresholds, and clicking Add.

    5. Click Submit.

  3. Add the thresholds to the metrics:

    1. Navigate to the Polling Assignments UI:

      From the main navigation menu, select Configuration, then Metrics, and then Polling Assignments.

    2. Add the UPS devices for polling using their group by setting the following fields:

      • Method: SNMP

      • Poller Template: Default UPS

      • Threshold Group: Default UPS

      • Poll Time: 300

      • Devices: Select any UPS devices on your system, limited to the Device instance for each device.

    3. Click Submit.

  4. Ensure that the Metric Standard Thresholding Engine service is enabled:

    1. Navigate to the Services UI:

      From the main navigation menu, select Configuration, then Broker Control, and then Services.

    2. Select the filter button, then enter Metric Standard Thresholding Engine in the Name column to locate the service quickly.

    3. Confirm that the Status is Enabled and the State is Running.

      If not, select it, change Status to Enabled, click Submit, then click Start.

Configuring Polling Policies

The Poller Discovery scheduled job uses polling policy settings to search for devices to process, creates the types of metrics to poll the devices for based on the selected poller template, and then assigns thresholds based on the selected threshold group. Essentially, this is a simple, automated, and dynamic way to create and maintain metrics and threshold settings for certain devices and instances, rather than manually creating them using the Polling Assignments UI.

To set up an example Network Interface polling policy for routers:

  1. Navigate to the Polling Policies UI:

    From the main navigation menu, select Configuration, then Metrics, and then Polling Policies.

  2. Click Add.

    The Polling Policy (new) form appears.

  3. Set the following fields (leave the default setting for other fields):

    • Name: Router Network Interface

    • Description: Network Interface Metric Polling Policy

    • Policy Status: Enabled

    • Match:

      • IP Range: 192.0.2.*

        Optionally, specify a specific range of IP addresses to search. If left blank, the scheduled job will search every device.

      • Device Category: Router

        The scheduled job will only use this polling policy on Router devices.

      • Instance:

        • Match: LIKE

        • Name: eth

        The scheduled job will only process instances that match the provided name.

    • Assign:

      • Method: SNMP

      • Poller Template: Default Network Interface

      • Threshold Group: Default Network Interface

  4. Click Submit.

  5. Navigate to the Jobs UI:

    From the main navigation menu, select Configuration, then Broker Control, and then Jobs.

  6. Ensure that the Metric Poller Discovery scheduled job is set to Enabled.

Calculation Policies

You can use the Metric Post-Collection Calculation Engine (PCCE) to combine individual metrics to create a meta-metric. For example, you could define a meta-metric for total inbound bandwidth, where the metric data for all inbound interfaces is summed up and saved as a separate metric. You can use meta-metrics for thresholding, SLM monitoring, and so on.

To use PCCE:

  1. Define meta-metrics as collections, by using the Collections UI.

  2. Define how the meta-metrics are handled by using the Calculations UI.

    The calculation policies use Perl-syntax code to do special processing on the metric data.

  3. Set up a job to run PCCE by using the Jobs UI.

See MetricPostCalculator for more information about PCCE.