Operations Insights Metrics

You can monitor for conditions where incoming data for any Operations Insights-enabled target has been delayed for last one or two days by using metrics, alarms, and notifications.

This topic covers the metrics emitted by the Operations Insights service.

Overview of Operations Insights Metrics

Operations Insights relies on a constant flow of data coming from a variety of sources such as Autonomous DBs and Enterprise Manager targets such as hosts and databases. The DataFlowDelayInHrs metric lets you monitor for data flow interruptions for all enabled targets and lets you quickly and easily identify which sources are having problems.

Required Policies

To monitor resources, you must be given the required type of access in a policy . The policy must give you access to the monitoring services as well as the resources being monitored. If you try to perform an action and get a message that you don’t have permission or are unauthorized, confirm with your administrator the type of access you've been granted and which compartment you should work in. For more information on user authorizations for monitoring, see the Authentication and Authorization section for the related service: Monitoring or Notifications.

For information on required Operations Insights policies, see Set Up Groups and Policies.

Metrics

Metric Name Dimensions Description
DataFlowDelayInHrs

resourceId - OPSI Id for the target.

resourceDisplayName - Display Name of the target.

telemetrySource - (Autonomous Database, Enterprise Manager, Agent Service) The resource type.

sourceIdentifier - The Enterprise Manager Bridge ID for the Enterprise Manager target, Management Agent ID for Management Agent-based target and OCID of ADW for Autonomous Database targets.

sourceEntityIdentifier - The target GUID for the Enterprise Manager target, database ID for the Management Agent-based targets.

associatedResourceId - The ADW OCID. This will only be populated for Autonomous Database targets.

sourceMetricName- Source Metric name for which the delay is being reported

dataProcessingFrequencyInHrs - Frequency of data processing in hours.

Number of hrs ago at which the data was last processed for given target and metric

Setting Alarms

When the metric condition is met, you can use the Monitoring service's Alarm system to alert interested parties to conditions defined for the DataFlowDelayInHrs metric. The following table shows some recommended alarms you can set up along with a corresponding Monitoring Query Language (MQL) example which you can use as a template to define your alarms. For more information about setting up alarms, see Managing Alarms.

Alarm Name MQL Description
DataFlowSourceAlarmFor1HrData DataFlowDelayInHrs[1h]{dataProcessingFrequencyInHrs="1.00"}.grouping(telemetrySource , sourceIdentifier).mean() > 48

Pending duration: 1h

For a sourceType, sourceIdentifier with 1 hour data processing frequency, the mean value (across targets) of DataFlowDelayInHrs is greater than 48 hours for continuous 6 hours. This indicates that the problem is at the whole source level.
DataFlowResourceAlarmFor1HrData DataFlowDelayInHrs[1h]{dataProcessingFrequencyInHrs="1.00"}.grouping(telemetrySource, resourceId,resourceDisplayName, sourceIdentifier).max() > 24

Pending duration: 1h

For a sourceType, resource & sourceIdentifier, DataFlowDelayInHrs is more than 24 hours for continuous 1 day for the type of data for which data processing frequency is every 1 hour.
DataFlowResourceAlarmFor3HrData DataFlowDelayInHrs[3h]{dataProcessingFrequencyInHrs="3.00"}.grouping(telemetrySource, resourceId, sourceIdentifier).max() > 48

Pending duration: 1h

For a sourceType, resource & sourceIdentifier, DataFlowDelayInHrs is more than 48 hours for continuous 1 day for the type of data for which data processing frequency is every 3 hours.
DataFlowResourceAlarmForDailyData DataFlowDelayInHrs[3h]{dataProcessingFrequencyInHrs="24.00"}.grouping(telemetrySource, resourceId, sourceIdentifier).mean()

Pending duration: 1h

For a sourceType, resource & sourceIdentifier, DataFlowDelayInHrs is more than 72 hours for continuous 1 day for the type of data for which data processing frequency is every 24 hours.

Using the Console

To view metric charts by dimension

  1. Open the navigation menu and click Observability & Management. Under Monitoring, click Service Metrics.
  2. For Metric Namespace, select oci_operations_insights.

  3. For Dimensions, click Add.

  4. For Dimension Name, select a dimension and then select a Dimension Value.

    Add more dimensions as needed.

  5. Click Done.

    The Service Metrics page displays a default set of charts for the selected metric namespace and dimension. You can also use the Monitoring service to create custom queries.

For more information about monitoring metrics and using alarms, see Monitoring. For information about notifications for alarms, see Notifications Overview.

To view metric charts using Metrics Explorer

  1. Open the navigation menu and click Observability & Management. Under Monitoring, click Metrics Explorer.

    The Metrics Explorer page displays an empty chart with fields to build a query.

  2. Select a compartment.
  3. From Metric Namespace, select oci_operations_insights.
  4. From Metric Name, select a metric.
  5. (Optional) Refine your query.

    For instructions, see To create a query.

  6. Click Update Chart.

    The chart shows the results of your new query. You can optionally add more queries by clicking Add Query below the chart.

For more information about monitoring metrics and using alarms, see Monitoring. For information about notifications for alarms, see Notifications Overview.

Using the API

For information about using the API and signing requests, see REST APIs and Security Credentials. For information about SDKs, see Software Development Kits and Command Line Interface.

Use the following APIs for monitoring: