Resource Monitoring

You can monitor the health, capacity, and performance of your Oracle Cloud Infrastructure resources when needed using queries  or on a passive basis using alarms . Queries and alarms rely on metrics  emitted by your resource to the Monitoring service.

Prerequisites

  • IAM policies: To monitor resources, you must have the required type of access in a policy  written by an administrator, whether you're using the Console or the REST API with an SDK, CLI, or other tool. The policy must give you access to the monitoring services as well as the resources being monitored. If you try to perform an action and get a message that you don’t have permission or are unauthorized, confirm with your administrator the type of access you have and which compartment  you should work in. For more information about user authorizations for monitoring, see IAM Policies (Monitoring).
  • Metrics exist in Monitoring: The resources that you want to monitor must emit metrics to the Monitoring service.
  • Compute instances: To emit metrics, the Compute Instance Monitoring plugin must be enabled on the instance, and plugins must be running. The instance must also have either a service gateway or a public IP address to send metrics to the Monitoring service. For more information, see Enabling Monitoring for Compute Instances.

Working with Resource Monitoring

Not all resources support monitoring. See Supported Services for the list of resources that support the Monitoring service, which is required for queries and alarms used in monitoring.

The Monitoring service works with the Notifications service to notify you when metrics breach. For more information about these services, see Monitoring and Notifications.

To view default metric charts for a resource

On the page for the resource of interest, under Resources, click Metrics.

For example, to view metric data for a Compute instance: 

  1. Open the navigation menu and click Compute. Under Compute, click Instances.
  2. Click the name of the instance that you want to see metrics for.

  3. On the instance details page, under Resources, click Metrics.

    The page displays a chart for each metric. For a list of metrics related to Compute instances, see Compute Instance Metrics.

The Console displays the last hour of metric data for the selected resource. The page shows a chart (graph) for each metric emitted by the selected resource.

For example, default charts for a Compute instance include CPU Utilization and Memory Utilization.

For a list of metrics emitted by the resource, see Supported Services.

To view default metric charts for a set of resources
  1. Open the navigation menu and click Observability & Management. Under Monitoring, click Service Metrics.
    The Service Metrics page opens.
  2. From the Console header, select the region that contains the metric data that you want.
    For more information about regions, see Understand Regions and Working Across Regions.
  3. Choose a compartment that you have permission to work in (on the left side of the page).
    The page lists metric namespaces for the selected region and compartment. For example, if the current compartment contains load balancers, then the page includes oci_lbaas in its list of metric namespaces.
  4. Choose the Metric namespace for the resource types of interest.
    For example, choose oci_lbaas to see metrics for load balancers.

For more information about default metric charts, see Viewing Default Metric Charts.

To create a query
Note

To start with a predefined service query, see Exploring a Default Metric Chart.
  1. Open the navigation menu and click Observability & Management. Under Monitoring, click Metrics Explorer.
  2. From the Console header, select the region that contains the metric data that you want.
    For more information about regions, see Understand Regions and Working Across Regions.
  3. Select a Metric namespace and a Metric name.
    The minimum required fields to view a metric chart are Metric namespace and Metric name.
  4. Click Update Chart.
    The chart shows metric data for the selected metric namespace and metric name, using default values for interval and statistic.
  5. To change the metric chart, update fields and then click Update Chart again.
    For reference, see Creating a Query.

For more information about custom metric charts, see Viewing a Custom Metric Chart. For more information about queries, see Querying Metric Data.

To create an alarm
  1. Open the navigation menu and click Observability & Management. Under Monitoring, click Alarm Definitions.
  2. Click Create alarm.
    The Create Alarm page opens.
  3. To create an alarm using Basic mode (default view), fill in the fields.
    FieldDescription
    Alarm name (under Define alarm)
    User-friendly name for the new alarm. This name is sent as the title for notifications related to this alarm. Avoid entering confidential information.
    Rendering of the title by protocol
    Protocol Rendering of the title
    Email Subject line of the email message.
    HTTPS (Custom URL) Not rendered.
    PagerDuty Title field of the published message.
    Slack Not rendered.
    SMS Not rendered.
    Alarm severity The perceived type of response required when the alarm is in the firing state.
    Alarm body The human-readable content of the notification delivered. We recommend providing guidance to operators for resolving the alarm condition. Consider adding links to standard runbook practices. Example: "High CPU usage alert. Follow runbook instructions for resolution."
    Tags (optional) If you have permissions to create a resource, then you also have permissions to apply free-form tags to that resource. To apply a defined tag, you must have permissions to use the tag namespace. For more information about tagging, see Resource Tags. If you're not sure whether to apply tags, skip this option (you can apply tags later) or ask an administrator. See Tagging an Alarm at Creation.
    Compartment (under Metric description) The compartment  containing the resources that emit the metrics evaluated by the alarm. The selected compartment is also the storage location of the alarm. By default, the first accessible compartment is selected.
    Metric namespace The service or application emitting metrics for the resources that you want to monitor. The page lists metric namespaces for the selected compartment. For example, if the current compartment contains load balancers, then the page includes oci_lbaas in its list of metric namespaces.
    Resource group The group that the metric belongs to. A resource group is a custom string provided with a custom metric. Not applicable to service metrics.
    Metric name

    The name of the metric. Only one metric can be specified. Example: CpuUtilization

    Note: Any OCI metric or custom metric can be selected, as long as data for the metric exists in the selected compartment and metric namespace.

    Interval

    The aggregation window, or the frequency at which data points are aggregated.

    • 1 minute
    • 5 minutes
    • 15 minutes
    • 30 minutes
    • 1 hour
    • 2 hours
    • 6 hours
    • 12 hours
    • 1 day
    • Custom - specify a Custom value and select a Unit (Minutes or Hours)
    Statistic

    The aggregation function.

    • Mean - The value of Sum divided by Count during the specified time period.
    • Rate - The per-interval average rate of change.
    • Sum - All values added together.
    • Max - The highest value observed during the specified time period.
    • Min - The lowest value observed during the specified time period.
    • Count - The number of observations received in the specified time period.
    • P50 - The value of the 50th percentile.
    • P90 - The value of the 90th percentile.
    • P95 - The value of the 95th percentile.
    • P99 - The value of the 99th percentile.
    Metric dimensions Optional filters to narrow the metric data evaluated. Fill in the fields.
    • Dimension name: A qualifier specified in the metric definition. For example, the dimension resourceId is specified in the metric definition for CpuUtilization.
    • Dimension value: The value that you want to use for the specified dimension, for example, the resource identifier for an instance.
    • Additional dimension: Adds another name-value pair for a dimension.
    • X: Removes the indicated name-value pair for a dimension.
    Aggregate metric streams Returns the combined value of all metric streams for the selected statistic.
    Trigger rule

    The condition that must be satisfied for the alarm to be in the firing state. The condition can specify a threshold, such as 90% for CPU Utilization, or an absence. Fill in the fields:

    • Operator: The operator used in the condition threshold.
      • greater than
      • greater than or equal to
      • equal to
      • less than
      • less than or equal to
      • between (inclusive of specified values)
      • outside (inclusive of specified values)
      • absent
    • Value: The value to use for the condition threshold.
    • Trigger delay minutes: The number of minutes that the condition must be maintained before the alarm is in the firing state.
    Destination (under Define alarm notifications)

    The provider of the destination to use for alarm notifications.

    Note:If you expect more than 60 messages per minute, select a stream as notification destination (instead of a topic). For more information, see Alarm Message Limits.

    • Destination service: Select one of the following:
    • Compartment: The compartment  storing the resource (such as a topic or stream) to be used for notifications. This compartment can be a different compartment than the one specified for the alarm and metric. By default, the first accessible compartment is selected.
    • Topic (for Notifications only): The topic to use for notifications. Each topic supports one or more subscription protocols, such as PagerDuty.

      Create a topic

      To create a new topic (and a new subscription) in the selected compartment, click Create a topic and then fill in the fields.

      • Topic name: A user-friendly name for the topic. For example, enter: "Operations Team" for a topic used to notify operations staff of firing alarms. Avoid entering confidential information.
      • Topic description: Description of the new topic.
      • Subscription protocol: Medium of communication to use for the new topic. Select the type of subscription that you want to create, then fill in the associated fields.
        • Email:
          • Subscription email: Type an email address.
        • Function:
          • Function Compartment: Select the compartment containing the function that you want.
          • Function Application: Select the application containing the function that you want.
          • Function: Select the function that you want.
        • HTTPS (Custom URL):
          • Subscription URL: Type (or copy and paste) the URL that you want to use as the endpoint.
        • PagerDuty:
          • Subscription URL: Type (or copy and paste) the integration key portion of the URL for the PagerDuty subscription. (The other portions of the URL are hard-coded.)
        • Slack:
          • Subscription URL: Type (or copy and paste) the Slack endpoint, including the webhook token.
        • SMS:
          • Country: Select the country for the phone number.
          • Phone Number: Enter the phone number, using E.164 format. Example: +14255550100
    • Stream (for Streaming only): The stream to use for alarm notifications.
    Message grouping

    Select an option.

    • Group notifications across metric streams: Collectively track metric status across all metric streams. Send a message when metric status across all metric streams changes.
    • Split notifications per metric stream: Individually track metric status by metric stream. Send a message when metric status for each metric stream changes. For an example, see Scenario: Split Messages by Metric Stream.
    Message Format

    Select an option for appearance of messages you receive from this alarm (Notifications destination only).

    • Send formatted messages: Simplified, user-friendly layout. To view supported subscription protocols and message types for formatted messages (options other than Raw), see Friendly formatting.
    • Send Pretty JSON messages (raw text with line breaks): JSON with new lines and indents.
    • Send raw messages: Raw JSON blob.
    Repeat notification?

    While the alarm is in the firing state, resends notifications at the specified interval.

    Notification frequency: The period of time to wait before resending the notification. See Best Practices for Your Alarms.

    Suppress notifications Sets up a suppression time window during which to suspend evaluations and notifications. Useful for avoiding alarm notifications during system maintenance periods. Specify Start time, End time, and optionally Suppression description. See Best Practices for Your Alarms and Suppressing an Alarm.
    Enable this alarm? When selected, the new alarm is enabled. Metric data is evaluated on creation of the alarm.

    The chart under the Define alarm section dynamically displays the last six hours of emitted metrics according to selected fields for the query. Very small or large values are indicated by International System of Units (SI units), such as M for mega (10 to the sixth power). To switch views of data in the chart, see Switching Table and Graph Views for an Alarm Metric Chart.

  4. Click Save alarm.

    The Alarm Definitions page lists the new alarm. Monitoring begins evaluating the configured metric, sending alarm messages when the metric data satisfies the trigger rule.

For more information about creating alarms, see Creating an Alarm.