Monitoring Oracle NoSQL Database Cloud Service

The Oracle Cloud Infrastructure Monitoring service enables you to actively and passively monitor your cloud resources using the Metrics and Alarms features. The Monitoring service uses metrics to monitor resources and alarms to notify you when these metrics meet alarm-specified triggers.

A metric is a measurement related to the health, capacity, or performance of a given resource. An alarm is a trigger rule and query. Alarms passively monitor your cloud resources by using metrics. You can configure notification settings when creating an alarm.

Metrics are emitted to the Monitoring service as raw data points (a timestamp-value pair for a specified metric)along with dimensions (a resource identifier provided in the metric definition)and metadata. The Monitoring service publishes alarm messages to configured destinations managed by the Notifications service.

When you query a metric, the Monitoring service returns aggregated data according to the specified parameters. You can specify a range (such as the last 24 hours), statistic, and interval. A statistic is the aggregation function applied to the raw data points. SUM aggregation function is an example of a statistic. An interval is the time window used to convert a given set of raw data points. For example, 5 minutes.

The Console displays one monitoring chart per metric for selected resources. The aggregated data in each chart reflects your selected statistic and interval. API requests can optionally filter by dimension and specify a resolution. API responses include the metric name along with its source compartment and metric namespace(indicates the resource, service, or application that emits a metric). The namespace is provided in the metric definition. For example, the CpuUtilization metric definition emitted by Oracle Cloud lists the oci_computeagent metric namespace as the source of the metric.

Metric and alarm data is accessible via the Console, CLI, and API. For more information about OCI monitoring service concepts, see Monitoring Concepts

This article has the following topics:

Viewing or Listing Oracle NoSQL Database Cloud Service Metrics

You can view the metrics available for the Oracle NoSQL Database Cloud Service from Console. Additionally, you can get the list of metrics available for the Oracle NoSQL Database Cloud Service using OCI CLI commands.

  1. Open the navigation menu and click Observability & Management. Under Monitoring, click Service Metrics.
  2. Select the Compartment and Metric namespace (oci_nosql).

From the Cloud Shell, run the following command. It returns metric definitions that match the criteria specified in the request. Compartment OCID required. For more information about the OPTIONS available with the list command, see List Metrics.

oci monitoring metric list --compartment-id <Compartment_OCID> --namespace oci_nosql

For example:
oci monitoring metric list --compartment-id ocid1.compartment.oc1..aaaaaaaawrmvqjzoegxbsixp5k3b5554vlv2kxukobw3drjho3f7nf5ca3ya --namespace oci_nosql
Example response:
{
  "data": [
    {
      "compartment-id": "ocid1.compartment.oc1..aaaaaaaawrmvqjzoegxbsixp5k3b5554vlv2kxukobw3drjho3f7nf5ca3ya",
      "dimensions": {
        "resourceId": "ocid1_nosqltable_oc1_phx_amaaaaaau7x7rfyasvdkoclhgryulgzox3nvlxb2bqtlxxsrvrc4zxr6lo4a",
        "tableName": "demo"
      },
      "name": "ReadThrottleCount",
      "namespace": "oci_nosql",
      "resource-group": null
    },
    {
      "compartment-id": "ocid1.compartment.oc1..aaaaaaaawrmvqjzoegxbsixp5k3b5554vlv2kxukobw3drjho3f7nf5ca3ya",
      "dimensions": {
        "resourceId": "ocid1_nosqltable_oc1_phx_amaaaaaau7x7rfyasvdkoclhgryulgzox3nvlxb2bqtlxxsrvrc4zxr6lo4a",
        "tableName": "demo"
      },
      "name": "ReadUnits",
      "namespace": "oci_nosql",
      "resource-group": null
    },
    {
      "compartment-id": "ocid1.compartment.oc1..aaaaaaaawrmvqjzoegxbsixp5k3b5554vlv2kxukobw3drjho3f7nf5ca3ya",
      "dimensions": {
        "resourceId": "ocid1_nosqltable_oc1_phx_amaaaaaau7x7rfyasvdkoclhgryulgzox3nvlxb2bqtlxxsrvrc4zxr6lo4a",
        "tableName": "demo"
      },
      "name": "StorageGB",
      "namespace": "oci_nosql",
      "resource-group": null
    },
    {
      "compartment-id": "ocid1.compartment.oc1..aaaaaaaawrmvqjzoegxbsixp5k3b5554vlv2kxukobw3drjho3f7nf5ca3ya",
      "dimensions": {
        "resourceId": "ocid1_nosqltable_oc1_phx_amaaaaaau7x7rfyasvdkoclhgryulgzox3nvlxb2bqtlxxsrvrc4zxr6lo4a",
        "tableName": "demo"
      },
      "name": "StorageThrottleCount",
      "namespace": "oci_nosql",
      "resource-group": null
    },
    {
      "compartment-id": "ocid1.compartment.oc1..aaaaaaaawrmvqjzoegxbsixp5k3b5554vlv2kxukobw3drjho3f7nf5ca3ya",
      "dimensions": {
        "resourceId": "ocid1_nosqltable_oc1_phx_amaaaaaau7x7rfyasvdkoclhgryulgzox3nvlxb2bqtlxxsrvrc4zxr6lo4a",
        "tableName": "demo"
      },
      "name": "WriteThrottleCount",
      "namespace": "oci_nosql",
      "resource-group": null
    },
    {
      "compartment-id": "ocid1.compartment.oc1..aaaaaaaawrmvqjzoegxbsixp5k3b5554vlv2kxukobw3drjho3f7nf5ca3ya",
      "dimensions": {
        "resourceId": "ocid1_nosqltable_oc1_phx_amaaaaaau7x7rfyasvdkoclhgryulgzox3nvlxb2bqtlxxsrvrc4zxr6lo4a",
        "tableName": "demo"
      },
      "name": "WriteUnits",
      "namespace": "oci_nosql",
      "resource-group": null
    }
  ]
}

How to Collect Oracle NoSQL Database Cloud Service Metrics?

You can build metric queries for collecting specific sets of metrics (aggregated data). A metric query contains the Monitoring Query Language (MQL) expression to evaluate for returning aggregated data. The query must specify a metric, statistic, and interval.

You can use metric queries to actively and passively monitor your cloud resources. Actively monitor with metric queries that you generate spontaneously, on-demand. In the Console, update a chart to show data from multiple queries. Store queries you want to reuse. Passively monitor with alarms that add a condition, or trigger rule, to a metric query.

Metric query syntax (boldface elements are required):
metric[interval] {dimensionname=dimensionvalue}.groupingfunction.statistic
Threshold Alarm query syntax (boldface elements are required):
metric[interval]{dimensionname=dimensionvalue}.groupingfunction.statistic alarmoperator alarmvalue

For supported parameter values, see Monitoring Query Language (MQL) Reference.

Example Queries

Simple metric query

Sum of Storage Throttle counts for all the tables in a compartment at a one-minute interval.

The number of lines displayed in the metric chart (Console): 1 per table.

StorageThrottleCount[1m].sum()
Filtered metric query

Sum of Storage Throttle counts in a compartment at a one-minute interval, filtered to a single table.

The number of lines displayed in the metric chart (Console): 1.

StorageThrottleCount[1m]{tableName = "demoKeyVal"}.sum()
Aggregated metric query

Aggregated average of read operation at a sixty-minute interval, filtered to a compartment, aggregated for the average.

The number of lines displayed in the metric chart (Console): 1 per table.

ReadUnits[60m]{compartmentId="ocid1.compartment.oc1.phx..exampleuniqueID"}.grouping().mean()
Group-aggregated metric query

Aggregated average of Read Throttle Count by read unit at a sixty-minute interval, filtered to a single table in a compartment.

The number of lines displayed in the metric chart (Console): 1 per read unit.

ReadThrottleCount[60m]{tableName = "demoKeyVal"}.groupBy(ReadUnits).mean()
Alarm query (threshold)

Triggered when the 90th percentile of CPU Utilization, aggregated by pool ID, and filtered to the specified availability domain, exceeds 85.

Number of lines displayed in the metric chart (Console): 1 per pool.

CpuUtilization[1m]{availabilityDomain="VeBZ:PHX-AD-1"}.groupBy(poolId).percentile(0.9) > 85

Creating a Metric Query

There are two ways for creating a metric query. You can either create a query using Console or OCI CLI command.

  1. Open the navigation menu and click Observability & Management. Under Monitoring, click Metrics Explorer.

    The Metrics Explorer page displays an empty chart with fields to build a query.

  2. Fill in the fields for a new query.
    • Compartment: The compartment containing the Oracle NoSQL Database Cloud Service tables that you want to monitor. By default, the first accessible compartment is selected.
    • Metric namespace: The Oracle NoSQL Database Cloud Service emitting metrics for the tables that you want to monitor. Example: oci_nosql.
    • Resource group (optional): The group that the metric belongs to. A resource group is a custom string provided with a custom metric. Not applicable to service metrics.
    • Metric name: The name of the metric. Only one metric can be specified. Metric selections depend on the selected compartment and metric namespace. Example: ReadUnits
    • Interval: The aggregation window.
    • Statistic: The aggregation function.
    • Metric dimensions: Optional filters to narrow the metric data evaluated.
      • Dimension fields: For Oracle NoSQL Database Cloud Service metrics, you can select either resourceId or tableName as Dimension name and Dimension value pair.
    • Aggregate metric streams: Plots a single line on the metric chart to represent the combined value of all metric streams for the selected statistic.
  3. Click Update Chart.

    The chart shows the results of your new query. Very small or large values are indicated by the International System of Units (SI units), such as M for mega (10 to the sixth power). Units correspond to the selected metric and do not change by the statistic.

  4. To view the query as a Monitoring Query Language (MQL) expression, select Advanced mode.

From the Cloud Shell, run the following command. It returns aggregated data that match the criteria specified in the request. Compartment OCID required.

oci monitoring metric-data summarize-metrics-data --compartment-id<Compartment_OCID> --namespace oci_nosql --query-text [text]

--query-text is the Monitoring Query Language (MQL) expression to use when searching for metric data points to aggregate. The query must specify a metric, statistic, and interval. Supported values for interval: 1m-60m (also 1h). You can optionally specify dimensions and grouping functions. Supported grouping functions: grouping(), groupBy(). For more information about the OPTIONS available with the summarize-metrics-data command, see Summarize Metrics Data. In the example below, we are creating a filtered metric query to get the Sum of Read Units in a compartment at a one-minute interval, filtered to a single table.

For example:
oci monitoring metric-data summarize-metrics-data --compartment-id ocid1.compartment.oc1..aaaaaaaawrmvqjzoegxbsixp5k3b5554vlv2kxukobw3drjho3f7nf5ca3ya 
--namespace oci_nosql --query-text 'ReadUnits[1m]{tableName="articles"}.sum()'
Example response:
{
  "data": [
    {
      "aggregated-datapoints": [
        {
          "timestamp": "2022-02-17T11:03:00+00:00",
          "value": 0.0
        },
        {
          "timestamp": "2022-02-17T11:04:00+00:00",
          "value": 0.0
        },
        {
          "timestamp": "2022-02-17T11:05:00+00:00",
          "value": 0.0
        },

        ...
        ...
        ...

        {
          "timestamp": "2022-02-17T13:59:00+00:00",
         "value": 0.0
        },
        {
          "timestamp": "2022-02-17T14:00:00+00:00",
          "value": 0.0
        },
        {
          "timestamp": "2022-02-17T14:01:00+00:00",
          "value": 0.0
        }
      ],
      "compartment-id": "ocid1.compartment.oc1..aaaaaaaawrmvqjzoegxbsixp5k3b5554vlv2kxukobw3drjho3f7nf5ca3ya",
      "dimensions": {
        "resourceId": "ocid1_nosqltable_oc1_phx_amaaaaaau7x7rfyav7f67yuj3t2q6rk7lp2a2obfdxa6hg2ho2ea7qabin4q",
        "tableName": "demo"
      },
      "metadata": {},
      "name": "ReadUnits",
      "namespace": "oci_nosql",
      "resolution": null,
      "resource-group": null
    }
  ]
}

Creating Alarms

You can create an alarm that evaluates the alarm query and sends a notification when the alarm is in the firing state, along with other alarm properties. When triggered, an alarm sends an alarm message to the configured topic (in Notifications), which then sends the message on to all of the topic's subscriptions. Slack, Email, SMS, and PagerDuty are some of the examples of Configured Topic in Notifications.

When configured, repeat notifications remind you of a continued firing state at the configured repeat interval. You are also notified when an alarm transitions back to the OK state, or when an alarm is reset.

An alarm query contains the Monitoring Query Language (MQL) expression to evaluate for returning aggregated data. The query must specify a metric, statistic, and interval.

There are two ways for creating an alarm. You can either create a query using the Console or OCI CLI.

  1. Open the navigation menu and click Observability & Management. Under Monitoring, click Alarm Definitions.
  2. Click Create Alarm.

    Note:

    You can also create an alarm from a predefined query on the Service Metrics page. Expand Options and click Create an Alarm on this Query. For more information about service metrics, see Viewing or Listing Oracle NoSQL Database Cloud Service Metrics.
  3. On the Create Alarm page, under Define alarm, fill in or update the alarm settings:

    Note:

    To toggle between Basic Mode and Advanced Mode, click Switch to Advanced Mode or Switch to Basic Mode (to the right of Define Alarm).
    • Alarm name: The user-friendly name for the new alarm. This name is sent as the title for notifications related to this alarm. Avoid entering confidential information.
    • Alarm severity: The perceived type of response required when the alarm is in the firing state.
    • Alarm body: The human-readable content of the notification delivered. Oracle recommends providing guidance to operators for resolving the alarm condition. Example: "High Read Throttle Count".
    • Tags (optional): If you have permission to create a resource, then you also have permission to apply free-form tags to that resource. To apply a defined tag, you must have permission to use the tag namespace. For more information about tagging, see Resource Tags. If you are not sure whether to apply tags, skip this option (you can apply tags later) or ask your administrator.
    • Metric description: The metric to evaluate for the alarm condition.
      • Compartment: The compartment containing the Oracle NoSQL Database Cloud Service tables that you want to monitor. By default, the first accessible compartment is selected.
      • Metric namespace: The Oracle NoSQL Database Cloud Service emitting metrics for the tables that you want to monitor. Example: oci_nosql.
      • Resource group (optional): The group that the metric belongs to. A resource group is a custom string provided with a custom metric. Not applicable to service metrics.
      • Metric name: The name of the metric. Only one metric can be specified. Metric selections depend on the selected compartment and metric namespace. Example: ReadUnits
      • Interval: The aggregation window.
      • Statistic: The aggregation function.
    • Metric dimensions: Optional filters to narrow the metric data evaluated.
      • Dimension fields: For Oracle NoSQL Database Cloud Service metrics, you can select either resourceId or tableName as Dimension name and Dimension value pair.
    • Aggregate metric streams: Plots a single line on the metric chart to represent the combined value of all metric streams for the selected statistic.
    • Trigger rule: The condition that must be satisfied for the alarm to be in the firing state. The condition can specify a threshold, such as 90% for StorageGB.
      • Operator: The operator used in the condition threshold.
      • Value: The value to use for the condition threshold.
      • Trigger delay minutes: The number of minutes that the condition must be maintained before the alarm is in a firing state.
  4. To change the view of the query results, click the appropriate option above the results, on the right:
    • Show Data Table: Lists data points, indicating time stamp and bytes for each.
    • Show Graph (default): Plots data points on a graph.
  5. Set up notifications: Under Notifications, fill in the fields.
    • Destinations: The topic to be used for notifications.
    • Repeat notification?: While the alarm is in the firing state, resends notifications at the specified interval.
    • Notification frequency: The period of time to wait before resending the notification.
    • Suppress notifications: Set up a suppression time window during which to suspend evaluations and notifications. Useful for avoiding alarm notifications during system maintenance periods.
  6. If you want to disable the new alarm, clear Enable this alarm?
  7. Click Save alarm.

From the Cloud Shell, run the following command to create a new alarm in the specified compartment. Compartment OCID required.

oci monitoring alarm create --compartment-id <Compartment_OCID> --namespace oci_nosql --query-text [text] --destinations [complex type] --display-name [text] --is-enabled [boolean] --metric-compartment-id [text] --severity [text]

--query-text is the Monitoring Query Language (MQL) expression to use when searching for metric data points to aggregate. The query must specify a metric, statistic, and interval. Supported values for interval: 1m-60m (also 1h). You can optionally specify dimensions and grouping functions. Supported grouping functions: grouping(), groupBy(). For more information about the OPTIONS available with the create alarm command, see create - alarm. In the example below, we are creating an alarm with alarm query when 90the percentile of StorageGB is greater than 85 in a compartment at a one-minute interval, filtered to a single table.

Example of threshold alarm:
oci monitoring alarm create --compartment-id ocid1.compartment.oc1..aaaaaaaawrmvqjzoegxbsixp5k3b5554vlv2kxukobw3drjho3f7nf5ca3ya 
--namespace oci_nosql --query-text 'StorageGB[1m]{tableName="demo"}.groupBy(WriteUnits).percentile(0.9) > 85' 
--display-name HighStorageConsumption --metric-compartment-id demonosql --severity Critical --is-enabled true

Managing Alarms

You can follow these guidelines on how to manage your alarms.

  • Create a Set of Alarms for Each Metric.
    For each metric emitted by Oracle NoSQL Database Cloud Service table, create alarms that define the following resource behaviors:
    • At risk - The Oracle NoSQL Database Cloud Service is at risk of becoming inoperable, as indicated by metric values. For example, Storage size for a table is at risk of high utilization.
    • Non-optimal - The Oracle NoSQL Database Cloud Service is performing at non-optimal levels, as indicated by metric values. For example, ReadUnits or Write Units have high latency.
    • Resource is up or down - The Oracle NoSQL Database Cloud Service is either not reachable or not operating. For example, High number for ReadThrottleCount or WriteThrottleCount.
  • Set up a process for responding to alarms.
    Based on the severity of the alarm, you can choose to respond to the alarms in the following different ways:
    • For Critical to At-Risk alarms, you can decide to notify the operations team immediately because repair is required to bring the instances back to optimal operational levels. You configure alarm notifications to the responsible team by both PagerDuty and email, requesting an investigation and appropriate fixes before the instances go into an inoperable state. You set repeat notifications every minute. When someone responds to the alarm notifications, you temporarily stop notifications by suppressing the alarm. Once metrics return to optimal values, you remove the suppression.
    • For Warning or Non-Optimal alarms, you can decide to notify the appropriate individual or team that Oracle NoSQL Database Cloud Service table is consuming more Storage Size than usual. You configure a threshold alarm to notify the appropriate contacts as no immediate actions are required to investigate and reduce the Storage Size. You set notification to email only, directed to the appropriate developer or team, with repeat notifications every 24 hours to reduce email notification noise.