Note:

Monitor Oracle Cloud Infrastructure Database with PostgreSQL using Datadog

Introduction

Oracle Cloud Infrastructure (OCI) is a robust and highly scalable cloud platform designed to meet the needs of modern enterprises. It delivers a comprehensive suite of services for computing, storage, networking, database, and application development, optimized for performance, security, and cost efficiency. OCI is ideal for running both cloud-native and traditional workloads, offering enterprises a flexible and reliable infrastructure.

Datadog is a comprehensive cloud-based monitoring and analytics platform designed to help organizations gain end-to-end visibility into their IT infrastructure, applications, and services. It enables real-time monitoring, troubleshooting, and performance optimization across dynamic, hybrid cloud environments. Datadog integrates seamlessly with a wide range of tools, platforms, and services, making it a versatile solution for modern DevOps and IT operations teams.

This tutorial demonstrates how OCI Database with PostgreSQL and Datadog users can set up efficient, scalable solutions to seamlessly transmit metrics from OCI to Datadog using OCI Connector Hub and OCI Functions.

Objectives

Prerequisites

Task 1: Create a Datadog Account

  1. Set up an account in the Datadog integration tool using Datadog Website. Provide the necessary account details and complete the agent setup by configuring the appropriate environment settings.

  2. Install the Datadog Agent to collect metrics and events from the OCI Database with PostgreSQL. For more information about setting up and configuring the Datadog Agent, see Setup Datadog Agent. For additional details on troubleshooting and debugging on Datadog Agent, see Basic Datadog Agent Usage.

  3. Select OCI as the integration and proceed with its installation. The following image shows the post-installation of the OCI integration for Datadog.

    image

  4. Click Add Tenancy and enter your Tenancy OCID and Home Region information.

    image

Task 2: Create Datadog Authentication Resources

Create a Datadog auth user, group, and policy in Oracle Cloud Infrastructure (OCI).

  1. To create a domain, navigate to Identity and create a domain named DataDog.

  2. Create a group called DatadogAuthGroup.

  3. Create a user named DatadogAuthUser using your email address (the same email used to log in to the Datadog Monitoring Tool) and assign DatadogAuthGroup as the group.

  4. Copy the User OCID and paste it into the User OCID field on the Datadog OCI integration tile to configure the user OCID.

  5. Set up API.

    1. Navigate to your profile and select your username.

    2. Navigate to Resources in the bottom-left corner and select API Keys.

    3. Click Add API Key, then download the private key, and click Add.

    4. Close the Configuration File Preview window. No action required.

    5. Copy the Fingerprint value and paste it into the Fingerprint field on the Datadog OCI integration tile.

  6. Configure private key.

    1. Open the downloaded private key file (.pem) in a text editor or use a terminal command (for example, cat) to view its contents.

    2. Copy the entire key, including the lines -----BEGIN PRIVATE KEY----- and -----END PRIVATE KEY-----.

    3. Paste the private key into the Private Key field on the Datadog OCI integration tile.

  7. Create a policy named DataDogPolicy in the postgresqlinteg (root) compartment.

  8. Use the policy builder in manual editor mode to enter the required policy statement.

    Allow group DatadogAuthGroup to read all-resources in tenancy
    

The following is a sample Datadog OCI integration tile after adding tenancy and user details.

image

Task 3: Create an OCI Stack

image

Navigate to your Identity section and create a policy stack under the root compartment. This allows connector hubs to read metrics and invoke functions with the following statements.

Allow dynamic-group DatadogAuthGroup to read metrics in tenancy
Allow dynamic-group DatadogAuthGroup to use fn-function in tenancy
Allow dynamic-group DatadogAuthGroup to use fn-invocation in tenancy

To configure identity policies and deploy metric forwarding stacks in OCI for Datadog integration, follow the tasks:

Task 3.1: Create Policy Stack (ORM_policy_stack)

  1. Click Create Policy Stack on the Datadog OCI integration tile, ensure to use the link provided, which includes the necessary Terraform script and accept the Oracle Terms of Use.

  2. Click Working Directory drop-down menu and select datadog-oci-orm/policy-setup.

  3. Deselect Use custom Terraform providers.

  4. Enter a descriptive name (for example, datadog-metrics-policy-setup) and select the compartment for deployment.

  5. Click Next, name the dynamic group and policy (or use default names), ensure the home region of the tenancy is selected, and click Create.

Task 3.2: Create Metric Forwarding Stack

Resources are deployed to the specified compartment. Ensure the user running the stack has appropriate access rights.

  1. Click Create Policy Stack on the Datadog OCI integration tile and accept the Oracle Terms of Use.

  2. Click Working Directory drop-down menu, select datadog-oci-orm/metrics-setup and deselect Use custom Terraform providers.

  3. Name the stack and select the deployment compartment, then click Next.

  4. Leave the tenancy values unmodified, enter your Datadog API key, and select the US5 endpoint (ocimetrics-intake.us5.datadoghq.com).

  5. For network configuration, ensure Create VCN is checked and select the appropriate compartment for VCN creation.

  6. In the Function Settings section, retain the default application shape as GENERIC_ARM. Enter the OCI Docker registry username and password (artifactory password).

  7. Set the Service Connector Hub batch size to 5000 and click Next.

  8. Click Create.

Task 3.3: Finalize the Configuration

  1. Return to the Datadog OCI integration tile and click Create configuration to complete the setup.

  2. This process ensures that Datadog metrics and functions are properly configured for integration with OCI.

    image

Task 4: Create OCI Functions

To create an application in the OCI Console, follow the steps:

  1. Navigate to Applications and select Create Application.

  2. Enter the application name, select the appropriate Virtual Cloud Network (VCN) and subnet details, and click Create.

  3. To access the newly created application, under Resources, select Getting started.

    image

  4. Click Launch Cloud Shell and copy the following commands from Use the context for your region.

    fn list context
    fn use context <region name>
    
  5. Update the context to include the function’s compartment ID.

    fn update context oracle.compartment-id <compartment-id>
    
  6. Update the context to include the location of the registry you want to use.

    fn update context registry phx.ocir.io/<tenancy_name>/[YOUR-OCIR-REPO]
    

    Note: Replace phx in context with the three-digit region code.

  7. Login to the registry using the auth token as your password.

    docker login -u 'TENACNY_NAME/OCI_USERNAME' phx.ocir.io
    
  8. You will be prompted with password. Provide your appropriate password.

    Note:

    • Replace phx with the three-digit region code.
    • If you are using Oracle Identity Cloud Service, your username is <tenancyname>/oracleidentitycloudservice/<username>.
  9. Generate a hello-world boilerplate function.

    fn list apps
    fn init --runtime python datadog
    

    The fn init command will generate a folder called datadog with three files inside; func.py, func.yaml, and requirements.txt.

  10. Run the cd datadog command.

  11. Open func.py and replace the content of the file with the following code snippet.

    # oci-monitoring-metrics-to-datadog version 1.0.
    #
    # Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved.
    # Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl.
    
    import io
    import json
    import logging
    import os
    import re
    import requests
    from fdk import response
    from datetime import datetime
    
    """
    This sample OCI Function maps OCI Monitoring Service Metrics to the DataDog
    REST API 'submit-metrics' contract found here:
    
    https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
    
    """
    
    # Use OCI Application or Function configurations to override these environment variable defaults.
    
    api_endpoint = os.getenv('DATADOG_METRICS_API_ENDPOINT', 'not-configured')
    api_key = os.getenv('DATADOG_API_KEY', 'not-configured')
    is_forwarding = eval(os.getenv('FORWARD_TO_DATADOG', "True"))
    metric_tag_keys = os.getenv('METRICS_TAG_KEYS', 'name, namespace, displayName, resourceDisplayName, unit')
    metric_tag_set = set()
    
    # Set all registered loggers to the configured log_level
    
    logging_level = os.getenv('LOGGING_LEVEL', 'INFO')
    loggers = [logging.getLogger()] + [logging.getLogger(name) for name in logging.root.manager.loggerDict]
    [logger.setLevel(logging.getLevelName(logging_level)) for logger in loggers]
    
    # Exception stack trace logging
    
    is_tracing = eval(os.getenv('ENABLE_TRACING', "False"))
    
    # Constants
    
    TEN_MINUTES_SEC = 10 * 60
    ONE_HOUR_SEC = 60 * 60
    
    # Functions
    
    def handler(ctx, data: io.BytesIO = None):
        """
        OCI Function Entry Point
        :param ctx: InvokeContext
        :param data: data payload
        :return: plain text response indicating success or error
        """
    
        preamble = " {} / event count = {} / logging level = {} / forwarding to DataDog = {}"
    
        try:
            metrics_list = json.loads(data.getvalue())
            logging.getLogger().info(preamble.format(ctx.FnName(), len(metrics_list), logging_level, is_forwarding))
            logging.getLogger().debug(metrics_list)
            converted_event_list = handle_metric_events(event_list=metrics_list)
            send_to_datadog(event_list=converted_event_list)
    
        except (Exception, ValueError) as ex:
            logging.getLogger().error('error handling logging payload: {}'.format(str(ex)))
            if is_tracing:
                logging.getLogger().error(ex)
    
    
    def handle_metric_events(event_list):
        """
        :param event_list: the list of metric formatted log records.
        :return: the list of DataDog formatted log records
        """
    
        result_list = []
        for event in event_list:
            single_result = transform_metric_to_datadog_format(log_record=event)
            result_list.append(single_result)
            logging.getLogger().debug(single_result)
    
        return result_list
    
    
    def transform_metric_to_datadog_format(log_record: dict):
        """
        Transform metrics to DataDog format.
        See: https://github.com/metrics/spec/blob/v1.0/json-format.md
        :param log_record: metric log record
        :return: DataDog formatted log record
        """
    
        series = [{
            'metric': get_metric_name(log_record),
            'type' : get_metric_type(log_record),
            'points' : get_metric_points(log_record),
            'tags' : get_metric_tags(log_record),
        }]
    
        result = {
            'series' : series
        }
        return result
    
    
    def get_metric_name(log_record: dict):
        """
        Assembles a metric name that appears to follow DataDog conventions.
        :param log_record:
        :return:
        """
    
        elements = get_dictionary_value(log_record, 'namespace').split('_')
        elements += camel_case_split(get_dictionary_value(log_record, 'name'))
        elements = [element.lower() for element in elements]
        return '.'.join(elements)
    
    
    def camel_case_split(str):
        """
        :param str:
        :return: Splits camel case string to individual strings
        """
    
        return re.findall(r'[A-Z](?:[a-z]+|[A-Z]*(?=[A-Z]|$))', str)
    
    
    def get_metric_type(log_record: dict):
        """
        :param log_record:
        :return: The type of metric. The available types are 0 (unspecified), 1 (count), 2 (rate), and 3 (gauge).
        Allowed enum values: 0,1,2,3
        """
    
        return 0
    
    
    def get_now_timestamp():
        return datetime.now().timestamp()
    
    
    def adjust_metric_timestamp(timestamp_ms):
        """
        DataDog Timestamps should be in POSIX time in seconds, and cannot be more than ten
        minutes in the future or more than one hour in the past.  OCI Timestamps are POSIX
        in milliseconds, therefore a conversion is required.
    
        See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
        :param oci_timestamp:
        :return:
        """
    
        # positive skew is expected
        timestamp_sec = int(timestamp_ms / 1000)
        delta_sec = get_now_timestamp() - timestamp_sec
    
        if (delta_sec > 0 and delta_sec > ONE_HOUR_SEC):
            logging.getLogger().warning('timestamp {} too far in the past per DataDog'.format(timestamp_ms))
    
        if (delta_sec < 0 and abs(delta_sec) > TEN_MINUTES_SEC):
            logging.getLogger().warning('timestamp {} too far in the future per DataDog'.format(timestamp_ms))
    
        return timestamp_sec
    
    
    def get_metric_points(log_record: dict):
        """
        :param log_record:
        :return: an array of arrays where each array is a datapoint scalar pair
        """
    
        result = []
    
        datapoints = get_dictionary_value(dictionary=log_record, target_key='datapoints')
        for point in datapoints:
            dd_point = {'timestamp': adjust_metric_timestamp(point.get('timestamp')),
                        'value': point.get('value')}
    
            result.append(dd_point)
    
        return result
    
    
    def get_metric_tags(log_record: dict):
        """
        Assembles tags from selected metric attributes.
        See https://docs.datadoghq.com/getting_started/tagging/
        :param log_record: the log record to scan
        :return: string of comma-separated, key:value pairs matching DataDog tag format
        """
    
        result = []
    
        for tag in get_metric_tag_set():
            value = get_dictionary_value(dictionary=log_record, target_key=tag)
            if value is None:
                continue
    
            if isinstance(value, str) and ':' in value:
                logging.getLogger().warning('tag contains a \':\' / ignoring {} ({})'.format(key, value))
                continue
    
            tag = '{}:{}'.format(tag, value)
            result.append(tag)
    
        return result
    
    
    def get_metric_tag_set():
        """
        :return: the set metric payload keys that we would like to have converted to tags.
        """
    
        global metric_tag_set
    
        if len(metric_tag_set) == 0 and metric_tag_keys:
            split_and_stripped_tags = [x.strip() for x in metric_tag_keys.split(',')]
            metric_tag_set.update(split_and_stripped_tags)
            logging.getLogger().debug("tag key set / {} ".format (metric_tag_set))
    
        return metric_tag_set
    
    
    def send_to_datadog (event_list):
        """
        Sends each transformed event to DataDog Endpoint.
        :param event_list: list of events in DataDog format
        :return: None
        """
    
        if is_forwarding is False:
            logging.getLogger().debug("DataDog forwarding is disabled - nothing sent")
            return
    
        if 'v2' not in api_endpoint:
            raise RuntimeError('Requires API endpoint version "v2": "{}"'.format(api_endpoint))
    
        # creating a session and adapter to avoid recreating
        # a new connection pool between each POST call
    
        try:
            session = requests.Session()
            adapter = requests.adapters.HTTPAdapter(pool_connections=10, pool_maxsize=10)
            session.mount('https://', adapter)
    
            for event in event_list:
                api_headers = {'Content-type': 'application/json', 'DD-API-KEY': api_key}
                logging.getLogger().debug("json to datadog: {}".format (json.dumps(event)))
                response = session.post(api_endpoint, data=json.dumps(event), headers=api_headers)
    
                if response.status_code != 202:
                    raise Exception ('error {} sending to DataDog: {}'.format(response.status_code, response.reason))
    
        finally:
            session.close()
    
    
    def get_dictionary_value(dictionary: dict, target_key: str):
        """
        Recursive method to find value within a dictionary which may also have nested lists / dictionaries.
        :param dictionary: the dictionary to scan
        :param target_key: the key we are looking for
        :return: If a target_key exists multiple times in the dictionary, the first one found will be returned.
        """
    
        if dictionary is None:
            raise Exception('dictionary None for key'.format(target_key))
    
        target_value = dictionary.get(target_key)
        if target_value:
            return target_value
    
        for key, value in dictionary.items():
            if isinstance(value, dict):
                target_value = get_dictionary_value(dictionary=value, target_key=target_key)
                if target_value:
                    return target_value
    
            elif isinstance(value, list):
                for entry in value:
                    if isinstance(entry, dict):
                        target_value = get_dictionary_value(dictionary=entry, target_key=target_key)
                        if target_value:
                            return target_value
    
    
    def local_test_mode(filename):
        """
        This routine reads a local json metrics file, converting the contents to DataDog format.
        :param filename: cloud events json file exported from OCI Logging UI or CLI.
        :return: None
        """
    
        logging.getLogger().info("local testing started")
    
        with open(filename, 'r') as f:
            transformed_results = list()
    
            for line in f:
                event = json.loads(line)
                logging.getLogger().debug(json.dumps(event, indent=4))
                transformed_result = transform_metric_to_datadog_format(event)
                transformed_results.append(transformed_result)
    
            logging.getLogger().debug(json.dumps(transformed_results, indent=4))
            send_to_datadog(event_list=transformed_results)
    
        logging.getLogger().info("local testing completed")
    
    
    """
    Local Debugging
    """
    
    if __name__ == "__main__":
        local_test_mode('oci-metrics-test-file.json')
    
  12. Update func.yaml with the following code. Replace DATADOG_TOKEN with your Datadog API key and DATADOG_HOST with the REST endpoint - https://http-intake.logs.datadoghq.com/v1/input . For more information about REST endpoint, see Log Collection and Integrations.

    schema_version: 20180708
    name: datadogapp
    version: 0.0.1
    runtime: python
    entrypoint: /python/bin/fdk /function/func.py handler
    memory: 1024
    timeout: 120
    config:
    DATADOG_HOST: https://http-intake.logs.datadoghq.com/v1/input
    DATADOG_TOKEN: ZZZZZzzzzzzzzzz
    
  13. Update requirements.txt with the following code.

    fdk
    dattime
    requests
    oci
    
  14. Run the following command to create the application and deploy the functions to complete the setup.

    fn create app datadog01 --annotation oracle.com/oci/subnetIds='["Provide your subnet OCID"]'
    
  15. Run the following command to deploy the functions to complete the setup.

    fn -v deploy --app datadog
    

Task 5: Set up an OCI Connector Hub

  1. Go to the OCI Console, navigate to Logging, Connectors and click Create Connector.

  2. Set the Source to Monitoring and the Target to Functions.

  3. Under Configure Source Connection, select the appropriate Metric Compartment and Namespace. For example, oci_postgresql for database monitoring.

  4. Under Configure Target, select the Compartment, Function Application, and Function created in Task 4.

  5. If prompted, click Create to create the necessary policy.

  6. Click Create to finalize the OCI Connector Hub setup.

image

Task 6: View Metrics In Datadog

The OCI Connector Hub is now configured to trigger the function, enabling the ingestion of metrics into Datadog whenever new metrics are detected. In the Datadog Integration tile, navigate to Metrics and review the summary to view OCI-related metrics.

image

In the Datadog Integration tile, click Explorer to analyze and select the required OCI metrics as needed,

image

Troubleshooting

If no data appears on the Metrics Summary page, select Enable log to enable logging for your functions to review logs and debug the issue.

image

Acknowledgments

More Learning Resources

Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.

For product documentation, visit Oracle Help Center.