Manage Data Flow

Learn how to manage Data Flow including setting up the correct policies and storage to manage your data, and the metrics available.

Set Up to Manage Data

Ensure you have the correct policies in place:

You must have set up storage, too.

Data Flow Events

Events are JSON files that are emitted with some service operations and carry information about that operation.

For information about managing rules for Oracle Cloud Infrastructure Events, see Managing Rules for Events.

Data Flow emits an event when:

a Data Flow Application is created
a Data Flow Application is deleted
a Data Flow Application is updated
a Data Flow Run begins
a Data Flow Run ends

You can view these Events in the Events service. Set up rules to take actions when these events are emitted, for example, emailing you a JSON file or triggering a Function. When you create rules based on Event Type, select Data Flow as the Service Name. The actions available are described in the Events documentation in Events Overview.

Event Types for Applications

Data Flow emits events, in the form of a JSON file, when an Application is created, deleted, or updated.

An Application is an infinitely reusable Spark application template consisting of a Spark application, its dependencies, default parameters, and a default runtime resource specification. After a developer creates a Data Flow Application, anyone can use it without worrying about the complexities of deploying it, setting it up, or running it.

Application Events
Friendly Name	Description	Event Type
Application - Create	Emitted when a Data Flow application is created.	`com.oraclecloud.dataflow.createapplication`
Application - Delete	Emitted when a Data Flow application is deleted.	`com.oraclecloud.dataflow.deleteapplication`
Application - Update	Emitted when a Data Flow application is updated.	`com.oraclecloud.dataflow.updateapplication`

Create Application Event Example

Here is a reference event file for a Data Flow Application create event that complete successfully.

{
  "id": "ocid1.eventschema.oc1.phx.abyhqljr7e6dxrsvyp2rowvkgqynfzjuo5gjiqo5gnkfcq7fzmaf7nzskk2q",
  "exampleEvent": {
    "eventType": "com.oraclecloud.dataflow.createapplication",
    "cloudEventsVersion": "0.1",
    "eventTypeVersion": "2.0",
    "source": "dataflow",
    "eventTime": "2022-07-17T02:17:41Z",
    "contentType": "application/json",
    "data": {
      "eventGroupingId": "unique_id",
      "eventName": "CreateApplication",
      "compartmentId": "ocid1.compartment.oc1.unique_id",
      "compartmentName": "example_compartment",
      "resourceName": "application_name",
      "resourceId": "ocid1.dataflowapplication.oc1.phx.unique_id",
      "availabilityDomain": "AD",
      "definedTags": {
        "Oracle-Tags": {
          "CreatedBy": "user_name",
          "CreatedOn": "2022-07-17T02:17:40.799Z"
        }
      },
      "request": {
        "id": "unique_id",
        "path": "/latest/applications",
        "action": "POST",
        "parameters": {},
        "headers": {}
      },
      "response": {
        "status": "200",
        "responseTime": "2022-07-17T02:17:41Z",
        "headers": {},
        "payload": {},
        "message": "application_name CreateApplication succeeded"
      }
    },
    "eventID": "unique_id",
    "extensions": {
      "compartmentId": "ocid1.compartment.oc1..example_compartment"
    }
  },
  "serviceName": "Data Flow",
  "displayName": "Application - Create",
  "additionalDetails": [],
  "timeCreated": "2022-07-18T04:01:56Z"
}

Delete Application Event Example

Here is a reference event file for a Data Flow Application delete event that complete successfully.

{
  "id": "ocid1.eventschema.oc1.phx.abyhqljrhnwwfto2ed3ytl7xaumc4qrjzsuumfagptovb5rhjjp266cryfpa",
  "exampleEvent": {
    "eventType": "com.oraclecloud.dataflow.deleteapplication",
    "cloudEventsVersion": "0.1",
    "eventTypeVersion": "2.0",
    "source": "dataflow",
    "eventTime": "2022-07-18T00:10:14Z",
    "contentType": "application/json",
    "data": {
      "eventGroupingId": "unique_id",
      "eventName": "DeleteApplication",
      "compartmentId": "ocid1.compartment.oc1.unique_id",
      "compartmentName": "example-compartment",
      "resourceName": "",
      "resourceId": "ocid1.dataflowapplication.oc1.phx.unique_id",
      "availabilityDomain": "AD",
      "definedTags": {
        "Oracle-Tags": {
          "CreatedBy": "user_name",
          "CreatedOn": "2022-07-17T02:17:40.799Z"
        }
      },
      "request": {
        "id": "unique_id",
        "path": "/latest/applications/ocid1.dataflowapplication.oc1.phx.unique_id",
        "action": "DELETE",
        "parameters": {},
        "headers": {}
      },
      "response": {
        "status": "204",
        "responseTime": "2022-07-18T00:10:14Z",
        "headers": {},
        "payload": {},
        "message": "DeleteApplication succeeded"
      }
    },
    "eventID": "unique_id",
    "extensions": {
      "compartmentId": "ocid1.compartment.oc1..unique_id"
    }
  },
  "serviceName": "Data Flow",
  "displayName": "Application - Delete",
  "additionalDetails": [],
  "timeCreated": "2022-07-18T04:01:56Z"
}

Update Application Event Example

Here is a reference event file for a Data Flow Application update event that complete successfully.

{
  "id": "ocid1.eventschema.oc1.phx.abyhqljrf42fatkajcznyzhdilv4c3sivrffbfgi45wm656tyqzwuf6ndwpa",
  "exampleEvent": {
    "eventType": "com.oraclecloud.dataflow.updateapplication",
    "cloudEventsVersion": "0.1",
    "eventTypeVersion": "2.0",
    "source": "dataflow",
    "eventTime": "2022-07-18T00:07:08Z",
    "contentType": "application/json",
    "data": {
      "eventGroupingId": "/unique_id",
      "eventName": "UpdateApplication",
      "compartmentId": "ocid1.compartment.oc1..unique_id",
      "compartmentName": "example-compartment",
      "resourceName": "application_name",
      "resourceId": "ocid1.dataflowapplication.oc1.phx.unique_id",
      "availabilityDomain": "AD",
      "freeformTags": {},
      "definedTags": {
        "Oracle-Tags": {
          "CreatedBy": "user_name",
          "CreatedOn": "2022-07-18T00:07:06.095Z"
        }
      },
      "request": {
        "id": "unique_id",
        "path": "/latest/applications/ocid1.dataflowapplication.oc1.phx.unique_id",
        "action": "PUT",
        "parameters": {},
        "headers": {}
      },
      "response": {
        "status": "200",
        "responseTime": "2022-07-18T00:07:08Z",
        "headers": {},
        "payload": {},
        "message": "application_name UpdateApplication succeeded"
      }
    },
    "eventID": "unique_id",
    "extensions": {
      "compartmentId": "ocid1.compartment.oc1..unique_id"
    }
  },
  "serviceName": "Data Flow",
  "displayName": "Application - Update",
  "additionalDetails": [],
  "timeCreated": "2022-07-18T04:01:56Z"
}

Event Types for Create Run Jobs

Data Flow emits events, in the form of a JSON file, when a create Run begins or ends.

Every time a Data Flow Application is run, a Run is created. The Data Flow Run captures the Application's output, logs, and statistics that are automatically securely stored. Output is saved so it can be viewed by anyone with the correct permissions using the UI or REST API. Runs give you secure access to the Spark UI for debugging and diagnostics.

Application Events
Friendly Name	Description	Event Type
Run - Begin	Emitted when a request to trigger a Data Flow run is submitted successfully.	`com.oraclecloud.dataflow.createrun.begin`
Run - End	Emitted when the submitted run request processing is completed and the run has transitioned to a terminal state `SUCCEEDED`, `CANCELED`, `FAILED`, or `STOPPED`.	`com.oraclecloud.dataflow.createrun.end`

Create Run Begin Event Example

Here is a reference event file for a Data Flow Run begin event that complete successfully.

{
  "id": "ocid1.eventschema.oc1.phx.abyhqljrbhvyktxafsvf7p7thtdu5eqgwqnfxflwzlu52rkxpu3feb2p7zfa",
  "exampleEvent": {
    "eventType": "com.oraclecloud.dataflow.createrun.begin",
    "cloudEventsVersion": "0.1",
    "eventTypeVersion": "2.0",
    "source": "dataflow",
    "eventTime": "2022-07-18T04:01:56Z",
    "contentType": "application/json",
    "data": {
      "eventGroupingId": "unique_id",
      "eventName": "CreateRun",
      "compartmentId": "ocid1.compartment.oc1..unique_id",
      "compartmentName": "example_compartment",
      "resourceName": "example_run",
      "resourceId": "ocid1.dataflowrun.oc1.phx.unique_id",
      "availabilityDomain": "availability_domain",
      "definedTags": {
        "Oracle-Tags": {
          "CreatedBy": "unique_id",
          "CreatedOn": "2022-07-18T04:01:55.278Z"
        }
      },
      "request": {
        "id": "unique_id",
        "path": "/latest/runs",
        "action": "POST",
        "parameters": {},
        "headers": {}
      },
      "response": {
        "status": "200",
        "responseTime": "2022-07-18T04:01:56Z",
        "headers": {},
        "payload": {},
        "message": "example_run CreateRun succeeded"
      }
    },
    "eventID": "unique-id",
    "extensions": {
      "compartmentId": "ocid1.compartment.oc1..unique_id"
    }
  },
  "serviceName": "Data Flow",
  "displayName": "Run - Begin",
  "additionalDetails": [],
  "timeCreated": "2022-07-18T04:01:56Z"
}

Create Run-End Event Example

Here is a reference event file for a Data Flow Run end event that complete successfully.

{
  "id": "ocid1.eventschema.oc1.phx.abyhqljriljgnkdbqfuagrwc5h57kc2cpwphgcxpxkgqp6mnarjjo3zvhy7q",
  "exampleEvent": {
    "eventType": "com.oraclecloud.dataflow.createrun.end",
    "cloudEventsVersion": "0.1",
    "eventTypeVersion": "2.0",
    "source": "dataflow",
    "eventTime": "2022-07-18T04:06:11Z",
    "contentType": "application/json",
    "data": {
      "eventGroupingId": "unique_id",
      "eventName": "CreateRun",
      "compartmentId": "ocid1.compartment.oc1..unique_id",
      "compartmentName": "example_compartment",
      "resourceName": "example_run",
      "resourceId": "ocid1.dataflowrun.oc1.phx.unique_id",
      "availabilityDomain": "availability_domain",
      "request": {},
      "response": {
        "status": "204",
        "responseTime": "2022-07-18T04:06:11Z",
        "message": "example_run CreateRun succeeded"
      },
      "additionalDetails": {
        "lifecycleState": "SUCCEEDED" | CANCELED | FAILED | STOPPED"
        ,
        "type": "BATCH | STREAMING | SESSION",
        "language": "JAVA | SCALA | PYTHON | SQL",
        "sparkVersion": "3.2.1 | 3.0.2 | 2.4.4",
        "applicationId": "ocid1.dataflowapplication.oc1.phx.unique_id",
        "tenantId": "ocid1.tenancy.oc1..unique_id"
      }
    },
    "eventID": "unique_id",
    "extensions": {
      "compartmentId": "ocid1.compartment.oc1..unique_ID"
    }
  },
  "serviceName": "Data Flow",
  "displayName": "Run - End",
  "additionalDetails": [
     { "name": "lifecycleState", "type": "string"},
     { "name": "type", "type": "string"},
     { "name": "language", "type": "string"},
     { "name": "sparkVersion", "type": "string"},
     { "name": "applicationId", "type": "string"},
     { "name": "tenantId", "type": "string"}
  ],
  "timeCreated": "2022-07-18T04:06:11Z"
  }

The Data Flow Run-End event is created when the Data Flow Run reaches a terminal state of SUCCEEDED, CANCELED, FAILED, or STOPPED. The Run-End event has the following extra fields on which the Events service can create rule filters:

lifecycleState is the Data Flow run lifecycle states.
type is the Data Flow run type.
language is the corresponding Spark code language.
sparkVersion is the Data Flow run Spark version used.
applicationId is the OCID of the corresponding Data Flow application for the Data Flow run.
tenantId is the OCID of the tenant that submitted the run.

The possible values for these fields are as follows:

"additionalDetails": {
      "lifecycleState": "SUCCEEDED | CANCELED | FAILED | STOPPED",
      "type": "BATCH | STREAMING | SESSION",
      "language": "JAVA | SCALA | PYTHON | SQL",
      "sparkVersion": "3.2.1 | 3.0.2 | 2.4.4",
      "applicationId": "ocid1.dataflowapplication.oc1.phx.unique_id",
      "tenantId": "ocid1.tenancy.oc1..unique_id"
}

Data Flow Metrics

Learn about the Spark-related metrics available from the oci_data_flow metric namespace.

Metrics Overview

The Data Flow metrics help you monitor the number of tasks that completed or failed and the amount of data involved. They are free service metrics and are available from Service Metrics, or Metrics Explorer. See Viewing the Metrics for more information.

Terminology

These terms help you understand what is available with Data Flow metrics.

Namespace:: A namespace is a container for Data Flow metrics. The namespace identifies the service sending the metrics. The namespace for Data Flow is oci_dataflow.

Metrics:

Metrics are the fundamental concept in telemetry and monitoring. Metrics define a time-series set of data points. Each metric is uniquely defined by:

namespace
metric name
compartment identifier
a set of one or more dimensions
a unit of measure

Each data point has a timestamp, a value, and a count associated with it.

Dimensions:

A dimension is a key-value pair that defines the characteristics associated with the metric. Data Flow has five dimensions:

resourceId: The OCID of a Data Flow Run instance.
resourceName: The name you've given the Run resource. It's not guaranteed to be unique.
applicationId: The OCID of a Data Flow Application instance.
applicationName: The name you've given the Application resource. It's not guaranteed to be unique or final.
executorId: A Spark cluster consists of a driver and one or more executors. The driver has executorId = driver, the executor has executorId = 1.2.3...n.

Statistics:: Statistics are metric data aggregations over specified periods of time. Aggregations are done using the namespace, metric name, dimensions, and the data point unit of measure within a specified time period.

Alarms:: Alarms are used to automate operations monitoring and performance. An alarm tracks changes that occur over a specific period of time and performs one or more defined actions, based on the rules defined for the metric.

Prerequisites

To monitor resources in Data Flow, you must be given the required type of access in a policy written by an administrator.

The policy must give you access to the monitoring services and the resources being monitored. This applies whether you're using the Console or the REST API with an SDK, CLI, or another tool. If you try to perform an action, and get a message that you don't have permission or are unauthorized, confirm with your administrator the type of access you have been granted and which compartment to work in. For more information on user authorizations for monitoring, see the Authentication and Authorization section for the related service: Monitoring or Notifications.

Available Metrics

Here are the metrics available for Data Flow. The control plane metrics are listed first, then the data plane metrics.

Control Plane Metrics
Metric Name	Display Name	Dimensions	Statistic	Description
`RunTotalStartUpTime`	Run Startup Time	`resourceId` `resourceName` `applicationId` `applicationName`	Mean	The overall startup time for a run contains timings for resource assignment and Spark job startup, and the time it waits in various queues internal to the service.
`RunExecutionTime`	Run Execution Time	`resourceId` `resourceName` `applicationId` `applicationName`	Mean	The amount of time it takes to complete a run, from the time it's started until the time it completes.
`RunTotalTime`	Total Run Time	`resourceId` `resourceName` `applicationId` `applicationName`	Mean	The sum of the Run startup time and Run Execution Time.
`RunSucceeded`	Run Succeeded	`resourceId` `resourceName` `applicationId` `applicationName`	Count	Whether the run finished successfully.
`RunFailed`	Run Failed	`resourceId` `resourceName` `applicationId` `applicationName`	Count	Whether the run failed to start.

Data Plane Metrics
Metric Name	Display Name	Dimensions	Statistic	Description
`CpuUtilization`	CPU Utilization	`resourceId` `resourceName` `applicationId` `applicationName` `executorId`	Percent	The CPU utilization by the container allocated to the driver or executor as a percentage.
`DiskReadBytes`	Disk Read Bytes	`resourceId` `resourceName` `applicationId` `applicationName` `executorId`	Sum	The number of bytes read from all block devices by the container allocated to the driver or executor in a given time interval.
`DiskWriteBytes`	Disk Write Bytes	`resourceId` `resourceName` `applicationId` `applicationName` `executorId`	Sum	The number of bytes written from all block devices by the container allocated to the driver or executor in a given time interval.
`FileSystemUtilization`	File System Utilization	`resourceId` `resourceName` `applicationId` `applicationName` `executorId`	Percent	The file system usage by the container allocated to the driver or executor as a percentage.
`GcCpuUtilization`	GC CPU Utilization	`resourceId` `resourceName` `applicationId` `applicationName` `executorId`	Percent	The memory usage by the Java Garbage Collector of the driver or executor as a percentage.
`MemoryUtilization`	Memory Utilization	`resourceId` `resourceName` `applicationId` `applicationName` `executorId`	Percent	The memory usage by the container allocated to the driver or executor as a percentage.
`NetworkReceiveBytes`	Network Receive Bytes	`resourceId` `resourceName` `applicationId` `applicationName` `executorId`	Sum	The number of bytes received from the network interface by the container allocated to the driver or executor in a given time interval.
`NetworkTransmitBytes`	Network Transmit Bytes	`resourceId` `resourceName` `applicationId` `applicationName` `executorId`	Sum	The number of bytes transmitted from the network interface by the container allocated to the driver or executor in a given time interval.

Viewing the Metrics

You can view Data Flow metrics in various ways.

From the Console, select the navigation menu, select Observability & Management, and under Monitoring, select Service Metrics. See Overview of Monitoring for how to use these metrics.
From the Console, select the navigation menu, select Observability & Management, and under Monitoring, select Metrics Explorer. See Overview of Monitoring for how to use these metrics.
From the Console, select the navigation menu, select Data Flow, and select Runs. Under Resources, select Metrics, and you see the metrics specific to this Run. Set the Start time and End time as appropriate, or a time period from Quick Selects. For each chart, you can specify an Interval and the Options as to how to display each metric.
From the Console, select the navigation menu, select Data Flow, and select Applications. You see the metrics specific to the Runs of this Application. Set the Start time and End time as appropriate, or a time period from Quick Selects. For each chart, you can specify an Interval and a Statistic, and the Options as to how to display each metric.

Oracle Cloud Infrastructure Documentation