Publishing Design Tasks

In Oracle Cloud Infrastructure Data Integration, a task is a design-time resource that specifies a set of actions to perform on data. You create tasks from a project details or folder details page. You then publish the tasks into an Application to test or roll out into production.

The Tasks section of a project or folder details page shows the list of tasks created in that project or folder. You can filter the list by task name. In the search field, enter the complete name of the task to do a full text search.

The Actions menu for a task has these publish options:

  • Publish to application: Lets you publish the task to an Application in OCI Data Integration.
  • Publish to OCI Data Flow: Lets you publish the task to an application in OCI Data Flow (for integration and data loader tasks only).
  • View OCI Data Flow publish history: Lets you view the current status as well as the history of publishing the task from the Data Integration service to the Data Flow service (for integration and data loader tasks only).

To publish tasks, see the following sections:

When you publish a single task or a group of tasks at the same time, one patch entry is created in the Application. Published tasks are listed in the Tasks section of the Application details page.

Publishing a Task to a Data Integration Application

Before you can run a task in Oracle Cloud Infrastructure Data Integration, you must first publish the task to an Application. You use an Application to run tasks for testing or roll out tasks into production.

Data Integration includes the Default Application that you can publish to. If you want to create your own Application to publish to, see Creating Applications.

You can publish a single task, or you can select a group of tasks to publish at the same time.

To publish one or more tasks to an application in the Data Integration service:

  1. On the project or folder details page where the task is saved, click Tasks.
  2. In the tasks list, do one of the following:
    • To publish one task, select Publish to application from the Actions menu for the task.
    • To publish one or more tasks, use the check boxes to select the tasks, and then click Publish to application.
  3. In the Publish to application dialog, use the menu to select the Application to publish to.

    You cannot publish to an Application (target) that was created as a copy of existing resources in another Application (source). You can only sync the (target) Application.

  4. Click Publish.
A notification message displays, with a link to the Application to view the published task. When you publish a single task or a group of tasks at the same time, one patch entry is created in the Application. Learn more about Applications and patches.

Publishing a Task to OCI Data Flow

When you publish a task to Oracle Cloud Infrastructure Data Flow, a JAR file is created in OCI Object Storage, and an application that points to the JAR file is created in the Data Flow service.

Only integration and data loader tasks can be published to OCI Data Flow.

After publishing, you can run the application in OCI Data Flow, where you can choose compute shapes, and monitor and diagnose data flow runs. If the task has assigned parameters, the Data Flow application is created using the default parameter values. You won't be able to enter parameter values when you run the application in OCI Data Flow.

Before you publish a task to OCI Data Flow, ensure that you have the following:

  • An Object Storage data asset to publish the executables to

  • A bucket in Object Storage for the JAR

  • The relevant permissions and IAM policies to create applications in OCI Data Flow:

    allow any-user to manage dataflow-application in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
    allow any-user to manage dataflow-run in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
    allow group <group-name> to read dataflow-application in compartment <compartment-name>
    allow group <group-name> to manage dataflow-run in compartment <compartment-name>

By default, the OCI Data Flow application is created with public internet access. You can choose to publish using a private endpoint in OCI Data Flow. For example, if your tasks use data sources that are hosted in private networks, you can publish to OCI Data Flow using a private endpoint.

To publish to OCI Data Flow using a private endpoint, ensure that you also have the following:

  • An existing private endpoint in OCI Data Flow for the application to use. See Creating a Private Endpoint.

    For required policies to create private endpoints in OCI Data Flow, see Private Endpoint Policies.

  • The policy that enables a group of users who are not administrators to list existing private endpoints at the compartment level (when publishing to OCI Data Flow from OCI Data Integration):

    allow group <group-name> to inspect dataflow-private-endpoint in compartment <compartment-name>
  • The data assets used in the task you're publishing must be:
    • Configured to use OCI Vault secrets that contain the passwords to connect to the data sources. This is required for passing credentials securely across OCI services. See OCI Vault Secrets and Oracle Wallets.
    • Specified using the Fully Qualified Domain Name (FQDN) for the database hosts. OCI Data Flow does not allow connections through direct IP addresses.
Publishing a Task to OCI Data Flow

To publish a task to an application in OCI Data Flow:

  1. On the project or folder details page where the task is saved, click Tasks.
  2. In the tasks list, select Publish to OCI Data Flow from the Actions menu for the task you want to publish.
    Only integration and data loader tasks can be published to OCI Data Flow.
  3. On the Publish to OCI Data Flow page, complete the Application information section:
    1. Select the compartment in which to create the Data Flow application.
    2. Enter a name and description (optional) for the application.
  4. Complete the Resource configuration section:
    1. Select a VM shape for the Spark driver host.
    2. Select another shape for each Spark executor host.
    3. Enter the number of executors to launch when the Data Flow application is run.
  5. Complete the Object Storage file configuration section:
    1. Select the Object Storage data asset to use.
    2. Select a connection to the data asset you selected.
    3. Select the compartment that has the Object Storage bucket you want to use.
    4. Select the bucket to upload the JAR to.
  6. (Optional) Click Show advanced options and select Secure access to private subnet. Then select a private endpoint in OCI Data Flow to use for this application.
  7. Click Validate task to check the configuration of the task for any errors that may cause the publish to fail.
  8. Click Publish.
    A notification message displays, with a View Publish Status link for the task.
  9. Click the status link to monitor the publish status on the Data Flow Publish History page.
Viewing the OCI Data Flow Publish History

The Oracle Cloud Infrastructure Data Flow Publish History page has a table that shows the current status as well as the history of publishing a task from the Data Integration service to the Data Flow service. The table is empty if the task has never been published to OCI Data Flow.

When the publish history table was last refreshed is indicated next to Refresh.

If the table displays many publish entries, you can do a full text search by entering the application name in the filter by name field.

For each publish run, you can view:

  • The application name provided in the initial publish to the Data Flow service. If the application name is later changed in Data Flow, the new name is not reflected in this table.
  • The status:
    • Publishing: Indicates publishing is in progress. Click Refresh periodically to get the latest status.
    • Successful: Indicates the publish completed successfully with no errors.
    • Failed: Indicates the publish could not be completed. Open the tooltip next to the status for more information.
  • Who initiated the publish
  • When the publish was last updated

The Actions menu for a publish has these options:

  • View in OCI Data Flow: Go to the associated application in OCI Data Flow.
  • Republish: Republish the task to the same application in OCI Data Flow.

If the options in the Actions menu for a publish are disabled, this means the application that's associated with that publish has been deleted in OCI Data Flow.

To view the OCI Data Flow publish history for a task:

  1. On the project or folder details page where the task is saved, select Tasks.
  2. In the tasks list, select View OCI Data Flow publish history from the Actions menu for the task you want to view.
    The Oracle Cloud Infrastructure Data Flow Publish History page displays, with a table showing the history of publishes.
  3. If necessary, click Refresh to get the latest status.
Viewing the OCI Data Flow Application

After publishing a task to an application in OCI Data Flow, you can navigate to the Data Flow application from OCI Data Integration.

To view the Data Flow application for a published task:

  1. On the project or folder details page where the task is saved, select Tasks.
  2. In the tasks list, select View OCI Data Flow publish history from the Actions menu for the task you want to view.
  3. On the Oracle Cloud Infrastructure Data Flow Publish History page, locate the successful publish associated with the task and select View in OCI Data Flow in the Actions menu.
A browser window displays showing the Application Details page of the Data Flow application that was created for the published task. You can run and delete this application just like any other OCI Data Flow application.
Republishing to OCI Data Flow

If you make changes to a task after publishing it to an application in OCI Data Flow, you can republish the task to the Data Flow service.

You can also republish a task if the initial publish to OCI Data Flow was not successful.

To republish a task to OCI Data Flow:

  1. On the project or folder details page where the task is saved, select Tasks.
  2. In the tasks list, select View OCI Data Flow publish history in the Actions menu for the task you want to republish.
    Only integration and data loader tasks can be published to OCI Data Flow.
  3. On the Oracle Cloud Infrastructure Data Flow Publish History page, locate the successful or failed publish associated with the task and select Republish in the Actions menu.
  4. In the Confirm Republish dialog, click Confirm.
    A notification message displays, with a Refresh Publish Status link for the task.
  5. Click Refresh to refresh the publish history table.
    The Updated column shows the date and time of the republish.

Using the API

For information about using the API and signing requests, see REST APIs and Security Credentials. For information about SDKs, see Software Development Kits and Command Line Interface.

Use the following operations in Oracle Cloud Infrastructure Data Integration to manage external publishes to OCI Data Flow: