Working with Data Loader Tasks

A data loader task lets you take data from a source and load it into a target. Data loader tasks are essential for data preparation, data migration, or loading diverse data into data lakes or data warehouses.

In Data Integration, you can use a data loader task to perform 1-to-1 or n-to-n loading of data from one system type into another, with the option of multiple to no data transformations before loading. When you create a data loader task, Data Integration guides you through the process of selecting the source and target entity or entities, applying transformations, and validating the task. For the target, you have the option to create the entity or entities before loading or select existing entities to load the data into. For both source and target, you have the option use parameters to specify the resources and reuse parameters as described in Parameters for Source and Target.

The following pages describe how you can create, edit, and delete data loader tasks:

The following pages describe how to use parameters in a data loader task:

The following pages describe other management tasks that can be performed after a data loader task is created:

Parts of a Data Loader Task

Configuring a task to load data from a source to a target involves several steps.

  • Basic information and Load type: Choose the type of the source data entity and target data entity, and the load type.

    For the types of Database, File storage, and SaaS applications data assets that you can use as the source and target data entities, see Supported Source and Target Types.

    For load type, the source data to be loaded can come from Multiple data entities in a schema, or from a Single data entity. For example, data in two or more entities from an Oracle Database source can be loaded to an Object Storage target.

  • Source: Select the data asset, connection, and schema that has the source data for loading. Then, depending on the load type you specified, select one or more data entities to add to the source for loading. See Selecting the Source.

    To parameterize a resource in the source, see Parameters for Source and Target.

  • Target: Select the data asset, connection, schema, and data entity to use as the target. By default, the source and target entities are mapped by name. If you don't have an existing entity to load to, you can create a new data entity. See Selecting the Target.

    To parameterize a target resource, see Using Parameters for Source and Target.

  • Transformation: Use the interactive tabs to apply transformations on the source attributes. A data loader task supports transformations at the metadata and data levels. See Applying Transformations.
  • Attribute mapping: When loading data to an existing target data entity or multiple entities, by default the source attributes are mapped to the target attributes by attribute name. You can apply more mapping rules to all attributes across all mapped entities. See Mapping Attributes.
  • Review and validate: Review and amend any of the configuration steps, and ensure that the data loader task is valid before you publish. See Reviewing and Validating the Task.

Parameters for Source and Target

By using parameters for the source or target, you have the flexibility to use the same data loader task for different data sources or data targets at design time or runtime.

You can use a parameter for each of the following resources, in both single data entity load type and multiple data entities load type:

  • Source data asset, connection, and schema or bucket
  • Target data asset, connection, and schema or bucket

Note that when the data asset type is Object Storage, you can parameterize the bucket (schema) but not the compartment that contains the bucket.

With the data entity resource, you can parameterize the source entity or target entity in the following conditions only:

  • When using the single data entity load type
  • When creating a new data entity on the target by entering an entity name
  • When using an existing data entity on the target

When you parameterize a resource on the source and target, Data Integration automatically adds and uses the following parameter names:

Resource Source parameter name Target parameter name
Data asset SOURCE_DATA_ASSET TARGET_DATA_ASSET
Connection SOURCE_CONNECTION TARGET_CONNECTION
Schema or bucket SOURCE_SCHEMA TARGET_SCHEMA
Data entity SOURCE_DATA_ENTITY TARGET_DATA_ENTITY

To parameterize a resource on the target, see also Reusing Parameters for Source or Target Resources.

Adding, Editing, and Removing Parameters

You manage parameters on the Source step and the Target step when you create the data loader task.

After you select a source data asset, connection, schema, or data entity, you can assign a parameter to a resource by clicking Parameterize that's next to the resource.

Similarly for target resources, after you select a target data asset, connection, schema, or data entity, you can assign a parameter to a resource by clicking Parameterize that's next to the resource.

To parameterize a resource on the source or target, see also Reusing Parameters for Source or Target Resources.

After parameters are added, you can edit a parameter name, and add a description. See Editing a Resource Parameter.

To remove a parameter that's assigned to a resource, see Removing a Resource Parameter.

Reusing Parameters for Source or Target Resources

Suppose you have parameterized the source data asset, connection, schema, or data entity in your data loader task. To parameterize the target resources, instead of clicking Parameterize, you can click Reuse source <resource type> parameter to use the same parameters as those that have been added to the resources of the same type on the source.

Similarly, if you have parameterized the target resources, and you want to use the target parameters for the resources of the same type on the source, you can click Reuse target <resource type> parameter to parameterize the source data asset, connection, schema, or data entity.

When you parameterize a source or target resource by reusing the resource parameter of the same type that's on the target or source, Data Integration does not create a new parameter. Instead, only one parameter is used for a resource type that's on the source and the target.

For example, if an Object Storage source data asset is parameterized, then the Object Storage target data asset parameter name is SOURCE_DATA_ASSET when you reuse the source parameter for the target data asset, as shown in the following table:

Target resource Target parameter name when reusing source parameter of same type
Data asset SOURCE_DATA_ASSET
Connection SOURCE_CONNECTION
Schema or bucket SOURCE_SCHEMA
Data entity SOURCE_DATA_ENTITY

Similarly, if an Object Storage target resource such as the connection is parameterized, then the Object Storage source connection parameter is TARGET_CONNECTION when you reuse the target parameter for the source connection.

Resource on source Source parameter name when reusing target parameter of same type
Data asset TARGET_DATA_ASSET
Connection TARGET_CONNECTION
Schema or bucket TARGET_SCHEMA
Data entity TARGET_DATA_ENTITY