Load Data into Autonomous Data Warehouse Using a Data Loader Task
A data loader task helps you transform and load data into data lakes or other targets such as Oracle Autonomous Data Warehouse.
The source data for loading can come from one data entity or multiple data entities in a schema.
In this tutorial, you:
- Create a data loader task to transform and load data from a single Object Storage data entity into Autonomous Data Warehouse.
- Publish the data loader task to an Application.
- Run the data loader task.
Before You Begin
To complete this tutorial, you must have the following:
- Access to a Data Integration workspace, and relevant policies created. See Connect to Data Integration and Connect to Data Sources.
- Sample data uploaded. Data assets created for Object Storage source and Autonomous Data Warehouse target. See Connect to Data Sources.
- A project created.
1. Creating a Data Loader Task
- In the tab bar, click Open tab (plus icon), and then select Projects.
- On the Projects page, click DI_Lab.
- On the DI_Lab project details page, click Tasks in the left submenu.
On the Tasks page, click Create task and select Data loader.
The Create data loader task page opens in a new tab. Numbered and named steps at the top guide you through the configuration. A check mark displays on a step icon after you configure the step. To move between steps, click Next or Previous. You can also navigate directly to a configured step by selecting the icon.
On the Create data loader task page, Basic information step,
select the following:
For this item Select Source type File storage Target type Database Load type Single data entity
For the task Name, enter
Load Revenue Data into Data Warehouse. Then click Next to navigate to the next step.
A check mark displays on the Basic information step icon after you have configured the step.
On the Source step, select the following:
For this item Select Data asset Data_Lake Connection Default Connection Compartment The compartment that has the bucket in which you have uploaded the sample data file,
Bucket The Object Storage bucket that contains the sample CSV file
Under File settings, select the following:
For this item Select File type CSV Compression type Auto (Default) Data has header the check box Delimiter COMMA
You can leave the default settings as-is in the remaining fields.
Under Available data entities, select the check box for
REVENUE.CSVand then click Set as source.
REVENUE.CSVdisplays next to Selected data entity.
- Click Create to save the task and continue editing.
Click Next to advance to the Target step, and then select the following:
For this item Select Data asset Data_Warehouse Connection Default Connection Schema BETA
Under Staging location, you can use the default staging location that was set up when you created the target data asset.
Alternatively, you can deselect the check box to select another Object Storage bucket.
Under Target data entities load settings, click Create new data entities.
Insert is the default Integration strategy.
Under Target data entity name options, select Specify entity name. Then in the Entity name field, enter
- Click Save to save the task and continue editing.
Click Next to navigate to the Transformation step.
The interactive Data Xplorer is displayed. On the Attributes tab, the attributes from the data entity
Under Attributes, locate
SRC_ORDER_NUMBER. Then from the transformations icon (three dots) for SRC_ORDER_NUMBER, select Null Fill Up.
In the Null Fill Up dialog, do the following:
Not Availablein the Replace by field.
- Do not select Keep source attributes.
- Leave the Name and remaining fields as-is.
- Click Apply.
Click the Transformations icon next to the data entity name.
The Transformations panel opens, where you can review the list of transformations that are applied to the source dataset. To remove a transformation, click the delete icon next to the transformed attribute name.
- To close the Transformations panel, click the X icon.
Next to Attributes, click Data to navigate to the Data tab and locate SRC_ORDER_NUMBER.
The null values in SRC_ORDER_NUMBER have been replaced with the string Not Available.
Click Next to navigate to the Review and validate step.
Validation of the task begins automatically.
A summary of the configuration details for each step is presented in a block. If you make changes to a step's configuration, navigate to the Review and validate step to validate the task again.
The result of the task validation is shown in the last block, Validation.
- If validation is successful, click Save and close.
2. Publishing a Data Loader Task
- On the DI Lab project details page, click Tasks in the submenu.
In the Tasks list, locate
Load Revenue Data into Data Warehouse.
- Click the actions icon (three dots), and then select Publish to application.
- In the Publish to application dialog, select Lab Application.
A notification message displays, with a link to the Application to view the published task.
Select View application in the notification. Then select X to
close the notification.
The Patches list on the Application details page displays. When you publish a task, one patch entry is created in the Application.
On the Patches list, you can monitor the patch status. Click Refresh to get the latest status updates.
A patch is an update made to the Application. When you publish a task, a Publish patch is created. When you unpublish a task, an Unpublish patch is created. Learn more about Patches.
When the status of the patch changes to Success, a published task entry is created on the Tasks list of the Application details page.
3. Running a Data Loader Task
- In the tab bar, click Open tab (plus icon), and then select Applications.
- Click Lab Application to open the details page.
- On the Application details page, click Tasks.
On the Tasks list, select Run from the actions icon (three dots) for
Load Revenue Data into Data Warehouse.
A notification message displays, and the Runs list appears. A data loader task run is created on the Runs list. Use the Runs list to view all task runs and their statuses.
On the Runs list, click Refresh to get the latest task run status updates.
Learn more about task runs.
When the run status turns to Success, you can see the values for Data read and Rows written.
Here are some resources if you want to learn more: