31 Oracle Storage Cloud Service

This chapter describes how to work with Oracle Storage Cloud Service in Oracle Data Integrator.

Attention:

This chapter applies only to Data Integration Platform Cloud.

This chapter includes the following sections:

Introduction

Oracle Storage Cloud Service provides a reliable, secure, and scalable object storage solution for storing unstructured data that can be accessed anytime anywhere. It serves as a gateway for data consumption to many OPC services. These cloud services directly picks up files from Oracle Storage Cloud Service and therefore, integration with Oracle Storage Cloud Service becomes very essential and useful to manage end-to-end integration flows using Oracle Data Integrator.

Oracle Data Integrator (ODI) seamlessly integrates with Oracle Storage Cloud Service. With this integration, you can now connect to Oracle Storage Cloud Service from ODI for uploading or downloading files/objects onto/from local directory or HDFS present in Oracle Storage Cloud Service.

Concepts

Oracle Storage Cloud Service is an Infrastructure as a Service (IaaS) product, which provides an enterprise-grade, large-scale, object storage solution for files and unstructured data.

Oracle Storage Cloud Service comprises of the following concepts:

Oracle Storage Cloud Service Hierarchy

Oracle Storage Cloud Service stores data as objects within a flat hierarchy of containers. You can create an object within a container most commonly by uploading a file or from ephemeral unstructured data. A single object can hold up to 5 GB of data, but multiple objects can be linked together to hold more than 5 GB of contiguous data.

A container is a user-created resource, which can hold an unlimited number of objects, unless you specify a quota for the container. Note that containers cannot be nested. You can define custom metadata for both objects and containers.

Storage Types

This integration provides support for both Oracle Standard Storage and Archive Storage.

  • Standard Storage - It is useful for storing one or more files that are accessed frequently.

  • Archive Storage — It is ideal for storing data that are not frequently accessed such as email archives, data backups, and digital video and so on.

Installation and Configuration

Make sure you have read the information in this section before you start working with the Oracle Storage Cloud Service technology:

System Requirements and Certifications

Before performing any installation, you should read the system requirements and certification documentation to ensure that your environment meets the minimum installation requirements for the products you are installing.

The list of supported platforms and versions is available on Oracle Technical Network (OTN):

http://www.oracle.com/technetwork/middleware/data-integrator/documentation/index.html

Technology Specific Requirements

The technology specific requirement for using Oracle Storage Cloud Service in ODI are:

  • You should have a dedicated pre-built technology named “Oracle Storage Cloud Service” defined similar to Oracle Object Storage.

  • As an ODI user, you should be able to create a Data Server from this technology and corresponding Physical and Logical schemas for the created Data Server. These physical and logical schemas are used by ODI Tools supported for Oracle Storage Cloud Service integration, used for uploading and downloading files/objects.

Supported datatypes for this technology are:

  • Array

  • Boolean

  • Bytes

  • Complex

  • Date

  • Double

  • Enum

  • Fixed

  • Float

  • Integer

  • Long

  • Map

  • Number

  • String

  • Struct

  • Union

Setting up the Topology

Setting up the topology consists of:

Creating an Oracle Storage Cloud Service Data Server

Create a data server for the Oracle Storage Cloud Service technology using the standard procedure, as described in Creating a Data Server of Administering Oracle Data Integrator guide. This section details only the fields required or specific for defining Oracle Storage Cloud Service data server:

  • In the Definition tab:

    1. Data Server

      • Name – Name of the data server that will appear in Oracle Data Integrator.

    2. Connection

      • Service URL – Oracle Cloud Storage Service URL. For Example: https://<identity-domain>.storage.oraclecloud.com

      • Service Name – It denotes the name of the service for the created service URL. For Example - Storage

      • User Name - Name of the user logging into the Oracle Storage Cloud Service

        Note:

        User Names should start with upper case and should not be real server names.
      • Password – Password of the logged in user

      • Identity Domain - It denotes the domain specific to the created storage instance. For Example – https://<identity-domain>.storage.oraclecloud.com

Creating an Oracle Storage Cloud Service Physical Schema

Create an Oracle Storage Cloud Service physical schema using the standard procedure, as described in Creating a Physical Schema in Administering Oracle Data Integrator guide.

Oracle Storage Cloud Service specific parameters are:

  • Name— Name of the physical schema created.

  • Container Name — It specifies the container to which you wish to associate the created physical schema. Select the required container from the Container Name drop-down list.

  • Directory (Work Schema) — This is the temporary folder on the local system used for getting files from Oracle Storage Cloud Service. If the directory does not exist, it is created. Specify the required location in the local system.

Create a logical schema for this physical schema using the standard procedure, as described in Creating a Logical Schema in Administering Oracle Data Integrator and associate it with a relevant context.

We use the created logical schema for getting Oracle Storage instance details. These details are used for connecting to Oracle Storage Cloud Service technology.

.

Creating and Reverse-Engineering an Oracle Storage Cloud Service Model

Creating an Oracle Storage Cloud Service Model

An Oracle Storage Cloud Service model is a set of data stores, corresponding to files stored in an Oracle Storage Cloud Service directory.

In a given context, the logical schema corresponds to one physical schema. You can create a model from the logical schema for the Oracle Storage Cloud Service technology. The physical schema is the Oracle Storage Cloud Service directory containing all the files. You can create new ODI Data store that will represent a file in Oracle Storage Cloud Service, so that it can be used in mappings.

Create an Oracle Storage Cloud Service model using the standard procedure, as described in Creating a Model of Developing Integration Projects with Oracle Data Integrator.

Reverse Engineering an Oracle Storage Cloud Service Model

Oracle Data Integrator provides specific methods for reverse-engineering Oracle Storage Cloud Service files.

Reverse-Engineering Delimited Files from Oracle Storage Cloud Service

To perform a delimited file reverse engineering:
  1. In the Models accordion, right click your Storage Cloud Service Model and select New Data store. The Data Store Editor opens.

  2. In the Definition tab, enter the following fields:

    • Name: Name of this data store

    • Resource Name: Sub-directory (if needed) and name of the file. It lists all the files present in Oracle Storage Cloud Service for the configured bucket.

  3. Go to the Storage tab, to describe the type of file. Set the fields as follows:

    • File Format: Delimited

    • Heading (Number of Lines): Enter the number of lines of the header. Note that if there is a header, Oracle Data Integrator uses the first line of the header to name the columns in the file.

    • Select a Record Separator.

    • Select or enter the character used as a Field Separator.

    • Enter a Text Delimiter if your file uses one.

    • Enter a Decimal Separator, if your file contains decimals.

  4. From the File main menu, select Save.

  5. In the Data Store Editor, go to the Attributes tab.

  6. In the editor toolbar, click Reverse Engineer.

  7. Verify the data type and length for the reverse engineered attributes. Oracle Data Integrator infers the field data types and lengths from the file content, but may set default values (for example 50 for the strings field length) or incorrect data types in this process.

  8. From the File main menu, select Save.

Reverse-engineering Fixed Files from Oracle Storage Cloud Service

Oracle Data Integrator provides a graphic wizard to define the columns of a fixed file.

To reverse-engineer a fixed file from Oracle Storage Cloud Service using the wizard:

  1. In the Models accordion, right click your Object Storage Cloud Service and select New Data store. The Data store Editor opens.

  2. In the Definition Tab, enter the following fields:

    • Name: Name of this data store

    • Resource Name: Sub-directory (if needed) and name of the file. It lists all the files present in Oracle Object Storage for the configured bucket.

  3. Go to the Storage tab to describe the type of file. Set the fields as follows:

    • File Format: Fixed

    • Header (Number of Lines): Enter the number of lines of the header.

    • Select a Record Separator.

  4. From the File main menu, select Save.

  5. In the Data store Editor, go to the Attributes tab.

  6. In the editor toolbar, click Reverse Engineer. The Attributes Setup Wizard appears. The Attributes Setup Wizard displays the first records of your file.

  7. Click on the ruler (above the file contents) to create markers delimiting the attributes. You can right-click within the ruler to delete a marker.

  8. Attributes are created with pre-generated names (C1, C2, and so on). You can edit the attribute name by clicking in the attribute header line (below the ruler).

  9. In the properties panel (on the right), you can edit all the parameters of the selected attribute. You should set at least the Attribute Name, Data type, and Length for each attribute.

  10. Click OK, when the attributes definition is complete.

  11. From the File main menu, select Save.

Reverse-Engineering JSON, Avro and Parquet Storage Formats

Oracle Storage Cloud Service technology supports reverse engineering JSON, Avro and Parquet storage formats , attributes, data types, and data type properties. If one of these storage format types is selected, then reverse engineering is based on the Schema File specified, and not on a sample data file. The schema file should be accessible in local file system.

To reverse-engineer JSON, Avro and Parquet storage formats, perform the following steps:

  1. In the Models accordion, right click your Oracle Storage Cloud Service Model and select New Data store. The Data Store Editor opens.

  2. In the Definition tab, enter the following fields:

    • Name: Name of this data store

    • Resource Name: Click the Search icon, to select the required file from the list of files present in Oracle Storage Cloud Service for the configured bucket.

  3. From the Storage Tab, select the Storage Format from the Storage Format drop-down list and specify the complete path of the schema file in the Schema File field.

    The schema file should be located in the local file system.

  4. From the File main menu, select Save.

Note:

There is no need to import an RKM into the project.

Working with Oracle Storage Cloud Service Tools

You can upload and download files to or from Oracle Storage Cloud Service through the following tools:

Note:

Apart from ODI Studio, you can also work with ODI Storage Cloud Service Tools from command line.

Uploading Files/Objects to Oracle Storage Cloud Service

ODI Storage Cloud Service Upload tool is used to upload single, multiple files, or an entire directory from HDFS or a local file system on to Oracle Storage Cloud Service.

To upload file(s) or directories to Oracle Storage Cloud Service,

  1. Create a new Project.

    For more details on how to create a project, see Creating an Integration Project of Developing Integration Projects with Oracle Data Integrator.

  2. Below the created Project folder, create a Package.

    For more details on how to create a package, see Creating and Using Packages of Developing Integration Projects with Oracle Data Integrator.

  3. Select OdiStorageCSUpload tool available in the Toolbox. Add it to the created package.

    Note:

    All the parameters of the tool are displayed under General Tab.

    Configure the required parameters:

    Table 31-1 ODI Storage Cloud Service Upload Tool Parameters

    Parameter Description

    Target Logical schema

    Target Logical schema name of Oracle Storage Cloud Service instance. Container information is obtained from Logical schema through configured physical schema.

    Source Logical schema

    Name of the Source Logical schema configured for File or HDFS Data Server for upload of Local or HDFS Files to Oracle Storage Cloud Service. Directory structure is obtained from Logical schema through configured physical architecture.

    File Names filter

    Field to specify one or more files or directories to be uploaded to Oracle Storage Cloud Service recursively. It also supports the list of files separated by | as a delimiter. The pattern followed is:

    • *.txt - should upload all the files ending with .txt

    • test* - uploads all the files and directories that matches with prefix “test”

    • *test* - uploads all the files and directories having substring “test”

    • test.xml | test1.xml | test2.xml - Uploads all the files specified

    • test* | test1* - Uploads all the files matching pattern test* and test1*

    • test.xml - Only one file is uploaded

    Overwrite

    This parameter indicates if upload operation should overwrite an existing file or not. Default value for this parameter is No.

    Retry on error

    It represents the number of times the retry attempt should occur when a failure or error happens during upload.

    Retry interval seconds

    Retry interval indicates after how many seconds a retry attempt should happen.

    Encrypt Key

    This is the user provided key used for encrypting objects while uploading files or directories to Oracle Storage Cloud Service.

    Note:

    This parameter cannot be null, if you want to encrypt objects while upload.

    For more details on the above parameters, refer to OdiStorageCSUpload Tool in Oracle Data Integrator Tools Reference guide.

  4. Save and execute the package.

    The required files from the source directory are uploaded to the target container of the Oracle Storage Cloud Service.

  5. Upon successful upload, you can find a complete log of this upload operation at the Details tab. To get to the details tab, from the Operator tab, expand the associated session for the upload tool and open the Session task window to find the Details tab with the required log information.

    The details include:

    • Source directory is : <source directory path>

    • Target container is : <Storage container name>

    • Filter used is : <input filter>

    • Number of file uploaded:<Total number of files that were uploaded>

    • Uploaded files are: <File1, File2>

    • Number of files failed:<Total number of files that were not uploaded>

    • Failed files are:<File1, File2>

Downloading File/Objects from Oracle Storage Cloud Service

ODI Storage Cloud Service Download tool is used to download single, multiple files, or an entire directory to HDFS or a local file system from Oracle Storage Cloud Service. For HDFS files, the files from Oracle Storage Cloud Service are first copied to the local directory (as you specified in Directory (Work Schema) for Oracle Storage Cloud Service Physical Schema) and then from local directory, files are downloaded to HDFS.

To download file(s) or directories from Oracle Storage Cloud Service,

  • Create a new Project

    For more details on how to create a project, see Creating an Integration Project of Developing Integration Projects with Oracle Data Integrator.

  • Below the created Project folder, create a Package

    For more details on how to create a package, see Creating and Using Packages of Developing Integration Projects with Oracle Data Integrator.

  • Select OdiStorageCSDownload tool available in the Toolbox. Add it to the created Package.

    Note:

    All the parameters of the tool are displayed under General Tab.
  • Configure the required parameters:

    Table 31-2 ODI Storge CS Download Tool Parameters

    Parameter Description

    Source Logical schema

    Source Logical Schema name configured for Oracle Storage Cloud Service instance. Container information is obtained from Logical schema through configured physical schema.

    Target Logical schema

    Logical Schema name configured for File or HDFS Data Server for download of Local or HDFS Files from Oracle Storage Cloud Service. Directory structure is obtained from Logical schema through configured physical architecture.

    File Names filter

    Field to specify one or more files or directories to be downloaded from Oracle Storage Cloud Service recursively. It also supports delimiter | for separated files list. The pattern followed is:

    • *.txt - should download all files ending with .txt

    • test* - Downloads all the files and directories that matches with prefix “test”

    • *test* - Downloads all the files and directories having substring “test”

    • test.xml | test1.xml | test2.xml - Downloads all the files specified

    • test* | test1* - Downloads all the files matching pattern test* and test1*

    • test.xml - Only one file is downloaded

    Overwrite

    This parameter indicates, if download operation should overwrite an existing file or not. The default value for this parameter is No.

    Retry on error

    It represents the number of times the retry attempt should occur when a failure or error happens during download.

    Retry interval seconds

    Retry interval indicates after how many seconds a retry attempt should happen.

    Decrypt Key

    This is the user provided key used for decrypting objects while downloading from Oracle Storage Cloud Service. This key should be same as the encrypt key provided during the upload of the same file (that you had uploaded earlier) to Oracle Storage Cloud Service. If you provide the wrong key then the download operation fails.

    Note:

    This parameter cannot be null, if you want to decrypt objects while download.

    For more details on the above parameters, refer to OdiStorageCSDownload Tool Oracle Data Integrator Tools Reference.

  • Save and execute the package.

    The required files from Oracle Storage Cloud Service are downloaded to the directory specified in the Target logical schema.

  • Upon successful download, you can find a detailed log of this download operation at the Details tab. To get to the details tab, from the Operator tab, expand the associated session for the download tool and open the Session task window to find the Details tab with the required log information.

    The details include:

    • Source container is : <Storage container name>

    • Target directory is : <target directory path>

    • Filter used is : <input filter>

    • Number of file downloaded:<Total number of files that were downloaded>

    • Downloaded files are:<File1, File2>

    • Number of files failed:<Total number of files that were not downloaded>

    • Failed files are:< File1, File2>