Connecting to Data Sources
In Oracle Cloud Infrastructure Data Integration, data assets represent data sources that you can use as sources and targets for your integration tasks. Data assets can be databases, flat files, and so on.
You create and access data assets from the Data Assets page. From the workspace home page, you can access the Data Assets page by clicking Data Assets or by clicking the Open tab (plus icon) in the tab bar and then selecting Data Assets.
The Data Assets page lists all the data assets you have created in the workspace. You can do a full text search by entering the data asset name in the Filter by name search field. You can also use the Favorite Filter menu to display only those data assets that are currently marked as favorite or those that are currently not marked as favorite. Clicking Clear removes the filter.
On the Data Assets list, the Actions menu for a data asset has these options:
- View Details: Displays the data asset details, including connections and schemas.
- Edit: Lets you edit the data asset, including adding and editing connections.
- Test Connection: Lets you test the default connection for the data asset.
- Copy Data Asset Key: Copies to the clipboard the key of the data asset, which you can use with APIs.
- Delete: Lets you delete the data asset.
For information about the types of data sources you can use, see Supported Data Sources for Data Assets.
When you create a data asset, you specify one default connection to that data source. A connection is a user and password combination that is used to connect to the data source. For most data source types, you can use secrets in Oracle Cloud Infrastructure Vault to store the password that you need to connect to the source using the default connection. By using OCI Vault, you provide the OCID of the secret when specifying the connection details, so you don't have to enter the actual password when you create the data asset. For sources with Oracle wallets, you can also use OCI Vault secrets for the wallet and passwords. See OCI Vault Secrets and Oracle Wallets.
After you create your data asset, you're brought to the details page where you can review or update the data asset name, description, and tags. You can see the connections, schemas, and data entities. For more information, see Viewing Data Asset Details.
You can create additional connections for a data asset as needed. Each connection should give you access to additional schemas and data entities to integrate. For more information on connections, see Creating a Connection.
OCI Vault Secrets and Oracle Wallets
Oracle Cloud Infrastructure Data Integration is integrated with Oracle Cloud Infrastructure Vault to enable you to manage sensitive information using vaults, keys, and secrets.
A vault is a container for keys and secrets. Secrets store credentials such as required passwords for connecting to data sources. You use an encryption key in a vault to encrypt and import secret contents to the vault. Secret contents are based64-encoded. Oracle Cloud Infrastructure Data Integration uses the same key to retrieve and decrypt secrets when creating a data asset and connecting to the data source.
For most data source types, you can use a secret in OCI Vault to store the password for the default connection in a data asset. To create a vault, key, and secret, see Overview of Vault.
When you create a data asset, you provide the OCID of the secret in the connection details, so you don't have to enter the actual password. To copy the secret OCID, see View Secret Details.
For Oracle Database, Oracle Autonomous Data Warehouse, and Oracle Autonomous Transaction Processing sources, you have the option to use secrets for the Oracle wallet and passwords instead of uploading the wallet and entering the wallet password when you create your data asset.
To use an Oracle wallet with secrets in OCI Vault, you must:
- Provide a wallet password when you download the wallet.
- Remove the
.p12
file from the dowloaded wallet zip. - Use any base64 encoder to encode the modified wallet zip to base64.
- Copy the base64-encoded data to a secret in a vault.
- Create a secret for the wallet password.
- Create a secret for the database password.
To use secrets in OCI Vault, ensure that you have the following policy:
allow any-user to read secret-bundles in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
The following policy enables a group of users who are not administrators to use secrets with Oracle Autonomous Data Warehouse and Oracle Autonomous Transaction Processing:
allow group <group-name> to read secret-bundles in compartment <compartment-name>
Supported Data Sources for Data Assets
The following table lists the data sources that you can use with Data Integration.
Data Source Type | Version | Source | Target |
---|---|---|---|
Amazon RDS for SQL Server | 2019 | Yes | No |
2017 | Yes | No | |
2016 Service Pack 2 | Yes | No | |
2014 Service Pack 3 | Yes | No | |
2012 Service Pack 4 | Yes | No | |
Amazon Redshift | Amazon Redshift | Yes | No |
Apache Hive | CDH 5.4 and higher | Yes | No |
Apache 1.0, 2.0, 3.0, and higher | Yes | Yes | |
Hadoop Distributed File System (HDFS) | 3.1.2 | Yes | Yes |
Azure SQL Database | 11.0 and higher | Yes | No |
12.0 and higher | Yes | No | |
Microsoft Azure Synapse Analytics | 12.0 and higher | Yes | No |
Microsoft SQL Server | 2019 | Yes | No |
2017 | Yes | No | |
2016 Service Pack 2 | Yes | No | |
2014 Service Pack 3 | Yes | No | |
2012 Service Pack 4 | Yes | No | |
MySQL | 5.7.x and 8.0.x | Yes | Yes |
MySQL on Amazon RDS | 5.7.x and 8.0.x | Yes | No |
Amazon S3 | Amazon S3 | Yes | No |
Autonomous Database for Analytics and Data Warehousing | 18c/19c | Yes | Yes |
Autonomous Database for Transaction Processing and Mixed Workloads | 18c/19c | Yes | Yes |
Oracle Database | 11g | Yes (except SQL task stored procedure) | Yes |
12.1 | Yes | Yes | |
12.2 | Yes | Yes | |
18 | Yes | Yes | |
19 | Yes | Yes | |
21 | Yes | Yes | |
Oracle Database on Oracle Cloud Infrastructure |
11g | Yes (except SQL task stored procedure) | Yes |
12.1 | Yes | Yes | |
12.2 | Yes | Yes | |
18 | Yes | Yes | |
19 | Yes | Yes | |
Exadata DB Systems | 11g | Yes (except SQL task stored procedure) | Yes |
12.1 | Yes | Yes | |
12.2 | Yes | Yes | |
18 | Yes | Yes | |
19 | Yes | Yes | |
Oracle Cloud Infrastructure Object Storage | Latest | Yes | Yes |
Oracle on Amazon RDS | 12.1 | Yes | No |
12.2 | Yes | No | |
18 | Yes | No | |
19 | Yes | No | |
21 | Yes | No | |
Oracle Fusion Applications using Oracle Business Intelligence Cloud Connector (BICC) | BICC API version 13.20.10 and higher Fusion Applications version 13.20.10 (20 Oct) and higher | Yes | No |
Oracle Fusion Applications using Oracle Business Intelligence Publisher (BIP) | 11.1.1.9 and higher | Yes | No |
PostgreSQL | 12.0 and higher | Yes | No |
11.0 and higher | Yes | No | |
10.1 | Yes | No | |
9.6, 9.5, 9.4, 9.3, 9.2, 9.1, and 9.0 | Yes | No | |
8.4, 8.3, and 8.2 | Yes | No | |
IBM DB2 | DB2 V11.1 and higher for Linux, UNIX, and Windows | Yes | No |
DB2 V10.1 and higher for Linux, UNIX, and Windows | Yes | No | |
DB2 V9.1 and higher for Linux, UNIX, and Windows | Yes | No | |
DB2 V8.x and higher for Linux, UNIX, and Windows | Yes | No | |
DB2 V12 and higher for z/OS | Yes | No | |
DB2 V11 and higher for z/OS | Yes | No | |
DB2 V10 and higher for z/OS | Yes | No | |
DB2 V9.1 and higher for z/OS | Yes | No | |
DB2 UDB V8.1 for z/OS | Yes | No | |
DB2 i 7.1 and higher | Yes | No | |
DB2 i 6.1 and higher | Yes | No | |
DB2 V5R4 and higher for i 5/OS | Yes | No | |
Amazon Web Services (AWS) Aurora PostgreSQL | 1.0 and higher | Yes | No |
Influx DB | 1.8 and 2.x | Yes | No |
Supported Object Types
For Oracle Cloud Infrastructure Object Storage and Amazon S3 data assets, Data Integration supports the following object types:
- CSV
- JSON
- Parquet
- Avro
Note that only read and write of primitive data types are supported.
Supported Compression Types
For Oracle Cloud Infrastructure Object Storage data assets, Data Integration supports the following compression types or methods for using the CSV or JSON object file type with a source or target operator:
- Auto (Default)
- Gzip
- Bzip2
- Deflate
- Lz4
- Snappy
For Parquet and Avro file types, only Auto (Default) is supported.
If a source file is compressed, the compression type is actually the compression algorithm that is used. If you do not know the compression algorithm, then use the Auto (Default) compression type.
Data Types Not Supported
The following Oracle Database data types are not supported:
ROWID
UROWID
BFILE
TIMESTAMP WITH LOCAL TIMEZONE
INTERVAL DAY TO SECOND
INTERVAL YEAR TO MONTH
XMLTYPE
SDO_GEOMETRY
NCHAR
NVARCHAR
Other Support Information
Oracle Cloud Infrastructure Data Integration supports hierarchical data types, Unicode characters, and a core set of generic data types.
Hierarchical Data Types
Data Integration supports hierarchical data types in source and target data entities.
Currently, you can use only Oracle Object Storage data entities in the following file formats to read and write data with hierarchical data types:
- JSON and multi-line JSON
- Avro
- Parquet
The supported hierarchical data types are the Array, Struct, and Map complex types. You can perform any type of file to file transformation, such as JSON to Avro or Avro to JSON.
Before using hierarchical data entities and complex types, ensure that you understand the supported capabilities in Data Integration when working with components and performing tasks.
Task/Component | Support | Limitation |
---|---|---|
Prepare data sources |
|
|
Add and configure a source operator |
|
|
Add and configure a target operator | Select Create New Data Entity check box:
Select existing data entity:
| |
Use shape operators |
|
|
Map attributes |
|
For example,
|
Use data transformations (Data Xplorer and Data tab) |
|
|
Unicode Support
Data Integration supports the Unicode standard, which is a universal character encoding standard for written characters and text in any language. The Unicode standard provides a unique number for every character regardless of the platform, device, or application. For example, 0041
is the Unicode character for the English letter "A".
Data Integration supports Unicode characters (including multibyte characters) in data and metadata.
Unicode support in data means that the attributes and attribute values in your source and target data assets can include Unicode and multibyte characters. You can also enter Unicode and multibyte characters in expressions. For JSON and CSV Object Storage data assets, the default encoding is UTF-8, and you cannot change it.
Unicode support in metadata means that the data entity and schema names of your data assets can include Unicode and multibyte characters. You can also enter Unicode and multibyte characters for names and descriptions when working with objects in Data Integration.
In the Unicode standard, a unique number assigned to a Unicode character is a code point. Currently, Data Integration supports the following Unicode code points and range of code points:
Code Point or Range | Character | Number of Characters Supported |
---|---|---|
Basic Latin characters | ||
0024 | $ (dollar sign) | 1 |
0041 - 005A | A to Z | 26 |
005F | _ (underscore) | 1 |
0061 - 007A | a to z | 26 |
Latin-1 Supplement characters | ||
00C0 - 00D6 | Latin-1 Supplement characters with accents | 23 |
00D8 - 00F6 | 31 | |
00F8 - 00FF | 8 | |
Characters in 46 ranges from Latin Extended-A to Greek Extended | ||
0100 - 1FFF | Characters in the following named ranges: Latin Extended-A, Latin Extended-B, IPA Extensions, Spacing Modifier Letters, Combining Diacritical Marks, Greek and Coptic, Cyrillic, Cyrillic Supplementary, Armenian, Hebrew, Arabic, Syriac, Thaana, Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, Malayalam, Sinhala, Thai, Lao, Tibetan, Myanmar, Georgian, Hangul Jamo, Ethiopic, Cherokee, Unified Canadian Aboriginal Syllabics, Ogham, Runic, Tagalog, Hanunoo, Buhid, Tagbanwa, Khmer, Mongolian, Limbu, Tai Le, Khmer Symbols, Phonetic Extensions, Latin Extended Additional, Greek Extended | 7936 |
Characters in 4 ranges from Hiragana to Hangul Compatibility Jamo | ||
3040 - 318F | Characters in the following named ranges: Hiragana, Katakana, Bopomofo, Hangul Compatibility Jamo | 336 |
Characters in 4 CJK (Chinese, Japanese, and Korean) ranges | ||
3300 - 337F | CJK Compatibility characters | 128 |
3400 - 3D2D | CJK Unified Ideographs Extension A characters | 2350 |
4E00 - 9FFF | CJK Unified Ideographs characters | 20992 |
F900 - FAFF | CJK Compatibility Ideographs characters | 512 |
Understanding Data Type Mappings
Data types from the source and target systems you use are mapped to and mapped from a core set of generic data types in Oracle Cloud Infrastructure Data Integration.
In the set of generic data types, some of the types have additional length or scale properties that you can use to further constrain the data type.
The Expression operator in Data Integration does not yet support all the generic data types. You can create a new attribute based on a generic data type only if the generic type is supported.
Data type | Expression support | Data type string reference | Length | Scale |
---|---|---|---|---|
ANYDATA | Seeded:/typeSystems/PLATFORM /dataTypes/ANYDATA | |||
BIGINT | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/BIGINT | ||
BINARY | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/BINARY | ||
BINARY_DOUBLE | Seeded:/typeSystems/PLATFORM /dataTypes/BINARY_DOUBLE | |||
BINARY_FLOAT | Seeded:/typeSystems/PLATFORM /dataTypes/BINARY_FLOAT | |||
BLOB | Seeded:/typeSystems/PLATFORM /dataTypes/BLOB | |||
BOOLEAN | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/BOOLEAN | ||
CHAR | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/CHAR | Yes | |
CLOB | Seeded:/typeSystems/PLATFORM /dataTypes/CLOB | |||
DATE | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/DATETIME | ||
DATETIME | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/DATETIME | ||
DECIMAL | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/DECIMAL | Yes | Yes |
DOUBLE | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/DOUBLE | ||
FIXED | Seeded:/typeSystems/PLATFORM /dataTypes/FIXED | |||
FLOAT | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/FLOAT | Yes | |
INTEGER | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/INTEGER | ||
INTERVAL DAY TO SECOND | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/INTERVAL DAY TO SECOND | Yes | Yes |
INTERVAL YEAR TO MONTH | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/INTERVAL YEAR TO MONTH | Yes | |
LONG | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/LONG | ||
MONEY | Seeded:/typeSystems/PLATFORM /dataTypes/MONEY | |||
NCHAR | Seeded:/typeSystems/PLATFORM /dataTypes/NCHAR | Yes | ||
NCLOB | Seeded:/typeSystems/PLATFORM /dataTypes/NCLOB | |||
NUMBER | Seeded:/typeSystems/PLATFORM /dataTypes/NUMBER | |||
NUMERIC | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/NUMERIC | Yes | Yes |
NVARCHAR | Seeded:/typeSystems/PLATFORM /dataTypes/NVARCHAR | Yes | ||
REAL | Seeded:/typeSystems/PLATFORM /dataTypes/REAL | |||
SMALLINT | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/SMALLINT | ||
STRING | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/STRING | Yes | |
TIME | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/TIME | ||
TIMESTAMP WITH TIME ZONE | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/TIMESTAMP WITH TIME ZONE | Yes | |
TIMESTAMP | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/TIMESTAP | Yes | |
TINYINT | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/TINYINT | ||
VARBINARY | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/VARBINARY | ||
VARCHAR | Yes | Seeded:/typeSystems/PLATFORM /dataTypes/VARCHAR | Yes | |
XMLFORMAT | Seeded:/typeSystems/PLATFORM /dataTypes/XMLFORMAT | |||
XMLTYPE | Seeded:/typeSystems/PLATFORM /dataTypes/XMLTYPE |
The table shows Oracle to Generic type mapping, and Generic to Oracle type mapping.
Oracle to Generic | Generic | Generic to Oracle |
---|---|---|
ANYDATA | ANYDATA | ANYDATA |
BIGINT | NUMBER | |
BINARY | BLOB | |
BINARY_DOUBLE | BINARY_DOUBLE | BINARY_DOUBLE |
BINARY_FLOAT | BINARY_FLOAT | BINARY_FLOAT |
BLOB RAW LONGRAW |
BLOB | BLOB |
BOOLEAN | VARCHAR2 | |
CHAR | CHAR | CHAR |
CLOB | CLOB | CLOB |
DATE | DATE | DATE |
DATETIME | TIMESTAMP | |
DECIMAL | NUMBER | |
DOUBLE | NUMBER | |
FIXED | BLOB | |
FLOAT | FLOAT | FLOAT |
INTEGER | NUMBER | |
INTERVAL DAY TO SECOND | INTERVAL DAY TO SECOND | INTERVAL DAY TO SECOND |
INTERVAL YEAR TO MONTH | INTERVAL YEAR TO MONTH | INTERVAL YEAR TO MONTH |
LONG | LONG | NUMBER |
MONEY | NUMBER | |
NCHAR | NCHAR | NCHAR |
NCLOB | NCLOB | NCLOB |
NUMBER | NUMBER | |
NUMBER | NUMERIC | NUMBER |
NVARCAHR2 | NVARCHAR | NVARCHAR2 |
REAL | REAL | REAL |
SMALLINT | NUMBER | |
STRING | VARCHAR2 | |
TIME | DATE | |
TIMESTAMP WITH LOCAL TIME ZONE | TIMESTAMP WITH TIME ZONE | TIMESTAMP WITH TIME ZONE |
TIMESTAMP WITH TIME ZONE | TIMESTAMP WITH TIME ZONE | |
TIMESTAMP(0) WITH LOCAL TIME ZONE | TIMESTAMP WITH TIME ZONE | |
TIMESTAMP(0) WITH TIME ZONE | TIMESTAMP WITH TIME ZONE | |
TIMESTAMP TIMESTAMP(0) |
TIMESTAMP | TIMESTAMP |
TINYINT | NUMBER | |
VARBINARY | BLOB | |
ROWID UROWID VARCHAR2 |
VARCHAR | VARCHAR2 |
XMLFORMAT | XMLFORMAT | XMLFORMAT |
XMLTYPE | XMLTYPE | XMLTYPE |
The table shows Hive to Generic type mapping.
Hive | Generic |
---|---|
ARRAY | VARCHAR |
BIGINT | BIGINT |
BOOLEAN | BOOLEAN |
CHAR | CHAR |
DATE | DATE |
DECIMAL | DECIMAL |
DOUBLE | DOUBLE |
FLOAT | REAL |
INT | INTEGER |
MAP | VARCHAR |
SMALLINT | SMALLINT |
STRING | VARCHAR |
STRUCT | VARCHAR |
TIMESTAMP | TIMESTAMP |
TINYINT | TINYINT |
UNION | VARCHAR |
VARCHAR | VARCHAR |
The table shows MySQL to Generic type mapping, and Generic to MySQL type mapping.
MySQL to Generic | Generic | Generic to MySQL |
---|---|---|
BIGINT | BIGINT | BIGINT |
BINARY | BINARY | BINARY |
BINARY_DOUBLE | DOUBLE PRECISION | |
BINARY_FLOAT | FLOAT | |
BLOB LONGBLOB MEDIUMBLOB TINYBLOB |
BLOB | BLOB |
CHAR | CHAR | CHAR |
LONGTEXT MEDIUMTEXT TEXT |
CLOB | LONGTEXT |
DATE | DATE | DATE |
DATETIME | TIMESTAMP | |
DECIMAL | DECIMAL | DECIMAL |
DOUBLE DOUBLE PRECISION |
DOUBLE | DOUBLE |
FLOAT | FLOAT | FLOAT |
INT MEDIUMINT |
INTEGER | INT |
INTERVAL DAY TO SECOND | VARCHAR | |
INTERVAL YEAR TO MONTH | VARCHAR | |
MONEY | NUMERIC | |
NCHAR | VARCHAR | |
NCLOB | VARCHAR | |
NUMERIC | NUMERIC | NUMERIC |
NVARCHAR | VARCHAR | |
REAL | REAL | REAL |
SMALLINT | SMALLINT | SMALLINT |
TIME | TIME | TIME |
DATETIME TIMESTAMP WITH TIME ZONE |
TIMESTAMP | |
TIMESTAMP | TIMESTAMP | |
TINYINT | TINYINT | TINYINT |
TINYTEXT | STRING | |
ENUM VARCHAR |
VARCHAR | VARCHAR |
VARBINARY | VARBINARY | VARBINARY |
XMLTYPE | VARCHAR |
The table shows PostgreSQL to Generic type mapping.
PostgreSQL | Generic |
---|---|
BIGINT BIGSERIAL |
BIGINT |
BIT | BINARY |
BIT VARYING BYTEA |
VARBINARY |
BOOLEAN | BOOLEAN |
CHARACTER | CHAR |
CHARACTER VARYING | VARCHAR |
DATE | DATE |
DOUBLE PRECISION | DOUBLE |
INTEGER | INTEGER |
JSON | VARCHAR(10485760) |
NUMERIC | NUMERIC |
REAL | REAL |
SERIAL | INTEGER |
SMALLINT | SMALLINT |
TEXT | VARCHAR(1073741823) |
TIME TIME WITH TIMEZONE TIMESTAMP TIMESTAMP WITH TIMEZONE |
TIMESTAMP |
UUID ARRAY |
VARCHAR |
XML | XMLTYPE |
The table shows SQL Server to Generic type mapping, and Generic to SQL Server type mapping.
SQL Server to Generic | Generic | Generic to SQL Server |
---|---|---|
bigint bigint identity |
BIGINT | bigint |
binary |
BINARY |
binary |
BINARY_DOUBLE BINARY_FLOAT |
float | |
bit | BOOLEAN | varchar |
char | CHAR | char |
date | DATE | datetime |
datetime | DATETIME | datetime |
DOUBLE | float | |
FIXED | binary | |
datetime2 timestamp |
TIMESTAMP | |
decimal decimal() identity |
DECIMAL | decimal |
float | FLOAT | float |
int identity int |
INTEGER | int |
INTERVAL DAY TO SECOND INTERVAL YEAR TO SECOND |
varchar | |
LONG | bigint | |
money smallmoney |
MONEY | money |
nchar | NCHAR | nchar |
nvarchar | NVARCHAR | nvarchar |
ntext nvarchar(max) |
NCLOB | nvarchar(max) |
NUMBER | numeric | |
numeric() identity numeric |
NUMERIC | numeric |
real | REAL | real |
smalldatetime | DATETIME | |
smallint identity smallint |
SMALLINT | smallint |
STRING | varchar | |
time | TIME | time |
TIMESTAMP WITH TIME ZONE TIMESTAMP |
datetime | |
tinyint identity tinyint |
TINYINT | tinyint |
varbinary | VARBINARY | varbinary |
image varbinary(max) |
BLOB | varbinary(max) |
text varchar(max) |
CLOB | varchar(max) |
sql_variant sysname uniqueidentifier varchar |
VARCHAR | varchar |
XMLTYPE | text |
The table shows File to Generic type mapping, and Generic to File type mapping.
File to Generic | Generic | Generic to File |
---|---|---|
Ascii signed zoned decimal Ascii unsigned zoned decimal |
NUMERIC | |
Binary signed big endian Binary signed little endian Binary unsigned big endian Binary unsigned little endian |
NUMERIC | |
Date | DATE | Date |
Ebcdic signed zoned decimal Ebcdic unsigned zoned decimal |
NUMERIC | |
Ebcdic Fixed Ebcdic Fixed String |
VARCHAR | |
Numeric | NUMERIC | |
Signed packed decimal | NUMERIC | |
String | VARCHAR | |
Unsigned packed decimal | NUMERIC | |
BIGINT | Numeric | |
BINARY | String | |
BINARY_DOUBLE BINARY_FLOAT |
Numeric | |
BLOB BOOLEAN |
String | |
CHAR CLOB |
String | |
DATETIME | Date | |
DECIMAL DOUBLE |
Numeric | |
FIXED | String | |
FLOAT | Numeric | |
INTEGER | Numeric | |
INTERVAL DAY TO SECOND INTERVAL YEAR TO MONTH |
String | |
LONG | Numeric | |
MONEY | Numeric | |
NCHAR NCLOB |
String | |
NUMBER NUMERIC |
Numeric | |
NVARCHAR | String | |
REAL | Numeric | |
SMALLINT TINYINT |
Numeric | |
STRING | String | |
TIME | String | |
TIMESTAMP WITH TIME ZONE TIMESTAMP |
Date | |
VARBINARY | String | |
VARCHAR | String | |
XMLTYPE | String |
All type mapping is String.
The table shows JSON to Generic type mapping, and Generic to JSON type mapping.
JSON to Generic | Generic | Generic to JSON |
---|---|---|
Null String |
VARCHAR | String |
BIGINT | Number | |
BINARY | String | |
BINARY_DOUBLE BINARY_FLOAT |
Number | |
BLOB | String | |
Boolean | BOOLEAN | Boolean |
CHAR CLOB |
String | |
DATE DATETIME |
String | |
DECIMAL DOUBLE |
Number | |
FIXED | String | |
FLOAT INTEGER |
Number | |
INTERVAL DAY TO SECOND INTERVAL YEAR TO MONTH |
String | |
LONG MONEY |
Number | |
NCHAR NCLOB NVARCHAR |
String | |
NUMBER | Number | |
Number | NUMERIC | Number |
REAL | Number | |
STRING | String | |
SMALLINT TINYINT |
Number | |
TIME TIMESTAMP WITH TIME ZONE TIMESTAMP |
String | |
VARBINARY | String | |
XMLTYPE | String |
The table shows Parquet to Generic type mapping, and Generic to Parquet type mapping.
Parquet to Generic | Generic | Generic to Parquet |
---|---|---|
BIGINT LONG NUMBER |
INT64 | |
BINARY BSON FIXED_LEN_BYTE_ARRAY INT96 |
BINARY | BINARY |
BINARY_DOUBLE BINARY_FLOAT BLOB CHAR CLOB FIXED INTERVAL DAY TO SECOND INTERVAL DAY TO MONTH NCHAR NCLOB NVARCHAR VARBINARY XMLTYPE |
BINARY | |
BOOLEAN | BOOLEAN | BOOLEAN |
DATE | DATE | DATE |
DATETIME | INT32 | |
DECIMAL | DECIMAL | DECIMAL |
DOUBLE | DOUBLE | DOUBLE |
ENUM INTERVAL JSON UTF8 key |
VARCHAR | UTF8 |
FLOAT | FLOAT | FLOAT |
INTEGER MONEY SMALLINT TINYINT |
INT32 | |
REAL | DOUBLE | |
STRING | UTF8 | |
TIME | TimeType | |
TIMESTAMP WITH TIME ZONE | TimestampType | |
TIMESTAMP | TimestampType | |
INT32 INT64 TIMESTAMP_MICROS TIMESTAMP_MILLIS TIME_MICROS TIME_MILLIS |
NUMERIC | INT32 |
The table shows Avro to Generic type mapping, and Generic to Avro type mapping.
Avro to Generic | Generic | Generic to Avro |
---|---|---|
bytes | STRING | string |
ANYDATA | string | |
BIGINT | long | |
BINARY | string | |
BINARY_DOUBLE | string | |
BINARY_FLOAT | string | |
BLOB | string | |
boolean | BOOLEAN | boolean |
CHAR | string | |
CLOB | string | |
DATE | string | |
DATETIME | string | |
DECIMAL | double | |
double | DOUBLE | double |
FIXED | string | |
float | FLOAT | float |
int | INTEGER | int |
INTERVAL DAY TO SECOND | string | |
INTERVAL YEAR TO MONTH | string | |
long | LONG | long |
MONEY | double | |
NCHAR | string | |
NCLOB | string | |
NUMBER | int | |
NUMERIC | int | |
NVARCHAR | string | |
REAL | float | |
SMALLINT | int | |
string | STRING | string |
TIME | string | |
TIMESTAMP WITH TIME ZONE | string | |
TIMESTAMP | string | |
TINYINT | int | |
VARBINARY | string | |
null | VARCHAR | string |
XMLFORMAT | string | |
XMLTYPE | string |
The table shows Oracle Fusion Applications using Oracle Business Intelligence Cloud Connector (BICC) to Generic type mapping, and Generic to Oracle Fusion Applications using BICC type mapping.
Oracle Fusion Applications using BICC to Generic | Generic | Generic to Oracle Fusion Applications using BICC |
---|---|---|
NUMERIC | NUMERIC | NUMERIC |
VARCHAR | VARCHAR | VARCHAR |
DATE | DATE | DATE |
TIMESTAMP | TIMESTAMP | TIMESTAMP |
Network Configurations
Your network configurations depend on the source and target data assets you're using with the Data Integration service, and where the assets are located.
See this blog to identify the options for your needs.
A workspace can have an attached Virtual Cloud Network (VCN). In general, for data sources that are accessible from the internet:
- If a workspace has an attached VCN: Data Integration can connect directly through a Network Address Translation (NAT) gateway on the VCN of the workspace.
- If a workspace does not have an attached VCN: Data Integration can connect directly using public IP addresses.
Resources, such as workspaces, with private IPs defined in any subnet can access other private resources in different virtual cloud networks and regions through Service gateways or NAT gateways using local or remote peering gateways.
You can also combine gateways when you need to access both Object Storage and Autonomous Database for Analytics and Data Warehousing. For example, for public connectivity, you would need both a Service gateway for Object Storage and a NAT gateway for Autonomous Database for Analytics and Data Warehousing.
For data sources that are not accessible from the internet, other options include:
Create a workspace with a private endpoint enabled, with the private endpoint in the same subnet as the data source.
Use Oracle Cloud Infrastructure FastConnect.
Use Oracle Cloud Infrastructure VPN Connect (also referred to as an IPSec VPN).
Note, however, that Oracle Cloud Infrastructure FastConnect and Oracle Cloud Infrastructure VPN Connect must be used when these sources are in private domains:
- Oracle on Amazon RDS
- MySQL on Amazon RDS
- Amazon RDS for SQL Server
- Microsoft Azure SQL Database
Oracle Cloud Infrastructure Networking documentation:
Creating a Data Asset
When you create a data asset, you select the data source type and then provide the data source details and the corresponding default connection details to that source. You can use the Console or the APIs to create data assets.
For most data assets, instead of providing the actual password for a connection, you can choose to use a secret in Oracle Cloud Infrastructure Vault that has the password for the connection user. Take note of the following before using OCI Vault secrets:
- Ensure that you have set up the required policies for using OCI vault and secrets.
- Create a secret for the database password.
- If you use Oracle wallets, you must modify the wallet zip before you create a secret for the wallet contents. You also need a secret for the wallet password, and a secret for the database password.
See OCI Vault Secrets and Oracle Wallets.
If you intend to publish a task to an application in OCI Data Flow, note also the following when creating your data assets:
- Use OCI Vault secrets that contain the passwords to connect to the data sources. This is required for passing credentials securely across OCI services.
- Specify the Fully Qualified Domain Name (FQDN) for the database hosts. OCI Data Flow does not allow connections through host IP addresses.
Each type of data asset has a different set of source and connection fields. Use the appropriate instructions for the data asset type you want to create.
OCI Data Sources
Before you can create a Service-to-Service (S2S) Principal connection for your Oracle Cloud Infrastructure Object Storage data asset, you must create a policy to allow Data Integration to access the Object Storage resource. See creating a policy and Policy Examples to Enable Access to Oracle Object Storage under Policy Examples.
Using Autonomous Database for Analytics and Data Warehousing or Autonomous Database for Transaction Processing and Mixed Workloads as source or target data assets in Data Integration require database privileges and additional IAM policies.
Before you create an Autonomous Database for Analytics and Data Warehousing or Autonomous Database for Transaction Processing and Mixed Workloads data asset, ensure that you understand the permissions you might need to set up, and the use of a default staging location.
When you use Autonomous Database for Analytics and Data Warehousing or Autonomous Database for Transaction Processing and Mixed Workloads as a target, Data Integration uses an Object Storage bucket to stage the data before loading it into the target.
Policies and permissions to use Object Storage, and database privileges must be in place. Use the following checklist to ensure that you have the required setup you need.
Item | Requirement or Consideration |
---|---|
Database privileges | Your database administrator must grant the following privileges to the Autonomous Database for Analytics and Data Warehousing or Autonomous Database for Transaction Processing and Mixed Workloads user in Data Integration. For example:
|
Policies to allow Data Integration to use Object Storage for staging data | When you use Autonomous Database for Analytics and Data Warehousing or Autonomous Database for Transaction Processing and Mixed Workloads as a target data asset, Data Integration uses an Object Storage bucket to stage the data before loading it into the target. Ensure that you enable the
In addition, you must create all the required Object Storage policies. |
Object Storage default staging location | A default staging location is an existing Object Storage bucket that Data Integration uses to stage data before loading the data into the Autonomous Database for Analytics and Data Warehousing or Autonomous Database for Transaction Processing and Mixed Workloads target. You can set up a default staging location for an Autonomous Database for Analytics and Data Warehousing or Autonomous Database for Transaction Processing and Mixed Workloads data asset when you create or edit the data asset. As you add such a target data asset in a data flow or data loader task, Data Integration automatically uses the default staging location that has been set. |
Using Oracle Fusion Applications data assets in Data Integration requires additional setups and access to other services in Oracle Cloud Infrastructure.
Before you create an Oracle Fusion Applications data asset, ensure that you understand the dependencies, and have completed the prerequisite tasks.
Data Integration uses the Oracle Business Intelligence Cloud Connector (BICC) or the Oracle Business Intelligence Publisher (BIP) to enable connections to Oracle Fusion Applications as data sources.
You use an Oracle Fusion Applications data asset as a source to extract data from Oracle Fusion Applications such as an ERP or HCM application. Data Integration loads the extracted data into a predefined external storage location that is configured in BICC. Policies to use BICC and network requirements for accessing Oracle Fusion Applications must be in place.
Use the following task checklist to ensure that you have the required setup and information you need for creating and using Oracle Fusion Applications data sources.
Task | Requirement |
---|---|
Obtain access to BICC | This topic assumes you already have the proper user credentials to access and use BICC. See Provision Access in Creating a Business Intelligence Cloud Extract. You use BICC to set up an external storage location for staging the extracted data. |
Obtain your Fusion Applications connection details | When you create a Fusion Applications data asset, you need to provide the following:
Instead of providing the actual the password, you can use a secret in Oracle Cloud Infrastructure Vault to store the password, and provide the OCID of that secret. See Overview of Vault. To copy the secret OCID, see View Secret Details. |
Verify network configuration to access Oracle Fusion Applications. | If the Fusion Applications host is accessible from the Internet, and the workspace has an attached virtual cloud network (VCN), the VCN must:
See Virtual Networking Quickstart. If the Fusion Applications host is not accessible from the Internet, you can use Oracle Cloud Infrastructure FastConnect to set up a fast and secure connection between Oracle Cloud Infrastructure and the Fusion Applications host. |
Create policies to allow Data Integration to access buckets and objects in Oracle Cloud Infrastructure Object Storage. | The policies are required for staging the extracted data, which need pre-authentication to complete the operations.
|
Create a bucket in Oracle Cloud Infrastructure Object Storage. | Object Storage is the only storage type in BICC that you can use as the external storage location for staging extracted data. After creating a bucket, gather the following details, which you need to provide when you set up the External Storage Location in BICC.
|
Set up the external storage location in BICC to reference the Object Storage bucket you created. | See Storage Type - OCI Object Storage Connection under Configure Where to Load Data in Creating a Business Intelligence Cloud Extract. |
Policies required also depend on whether the Object Storage instance and Data Integration instance are in the same tenancy or different tenancies, and whether you create the policies at the compartment level or tenancy level. Review more examples and this blog to identify the right policies for your needs.
Third-party Sources on Cloud or On-premise
Data Sources on Third-party Platform
Amazon S3 (Simple Storage Service) is storage service that lets you store semi-structured and unstructured data as objects within resources called buckets. You only need to create one data asset in Data Integration for all the buckets your AWS credentials has access to.
Creating a Connection
When you create a data asset, you must provide the default connection details. For existing data assets, you can add additional connections as needed.
To create a connection on an existing data asset:
Viewing Data Asset Details
The Data Assets page shows you a list of all the data assets in your Oracle Cloud Infrastructure Data Integration workspace.
You can view summary information of a data asset, access connections, and view data entities in a schema or bucket. Select a data asset to navigate to the details page of the data asset. You can also edit or delete the data asset from the details page.
Summary
In the summary section, you can view data asset details such as type and description. Depending on the type of data source, you can also view the host connection information.
Connections
On the data asset details page, select Connections to view the list of connections available to you. On the list, the Actions menu for a connection has these options:
- Edit: Edits the connection details.
- Make Default: Designates this connection as the default. The action is only applicable to a connection that is not designated as the default.
- Test Connection: Tests whether a connection can be established. If the
test fails because of missing policies, you get a list of policies that must be
added. Note
- For an Oracle Object Storage data asset, when you use a cross-tenancy OCID, you require policies to connect to the Object Storage data asset. For information about the required cross-tenancy policies and the process to create them, click the View Documentation button in the information box that appears when you enter a cross-tenant OCID.
- For an autonomous database data asset, you get the list of policies only if you have selected Use Vault Secret OCID as the option to provide the login credentials for the autonomous database
- Copy Connection Key: Copies to the clipboard the key of the connection, which you can use with APIs.
- Delete: Deletes this connection.Note
You cannot delete the connection that is designated as the default connection.
Schemas
On the details page of a data asset that is not Oracle Object Storage, select Schemas to view the list of schemas available to you through a connection in the data asset. For more than one connection in the data asset, select a connection under Filters, then click a schema to view the data entities in that schema.
Buckets
On the details page of an Oracle Object Storage data asset, select Buckets to view the list of buckets available to you. Select the compartment under List of Scope, then click a bucket to view the data entities in that bucket.
Data Entities (Schema)
Available data entities are listed on the schema details page. Select a data entity to view the entity details. On the data entity details page, click Attributes to list the attribute names and data types. Click Data to list the data rows in that data entity. In the Data table, click an attribute header to view the data and attribute profile.
For BIP Oracle Fusion Applications data assets, the available reports are listed on the schema details page. Select a report to view the report attributes and data.
Data Entities (Bucket in Oracle Object Storage)
Available data entities are listed on the schema (bucket) details page. Select a data entity to view the entity details. On the data entity details page, click Attributes to list the attribute names and data types. Click Data to list the data rows in that data entity.
For a CSV, JSON, Parquet or Avro data entity, the Attributes and Data lists are empty until you specify the Formatting Parameters and retrieve the attributes:
- Select the File Type of the data entity.
- For CSV and JSON files, select the Compression Type of the file. If you do not know the compression algorithm that is used to compress the file, use Auto (Default). For Parquet and Avro files, only Auto (Default) is available as the Compression Type.
- Specify how to parse the attributes retrieved, if applicable. For CSV files, you can specify to include a header, the encoding, escape character, delimiter, and quote character. For JSON, specify the encoding.
- Click Get Attributes to fetch the data entity's attributes.
- After the data entity's attributes have been successfully retrieved, click Data to list the data rows.
In the Data table, click an attribute header to view the data and attribute profile.
A data and attribute profile is not supported on these attribute data types:
- BLOB
- RAW
- BINARY
- BINARY_DOUBLE
- BINARY_FLOAT
- CLOB
- NCLOB
- SDO_GEOMETRY
- XMLTYPE
- XMLFORMAT
- COMPLEX
- VARBINARY
Editing a Data Asset
In the Console, you can only edit a data asset from the Data Assets page. You can also update a data asset using the API. When you edit a data asset, you can modify the data asset's details, select or add new connections, and change the default connection.
To edit a data asset:
Editing a Connection
You edit connections from a data asset's details page.
To edit a connection:
Deleting a Data Asset
When you delete a data asset, you also permanently delete its connections. Once deleted, the data asset and its connections cannot be restored. You can delete a data asset using the Console or the APIs.
To delete a data asset:
Using the API
For information about using the API and signing requests, see REST APIs and Security Credentials. For information about SDKs, see Software Development Kits and Command Line Interface.
Use the following operations to manage data assets:
Use the following operations to manage connections:
Use the following operations to manage schemas:
Use the following operations to manage data entities: