Skip Navigation Links | |
Exit Print View | |
Oracle Identity Analytics Business Administrator's Guide 11g Release 1 |
1. Oracle Identity Analytics Identity Warehouse
2. Oracle Identity Analytics Importing
3. Oracle Identity Analytics ETL Process
Oracle Identity Analytics CloverETL Extensions
Oracle Identity Analytics ETL Reference
DelimitedDataReader and DelimitedDataWriter
Load and Unload Data From the Database
How CloverETL Works With Databases
Mapping JDBC Data Types to Clover Types
CloverETL DataRecord Reference
How Data is Represented Within CloverETL
Specification of Record Format
Field Formats and Other Features
Specifying Default Values for Fields
4. Oracle Identity Analytics Data Correlation
5. Oracle Identity Analytics Role Engineering and Management
6. Oracle Identity Analytics Workflows
7. Oracle Identity Analytics Identity Certifications
8. Oracle Identity Analytics Identity Audit
9. Oracle Identity Analytics Reports
10. Oracle Identity Analytics Scheduling
11. Oracle Identity Analytics Configuration
This section discusses how to move data to and from the database using CloverETL.
CloverETL uses the JDBC API to communicate with databases. If your database has a driver supporting the JDBC API, CloverETL can be used to unload data stored within database table, or it can populate a database table with internal data.
Before you can connect to a database, you must define the DBConnection. This property is defined within a graph.
<DBConnection id="InterbaseDB" dbConfig="Interbase.cfg"/>
This specifies that CloverETL should set up a database connection called InterbaseDB. All required parameters (JDBC driver name, DB connect string, user name, and password) can be found in the configuration file Interbase.cfg.
The dbConfig file is a standard Java properties file. It contains names of parameters along with their values. The following table lists the possible parameters.
|
The following example lists the possible contents of a Postgres.cfg file that defines the connection to a PostgreSQL database:
dbDriver=org.postgresql.Driver dbURL=jdbc:postgresql://192.168.1.100/mydb user=david password=unknown
All parameters can also be directly specified when defining the connection:
<DBConnection id="InterbaseDB" dbDriver="org.postgresql.Driver" dbURL="jdbc:postgresql://192.168.1.100/mydb" user="david" password="unknown"/>
The values specified with the dbConfig parameter takes precedence over parameters specified in a properties file.
When working with the database through JDBC drivers, CloverETL needs to map its internal data types onto JDBC data types. The variety of DB (JDBC) field types is large, but most of them (with the exception of BLOBs) can be mapped to Clover internal types without losing any information.
The following table lists JDBC data types and corresponding CloverETL data types. The conversion is done automatically by CloverETL when analyzing DB tables using the org.jetel.database.AnalyzeDB utility. This conversion can also be made manually.
|
The following example illustrates the conversion. First, the DDL (Oracle DB) definition of the database table is presented, and then Clover's version of the same thing using its internal datatypes.
create table MYEMPLOYEE ( EMP_NO NUMBER not null, FIRST_NAME VARCHAR2(15) not null, LAST_NAME VARCHAR2(20) not null, PHONE_EXT VARCHAR2(4), HIRE_DATE DATE not null, DEPT_NO CHAR(3) not null, JOB_CODE VARCHAR2(5) not null, JOB_GRADE NUMBER(4,2) not null, JOB_COUNTRY VARCHAR2(15) not null, SALARY NUMBER(15,2) not null, FULL_NAME VARCHAR2(35) );
<?xml version="1.0" encoding="UTF-8"?> <!-- Automatically generated from database null --> <Record name="EMPLOYEE" type="delimited"> <Field name="EMP_NO" type="numeric" delimiter="," format="#"/> <Field name="FIRST_NAME" type="string" delimiter="," /> <Field name="LAST_NAME" type="string" delimiter="," /> <Field name="PHONE_EXT" type="string" nullable="yes" delimiter="," /> <Field name="HIRE_DATE" type="date" delimiter="," format="dd/MM/yyyy" /> <Field name="DEPT_NO" type="string" delimiter="," /> <Field name="JOB_CODE" type="string" delimiter="," /> <Field name="JOB_GRADE" type="numeric" delimiter="," /> <Field name="JOB_COUNTRY" type="string" delimiter="," /> <Field name="SALARY" type="numeric" delimiter="," /> <Field name="FULL_NAME" type="string" nullable="yes" delimiter="\n" /> </Record>
The reverse conversion from a CloverETL to JDBC data type (usually done when populating a target DB table) is also driven by JDBC data types. There are some exceptions that are caused by the non-existence of certain field types on the CloverETL side. These exceptions are handled automatically by CloverETL. Internally it is done by calling different than standard JDBC methods for populating database fields with values. Refer to the source code (org.jetel.database.CopySQLData) to get detailed information.
|
The CloverETL package contains a simple utility that can analyze a source or target database table and produce Clover's metadata description file. This metadata can be used by any DB-related component.
The following table lists the parameters that can be specified with the AnalyzeDB command. The command must specify which database to connect to and which database table to analyze. You can use the same DBConnection file described previously in the DBConnection section.
To specify which table to analyze, supply an SQL query to execute against the database. The returned result set is examined for field types. As a result, you can extract and analyze a portion of table.
The following table lists the options and parameters:
|
The following example examines all data fields of the employees DB table:
java -cp cloverETL.rel-1-x.zip org.jetel.database.AnalyzeDB -config postgres.sql -q "select * from employees where 1=0"
The following example extracts specific fields, as stated in the SQL query:
java -cp cloverETL.rel-1-x.zip org.jetel.database.AnalyzeDB -config postgres.sql -q "select emp_no,full_name from employees where 1=0"
To unload data from the database table, use the DBInputTable component. It requires that the dbConnection parameter be specified and an SQL command (sqlQuery parameter), which will be executed against the database specified by dbConnection.
Individual fields fetched from the database are mapped to Clover data records/fields. (See the JDBC to CloverETL). The structure of the Clover record is determined by specified Clover metadata. (Metadata is assigned to an Edge, which connects DBInputTable with other components connected to DBInputTable.)
The following example transformation graph uses the DBInputTable component:
<?xml version="1.0" encoding="UTF-8"?> <Graph name="TestingDB"> <Global> <Metadata id="InMetadata" fileURL="metadata/employee.fmt"/> <DBConnection id="PosgressDB" dbConfig="Posgress.cfg"/> </Global> <Phase number="0"> <Node id="INPUT" type="DB_INPUT_TABLE" dbConnection="PosgressDB" sqlQuery="select * from employee"/> <Node id="OUTPUT" type="DELIMITED_DATA_WRITER_NIO" append="false" fileURL="employees2.list.out"/> <Edge id="INEDGE" fromNode="INPUT:0" toNode="OUTPUT:0" metadata="InMetadata"/> </Phase> </Graph>
The SQL command (sqlQuery) can be more complicated than the previous example suggests. You can use any valid SQL construct, but make sure that the metadata corresponds to the number and types of returned data fields.
When there is a need to populate a database table with data coming from a CloverETL transformation graph, the DBOutputTable component can be used to fulfill it. It is complementary to DBInputTable. It maps CloverETL data records and individual fields to target database table fields. It can perform simple data conversions to successfully map CloverETL basic data types on to target database variants. See the previous CloverETL to JDBC.
The following example illustrates the usage of DBOutputTable:
<?xml version="1.0" encoding="UTF-8"?> <Graph name="TestingDB2"> <Global> <Metadata id="InMetadata" fileURL="metadata/myemployee.fmt"/> <DBConnection id="PosgressDB" dbConfig="posgress.cfg"/> </Global> <Phase number="0"> <Node id="INPUT" type="DELIMITED_DATA_READER_NIO" fileURL="employees.list.dat" /> <Node id="OUTPUT" type="DB_OUTPUT_TABLE" dbConnection="PosgressDB" dbTable="myemployee" /> <Edge id="INEDGE" fromNode="INPUT:0" toNode="OUTPUT:0" metadata="InMetadata"/> </Phase> </Graph>
If you need to populate only certain fields of the target DB table (when, for instance, one field is automatically populated from a DB sequence), the dbFields parameter of DBOutputTable can be used:
<Node id="OUTPUT2" type="DB_OUTPUT_TABLE" dbConnection="PosgressDB" dbTable="myemployee" dbFields="FIRST_NAME;LAST_NAME" />
The DBOutputTable cloverFields parameter can be used to precisely specify mapping from CloverETL data records to database table records. It allows you to specify which source field (from Clover) is mapped to which target database table field.
Coupled with dbFields, it specifies a 1:1 mapping. Individual fields are mapped according to the order in which they appear in dbFields and cloverFields, respectively. The parameter that determines how many fields will be populated is always dbFields. When there is no dbFields parameter present, CloverETL assumes that all target fields should be populated in the order in which they appear in the target database table.
The following examples illustrate how to pick certain fields from the source data record (a CloverETL record), regardless of their order, and map them to target database table fields (again, regardless of their order).
<?xml version="1.0" encoding="UTF-8"?> <Graph name="TestingDB3"> <Global> <Metadata id="InMetadata" fileURL="metadata/myemployee.fmt"/> <DBConnection id="PosgressDB" dbConfig="posgress.cfg"/> </Global> <Phase number="1"> <Node id="INPUT" type="DELIMITED_DATA_READER_NIO" fileURL="employees2.list.tmp" /> <Node id="OUTPUT" type="DB_OUTPUT_TABLE" dbConnection="InterbaseDB" dbTable="myemployee" dbFields="FIRST_NAME;LAST_NAME" cloverFields="LAST_NAME;FIRST_NAME" /> <Edge id="INEDGE" fromNode="INPUT:0" toNode="OUTPUT:0" metadata="InMetadata"/> </Phase> </Graph>
The resulting mapping between fields specified in the previous example is:
|
Sometimes you need to execute one or more database commands that do not require any input. Examples include creating a new table, adding a data partition, and dropping an index. For this purpose, CloverETL offers the DBExecute component.
The DBExecute component takes specified commands and executes them one by one against the database. You can define whether all commands form one transaction, or whether they should be committed to the database after each command.
The following is a simple example of DBExecute:
<?xml version="1.0" encoding="UTF-8"?> <Graph name="TestingExecute"> <Global> <DBConnection id="InterbaseDB" dbConfig="interbase.cfg"/> </Global> <Phase number="0"> <Node id="DBEXEC" type="DB_EXECUTE" dbConnection="InterbaseDB" inTransaction="N"> <SQLCode> create table EMPLOYEE ( EMP_NO NUMBER not null, FIRST_NAME VARCHAR2(15) not null, LAST_NAME VARCHAR2(20) not null, PHONE_EXT VARCHAR2(4), HIRE_DATE DATE not null, DEPT_NO CHAR(3) not null, JOB_CODE VARCHAR2(5) not null, JOB_GRADE NUMBER(4,2) not null, JOB_COUNTRY VARCHAR2(15) not null, SALARY NUMBER(15,2) not null, FULL_NAME VARCHAR2(35) ); insert into employee values(2,'Robert','Nelson','250',28/12/1988,'600','VP',2.0, 'USA',105900.0,'Nelson, Robert'); insert into employee values(4,'Bruce','Young','233',28/12/1988,'621','Eng',2.0, 'USA',97500.0,'Young, Bruce'); insert into employee values(5,'Kim','Lambert','22',06/02/1989,'130','Eng',2.0, 'USA', 102750.0,'Lambert, Kim'); insert into employee values(8,'Leslie','Johnson','410',05/04/1989,'180','Mktg', 3.0,'USA', 64635.0,'Johnson, Leslie'); insert into employee values(9,'Phil','Forest','229',17/04/1989,'622','Mngr',3.0,'USA',75060.0,'Fores / t, Phil'); </SQLCode> </Node> </Phase> </Graph>