Load and Unload Data From the Database

This section discusses how to move data to and from the database using CloverETL.

How CloverETL Works With Databases

CloverETL uses the JDBC API to communicate with databases. If your database has a driver supporting the JDBC API, CloverETL can be used to unload data stored within database table, or it can populate a database table with internal data.

DBConnection

Before you can connect to a database, you must define the DBConnection. This property is defined within a graph.

<DBConnection id="InterbaseDB" dbConfig="Interbase.cfg"/>

This specifies that CloverETL should set up a database connection called InterbaseDB. All required parameters (JDBC driver name, DB connect string, user name, and password) can be found in the configuration file Interbase.cfg.

The dbConfig file is a standard Java properties file. It contains names of parameters along with their values. The following table lists the possible parameters.

Parameter Name	Description of Parameter	Example of Parameter's Value
`dbDriver`	Specifies the name of the class containing the JDBC driver for your database. This class must be visible to Java (be part of `CLASSPATH`).	`org.postgresql.Driver`
`dbURL`	URL for connecting to the database, including the name of JDBC driver to use, the IP address where the server listens, the name of the database instance, and the port.	`jdbc:postgresql://192.168.1.100/mydb`
`user`	The user name under which to connect to the database.	`Admin`
`password`	The password to be used.	`free`
`driverLibrary`	Optional. The location of the JDBC driver class.	`c:\Oracle\product\10.1.0\Client_1\jdbc\lib\ojdbc14.jar\`
JDBC driver-specific parameters	Optional. Specify as needed.	Oracle example: `defaultRowPrefetch=10`

The following example lists the possible contents of a Postgres.cfg file that defines the connection to a PostgreSQL database:

dbDriver=org.postgresql.Driver

dbURL=jdbc:postgresql://192.168.1.100/mydb

user=david

password=unknown

All parameters can also be directly specified when defining the connection:

<DBConnection id="InterbaseDB" dbDriver="org.postgresql.Driver" 
dbURL="jdbc:postgresql://192.168.1.100/mydb" user="david" password="unknown"/>

The values specified with the dbConfig parameter takes precedence over parameters specified in a properties file.

Mapping JDBC Data Types to Clover Types

When working with the database through JDBC drivers, CloverETL needs to map its internal data types onto JDBC data types. The variety of DB (JDBC) field types is large, but most of them (with the exception of BLOBs) can be mapped to Clover internal types without losing any information.

JDBC to CloverETL

The following table lists JDBC data types and corresponding CloverETL data types. The conversion is done automatically by CloverETL when analyzing DB tables using the org.jetel.database.AnalyzeDB utility. This conversion can also be made manually.

JDBC (DB) Data Type	CloverETL Data Type
`INTEGER` `SMALLINT` `TINYINT`	`INTEGER`
`BIGINT`	`LONG`
`DECIMAL` `DOUBLE` `FLOAT` `NUMERIC` `REAL`	`NUMERIC`
`CHAR` `LONGVARCHAR` `VARCHAR` `OTHER`	`STRING`
`DATE` `TIME` `TIMESTAMP`	`DATE`
`BOOLEAN` `BIT`	`STRING` (true value coded as `T`; false value coded as `F`)

The following example illustrates the conversion. First, the DDL (Oracle DB) definition of the database table is presented, and then Clover's version of the same thing using its internal datatypes.

create table MYEMPLOYEE

(

  EMP_NO      NUMBER not null,

  FIRST_NAME  VARCHAR2(15) not null,

  LAST_NAME   VARCHAR2(20) not null,

  PHONE_EXT   VARCHAR2(4),

  HIRE_DATE   DATE not null,

  DEPT_NO     CHAR(3) not null,

  JOB_CODE    VARCHAR2(5) not null,

  JOB_GRADE   NUMBER(4,2) not null,

  JOB_COUNTRY VARCHAR2(15) not null,

  SALARY      NUMBER(15,2) not null,

  FULL_NAME   VARCHAR2(35)

);

<?xml version="1.0" encoding="UTF-8"?>

<!-- Automatically generated from database null -->

<Record name="EMPLOYEE" type="delimited">

    <Field name="EMP_NO" type="numeric" delimiter="," format="#"/>

    <Field name="FIRST_NAME" type="string" delimiter="," />

    <Field name="LAST_NAME" type="string" delimiter="," />

    <Field name="PHONE_EXT" type="string" nullable="yes" delimiter="," />

    <Field name="HIRE_DATE" type="date" delimiter="," format="dd/MM/yyyy" />

    <Field name="DEPT_NO" type="string" delimiter="," />

    <Field name="JOB_CODE" type="string" delimiter="," />

    <Field name="JOB_GRADE" type="numeric" delimiter="," />

    <Field name="JOB_COUNTRY" type="string" delimiter="," />

    <Field name="SALARY" type="numeric" delimiter="," />

    <Field name="FULL_NAME" type="string" nullable="yes" delimiter="\n" />

</Record>

CloverETL to JDBC

The reverse conversion from a CloverETL to JDBC data type (usually done when populating a target DB table) is also driven by JDBC data types. There are some exceptions that are caused by the non-existence of certain field types on the CloverETL side. These exceptions are handled automatically by CloverETL. Internally it is done by calling different than standard JDBC methods for populating database fields with values. Refer to the source code (org.jetel.database.CopySQLData) to get detailed information.

JDCB Type	CloverETL Type	Conversion Performed
Timestamp	Date	Date is converted to Timestamp, and the target is set using the `setTimestamp()` method.
Boolean Bit	String	If the string contains `T` or `t`, the target is set to be True; otherwise False using `setBoolean()`.
Decimal Double Numeric Real	Integer	Conversion from Integer to Decimal is made. The target is set using the `setDouble()` method.
Other (includes `NVARCHAR` and `NCHAR`	String	The target is set using the `setString()` method.

Using the AnalyzeDB Utility

The CloverETL package contains a simple utility that can analyze a source or target database table and produce Clover's metadata description file. This metadata can be used by any DB-related component.

The following table lists the parameters that can be specified with the AnalyzeDB command. The command must specify which database to connect to and which database table to analyze. You can use the same DBConnection file described previously in the DBConnection section.

To specify which table to analyze, supply an SQL query to execute against the database. The returned result set is examined for field types. As a result, you can extract and analyze a portion of table.

The following table lists the options and parameters:

Parameter	Meaning
`-dbDriver`	JDBC driver to use
`-dbURL`	Database name (URL)
`-config`	Config or Property file containing parameters
`-user`	User name
`-password`	User's password
`-d`	Delimiter to use (a comma `,` is standard)
`-o`	Output file to use (`stdout` is standard)
`-f`	Read SQL query from file name
`-q`	SQL query on command line
`-info`	Displays list of driver's properties

The following example examines all data fields of the employees DB table:

java -cp cloverETL.rel-1-x.zip org.jetel.database.AnalyzeDB -config postgres.sql -q "select * from employees where 1=0"

The following example extracts specific fields, as stated in the SQL query:

java -cp cloverETL.rel-1-x.zip org.jetel.database.AnalyzeDB -config postgres.sql -q "select emp_no,full_name from employees where 1=0"

DBInputTable Component

To unload data from the database table, use the DBInputTable component. It requires that the dbConnection parameter be specified and an SQL command (sqlQuery parameter), which will be executed against the database specified by dbConnection.

Individual fields fetched from the database are mapped to Clover data records/fields. (See the JDBC to CloverETL). The structure of the Clover record is determined by specified Clover metadata. (Metadata is assigned to an Edge, which connects DBInputTable with other components connected to DBInputTable.)

The following example transformation graph uses the DBInputTable component:

<?xml version="1.0" encoding="UTF-8"?>

<Graph name="TestingDB">

<Global>

<Metadata id="InMetadata" fileURL="metadata/employee.fmt"/>

<DBConnection id="PosgressDB" dbConfig="Posgress.cfg"/>

</Global>

<Phase number="0">

<Node id="INPUT" type="DB_INPUT_TABLE" dbConnection="PosgressDB"

sqlQuery="select * from employee"/>

<Node id="OUTPUT" type="DELIMITED_DATA_WRITER_NIO" append="false"

fileURL="employees2.list.out"/>

<Edge id="INEDGE" fromNode="INPUT:0" toNode="OUTPUT:0"

metadata="InMetadata"/>

</Phase>

</Graph>

The SQL command (sqlQuery) can be more complicated than the previous example suggests. You can use any valid SQL construct, but make sure that the metadata corresponds to the number and types of returned data fields.

DBOutputTable Component

When there is a need to populate a database table with data coming from a CloverETL transformation graph, the DBOutputTable component can be used to fulfill it. It is complementary to DBInputTable. It maps CloverETL data records and individual fields to target database table fields. It can perform simple data conversions to successfully map CloverETL basic data types on to target database variants. See the previous CloverETL to JDBC.

The following example illustrates the usage of DBOutputTable:

<?xml version="1.0" encoding="UTF-8"?>

<Graph name="TestingDB2">

<Global>

<Metadata id="InMetadata" fileURL="metadata/myemployee.fmt"/>

<DBConnection id="PosgressDB" dbConfig="posgress.cfg"/>

</Global>

<Phase number="0">

<Node id="INPUT" type="DELIMITED_DATA_READER_NIO"

fileURL="employees.list.dat" />

<Node id="OUTPUT" type="DB_OUTPUT_TABLE" dbConnection="PosgressDB"

dbTable="myemployee" />

<Edge id="INEDGE" fromNode="INPUT:0" toNode="OUTPUT:0"

metadata="InMetadata"/>

</Phase>

</Graph>

If you need to populate only certain fields of the target DB table (when, for instance, one field is automatically populated from a DB sequence), the dbFields parameter of DBOutputTable can be used:

<Node id="OUTPUT2" type="DB_OUTPUT_TABLE" dbConnection="PosgressDB" 
dbTable="myemployee" dbFields="FIRST_NAME;LAST_NAME" />

The DBOutputTable cloverFields parameter can be used to precisely specify mapping from CloverETL data records to database table records. It allows you to specify which source field (from Clover) is mapped to which target database table field.

Coupled with dbFields, it specifies a 1:1 mapping. Individual fields are mapped according to the order in which they appear in dbFields and cloverFields, respectively. The parameter that determines how many fields will be populated is always dbFields. When there is no dbFields parameter present, CloverETL assumes that all target fields should be populated in the order in which they appear in the target database table.

The following examples illustrate how to pick certain fields from the source data record (a CloverETL record), regardless of their order, and map them to target database table fields (again, regardless of their order).

<?xml version="1.0" encoding="UTF-8"?>

<Graph name="TestingDB3">

<Global>

<Metadata id="InMetadata" fileURL="metadata/myemployee.fmt"/>

<DBConnection id="PosgressDB" dbConfig="posgress.cfg"/>

</Global>

<Phase number="1">

<Node id="INPUT" type="DELIMITED_DATA_READER_NIO"

fileURL="employees2.list.tmp" />

<Node id="OUTPUT" type="DB_OUTPUT_TABLE" dbConnection="InterbaseDB"

dbTable="myemployee"

    dbFields="FIRST_NAME;LAST_NAME"

    cloverFields="LAST_NAME;FIRST_NAME" />

<Edge id="INEDGE" fromNode="INPUT:0" toNode="OUTPUT:0"

metadata="InMetadata"/>

</Phase>

</Graph>

The resulting mapping between fields specified in the previous example is:

Source Field (CloverETL)	Target Field (DB Table)
`LAST_NAME`	`FIRST_NAME`
`FIRST_NAME`	`LAST_NAME`

Executing SQL/DML/DDL Statements against DB

Sometimes you need to execute one or more database commands that do not require any input. Examples include creating a new table, adding a data partition, and dropping an index. For this purpose, CloverETL offers the DBExecute component.

DBExecute Component

The DBExecute component takes specified commands and executes them one by one against the database. You can define whether all commands form one transaction, or whether they should be committed to the database after each command.

The following is a simple example of DBExecute:

<?xml version="1.0" encoding="UTF-8"?>

<Graph name="TestingExecute">

<Global>

<DBConnection id="InterbaseDB" dbConfig="interbase.cfg"/>

</Global>

<Phase number="0">

<Node id="DBEXEC" type="DB_EXECUTE" dbConnection="InterbaseDB"

inTransaction="N">

<SQLCode>

create table EMPLOYEE

(

  EMP_NO      NUMBER not null,

  FIRST_NAME  VARCHAR2(15) not null,

  LAST_NAME   VARCHAR2(20) not null,

  PHONE_EXT   VARCHAR2(4),

  HIRE_DATE   DATE not null,

  DEPT_NO     CHAR(3) not null,

  JOB_CODE    VARCHAR2(5) not null,

  JOB_GRADE   NUMBER(4,2) not null,

  JOB_COUNTRY VARCHAR2(15) not null,

  SALARY      NUMBER(15,2) not null,

  FULL_NAME   VARCHAR2(35)

);

insert into employee values(2,'Robert','Nelson','250',28/12/1988,'600','VP',2.0,
'USA',105900.0,'Nelson, Robert');

insert into employee values(4,'Bruce','Young','233',28/12/1988,'621','Eng',2.0,
'USA',97500.0,'Young, Bruce');

insert into employee values(5,'Kim','Lambert','22',06/02/1989,'130','Eng',2.0,
'USA', 102750.0,'Lambert, Kim');

insert into employee values(8,'Leslie','Johnson','410',05/04/1989,'180','Mktg',
3.0,'USA', 64635.0,'Johnson, Leslie');

insert into employee values(9,'Phil','Forest','229',17/04/1989,'622','Mngr',3.0,'USA',75060.0,'Fores /
t,

Phil');

</SQLCode>

</Node>

</Phase>

</Graph>

Skip Navigation Links
Exit Print View
	Oracle Identity Analytics Business Administrator's Guide 11g Release 1