This chapter describes how to work with XML files in Oracle Data Integrator.
This chapter includes the following sections:
Oracle Data Integrator supports XML files integration through the Oracle Data Integrator Driver for XML.
The XML concepts map the Oracle Data Integrator concepts as follows: An XML file corresponds to a data server in Oracle Data Integrator. Within this data server, a single schema maps the content of the XML file.The Oracle Data Integrator Driver for XML (XML driver) loads the hierarchical structure of the XML file into a relational schema. This relational schema is a set of tables located in the schema that can be queried or modified using SQL. The XML driver is also able to unload the relational schema back in the XML file.The relational schema is reverse-engineered as a data model in ODI, with tables, columns, and constraints. This model is used like a normal relational data model in ODI. If the modified data within the relational schema needs to be written back to the XML file, the XML driver provides the capability to synchronize the relational schema into the file.
See Appendix B, "Oracle Data Integrator Driver for XML Reference" for more information on this driver.
Oracle Data Integrator provides the IKM XML Control Append for handling XML data. This Knowledge Module is a specific XML Knowledge Module. It has a specific option to synchronize the data from the relational schema to the file.
In addition to this KM, you can also use an XML data server as any SQL data server. XML data servers support both the technology-specific KMs sourcing or targeting SQL data servers, as well as the generic KMs. See Chapter 4, "Generic SQL" or the technology chapters for more information on these KMs.
Make sure you have read the information in this section before you start using the XML Knowledge Module:
Before performing any installation you should read the system requirements and certification documentation to ensure that your environment meets the minimum installation requirements for the products you are installing.
The list of supported platforms and versions is available on Oracle Technical Network (OTN):
There are no technology-specific requirements for using XML Files in Oracle Data Integrator.
This section lists the requirements for connecting to XML database.
Oracle Data Integrator Driver for XML
XML files are accessed through the Oracle Data Integrator Driver for XML. This JDBC driver is installed with Oracle Data Integrator and requires no other installed component or configuration.
You must ask the system administrator for the following connection information:
The location of the DTD or XSD file associated with your XML file
The location of the XML file
Setting up the topology consists in:
An XML data server corresponds to one XML file that is accessible to Oracle Data Integrator.
Create a data server for the XML technology using the standard procedure, as described in "Creating a Data Server" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator. This section details only the fields required or specific for defining a File data server:
In the Definition tab:
Name: Name of the data server that will appear in Oracle Data Integrator.
User/Password: These fields are not used for XML data servers.
In the JDBC tab, enter the values according to the driver used:
Table 5-1 lists the key properties of the Oracle Data Integrator Driver for XML. These properties can be specified in the JDBC URL.
See Appendix B, "Oracle Data Integrator Driver for XML Reference" for a detailed description of these properties and for a comprehensive list of all properties.
Table 5-1 JDBC Driver Properties
<XML File location>
XML file name. Use slash "/" in the path name instead of back slash "\". It is possible to use an HTTP, FTP or File URL to locate the file. Files located by URL are read-only.
<DTD/XSD File location>
Description file: This file may be a DTD or XSD file. It is possible to use an HTTP, FTP or File URL to locate the file. Files located by URL are read-only.
Note that when no DTD or XSD file is present, the relational schema is built using only the XML file content. It is not recommended to reverse-engineer the data model from such a structure as one XML file instance may not contain all the possible elements described in the DTD or XSD, and data model may be incomplete.
Name of the element to take as the root table of the schema. This value is case sensitive. This property can be used for reverse-engineering for example a specific message definition from a WSDL file, or when several possible root elements exist in a XSD file.
true | false
If true, the XML file is opened in read only mode.
Name of the relational schema where the XML file will be loaded. If this property is missing, a schema named after the five first letters of the XML file name will automatically be created.
This schema will be selected when creating the physical schema under the XML data server.
true | false
Load the XML file in case sensitive or insensitive mode. For case insensitive mode, all element names in the DTD file should be distinct (For example: Abc and abc in the same file will result in name collisions).
The following examples illustrate these properties:
Connects to the
PROD20100125_001.xml file described by
products.xsd in the PRODUCTS schema.
Connects in read-only mode to the
staff_internal.xml file described by
staff_internal.dtd in read-only mode. The schema name will be
Create an XML physical schema using the standard procedure, as described in "Creating a Physical Schema" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.
The schema name that you have set on the URL will be preset. Select this schema for both the Data Schema and Work Schema.
Create for this physical schema a logical schema using the standard procedure, as described in "Creating a Logical Schema" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator and associate it in a given context.
Setting up a Project using the XML database follows the standard procedure. See "Creating an Integration Project" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.
The recommended knowledge modules to import into your project for getting started with XML are the following:
LKM SQL to SQL
LKM File to SQL
IKM XML Control Append
This section contains the following topics:
An XML file model groups a set of datastores. Each datastore typically represents an element in the XML file.
Create an XML Model using the standard procedure, as described in "Creating a Model" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator. Select the XML technology and the XML logical schema created when configuring the topology.
XML supports standard reverse-engineering, which uses only the abilities of the XML driver.
It is recommended to reference a DTD or XSD file in the dtd or d parameters of the URL to reverse-engineer the structure from a generic description of the XML file structure. Reverse-engineering can use an XML instance file if no XSD or DTD is available. In this case, the relational schema structure will be inferred from the data contained in the XML file.
To perform a Standard Reverse- Engineering on XML use the usual procedure, as described in "Reverse-engineering a Model" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.
The standard reverse-engineering process will automatically reverse-engineer the table from the relational schema generated by the XML driver. Note that these tables automatically include:
Primary keys (PK columns) to preserve parent-child elements relationships
Foreign keys (FK columns) to preserve parent-child elements relationships
Order identifier (ORDER columns) to preserve the order of elements in the XML file
These extra columns enable the mapping of the hierarchical XML structure into the relational schema. See XML to SQL Mapping in the Appendix B, "Oracle Data Integrator Driver for XML Reference" for more information.
You can use XML as a source or a target of an integration interface.
The KM choice for an interface or a check determines the abilities and performances of this interface or check. The recommendations in this section help in the selection of the KM for different situations concerning an XML data server.
Read carefully these notes before working with XML in integration interfaces.
When using a datastore of an XML model as a target of an interface, you must make sure to load the driver-generated columns that are used for preserving the parent-child relationships and the order in the XML hierarchy. For example, if filling records for the
region element into an XML structure as shown in Example 5-1, that correspond to a REGION table in the relational schema, you should load the columns REGION_ID and REGION_NAME of the REGION table. These two columns correspond to XML attributes.
<country COUNTRY_ID="6" COUNTRY_NAME="Australia"> <region REGION_ID="72" REGION_NAME="Queensland"> </country>
In Example 5-1 you must also load the following additional columns that are automatically created by the XML Driver in the REGION table:
REGIONPK: This column enables you to identify each
REGIONORDER: This column enables you to order the
<region> elements in the XML file (records are not ordered in a relational schema, whereas XML elements are ordered).
COUNTRYFK: This columns enables you to put the
<region> element in relation with the
<country> parent element. This value is equal to the COUNTRY.COUNTRYPK value for the Australia record in the COUNTRY table.
To ensure a perfect synchronization of the data in an XML file and the data in the XML schema, the following commands have to be called:
Before using the tables of an XML model, either to read or update data, it is recommended that you use the
SYNCHRONIZE FROM FILE command on the XML logical schema. This operation reloads the XML hierarchical data in the relational XML schema. The schema is loaded in the built-in or external database storage when first accessed. Subsequent changes made to the file are not automatically synchronized into the schema unless you issue this command.
After performing changes in the relational schema, you must unload this schema into the XML hierarchical data by calling the
SYNCHRONIZE ALL or
SYNCHRONIZE FROM DATABASE commands on the XML Logical Schema. The IKM XML Control Append implements this synchronize command.
These commands must be executed in procedures in the packages before (and after) the interfaces and procedures manipulating the XML schema.
See Appendix B, "Oracle Data Integrator Driver for XML Reference" for more information on these commands.
Large XML files can be handled with high performance with Oracle Data Integrator.
The default driver configuration stores the relational schema in a built-in engine in memory. It is recommended to consider the use of external database storage for handling large XML files.
See Section B.2.3.1, "Schema Storage" for more information on these commands.
An XML file can be used as an interface's source or target. The LKM choice in the Interface Flow tab that is used to load data between XML files and other types of data servers is essential for the performance of the interface.
Use the Generic SQL KMs or the KMs specific to the other technology involved to load data from an XML database to a target or staging area database.
Table 5-2 lists some examples of KMs that you can use to load from an XML source to a staging area:
Table 5-2 KMs to Load from XML to a Staging Area
Microsoft SQL Server
LKM SQL to MSSQL (BULK)
Uses SQL Server's bulk loader.
LKM SQL to Oracle
Faster than the Generic LKM (Uses Statistics)
LKM SQL to Sybase ASE (BCP)
Uses Sybase's bulk loader.
LKM SQL to SQL
Generic KM to load data between an ANSI SQL-92 source and an ANSI SQL-92 staging area.
It is not advised to use an XML schema as a staging area, except if XML is the target of the interface and you wish to use the target as a staging area. In this case, it might be required to load data to an XML schema.
Use the Generic SQL KMs or the KMs specific to the other technology involved to load data from a source or staging area into an XML schema.
Table 5-3 lists some examples of KMs that you can use to load from a source to an XML staging area.
XML can be used as a target of an interface. The data integration strategies in XML concern loading from the staging area to XML. The IKM choice in the Interface Flow tab determines the performances and possibilities for integrating.
The IKM XML Control Append integrates data into the XML schema and has an option to synchronize the data to the file. In addition to this KM, you can also use the Generic SQL KMs or the KMs specific to the other technology involved. Note that if using generic or technology-specific KMs, you must manually perform the synchronize operation to write the changes made in the schema to the XML file.
Table 5-4 lists some examples of KMs that you can use to integrate data:
From a staging area to an XML target
From an XML staging area to an XML target. Note that in this case the staging area is on the XML target.
This section provides information on how to troubleshoot problems that you might encounter when using XML in Oracle Data Integrator. It contains the following topics:
Errors appear often in Oracle Data Integrator in the following way:
java.sql.SQLException: No suitable driver at ... at ... ...
java.sql.SQLExceptioncode simply indicates that a query was made through the JDBC driver, which has returned an error. This error is frequently a database or driver error, and must be interpreted in this direction.
Only the part of text in bold must first be taken in account. It must be searched in the XML driver documentation. If it contains a specific error code, like here, the error can be immediately identified.
If such an error is identified in the execution log, it is necessary to analyze the SQL code send to the database to find the source of the error. The code is displayed in the description tab of the task in error.
This section describes the most common errors with XML along with the principal causes. It contains the following topics:
No suitable driver
The JDBC URL is incorrect. Check that the URL syntax is valid.
File <XML file> is already locked by another instance of the XML driver.
The XML file is locked by another user/application. Close all application that might be using the XML file. If such an application has crashed, then remove the .lck file remaining in the XML file's directory.
The DTD file "xxxxxxx.dtd" doesn't exist
This exception may occur when trying to load an XML file by the command LOAD FILE. The error message can have two causes:
The path of the DTD file is incorrect.
The corresponding XML file was already opened by another schema (during connection for instance).
Table not found: S0002 Table not found: <table name> in statement [<SQL statement>]
The table you are trying to access does not exist in the schema.
Column not found: S0022 Column not found: <column name> in statement [<SQL statement>]
The column you are trying to access does not exist in the tables specified in the statement.