Skip Headers
Oracle® Fusion Middleware Connectivity and Knowledge Modules Guide for Oracle Data Integrator
11g Release 1 (11.1.1)

Part Number E12644-07
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

5 XML Files

This chapter describes how to work with XML files in Oracle Data Integrator.

This chapter includes the following sections:

5.1 Introduction

Oracle Data Integrator supports XML files integration through the Oracle Data Integrator Driver for XML.

5.1.1 Concepts

The XML concepts map the Oracle Data Integrator concepts as follows: An XML file corresponds to a data server in Oracle Data Integrator. Within this data server, a single schema maps the content of the XML file.The Oracle Data Integrator Driver for XML (XML driver) loads the hierarchical structure of the XML file into a relational schema. This relational schema is a set of tables located in the schema that can be queried or modified using SQL. The XML driver is also able to unload the relational schema back in the XML file.The relational schema is reverse-engineered as a data model in ODI, with tables, columns, and constraints. This model is used like a normal relational data model in ODI. If the modified data within the relational schema needs to be written back to the XML file, the XML driver provides the capability to synchronize the relational schema into the file.

See Appendix B, "Oracle Data Integrator Driver for XML Reference" for more information on this driver.

5.1.2 Knowledge Modules

Oracle Data Integrator provides the IKM XML Control Append for handling XML data. This Knowledge Module is a specific XML Knowledge Module. It has a specific option to synchronize the data from the relational schema to the file.

In addition to this KM, you can also use an XML data server as any SQL data server. XML data servers support both the technology-specific KMs sourcing or targeting SQL data servers, as well as the generic KMs. See Chapter 4, "Generic SQL" or the technology chapters for more information on these KMs.

5.2 Installation and Configuration

Make sure you have read the information in this section before you start using the XML Knowledge Module:

5.2.1 System Requirements

Before performing any installation you should read the system requirements and certification documentation to ensure that your environment meets the minimum installation requirements for the products you are installing.

The list of supported platforms and versions is available on Oracle Technical Network (OTN):

http://www.oracle.com/technology/products/oracle-data-integrator/index.html.

5.2.2 Technologic Specific Requirements

There are no technology-specific requirements for using XML Files in Oracle Data Integrator.

5.2.3 Connectivity Requirements

This section lists the requirements for connecting to XML database.

Oracle Data Integrator Driver for XML

XML files are accessed through the Oracle Data Integrator Driver for XML. This JDBC driver is installed with Oracle Data Integrator and requires no other installed component or configuration.

You must ask the system administrator for the following connection information:

  • The location of the DTD or XSD file associated with your XML file

  • The location of the XML file

5.3 Setting up the Topology

Setting up the topology consists in:

  1. Creating an XML Data Server

  2. Creating a Physical Schema for XML

5.3.1 Creating an XML Data Server

An XML data server corresponds to one XML file that is accessible to Oracle Data Integrator.

5.3.1.1 Creation of the Data Server

Create a data server for the XML technology using the standard procedure, as described in "Creating a Data Server" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator. This section details only the fields required or specific for defining a File data server:

  1. In the Definition tab:

    • Name: Name of the data server that will appear in Oracle Data Integrator.

    • User/Password: These fields are not used for XML data servers.

  2. In the JDBC tab, enter the values according to the driver used:

    • JDBC Driver: com.sunopsis.jdbc.driver.xml.SnpsXmlDriver

    • JDBC URL: jdbc:snps:xml?[property=value&property=value...]

    Table 5-1 lists the key properties of the Oracle Data Integrator Driver for XML. These properties can be specified in the JDBC URL.

    See Appendix B, "Oracle Data Integrator Driver for XML Reference" for a detailed description of these properties and for a comprehensive list of all properties.

    Table 5-1 JDBC Driver Properties

    Property Value Notes

    f

    <XML File location>

    XML file name. Use slash "/" in the path name instead of back slash "\". It is possible to use an HTTP, FTP or File URL to locate the file. Files located by URL are read-only.

    d

    <DTD/XSD File location>

    Description file: This file may be a DTD or XSD file. It is possible to use an HTTP, FTP or File URL to locate the file. Files located by URL are read-only.

    Note that when no DTD or XSD file is present, the relational schema is built using only the XML file content. It is not recommended to reverse-engineer the data model from such a structure as one XML file instance may not contain all the possible elements described in the DTD or XSD, and data model may be incomplete.

    re

    <Root element>

    Name of the element to take as the root table of the schema. This value is case sensitive. This property can be used for reverse-engineering for example a specific message definition from a WSDL file, or when several possible root elements exist in a XSD file.

    ro

    true | false

    If true, the XML file is opened in read only mode.

    s

    <schema name>

    Name of the relational schema where the XML file will be loaded. If this property is missing, a schema named after the five first letters of the XML file name will automatically be created.

    This schema will be selected when creating the physical schema under the XML data server.

    cs

    true | false

    Load the XML file in case sensitive or insensitive mode. For case insensitive mode, all element names in the DTD file should be distinct (For example: Abc and abc in the same file will result in name collisions).


    The following examples illustrate these properties:

    Connects to the PROD20100125_001.xml file described by products.xsd in the PRODUCTS schema.

    jdbc:snps:xml?f=/xml/PROD20100125_001.xml&d=/xml/products.xsd&s=PRODUCTS
    
    

    Connects in read-only mode to the staff_internal.xml file described by staff_internal.dtd in read-only mode. The schema name will be staff.

    jdbc:snps:xml?f=/demo/xml/staff_internal.xml&d=/demo/xml/staff_internal.dtd&ro=true&s=staff
    

5.3.2 Creating a Physical Schema for XML

Create an XML physical schema using the standard procedure, as described in "Creating a Physical Schema" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.

The schema name that you have set on the URL will be preset. Select this schema for both the Data Schema and Work Schema.

Create for this physical schema a logical schema using the standard procedure, as described in "Creating a Logical Schema" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator and associate it in a given context.

5.4 Setting Up an Integration Project

Setting up a Project using the XML database follows the standard procedure. See "Creating an Integration Project" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.

The recommended knowledge modules to import into your project for getting started with XML are the following:

5.5 Creating and Reverse-Engineering a XML File

This section contains the following topics:

5.5.1 Create an XML Model

An XML file model groups a set of datastores. Each datastore typically represents an element in the XML file.

Create an XML Model using the standard procedure, as described in "Creating a Model" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator. Select the XML technology and the XML logical schema created when configuring the topology.

5.5.2 Reverse-Engineering an XML Model

XML supports standard reverse-engineering, which uses only the abilities of the XML driver.

It is recommended to reference a DTD or XSD file in the dtd or d parameters of the URL to reverse-engineer the structure from a generic description of the XML file structure. Reverse-engineering can use an XML instance file if no XSD or DTD is available. In this case, the relational schema structure will be inferred from the data contained in the XML file.

Standard Reverse-Engineering

To perform a Standard Reverse- Engineering on XML use the usual procedure, as described in "Reverse-engineering a Model" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.

The standard reverse-engineering process will automatically reverse-engineer the table from the relational schema generated by the XML driver. Note that these tables automatically include:

  • Primary keys (PK columns) to preserve parent-child elements relationships

  • Foreign keys (FK columns) to preserve parent-child elements relationships

  • Order identifier (ORDER columns) to preserve the order of elements in the XML file

These extra columns enable the mapping of the hierarchical XML structure into the relational schema. See XML to SQL Mapping in the Appendix B, "Oracle Data Integrator Driver for XML Reference" for more information.

5.6 Designing an Interface

You can use XML as a source or a target of an integration interface.

The KM choice for an interface or a check determines the abilities and performances of this interface or check. The recommendations in this section help in the selection of the KM for different situations concerning an XML data server.

5.6.1 Notes about XML Interfaces

Read carefully these notes before working with XML in integration interfaces.

5.6.1.1 Targeting an XML Structure

When using a datastore of an XML model as a target of an interface, you must make sure to load the driver-generated columns that are used for preserving the parent-child relationships and the order in the XML hierarchy. For example, if filling records for the region element into an XML structure as shown in Example 5-1, that correspond to a REGION table in the relational schema, you should load the columns REGION_ID and REGION_NAME of the REGION table. These two columns correspond to XML attributes.

Example 5-1 XML Structure

<country COUNTRY_ID="6" COUNTRY_NAME="Australia">
    <region REGION_ID="72" REGION_NAME="Queensland">
</country>

In Example 5-1 you must also load the following additional columns that are automatically created by the XML Driver in the REGION table:

  • REGIONPK: This column enables you to identify each <region> element.

  • REGIONORDER: This column enables you to order the <region> elements in the XML file (records are not ordered in a relational schema, whereas XML elements are ordered).

  • COUNTRYFK: This columns enables you to put the <region> element in relation with the <country> parent element. This value is equal to the COUNTRY.COUNTRYPK value for the Australia record in the COUNTRY table.

5.6.1.2 Synchronizing XML File and Schema

To ensure a perfect synchronization of the data in an XML file and the data in the XML schema, the following commands have to be called:

  • Before using the tables of an XML model, either to read or update data, it is recommended that you use the SYNCHRONIZE FROM FILE command on the XML logical schema. This operation reloads the XML hierarchical data in the relational XML schema. The schema is loaded in the built-in or external database storage when first accessed. Subsequent changes made to the file are not automatically synchronized into the schema unless you issue this command.

  • After performing changes in the relational schema, you must unload this schema into the XML hierarchical data by calling the SYNCHRONIZE ALL or SYNCHRONIZE FROM DATABASE commands on the XML Logical Schema. The IKM XML Control Append implements this synchronize command.

These commands must be executed in procedures in the packages before (and after) the interfaces and procedures manipulating the XML schema.

See Appendix B, "Oracle Data Integrator Driver for XML Reference" for more information on these commands.

5.6.1.3 Handling Large XML Files

Large XML files can be handled with high performance with Oracle Data Integrator.

The default driver configuration stores the relational schema in a built-in engine in memory. It is recommended to consider the use of external database storage for handling large XML files.

See Section B.2.3.1, "Schema Storage" for more information on these commands.

5.6.2 Loading Data from and to XML

An XML file can be used as an interface's source or target. The LKM choice in the Interface Flow tab that is used to load data between XML files and other types of data servers is essential for the performance of the interface.

5.6.2.1 Loading Data from an XML Schema

Use the Generic SQL KMs or the KMs specific to the other technology involved to load data from an XML database to a target or staging area database.

Table 5-2 lists some examples of KMs that you can use to load from an XML source to a staging area:

Table 5-2 KMs to Load from XML to a Staging Area

Staging Area KM Notes

Microsoft SQL Server

LKM SQL to MSSQL (BULK)

Uses SQL Server's bulk loader.

Oracle

LKM SQL to Oracle

Faster than the Generic LKM (Uses Statistics)

Sybase

LKM SQL to Sybase ASE (BCP)

Uses Sybase's bulk loader.

All

LKM SQL to SQL

Generic KM to load data between an ANSI SQL-92 source and an ANSI SQL-92 staging area.


5.6.2.2 Loading Data to an XML Schema

It is not advised to use an XML schema as a staging area, except if XML is the target of the interface and you wish to use the target as a staging area. In this case, it might be required to load data to an XML schema.

Use the Generic SQL KMs or the KMs specific to the other technology involved to load data from a source or staging area into an XML schema.

Table 5-3 lists some examples of KMs that you can use to load from a source to an XML staging area.

Table 5-3 KMs to Load to an XML Schema

Source KM Notes

File

LKM File to SQL

Generic KM to load a file in a ANSI SQL-92 staging area.

All

LKM SQL to SQL

Generic KM to load data between an ANSI SQL-92 source and an ANSI SQL-92 staging area.


5.6.3 Integrating Data in XML

XML can be used as a target of an interface. The data integration strategies in XML concern loading from the staging area to XML. The IKM choice in the Interface Flow tab determines the performances and possibilities for integrating.

The IKM XML Control Append integrates data into the XML schema and has an option to synchronize the data to the file. In addition to this KM, you can also use the Generic SQL KMs or the KMs specific to the other technology involved. Note that if using generic or technology-specific KMs, you must manually perform the synchronize operation to write the changes made in the schema to the XML file.

Table 5-4 lists some examples of KMs that you can use to integrate data:

  • From a staging area to an XML target

  • From an XML staging area to an XML target. Note that in this case the staging area is on the XML target.

Table 5-4 KMs to Integrate Data in an XML File

Mode Staging Area KM Notes

Update

XML

IKM SQL Incremental Update

Generic KM

Append

XML

IKM SQL Control Append

Generic KM

Append

All RDBMS

IKM SQL to SQL Append

Generic KM


5.7 Troubleshooting

This section provides information on how to troubleshoot problems that you might encounter when using XML in Oracle Data Integrator. It contains the following topics:

5.7.1 Detect the Errors Coming from XML

Errors appear often in Oracle Data Integrator in the following way:

java.sql.SQLException: No suitable driver
at ... 
at ... 
...

the java.sql.SQLExceptioncode simply indicates that a query was made through the JDBC driver, which has returned an error. This error is frequently a database or driver error, and must be interpreted in this direction.

Only the part of text in bold must first be taken in account. It must be searched in the XML driver documentation. If it contains a specific error code, like here, the error can be immediately identified.

If such an error is identified in the execution log, it is necessary to analyze the SQL code send to the database to find the source of the error. The code is displayed in the description tab of the task in error.

5.7.2 Common Errors

This section describes the most common errors with XML along with the principal causes. It contains the following topics:

  • No suitable driver

    The JDBC URL is incorrect. Check that the URL syntax is valid.

  • File <XML file> is already locked by another instance of the XML driver.

    The XML file is locked by another user/application. Close all application that might be using the XML file. If such an application has crashed, then remove the .lck file remaining in the XML file's directory.

  • The DTD file "xxxxxxx.dtd" doesn't exist

    This exception may occur when trying to load an XML file by the command LOAD FILE. The error message can have two causes:

    • The path of the DTD file is incorrect.

    • The corresponding XML file was already opened by another schema (during connection for instance).

  • Table not found: S0002 Table not found: <table name> in statement [<SQL statement>]

    The table you are trying to access does not exist in the schema.

  • Column not found: S0022 Column not found: <column name> in statement [<SQL statement>]

    The column you are trying to access does not exist in the tables specified in the statement.