Skip Headers

Oracle9i Warehouse Builder User's Guide
Release 9.0.2

Part Number A95931-01
Go To Table Of Contents
Contents
Go To Index
Index

Go to previous page Go to next page

F
Using the XML Toolkit

This appendix describes how to retrieve and store data from XML documents in a target schema using the XML Toolkit. This toolkit can extract data from XML documents, which can reside in a variety of sources and formats, and store that data in Oracle8i/9i tables, CLOB database columns, Advanced Queues, and other Oracle8i/9i database objects.

An XML document conforms to the Extensible Markup Language (XML) specification. XML allows developers to design their own customized markup languages which can be optimized for delivery of documents on the World Wide Web and to support implementation of e-commerce and numerous other applications.

The XML Toolkit is a set of PL/SQL procedures, functions, and packages that you can use to retrieve and load data from XML documents into Oracle8i/9i database objects. For example, you can retrieve data from an XML document that resides in a file and load it into several tables that reside in a data warehouse.

Retrieving Data From Sources

You can retrieve data from XML documents that reside in the following:

You identify the source of the data to the XML Toolkit using an element name defined by the Toolkit. For example, you identify a file with the element name file. Subsequent sections and examples describe individual element names, their respective attributes (if any), and the required syntax.

Storing Data in Targets

The XML Toolkit can store data retrieved from XML documents in a variety of Oracle8i/9i database objects. The Toolkit can store data in the following objects:

You identify the target for the data using an element name defined by the Toolkit. For example, you identify a target table with the element name target.

The element name target has several attributes that provide control over runtime operations. For example, if the data extracted from the XML document does not match the target table, you can specify an XSL style sheet as an attribute that reformats the data accordingly. Subsequent sections and examples describe the individual element names, their respective attributes, and the required syntax.

Using Runtime Controls

During the extraction and load process, you can control when to commit the load as well as control the size of individual batches. These controls are specified using the element name runtimeConfig.

How to Call the XML Toolkit

The XML Toolkit is implemented as a set of PL/SQL procedures and functions. To use the Toolkit, you must first create a Warehouse Builder transformation that invokes one of its procedures or functions. After the transformation has been created, a mapping can call the transformation using a pre-map or post-map trigger.

A typical scenario would be to create a transformation that extracts and loads data from an XML document into a staging table. The transformation could be generalized by referencing the XML document as a runtime parameter. After you create the transformation, you could then create a mapping that uses a pre-map trigger to call the transformation to load the staging table. The mapping could then transform the data in the staging table and load it into a target table. A post-map trigger could in turn truncate the staging table before the mapping terminates.

Two Entrances

A transformation can invoke the XML Toolkit using one of the following entry points into the API:

Both of these calls extract and load data from XML documents into database targets. The function, however, returns the number of documents read during the operation. The control file, itself an XML document, specifies the source of the XML documents, the targets, and any runtime controls.

After the transformation has been defined, a mapping typically calls the transformation as a pre-map or post-map trigger.

The following example illustrates a script that can be used to implement an Warehouse Builder transformation which extracts data from an XML document stored in the file products.xml and loads it into the target table books.

   begin

    wb_xml_load(

'<OWBXMLRuntime>'

||

    '<XMLSource>'

    '     <file>\ora817\GCCAPPS\products.xml</file>'

    '</XMLSource>'                          

    '<targets> '

    '     <target XSLFile="\ora817\XMLstyle\GCC.xsl">books</target>'

    '</targets>'

||

||

||

||

||

||

'</OWBXMLRuntime>');

   end;

The control file itself is an XML document, and the element name OWBXMLRuntime defines the top-level, or root element, of this document. The remaining element names which define the document sources and targets are self-explanatory.

After you create and name this transformation, you can reference it by name in a mapping. The most common reference for the transformation would be in a pre-map or post-map trigger.


Notes:

  • The file names for the document source and the XSL style sheet are relative to the node that supports the Oracle8i/9i instance.

  • The white space in the example control file that separates the text from the concatenate characters is unnecessary and is included only to improve readability.

  • The control file is a well-formed XML document. See the control file's Document Type Definition (DTD).


Typical Control Files

The nine example control files in this section are well-formed and valid control files that extract and load data from the following sources into a database target:

  1. A single file

  2. A single file whose element names must be renamed

  3. A single file to multiple targets

  4. Multiple files

  5. A large file split into several parts

  6. CLOB column

  7. URL

  8. Raw Advanced Queue

  9. Object Type Based Advanced Queue

The first five examples extract data from XML documents that reside in files; the remaining cases extract data from database objects or a document addressed by a URL. The examples cover all the sources that are managed by the XML Toolkit.

XML Documents Stored in Files

The XML Toolkit can extract data from documents stored in a single file or in multiple files when they reside in the same directory. The Toolkit can also extract data from these sources and store it into multiple tables.

Often, the data in an XML document fails to match the target object in which case you can reformat the data to match the target by including an XSL style sheet. The second example shows you how to reference an XSL style sheet by specifying a value for the XSLFile attribute for the target. The use of style sheets to reformat the source data is probably the most common case as the data from XML documents rarely match the column names of target tables in a data warehouse.

Finally, you can improve load efficiency for large XML documents by splitting the documents into parts and performing a separate load operation for each part. This is accomplished by specifying a value for the splitElement attribute for the source file.

When the element names of an XML document exactly match the column names in a target table and the document resides in a file, you can easily create a control file to extract and load data from the document into the target table.

The XML Document

The XML document below (ORDERS) resides in a file named \ora817\examples\ex1.xml. This file name is relative to the Oracle8i/9i database instance.

The Target Table

The column names in the target table (PURCHASE_ORDERS) described by the DDL below match the element names of the XML document described above.


Go to previous page Go to next page
Oracle
Copyright © 1996, 2001 Oracle Corporation.

All Rights Reserved.
Go To Table Of Contents
Contents
Go To Index
Index