XDK for PL/SQL: Specifications and Cheat Sheets , 2 of 7

XML Parser for PL/SQL

XML documents are made up of storage units called entities, which contain either parsed or unparsed data. Parsed data is made up of characters, some of which form character data, and some of which form markup. Markup encodes a description of the document's storage layout and logical structure. XML provides a mechanism to impose constraints on the storage layout and logical structure.

A software module called an XML processor is used to read XML documents and provide access to their content and structure. It is assumed that an XML processor is doing its work on behalf of another module, called the application.

Oracle XML Parser Features

The XML Parser for PL/SQL parses an XML document (or a standalone DTD) so that it can be processed by an application. Library and command-line versions are provided supporting the following standards and features:

DOM (Document Object Model) support is provided compliant with the W3C DOM 1.0 Recommendation. These APIs permit applications to access and manipulate an XML document as a tree structure in memory. This interface is used by such applications as editors.
SAX (Simple API for XML) support is also provided compliant with the SAX 1.0 specification. These APIs permit an application to process XML documents using an event-driven model.
Support is also included for XML Namespaces 1.0 thereby avoiding name collisions, increasing reusability and easing application integration.
Able to run on Oracle8i and Internet Application Server (iAS)
C and C++ versions initially available for Windows, Solaris, and Linux.

Additional features include:

Validating and non-validating operation modes
Built-in error recovery until fatal error
DOM extension APIs for document creation Oracle XSL-Transform Processors

Version 2 of the Oracle XML Parsers include an integrated XSL-Transformation (XSL-T) Processor for transforming XML data using XSL stylesheets. Using the XSL-T processor, you can transform XML documents from XML to XML, HTML, or virtually any other text-based format. These processors support the following standards and features:

Compliant with the W3C XSL Transform Proposed Recommendation 1.0
Compliant with the W3C XPath Proposed Recommendation 1.0
Integrated into the XML Parser for improved performance and scalability
Available with library and command-line interfaces for Java, C, C++, and PL/SQL

Namespace Support

The Java, C, and C++ parsers also support XML Namespaces. Namespaces are a mechanism to resolve or avoid name collisions between element types (tags) or attributes in XML documents. This mechanism provides "universal" namespace element types and attribute names whose scope extends beyond the containing document. Such tags are qualified by uniform resource identifiers (URIs), such as <oracle:EMP xmlns:oracle="http://www.oracle.com/xml"/>. For example, namespaces can be used to identify an Oracle <EMP> data element as distinct from another company's definition of an <EMP> data element. This enables an application to more easily identify elements and attributes it is designed to process. The Java, C, and C++ parsers support namespaces by being able to recognize and parse universal element types and attribute names, as well as unqualified "local" element types and attribute names.

Validating and Non-Validating Mode Support

The Java, C, and C++ parsers can parse XML in validating or non-validating modes. In non-validating mode, the parser verifies that the XML is well-formed and parses the data into a tree of objects that can be manipulated by the DOM API. In validating mode, the parser verifies that the XML is well-formed and validates the XML data against the DTD (if any). Validation involves checking whether or not the attribute names and element tags are legal, whether nested elements belong where they are, and so on.

Example Code

See Chapter 24, "Using XML Parser for PL/SQL" for example code and suggestions on how to use the XML Parsers.

IXML Parser for PL/SQL Directory Structure

The following lists the XML Parser for PL/SQL directory structure in $ORACLE_HOME/xdk/plsql/parser:

Windows NT
- license.html - copy of license agreement
- readme.html - release and installation notes
- doc\ - directory for parser apis.
- lib\ - directory for parser sql and class files
- sample\ - sample code
UNIX
- license.html -- copy of license agreement
- readme.html -- release and installation notes
- doc/ -- directory for parser apis
- lib/ -- directory for parser sql and class files
- sample/ -- sample code files

DOM and SAX APIs

XML APIs generally fall into two categories: event-based and tree-based. An event-based API (such as SAX) uses callbacks to report parsing events to the application. The application deals with these events through customized event handlers. Events include the start and end of elements and characters. Unlike tree-based APIs, event-based APIs usually do not build in-memory tree representations of the XML documents. Therefore, in general, SAX is useful for applications that do not need to manipulate the XML tree, such as search operations, among others. For example, the following XML document:

<?xml version="1.0"?>
  <EMPLIST>
    <EMP>
     <ENAME>MARTIN</ENAME>
    </EMP>
    <EMP>
     <ENAME>SCOTT</ENAME>
    </EMP>
  </EMPLIST>

Becomes a series of linear events:

start document
start element: EMPLIST
start element: EMP
start element: ENAME
characters: MARTIN
end element: EMP
start element: EMP
start element: ENAME
characters: SCOTT
end element: EMP 
end element: EMPLIST
end document

A tree-based API (such as DOM) builds an in-memory tree representation of the XML document. It provides classes and methods for an application to navigate and process the tree. In general, the DOM interface is most useful for structural manipulations of the XML tree, such as reordering elements, adding or deleting elements and attributes, renaming elements, and so on.