|Oracle9i Application Developer's Guide - XML
Release 1 (9.0.1)
Part Number A88894-01
This chapter contains the following sections:
XML data can be stored in Oracle9i in the following ways:
XML can be generated from object-relational tables and views. The benefits of using object-relational tables and views as opposed to pure relational structures are discussed below.
Generated XML is used when the XML is an interchange format and existing business data is wrapped in XML structures (tags). This is the most common way of using XML in the database. Here, XML is used only for the interchange process itself and is transient.
Examples of this kind of document include sales orders and invoices, airline flight schedules, and so on.
Oracle, with its object-relational extensions has the ability to capture the structure of the data in the database using object types, object references, and collections. There are two options for storing and preserving the structure of the XML data in an object-relational form:
Once stored generated, in the object-relational form, the data can be easily updated, queried, rearranged, and reformatted as needed using SQL.
Complex XML documents can be stored as object-relational instances and indexed efficiently. Such instances fully capture and express the nesting and list semantics of XML. With Oracle's extensibility infrastructure, new types of indices, such as path indices, can be created for faster searching through XML documents.
XML SQL Utility (XSU) provides the means to store an XML document by mapping it to the underlying object-relational storage, and conversely, provides the ability retrieve the object-relational data as an XML document.
XSU converts the result of an SQL query into XML by mapping the query alias or column names into the element tag names and preserving the nesting of object types. The result can be in text or a DOM (Document Object Model) tree. The generation of the latter avoids the overhead of parsing the text and directly realizes the DOM tree.
Oracle8i and higher support the storage of large objects or LOBs as character LOBs (CLOB), binary LOBs (BLOB), or externally stored binary files (BFILE). LOBs are used to store composed (Authored/Native) XML documents.
If the incoming XML documents do not conform to one particular structure, then it might be better to store such documents in CLOBs. For instance, in an XML messaging environment, each XML message in a queue might be of a different structure.
CLOBs store large character data and are useful for storing composed XML documents.
BFILEs are external file references and can also be used, although they are more useful for multimedia data that is not accessed often. In this case the XML is stored and managed outside Oracle, but can be used in queries on the server. The metadata for the document can be stored in object-relational tables in the server for fast indexing and access.
Storing an intact XML document in a CLOB or BLOB is a good strategy if the XML document contains static content that will only be updated by replacing the entire document.
Oracle allows the creation of Oracle Text (interMedia Text) indexes on LOB columns, in addition to URLs that point to external documents. This indexing mechanism works for XML data as well.
Oracle8i and Oracle9i recognize XML tags, and section and sub-section text searching within XML elements' content. The result is that queries can be posed on unstructured data and restricted to certain sections or elements within a document.
This Oracle Text (interMedia Text) example presume you have already created the appropriate index.
Chapter 8, "Searching XML Data with Oracle Text" for more information on Oracle Text.
CLOB storage is ideal if the structure of the XML document is unknown or dynamic.
Much of the SQL functionality on object-relational columns cannot be exploited. Concurrency of certain operations such as updates may be reduced. However, the exact copy of the document is retained.
The previous section described the following:
However, in many cases, you need better control of the mapping granularity.
For example, when mapping a text document, such as a book, in XML, you may not want every single element to be expanded and stored as object-relational. Storing the font and paragraph information for such documents in an object-relational format may not be useful with respect to querying.
On the other hand, storing the whole text document in a CLOB reduces the effective SQL queriability on the entire document.
The alternative is to have user-defined granularity for such storage. In the book example, you may want the following:
You can specify the granularity of mapping at table definition time. The server can automatically construct the XML from the various sources and generate queries appropriately.
Figure 2-1 illustrates this hybrid approach to XML storage.
The advantages of the hybrid storage approach for storing XML documents are the following:
XML generated from the database is in a canonical format that maps columns to elements and object types to nested elements. However, applications might require different representations of the XML document in different circumstances.
If an XML document is structured, but the structure of the XML document is not compatible with the structure of the underlying database schema, you must transform the data into the correct format before writing it to the database. You can achieve this in one of the following ways:
Finally, if you have a combination of structured and unstructured XML data, but still want to view and operate on it as a whole, you can use Oracle views.
Views enable you to construct an object on the fly by combining XML data stored in a variety of ways. You can do the following:
When you need to retrieve the data as a whole, simply construct the structure from the various pieces of data with the use of type constructors in the view's select statement. XML SQL Utility then enables retrieving the constructed data from the view as a single XML document.
You may need to create indexes and query on transformed views of an XML document. For example, in an XML messaging environment, there could be purchase order messages in different formats. You may want to query them canonically, so that a particular query can work across all purchase order messages.
In this case, the query is posed against the transformed view of the documents. You can create functional indexes or use regular views to achieve this.
Native implementation for the
existsNode() member functions is to parse the XML document, perform path traversal, and extract the fragment. However, this is not a performance-enhancing or scalable solution.
A second approach is to use Oracle Text (interMedia Text) indexing.
You can also build your own indexing mechanism on an
XMLType column using the extensibility indexing infrastructure.
W3C has chartered a schema working group to provide a new, XML based notation for structural schema and datatypes as an evolution of the current Document Type Definition (DTD) based mechanism. XML schemas can be used for the following:
Datatypes themselves can either be primitive (such as bytes, dates, integers, sequences, intervals) or user-defined (including ones that are derived from existing datatypes and which may constrain certain properties -- range, precision, length, mask -- of the basetype.) Application-specific constraints and descriptions are allowed.
XML Schema provides inheritance for element, attribute, and datatype definitions. Mechanisms are provided for URI references to facilitate a standard, unambiguous semantic understanding of constructs. The schema language also provides for embedded documentation or comments.
For example, you can define a simple data type as shown in the following example.
This is an example of defining a simple data type in XMLSchema:
It is clear even from the simple example above that XMLSchema provides a number of important new constructs over DTDs, such as a basetype, and a minimum value constraint.
When dynamic data is generated from a database, it is typically expressed in terms of a database type system. In Oracle, this is the object-relational type system described above, which provides for much richness in data types, such as NULL-ness, variable precision, NUMBER(7,2), check constraints, user-defined types, inheritance, references between types, collections of types and so on. XML Schema can capture a wide spectrum of schema constraints that go towards better matching generated documents to the underlying type-system of the data.
Consider the simple
Purchase Order type expressed in XML Schema:
<type name="Address" > <element name="street" type="string" /> <element name="city" type="string" /> <element name="state" type="string" /> <element name="zip" type="string" /> </type> <type name="Customer"> <element name="custNo" type="positiveInteger"/> <element name="custName" type="string" /> <element name="custAddr" type="Address" /> </type> <type name="Items"> <element name="lineItem" minOccurs="0" maxOccurs="*"> <type> <element name="lineItemNo" type="positiveInteger" /> <element name="lineItemName" type="string" /> <element name="lineItemPrice" type="number" /> <element name="LineItemQuan"> <datatype basetype="integer"> <minExclusive>0</minExclusive> </datatype> </element> </type> </element> </type> <type name="PurchaseOrderType"> <element name="purchaseNo" type="positiveInteger" /> <element name="purchaseDate" type="date" /> <element name="customer" type="Customer" /> <element name="lineItemList" type="Items" /> </type>
These XML Schemas have been deliberately constructed to match closely the Object-Relational purchase order example described above in""XMLSchema Example 2: Using XMLSchema to Map Generated XML Documents to Underlying Schema". The point is to underscore the closeness of match between the proposed constructs of XML Schema with SQL:1999-based type systems. Given such a close match, it is relatively easy to map an XML Schema to a database Object-Relational schema, and map documents that arevalid according to the above schema to row objects in the database schema. In fact, the greater expressiveness of XML Schema over DTDs greatly facilitates the mapping.
The applicability of the schema constraints provided by XML Schema is not limited to data-driven applications. There are more and more document-driven applications that exhibit dynamic behavior.
This section describes the following XML design issues for applications that exchange data.
To generate a Web form's infrastructure, you can do the following:
One way to ensure that data obtained via a Web form will map to an underlying database schema is to design the Web form and its underlying structure so that it generates XML data based on a schema-compatible DTD. This section describes how to use the XML SQL Utility and the XML Parser for Java to achieve this. This scenario has the following flow:
You can use the DTD-generation capability of the XML SQL Utility to determine what XML format is expected by a target object view or table. To do this, you can perform a
SELECT * FROM an object view or table to generate an XML result.
This result contains the DTD information as a separate file or embedded within the
DOCTYPE tag at the top of the XML file.
Use this DTD as input to the XML Class Generator to generate a set of classes based on the DTD elements. You can then write Java code that use these classes to generate the infrastructure behind a Web-based form. The result is that data submitted via the Web form will be converted to an XML document that can be written to the database.
There are numerous ways to transmit XML documents among applications. This section presents some of the more common approaches.
Here you can assume the following:
File Transfer. The receiving application requests the XML document from the sending application via FTP, NFS, SMB, or other file transfer protocol. The document is copied to the receiving application's file system. The application reads the file and processes it.
HTTP. The receiving application makes an HTTP request to a servlet. The servlet returns the XML document to the receiving application, which reads and processes it.
Advanced Queuing. An Oracle database sends an XML document via Net Services, HTTP or SMTP, and JDBC to the one or more receiving applications as a message through Oracle Advanced Queueing (AQ). The receiving applications dequeue the XML message and process it.
You can use the following options to load XML data or DTD files into Oracle9i:
You can also use Oracle9i Internet File System (9iFS) to put an XML document into the database. However, it does not support DTDs. It does however support XML Schema, the standard that will replace DTDs.
You can use SQL*Loader to bulk load LOBs.
Because LOBs can be quite large, SQL*Loader can load LOB data from either the main datafile (inline with the rest of the data) or from LOBFILEs. Figure 2-2 shows the LOBFILE syntax.
LOB data can be lengthy enough that it makes sense to load it from a LOBFILE. In LOBFILEs, LOB data instances are still considered to be in fields (predetermined size, delimited, length-value), but these fields are not organized into records (the concept of a record does not exist within LOBFILEs). Therefore, the processing overhead of dealing with records is avoided. This type of organization of data is ideal for LOB loading.
There is no requirement that a LOB from a LOBFILE fit in memory. SQL*Loader reads LOBFILEs in 64K chunks. To load physical records larger than 64K, you can use the
READSIZE parameter to specify a larger size.
It is best to load XMLType columns or columns containing XML data in CLOBs, using LOBFILEs.
A conventional path load executes SQL
INSERT statements to populate tables in an Oracle database. A direct path load eliminates much of the Oracle database overhead by formatting Oracle data blocks and writing the data blocks directly to the database files.
A direct-path load does not compete with other users for database resources, so it can usually load data at near disk speed. Considerations inherent to direct path loads, such as restrictions, security, and backup implications, are discussed in Chapter 9 of Oracle9i Database Utilities.
Figure 2-3 illustrates SQL*Loader's direct-path load and conventional path loads.
Tables to be loaded must already exist in the database. SQL*Loader never creates tables. It loads existing tables that either already contain data or are empty.
The following privileges are required for a load:
INSERTprivileges on the table to be loaded.
DELETEprivilege on the table to be loaded, when using the
TRUNCATEoption to empty out the table's old data before loading the new data in its place.
There are many potential uses for XML in Internet applications. Two database-centric application areas where Oracle's XML components are well-suited are:
or any combinations of these. This manual focuses on these two application areas, in Part III, "Data Exchange Using XML" and Part IV, "Tools and Frameworks for Building Oracle-Based XML Applications", respectively.
Typical scenarios in each of these two application areas are described in this chapter.
XML is increasingly used to enable customized presentation of data for different browsers, devices, and users. By using XML documents along with XSL stylesheets on either the client, middle-tier, or server, you can transform, organize, and present XML data tailored to individual users for a variety of client devices, including the following:
In doing so, you can focus your business applications on business operations, knowing you can accommodate differing output devices easily.
Using XML and XSL also makes it easier to create and manage dynamic Web sites. You can change the look and feel simply by changing the XSL stylesheet, without having to modify the underlying business logic or database code. As you target new users and devices, you can simply design new XSL stylesheets as needed. This is illustrated in Figure 2-4
Consider the following content management scenarios that use Oracle's XML components:
Each scenario includes a brief description of the business problem, solution, main tasks, and Oracle XML components used.
These scenarios are further illustrated in Oracle9i Case Studies - XML Applications under the section, "Managing Content and Documents with XML".
Company X has numerous document repositories of SGML and XML marked up text fragments. Composite documents must be published dynamically.
The bottom line is that the database application design must begin with a good database design. In other words, Company X must first use good data modeling and design guidelines. Then object views can more readily be created against the data.
Use XMLType to store the documents in XML format, where the relational data is updatable. Use Oracle9i's Internet File System (9iFS) as the data repository interface. 9iFS helps implement XML data repository management and administration tasks.
Company X can use XSL stylesheets to assemble the document sections or fragments and deliver the composite documents electronically to users. One suggested solution is to use Arbortext and EPIC for single sourcing and authoring or multichannel publishing. Multichannel publishing facilitates producing the same document in many different formats, such as HTML, PDF, WORD, ASCII text, SGML, and Framemaker.
http://www.arbortext.com for more information about the Arbortext and EPIC. products.
See Figure 2-5
These are the main tasks involved in Scenario 1's solution:
A large news distributor receives data from various news sources. This data must be stored in a database and sent to all the distributors and users on demand so that they can view specific and customized news at any time, according to their contract with the news distributor. The distributor uses XSL to normalize and store the data in a database. The stored data is used to back several Websites and portals. These Websites and portals receive HTTP requests from various wired and unwired clients.
Use XSL stylesheets with the XSQL Servlet to dynamically deliver appropriate rendering to the requesting service. See Figure 2-6
These are the main tasks involved in Scenario 2:
Company X needs data interactively delivered to a thin client.
Queries are sent from the client to databases whose output is rendered dynamically through one or more XSL stylesheets, for sending to the client application. The data is stored in a relational database in LOBs and materialized in XML.
See Oracle9i Case Studies - XML Applications, the chapter, "Customizing Discoverer4i(9i) Viewer with XSL"
A challenge for business application developers is to tie together data generated by applications from different vendors and different application domains. Oracle XML-enabled technology makes this kind of data exchange among applications easier to do by focusing on the data and its context without tying it to specific network or communication protocols.
Using XML and XSL transformations, applications can exchange data without having to manage and interpret proprietary or incompatible data formats.
Consider the following business-to-business and business-to-consumer (B2B/B2C) messaging scenarios that use Oracle XML components:
Each scenario briefly describes the problem, solution, main tasks used to resolve the problem and Oracle XML components used.
These scenarios are illustrated with case studies in Part III, "Data Exchange Using XML"
Company X needs to build an online shopping cart, for products coming from various vendors. Company X wants to receive orders online and then based upon which product is ordered, transfer the order to the correct vendor.
Use XML to deliver an integrated online purchasing application. While a user is completing a new purchase requisition for new hardware, they can go directly to the computer manufacturer's Web site to browse the latest models, configuration options, and prices. The user's site sends a purchase requisition reference number and authentication information to the vendor's Web site.
At the vendor site, the user adds items to their shopping cart, then clicks on a button to indicate that they are done shopping. The vendor sends back the contents of the shopping cart to the Company X's application as an XML file containing the part numbers, quantities, and prices that the user has chosen.
Items from the shopping cart are automatically added to the new purchase requisition as line items.
Customer orders (in XML) are delivered to the appropriate vendor databases for processing. XSL is used to transform and divide the shopping cart for compliant transfers. Data is stored in a relational database and materialized using XML. See Figure 2-7.
A client/server and server/server application stores a data resource and inventory in a database repository. This repository is shared across enterprises. Company X needs to know every time the data resource is accessed, and all the users and customers on the system need to know when and where data is accessed.
When a resource is accessed or released this triggers an availability XML message. This in turn transforms the resource, using XSL, into multiple client formats according to need. Conversely, a resource acquisition by one client sends an XML message to other clients, signalling its removal. Messages are stored in LOBs. Data is stored in a relational database and materialized in XML. See Figure 2-8.
Company X needs several applications to communicate and share data to integrate the business work flow and processes.
XML is used as the message payload. It is transformed via the XSLT Processor, enveloped and routed accordingly. The XML messages are stored in an AQ Broker Database in LOBs. Oracle Workflow is used to facilitate management of message and data routing and transformation. This solution also utilizes content management, here presentation customization using XSL stylesheets. See Figure 2-9.