8 XPath Rewrite for Structured Storage

This chapter explains the fundamentals of XPath rewrite for structured (object-relational) storage in Oracle XML DB. It details the rewriting of XPath-expression arguments to various SQL functions.

This chapter contains these topics:

Overview of XPath Rewrite for Structured Storage

Oracle XML DB can often optimize queries that use XPath expressions — for example, queries involving SQL functions such as XMLQuery, XMLTable, XMLExists, and updateXML, which take XPath (XQuery) expressions as arguments. The XPath expression is, in effect, evaluated against the XML document without ever constructing the XML document in memory.

This optimization is called XPath rewrite. It is a proper subset of XML query optimization, which also involves optimization of XQuery expressions, such as FLWOR expressions, that are not XPath expressions. XPath rewrite also enables indexes, if present on the column, to be used in query evaluation by the Optimizer.

The XPath expressions that can be rewritten by Oracle XML DB are a proper subset of those that are supported by Oracle XML DB. Whenever you can do so without losing functionality, use XPath expressions that can be rewritten.

XPath rewrite can occur in these contexts (or combinations thereof):

This chapter covers the first case: rewriting queries that use structured XML data or XMLType views. The XMLType views can be XML schema-based or not. Structured storage of XMLType data is always XML schema-based. Examples in this chapter are related to XML schema-based tables.

Example 8-1 illustrates XPath rewrite for a simple query that uses an XPath expression.

Example 8-1 XPath Rewrite

SELECT po.OBJECT_VALUE FROM purchaseorder po
  WHERE XMLCast(XMLQuery('$p/PurchaseOrder/Requestor'
                         PASSING po.OBJECT_VALUE AS "p" RETURNING CONTENT)
                AS VARCHAR2(128))
        = 'Sarah J. Bell';

The XMLCast(XMLQuery...)) expression here is rewritten to the underlying relational column that stores the requestor information for the purchase order. The query is rewritten to something like the following:Foot 1 


Sample of XPath Expressions that Are Rewritten

Table 8-1 describes some XPath expressions that are rewritten during XPath rewrite.

Table 8-1 Sample of XPath Expressions that Are Rewritten to Underlying SQL Constructs

XPath Expression for Translation Description

Simple XPath expressions (expressions with child and attribute axes only):



Involves traversals over object type attributes only, where the attributes are simple scalar or object types themselves.

Collection traversal expressions:


Involves traversal of collection expressions. The only axes supported are child and attribute axes. Collection traversal is not supported if the SQL function is used during a CREATE INDEX operation.


[Requestor = "Sarah J. Bell"]

Predicates in the XPath are rewritten into SQL predicates.

List index (positional predicate):


Indexes are rewritten to access the nth item in a collection.

Wildcard traversals:


If the wildcard can be translated to one or more simple XPath expressions, then it is rewritten.

Descendant axis (XML schema-based data only), without recursion:


Similar to a wildcard expression. The descendant axis is rewritten if it can be mapped to one or more simple XPath expressions.

Descendant axis (XML schema-based data only), with recursion:


The descendant axis is rewritten if both of these conditions holds:

  • All simple XPath expressions to which this XPath expression expands map to the same out-of-line table.

  • Any simple XPath expression to which this XPath expression does not expand does not map to that out-of-line table.

XPath functions

Some XPath functions are rewritten. These functions include not, floor, ceiling, substring, and string-length.

See Also:

"Performance Tuning for XQuery" for information about rewrite of XQuery expressions

Analyzing and Optimizing XPath Queries using Execution Plans

This section presents some guidelines for using execution plans to do the following, for queries that use XPath expressions:

  • Analyze query execution, to determine whether XPath rewrite occurs.

  • Optimize query execution, by using secondary indexes.

Use these guidelines together, taking all that apply into consideration.

As is true also for the rest of this chapter, this section is applicable only to XMLType data that is stored object-relationally (structured storage).

XPath rewrite for object-relational storage means that a query that selects XML fragments defined by an XPath expression is rewritten to a SQL SELECT statement on the underlying object-relational tables and columns. These underlying tables can include out-of-line tables.

Guideline: Look for underlying tables versus XML functions in execution plans

The execution plan of a query that has been rewritten refers to the object-relational tables and columns that underlie the queried XMLType data.

The names of the underlying tables can be meaningful to you, if they are derived from XML element or attribute names or if the governing XML schema explicitly names them by using annotation xdb:defaultTable. Otherwise, these names are system-generated and have no obvious meaning. In particular, they do not reflect the corresponding XML element or attribute names. Also, some system-generated columns are hidden. You do not see them if you use the SQL describe command. They nevertheless show up in execution plans.

The plan of a query that has not been rewritten shows only the base table names, and it typically refers to user-level XML functions, such as XMLExists. Look for this difference to determine whether a query has been optimized. The XML function name shown in an execution plan is actually the internal name (for example, XMLEXISTS2), which is sometimes slightly different from the user-level name.

Example 8-2 shows the kind of execution plan output that is generated when Oracle XML DB cannot perform XPath rewrite. The plan here is for a query that uses SQL/XML function XMLExists. The corresponding internal function XMLExists2 appears in the plan output, indicating that the query is not rewritten.

Example 8-2 Execution Plan Generated When XPath Rewrite Does Not Occur

Predicate Information (identified by operation id):
   1 - filter(XMLEXISTS2('$p/PurchaseOrder[User="SBELL"]' PASSING BY VALUE
              MLDATA") AS "p")=1)

In this situation, Oracle XML DB constructs a pre-filtered result set based on any other conditions specified in the query WHERE clause. It then filters the rows in this potential result set to determine which rows belong in the result set. The filtering is performed by constructing a DOM on each document and performing a functional evaluation (using the methods defined by the DOM API) to determine whether or not each document is a member of the result set.

Guideline: Name the default tables, so you recognize them in execution plans

When designing an XML schema, use annotation xdb:defaultTable to name the underlying tables that correspond to elements that you select in queries where performance is important. This lets you easily recognize them in an execution plan, indicating by their presence or absence whether the query has been rewritten.

Guideline: Create an index on a column targeted by a predicate

A query resulting from XPath rewrite sometimes includes a SQL predicate (WHERE clause). This can happen even if the original query does not use an XPath predicate, and it can happen even if the original query does not have a SQL WHERE clause.

When this happens, you can sometimes improve performance by creating an index on the column that is targeted by the SQL predicate, or by creating an index on a function application to that column. Example 8-1 illustrates XPath rewrite for a query that includes a WHERE clause. Example 8-3 shows the predicate information from an execution plan for this query.

Example 8-3 Analyzing an Execution Plan to Determine a Column to Index

Predicate Information (identified by operation id):
   1 - filter(CAST("PURCHASEORDER"."SYS_NC00021$" AS VARCHAR2(128))='Sarah
              J. Bell' AND SYS_CHECKACL("ACLOID","OWNERID",xmltype('<privilege
              http://xmlns.oracle.com/xdb/acl.xsd DAV:http://xmlns.oracle.com/xdb/dav.xsd

The predicate information indicates that the expression XMLCast(XMLQuery...)) is rewritten to an application of SQL function cast to the underlying relational column that stores the requestor information for the purchase order, SYS_NC0021$. This column name is system-generated. The execution plan refers to this system-generated name, in spite of the fact that the governing XML schema uses annotation SQLName to name this column REQUESTOR.

Because these two names (user-defined and system-generated) refer to the same column, you can create a B-tree index on this column using either name. Alternatively, you can use the extractValue shortcut to create the index, by specifying an XPath expression that targets the purchase-order requestor data. Example 8-4 shows these three equivalent ways to create the B-tree index on the predicate-targeted column.

Example 8-4 Creating an Index on a Column Targeted by a Predicate

CREATE INDEX requestor_index ON purchaseorder ("SYS_NC00021$");

CREATE INDEX requestor_index ON purchaseorder ("XMLDATA"."REQUESTOR");

CREATE INDEX requestor_index ON purchaseorder
  (extractvalue(OBJECT_VALUE, '/PurchaseOrder/Requestor'));

However, for this particular query it makes sense to create a function-based index, using a functional expression that matches the one in the rewritten query. Example 8-5 illustrates this.

Example 8-5 Creating a Function-Based Index for a Column Targeted by a Predicate

CREATE INDEX requestor_index ON purchaseorder

Example 8-6 shows an execution plan that indicates that the index is picked up.

Example 8-6 Execution Plan Showing that Index Is Picked Up

| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT            |                 |     1 |   524 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| PURCHASEORDER   |     1 |   524 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | REQUESTOR_INDEX |     1 |       |     1   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   1 - filter(SYS_CHECKACL("ACLOID","OWNERID",xmltype('<privilege
   2 - access(CAST("SYS_NC00021$" AS VARCHAR2(128))='Sarah J. Bell')

In the particular case of this query, the original functional expression applies XMLCast to XMLQuery to target a singleton element, Requestor. This is a special case, where you can as a shortcut use such a functional expression directly in the CREATE INDEX statement. That statement is rewritten to create an index on the underlying scalar data. Example 8-7, which targets an XPath expression, thus has the same effect as Example 8-5, which targets the corresponding object-relational column.

Example 8-7 Creating a Function-Based Index for a Column Targeted by a Predicate

CREATE INDEX requestor_index 
  ON purchaseorder po
     (XMLCast(XMLQuery('$p/PurchaseOrder/Requestor' PASSING po.OBJECT_VALUE AS "p"
                                                    RETURNING CONTENT)
              AS VARCHAR2(128)));

See Also:

"Indexing Non-Repeating text() Nodes or Attribute Values" for information about using the shortcut of XMLCast applied to XMLQuery and the extractValue shortcut to index singleton data

Guideline: Create indexes on ordered collection tables

If a collection is stored as an ordered collection table or an XMLType instance, then you can directly access members of the collection. Each member of the collection becomes a row in a table, so you can access it directly with SQL.

You can often improve performance by indexing such collection members. You do this by creating a composite index on (a) the object attribute that corresponds to the collection XML element or its attribute and (b) pseudocolumn NESTED_TABLE_ID.

Example 8-8 shows the execution plan for a query to find the Reference elements in documents that contain an order for part number 717951002372 (Part element with an Id attribute of value 717951002372). The collection of LineItem elements is stored as rows in the ordered collection table lineitem_table.


Example 8-8 does not use the purchaseorder table from sample database schema OE. It uses the purchaseorder table defined in Example 3-13. This table uses an ordered collection table (OCT) named lineitem_table for the collection element LineItem.

Example 8-8 Execution Plan for a Selection of Collection Elements

SELECT XMLCast(XMLQuery('$p/PurchaseOrder/Reference'
               AS VARCHAR2(4000)) "Reference"
  FROM purchaseorder
  WHERE XMLExists('$p/PurchaseOrder/LineItems/LineItem/Part[@Id="717951002372"]'
                  PASSING OBJECT_VALUE AS "p");

| Id  | Operation                    | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT             |                        |     1 |   122 |    16  (13)| 00:00:01 |
|   1 |  NESTED LOOPS                |                        |       |       |            |          |
|   2 |   NESTED LOOPS               |                        |     1 |   122 |    16  (13)| 00:00:01 |
|   3 |    SORT UNIQUE               |                        |     1 |    50 |    14   (8)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL        | LINEITEM_TABLE         |     1 |    50 |    14   (8)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN         | LINEITEM_TABLE_MEMBERS |     1 |       |     0   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID| PURCHASEORDER          |     1 |    72 |     1   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   4 - filter("SYS_NC00009$" IS NOT NULL AND "SYS_NC00011$"='717951002372')
   5 - access("NESTED_TABLE_ID"="PURCHASEORDER"."SYS_NC0003400035$")

The execution plan shows a full scan of ordered collection table lineitem_table. This could be acceptable if there were only a few hundred documents in the purchaseorder table, but it would be unacceptable if there were thousands or millions of documents in the table.

To improve the performance of such a query, you can create an index that provides direct access to pseudocolumn NESTED_TABLE_ID, given the value of attribute Id. Unfortunately, Oracle XML DB does not allow indexes on collections to be created using XPath expressions directly. To create the index, you must understand the structure of the SQL object that is used to manage the LineItem elements. Given this information, you can create the required index using conventional object-relational SQL.

In this case, element LineItem is stored as an instance of object type lineitem_t. Element Part is stored as an instance of SQL data type part_t. XML attribute Id is mapped to object attribute part_number. Given this information, you can create a composite index on attribute part_number and pseudocolumn NESTED_TABLE_ID, as shown in Example 8-9. This index provides direct access to those purchase-order documents that have LineItem elements that reference the required part.

Example 8-9 Creating an Index for Direct Access to an Ordered Collection Table

CREATE INDEX lineitem_part_index ON lineitem_table l (l.part.part_number, l.NESTED_TABLE_ID);

Guideline: Use XMLOptimizationCheck to determine why a query is not rewritten

If a query has not been optimized, you can use system variable XMLOptimizationCheck to try to determine why.

Footnote Legend

Footnote 1: This example uses sample database schema OE and its table purchaseorder. The XML schema for this table is annotated with attribute SQLName to specify SQL object attribute names such as REQUESTOR — see Example 3-10. Without such annotations, this example would use p."XMLDATA"."Requestor", not p."XMLDATA".".REQUESTOR".