13 Expressions with XPath Predicates

The expressions stored in a column of a table may contain XPath predicates defined on XMLType attributes. This section describes an application for XPath predicates using the Car4Sale example introduced in Chapter 11. For this purpose, the information published for each car going on sale includes a Details attribute in addition to the Model, Price, Mileage, and Year attributes. The Details attribute contains additional information about the car in XML format as shown in the following example:

<details>
  <color>White</color> 
  <accessory>
     <stereo make="Koss">CD</stereo>
     <GPS>
       <resolution>1FT</resolution>
       <memory>64MB</memory>
     </GPS>
  </accessory>
</details>

A sample predicate on the Details attribute is extract(Details, '//stereo[@make="Koss"]') IS NOT NULL. This predicate can be combined with one or more predicates on other XML or non-XML attributes.

13.1 Using XPath Predicates in Expressions

Using the XMLType data type supplied by Oracle, users can apply XPath predicates on XML documents within a standard SQL WHERE clause of a query. These predicates use SQL operators such as EXTRACT and EXISTSNODE on an instance of the XMLType data type to process an XPath expression for the XML instance. For more information, see Oracle Database SQL Language Reference and Oracle XML DB Developer's Guide.

To allow XPath predicates in an expression set, the corresponding attribute set should be created with an attribute of sys.XMLType data type, as shown in the following example:

CREATE OR REPLACE TYPE Car4Sale AS OBJECT 
                                   (Model   VARCHAR2(20), 
                                    Year    NUMBER, 
                                    Price   NUMBER, 
                                    Mileage NUMBER,
                                    Details sys.XMLType);/
 
BEGIN 
  DBMS_EXPFIL.CREATE_ATTRIBUTE_SET(attr_set  => 'Car4Sale',
                                   from_type => 'YES');
END;
/

The expression sets using this attribute set can include predicates on the XMLType attribute, as shown in the following example:

Model='Taurus' and Price < 15000 and Mileage < 25000 AND 
             extract(Details, '//stereo[@make="Koss"]') IS NOT NULL
 
                      -- or --
 
Model='Taurus' and Price < 15000 and Mileage < 25000 AND 
             existsNode(Details, '//stereo[@make="Koss"]') = 1 

Now, a set of expressions stored in the Interest column of the Consumer table can be processed for a data item by passing an instance of XMLType for the Details attribute along with other attribute values to the EVALUATE operator:

SELECT * FROM Consumer WHERE 
    EVALUATE (Consumer.Interest, 
              'Model=>''Mustang'', 
               Year=>2000,
               Price=>18000,
               Mileage=>22000,
               Details=>sys.XMLType(''<details>
                                       <color>White</color> 
                                       <accessory>
                                        <stereo make="Koss">CD</stereo>
                                        <GPS>
                                         <resolution>1FT</resolution>
                                         <memory>64MB</memory>
                                        </GPS>
                                       </accessory>
                                      </details>'')'
         ) = 1;

The previous query identifies all the rows with expressions that are true based on their XPath and non-XPath predicates.

13.2 Indexing XPath Predicates

To process a large set of XPath predicates in an expression set efficiently, the Expression Filter index defined for the expression set can be configured for the XPath predicates (in addition to some simple predicates). The Expression Filter indexes use the commonalities in the XPath expressions to efficiently compare them to a data item. These commonalities are based on the positions and the values for the XML elements and attributes appearing in the XPath expressions.

The indexable constructs in an XPath expression are the levels (or positions) of XML elements, the values for text nodes in XML elements, the positions of XML attributes, and the values for XML attributes. For this purpose, an XPath predicate is treated as a combination of positional and value filters on XML elements and attributes appearing in an XML document. For example, the following XPath expression can be deciphered as a set of checks on the XML document. The list following the example explains those checks.

extract(Details, '//stereo[@make="Koss" and /*/*/GPS/memory[text()="64MB"]]')
                                                              IS NOT NULL

  1. Level (position) of stereo element is 1 or higher.

  2. The stereo element appearing at level 1 or higher has a make attribute.

  3. The value for stereo element's make attribute is Koss.

  4. The GPS element appears at level 3.

  5. The memory element appears at level 4.

  6. The memory element has a text node with a value of 64MB.

13.2.1 Indexable XPath Predicates

The Expression Filter index does not support some constructs in an XPath predicate. Therefore, the XPath predicate is always included in the sparse predicates and evaluated during the last phase of expression filtering. For more information about sparse predicates, see Section 12.4.

A positional filter for an Expression Filter index can be configured from any XML element or attribute. A value filter can only be configured from equality predicates on XML attributes and text nodes in XML elements. XPath predicates that are indexed in an expression set must use either the EXTRACT or the EXISTSNODE operator with a positive test on the return value. For example extract(Details, '//stereo[@make="Koss"]') IS NOT NULL can be indexed, but a similar predicate with an IS NULL check on the return value cannot be indexed.

Some of the XPath constructs that cannot be indexed by the Expression Filter include:

  • Inequality or range predicates in the node test. For example, the predicate on the stereo element's make attribute cannot be indexed in the following XPath predicate:

    extract(Details, '//stereo[@make!="Koss"]') IS NOT NULL
    
    
  • Disjunctions in the node test. For example, the predicates on the stereo element's make attribute cannot be indexed in the following XPath predicate:

    extract(Details, '//stereo[@make="Koss" or @make="Bose"]') IS NOT NULL
    
    
  • Node tests using XML functions other than text(). For example, the predicate using the XML function, position, cannot be indexed in the following XPath predicate:

    extract(Details, '//accessory/stereo[position()=3]') IS NOT NULL
    
    

    However, the text() function in the following example can be a value filter on the stereo element:

    extract(Details, '//accessory/stereo[text()="CD"]') IS NOT NULL
    
    
  • Duplicate references to an XML element or an attribute within a single XPath expression. For example, if the stereo element appears in an XPath expression at two different locations, only the last occurrence is indexed, and all other references are processed during sparse predicate evaluation.

13.2.2 Index Representation

The Expression Filter index can be configured to process the XPath predicates efficiently by using the most discriminating XML elements and attributes as positional and value filters. Each one forms a predicate group for the expression set.

For the purpose of indexing XPath predicates, the predicate table structure described in Section 2.3 is extended to include two columns for each XML tag. For an XML tag configured as positional filter, these columns capture the relative and absolute positions of the tag in various XPath predicates. For an XML tag configured as a value filter, these columns capture the constants appearing with the tag in the node tests and their relational operators.

Note:

Only equality operators are indexed in this release.

Figure 13-1 shows the predicate table structure for the index configured with the following XML tags:

  • XML attribute stereo@make as value filter. (Predicate Group 4 - G4)

  • XML element stereo as positional filter. (Predicate Group 5 - G5)

  • Text node of the XML element memory as value filter. (Predicate Group 6 - G6)

This image can be viewed as an extension of the predicate table shown in Figure 12-1. The partial row shown in the predicate table captures the following XPath predicate:

extract(Details, '//stereo[@make="Koss" and /*/*/GPS/memory[text()="64MB"]]')
                                                                  IS NOT NULL

Figure 13-1 Conceptual Predicate Table with XPath Predicates

Description of Figure 13-1 follows
Description of "Figure 13-1 Conceptual Predicate Table with XPath Predicates"

13.2.3 Index Processing

The XPath predicates captured in the predicate table are compared to an XML document that is included in the data item passed to the EVALUATE operator. The positions and values of the XML tags used in the index are computed for the XML document, and these are compared with the values stored in the corresponding columns of the predicate table. Assuming that the relational operators and the right-hand-side constants for the value filter on stereo@make attribute are stored in G4_OP and G4_RHS columns of the predicate table (see Figure 13-1), the following query on the predicate table identifies the rows that satisfy this check for an XML document:

SELECT Rid FROM predicate_table
  WHERE G4_OP = '=' AND 
        G4_RHS in (SELECT column_value FROM TABLE (:G4ValuesArray));

For the previous query, the values for all the occurrences of the stereo@make attribute in the given XML document are represented as a VARRAY and are bound to the :G4ValuesArray variable.

Similarly, assuming that the position constraints and the absolute levels (positions) of the stereo element are stored in the G5_OP and G5_POS columns of the predicate table, the following query identifies all the rows that satisfy these positional checks for an XML document:

SELECT Rid FROM predicate_table 
    WHERE (G5_OP = '=' AND                     --- absolute position check --
           G5_POS in (SELECT column_value FROM table (:G5PosArray))) OR
           (G5_OP = '>=' AND                   --- relative position check --
            G5_POS <= SELECT max(column_value) FROM table (:G5PosArray)));
 

For the previous query, the :G5PosArray contains the levels for all the occurrences of the stereo element in the XML document. These checks on each predicate group can be combined with the checks on other (XPath and non-XPath) predicate groups to form a complete predicate table query. A subset of the XML tags can be identified as the most selective predicate groups, and they can be configured as the indexed predicate groups (See Section 12.4). Bitmap indexes are created for the selective predicate groups, and these indexes are used along with indexes defined for other indexed predicate groups to efficiently process the predicate table query.

13.2.4 Index Tuning for XPath Predicates

The most discriminating XML tags in a set of XPath predicates are classified as positional filters and value filters. A value filter is considered discriminating if node tests using the XML tag are selective enough to match only a subset of XML documents. Similarly, a positional filter is considered discriminating if the tag appears at different levels or does not appear in all XML documents, and thus match only a subset of them.

The XPath positional and value filters can be further mapped to indexed predicate groups or stored predicate groups. PL/SQL procedures are provided to configure an Expression Filter index with these parameters. For an attribute set consisting of two or more XMLType attributes, the XML tags can be associated with each of these attributes

The XPath index parameters for a set of expressions are considered part of the index parameter, and they can be assigned to an attribute set or an expression set (the column storing the expressions). The index parameters assigned to the attribute set act as defaults and are shared across all the expression sets associated with the attribute set.

A few XPath index parameters can be assigned to an XMLType attribute of an attribute set using the DBMS_EXPFIL.DEFAULT_XPINDEX_PARAMETERS procedure, as shown in the following example:

BEGIN
  DBMS_EXPFIL.DEFAULT_XPINDEX_PARAMETERS(
       attr_set   => 'Car4Sale',
       xmlt_attr  => 'Details',                         --- XMLType attribute
       xptag_list =>                                    --- Tag list 
         exf$xpath_tags(
           exf$xpath_tag(tag_name    => 'stereo@make',  --- XML attribute
                         tag_indexed => 'TRUE',
                         tag_type    => 'VARCHAR(15)'), --- value filter
           exf$xpath_tag(tag_name    => 'stereo',       --- XML element
                         tag_indexed => 'FALSE',
                         tag_type    => null),   --- null => positional filter
           exf$xpath_tag(tag_name    => 'memory',       --- XML element
                         tag_indexed => 'TRUE',
                         tag_type    => 'VARCHAR(10)')  --- value filter
          )
        );
END;
/

Note that a missing or null value for the tag_type argument configures the XML tag as a positional filter.

For more information about assigning XPath index parameters, see DEFAULT_XPINDEX_PARAMETERS Procedure.

By default, the previous XPath index parameters are used for any index created on an expression set that is associated with the Car4Sale attribute set.

CREATE INDEX InterestIndex ON Consumer (Interest) 
        INDEXTYPE IS EXFSYS.EXPFILTER;

Unlike simple index parameters, the XPath index parameters cannot be fine-tuned for an expression set when the index is created. However, you can achieve this by associating index parameters directly with the expression set using the DBMS_EXPFIL.INDEX_PARAMETERS and DBMS_EXPFIL.XPINDEX_PARAMETERS procedures and then creating the index, as shown in the following example:

BEGIN
  -- Derive the index parameters including XPath index params from defaults --
  DBMS_EXPFIL.INDEX_PARAMETERS(expr_tab  => 'Consumer',
                               expr_col  => 'Interest',
                               attr_list => null, 
                               operation => 'DEFAULT');
 
  -- fine-tune the XPath index parameters by adding another Tag --
  DBMS_EXPFIL.XPINDEX_PARAMETERS(expr_tab  => 'Consumer',
                                 expr_col  => 'Interest',
                                 xmlt_attr => 'Details',
                                 xptag_list =>
                                   exf$xpath_tags(
                                     exf$xpath_tag(tag_name    => 'GPS',
                                                   tag_indexed => 'TRUE',
                                                   tag_type    => null)),
                                 operation => 'ADD'); 
END;
/
 
 
CREATE INDEX InterestIndex ON Consumer (Interest) 
            INDEXTYPE IS EXFSYS.EXPFILTER;

For more information, see INDEX_PARAMETERS Procedure and XPINDEX_PARAMETERS Procedure.

Once the index is created on a column storing the expressions, a query with the EVALUATE operator can process a large set of XPath and non-XPath predicates for a data item efficiently:

SELECT * FROM Consumer WHERE 
    EVALUATE (Consumer.Interest, 
              'Model=>''Mustang'', 
               Year=>2000,
               Price=>18000,
               Mileage=>22000,
               Details=>sys.XMLType(''<details>
                                       <color>White</color> 
                                       <accessory>
                                        <stereo make="Koss">CD</stereo>
                                        <GPS>
                                         <resolution>1FT</resolution>
                                         <memory>64MB</memory>
                                        </GPS>
                                       </accessory>
                                      </details>'')'
              ) = 1;

Note:

Expression Filter index tuning based on XPath statistics is not supported in the current release.