14 Expressions with Spatial and Text Predicates

This chapter describes using expressions with spatial predicates and text predicates. The expression stored in the columns of Expression data type can specify predicates involving:

14.1 Expressions with Spatial Predicates


The Oracle Spatial or the Locator components must be installed to use spatial predicates in stored expressions.

The expressions stored in a column of a table may contain spatial predicates defined on SDO_GEOMERTY attributes. This section describes an application for spatial predicates using the Car4Sale example introduced in Chapter 11. For this purpose, the information published for each car going on sale includes a Location attribute in addition to the Model, Price, Mileage, and Year attributes. The Location attribute contains geographical coordinates for the vehicle's location, as an instance of the SDO_GEOMETRY data type.

Using the Location attribute, the consumer interested in a vehicle can restrict the search only to the vehicles that are within a specified distance, say half a mile, of his own location. You can specify this using the following spatial predicate involving the SDO_WITHIN_DISTANCE operator:

    2001, 8307, 
    SDO_POINT_TYPE(-77.03644, 37.89868, NULL), NULL, NULL
  ) , 
  'distance=0.5 units=mile'
) = 'TRUE' 

Note that spatial predicates are efficiently evaluated with the help of spatial indexes. Section 14.1.1 and Section 14.1.2 describes how to specify spatial predicates in arbitrary expressions and how to ensure the predicates are evaluated using appropriate spatial indexes.

14.1.1 Using Spatial Predicates in Expressions

Using the Oracle Database supplied SDO_GEOMETRY data type, you can specify spatial predicates on instances of spatial geometries within a standard SQL WHERE clause of a query. These predicates use operators such as SDO_WITHIN_DISTANCE and SDO_RELATE on an instance of SDO_GEOMETRY data type to relate two spatial geometries in a specific way. For more information, see Oracle Spatial Developer's Guide.

To allow spatial predicates in an expression set, the corresponding attribute set should be created with an attribute of MDSYS.SDO_GEOMETRY data type as shown in the following example:

                   (Model    VARCHAR2(20),
                    Year     NUMBER,
                    Price    NUMBER,
                    Mileage  NUMBER,
                    Location MDSYS.SDO_GEOMETRY);
  dbms_expfil.create_attribute_set (attr_set => 'Car4Sale',
                                    from_type => 'YES');

In order to specify predicates on the spatial attribute and index them for efficiency, you should associate the geometry metadata describing the dimension, lower and upper bounds, and tolerance in each dimension with each spatial geometry attribute in the attribute set. This metadata information can be inserted into the USER_SDO_GEOM_METADATA view using the attribute set name in the place of the table name. For more information on the USER_SDO_GEOM_METADATA view and its semantics, see Oracle Spatial Developer's Guide.

INSERT INTO user_sdo_geom_metadata VALUES ('CAR4SALE','LOCATION',
          mdsys.sdo_dim_element('X',  -180, 180, 0.5),
          mdsys.sdo_dim_element('Y',  -90, 90, 0.5)), 8307);

The expression set using the attribute set with one or more SDO_GEOMETRY attributes can include predicates on such attributes using SDO_WITHIN_DISTANCE or SDO_RELATE operators, as shown in the following examples:

Model = 'Taurus' and Price < 15000 and Mileage < 25000 and 
      SDO_GEOMETRY(2001, 8307, 
        SDO_POINT_TYPE(-77.03644, 37.89868, NULL), NULL, NULL), 
      'distance=0.5 units=mile') = 'TRUE'
Model = 'Taurus' and Price < 15000 and Mileage < 25000 and 
  SDO_RELATE (Location,
      SDO_GEOMETRY(2001, 8307, NULL, SDO_ELEM_INFO_ARRAY(1, 1003, 3), 
         SDO_ORDINATE_ARRAY(-77.03644, 37.89868, -75, 39), 
      'mask=anyinteract') = 'TRUE'

Note that unlike in the case of expressions with purely non-spatial predicates, expressions with spatial predicates cannot be evaluated when an Expression Filter index is not defined for the expression set. Once an Expression Filter index is created on the column storing the expressions, expressions with spatial predicates can be processed for a data item by passing an instance of SDO_GEOMETRY data type for the Location attribute, along with other attribute values, to the EVALUATE operator.

  EVALUATE (Interest,
       Car4Sale('Mustang', 2002, 20000, 250000,
            SDO_GEOMETRY(2001, 8307,  
            sdo_point_type(-77.03644, 38.9059284, null), null, null)))
  ) = 1;

The previous query identifies all the rows with expressions that are true based on their spatial and not-spatial predicates.

14.1.2 Indexing Spatial Predicates

The spatial predicates in the stored expressions are processed using some custom spatial indexes created on the geometries specified in the spatial predicates. These spatial indexes are automatically created when the Expression Filter index is created on the column storing expressions. The expressions with spatial predicates cannot be processed in the absence of these spatial indexes and hence an Expression Filter index is always required to evaluate such expressions.

When an Expression Filter index is defined on an expression column, the spatial attributes in the corresponding attribute set are all considered as indexed predicate groups. The predicate table has columns of SDO_GEOMETRY type for each of these attributes and spatial indexes are created on these columns. The values stored in an SDO_GEOMETRY column of the predicate table are computed based on the values specified in the spatial predicates involving corresponding attribute.

When the expressions are evaluated for a data item, the spatial indexes created on the geometry column in combination with bitmap indexes created for the other indexed predicate groups filter out the expressions that are false based on all indexed predicate groups. The expressions remaining in the working set are further evaluated based on the stored predicate groups and sparse predicates to identify all the expressions that are true for the given data item.

14.2 Expressions with Text Predicates

The Text predicates in the stored expressions are specified using the CONTAINS operator within the SQL WHERE clause syntax. The text predicates include the document to be searched (through an attribute name) and a text query expression. The text query expression specifies the text pattern to be searched within the document and it is represented using the CONTEXT grammar (see Oracle Text Application Developer's Guide).

Model='Taurus' and Price < 15000 and Mileage < 25000 AND
    CONTAINS (InsReport, 'near(water, damage), 4)') = 1

The Text classification engine currently allows storing (text) query expressions in a column of a table and processing them for incoming documents. The incoming documents are matched with the stored expressions using a MATCHES operator within SQL queries. The metadata required to match the document with the stored expressions is obtained from the CTXRULE index defined on the column storing the query expressions. This metadata includes the information about the text preferences such as the type of LEXER to use and the list of STOP words. This metadata is assigned to the CTXRULE index at the time of index creation and hence the MATCHES operator is only operational when the CTXRULE index exists on the column.

The support for text predicates in stored expressions is provided through the integration of Text classification engine and the Expression Filter feature. In these stored expressions, the text predicates can be combined with other text predicates, scalar or XML predicates using the SQL WHERE clause syntax. You specify the text predicates using the CONTAINS operator, which identifies the text attribute on which the text query expression should be evaluated. At the time of expression evaluation, Expression Filter matches the text query expression with the document assigned to the text attribute. The text preferences required for this matching operation are assigned to the text attributes as a form of metadata.

In an attribute set you create for an expression column, a text attribute is identified as an attribute of CLOB or VARCHAR data type with associated text preferences. You use such attributes as the first argument to the CONTAINS operator to form text predicates within the stored expressions. You specify the text preferences for an attribute using an instance of EXF$TEXT type, which accepts the text preferences in string format, for example: (LEXER insrpt_lexer WORDLIST insrpt_wordlist). The preferences you specify through this argument are used for parsing the document bound to the attribute and for indexing the text query expressions (using ctxsys.CTXRULE Indextype). Alternately, you can assign an EXF$TEXT instance with an empty preferences string to use default preferences for a CTXRULE index (See Oracle Text Reference).

  dbms_expfil.create_attribute_set(attr_set => 'Car4Sale');
  // create scalar attributes 
                                   attr_set  => 'Car4Sale',
                                   attr_name => 'Model',
                                   attr_type => 'VARCHAR2(30)');

                                   attr_set  => 'Car4Sale',
                                   attr_name => 'Price',
                                   attr_type => 'NUMBER');

  . . .
  // create text attribute
                                   attr_set  => 'Car4Sale',
                                   attr_name => 'InsReport',
                                   attr_type => 'CLOB'
                                   attr_type => 
                                     'LEXER insrpt_lexer
                                      WORDLIST insrpt_wordlist'));


The attribute sets consisting of one or more text attributes have to be assembled using the ADD_ELEMENTARY_ATTRIBUTE procedure and they cannot be created from an existing object type.

For the attribute set previously described, you should create the LEXER preference insrpt_lexer and the WORDLIST preference insrpt_wordlist using the CTX_DDL package (see Oracle Text Reference).

Figure 14-1 Text Predicate in the Stored Expression Using the CONTAINS Operator

Description of Figure 14-1 follows
Description of "Figure 14-1 Text Predicate in the Stored Expression Using the CONTAINS Operator"

When you use the attribute set with a text attribute as the metadata for an expression column, the expressions stored in the column can include text predicates. The text query expression you use within the text predicates use a subset of the CONTEXT grammar to support proximity searches, theme searches, and so forth, with the use of CONTEXT operators: ABOUT, AND, NEAR, NOT, OR, STEM, WITHIN, and THESAURUS. (See the section about document classification in Oracle Text Application Developer's Guide for complete syntax). Expression Filter validates the scalar predicates in the stored expressions for syntactic and semantic correctness when the expressions are inserted into a column of expression data type. However, the text query expressions you specify as the second argument to the CONTAINS operator are not validated until the time of index creation or index maintenance. Any text query expression errors identified during index creation are reported through CTX_USER_INDEX_ERRORS view as described next.

Unlike in the case of expressions containing purely scalar predicates, the expressions containing predicates on text attributes cannot be evaluated when the Expression Filter index is not defined on the column storing expressions. At the time of index creation, all the text attributes are added to the indexed predicate groups and CTXRULE indexes are created on the predicate table for each of these predicate groups. The CTXRULE index creation for the text predicate groups also identifies any errors in the corresponding text query expressions and these errors are reported in the CTX_USER_INDEX_ERRORS view (see Oracle Text Reference). However, the errors reported in this view refer to the rows in the predicate table structure of the Expression Filter index. These errors can be mapped back to the rows in the user table using the USER_EXPFIL_TEXT_INDEX_ERRORS view.

Once the index is defined, you can evaluate the stored expressions using the EVALUATE operator in a SQL query. When you define the attribute set with one or more text attributes, the data item passed to the EVALUATE operator consists of corresponding values (VARCHAR or CLOB) assigned to these attributes.

SELECT * FROM Consumer
                    '...4 star crash test rating ...') = 1

The previous query with the EVALUATE operator identifies all the expressions that are true based on the text and the non-text predicates.

When you define an Expression Filter index on the column storing expressions, some of the most common and selective scalar and XPath predicates are identified by the end-user and the predicate table is created to accommodate such predicates. While selecting the predicate groups for an Expression Filter index, the predicates involving a text attribute are always included in the indexed predicate groups. The predicate table is created with a VARCHAR column to store the text query expressions specified for a text attribute and a CTXRULE index is defined on this column. These indexes along with the bitmap indexes you define on the other indexed predicates process the expressions for a data item efficiently.

When you issue a query with the EVALUATE operator, the values from the data item passed in are bound into a query on the predicate table. The predicate table query uses a MATCHES operator on the columns storing text query expressions to match the incoming documents with the stored text query expressions. The MATCHES operator internally makes use of the CTXRULE index to process the text query expressions efficiently and return a subset of expressions that match the document. The results from matching the documents with text query expressions are combined with results from matching other scalar and XPath predicates to narrow down the working set to a set of candidate expressions that are true based on all indexed predicate groups. The processing of stored predicate groups and the sparse predicates remain unchanged with the inclusion of text predicates in the expression set.

Delayed DML Maintenance for Text Predicates

The modifications made to the data stored in Expression data type column are all transactional in nature. That is, any modifications made to the expression column are automatically reflected in the corresponding Expression Filter index. However, the modifications made to the text query expression within the stored expressions are not immediately reflected in the corresponding CTXRULE index. The index maintenance is forced by manually synchronizing the index, and until the CTXRULE index is synchronized, the Expression Filter index may return incorrect results (based on the old text predicates and new scalar predicates). One or more CTXRULE indexes defined for expression sets stored in a table can all be synchronized using a DBMS_EXPFIL.SYNC_TEXT_INDEXES procedure call. The name of the user table with one or more expression columns is passed as the only argument to this procedure call.

  dbms_expfil.sync_text_indexes (expr_tab => 'Consumer');

You must have the EXECUTE privilege on the CTX_DDL package for successful completion of the previous command. The call to the DBMS_EXPFIL.SYNC_TEXT_INDEXES procedure processes all the newly added or modified text predicates in the expression set and synchronizes the CTXRULE indexes accordingly. Any text predicate errors identified in this process are reported through the USER_EXPFIL_TEXT_INDEX_ERRORS view.