Oracle Spatial User's Guide and Reference
Release 9.0.1

Part Number A88805-01

Library

Product

Contents

Index

Go to previous page Go to next page

4
Indexing and Querying Spatial Data

After you have loaded spatial data (discussed in Chapter 3), you should create a spatial index on it to enable efficient query performance using the data. This chapter describes how to:

4.1 Creating a Spatial Index

Once data has been loaded into the spatial tables through either bulk or transactional loading, a spatial index must be created on the tables for efficient access to the data. Each spatial index can be an R-tree index or a quadtree index. To decide which type of index to use for a spatial application, you must understand the concepts and guidelines discussed in Section 1.7.

If the index creation does not complete for any reason, the index is invalid and must be deleted with the DROP INDEX <index_name> [FORCE] statement.

4.1.1 Creating R-Tree Indexes

If you create a spatial index without specifying any quadtree-specific parameters, an R-tree index is created. For example, the following statement creates a spatial R-tree index named territory_idx using default values for parameters that apply to R-tree indexes:

CREATE INDEX territory_idx ON territories (territory_geom)
   INDEXTYPE IS MDSYS.SPATIAL_INDEX;

For detailed information about options when creating a spatial index, see the documentation for the CREATE INDEX statement in Chapter 9.

If the rollback segment is not large enough, an attempt to create an R-tree index will fail. The rollback segment should be 100*n bytes, where n is the number of rows of data to be indexed. For example, if the table contains 1 million (1,000,000) rows, the rollback segment size should be 100,000,000 (100 million bytes).

To ensure an adequate rollback segment, or if you have tried to create an R-tree index and received an error that a rollback segment cannot be extended, review (or have a DBA review) the size and structure of the rollback segments that are available to the schema that owns the table with the geometries. Create a public rollback segment of the appropriate size, and place that rollback segment online. In addition, ensure that any small inappropriate rollback segments are placed offline during large spatial index operations. For information about performing these operations on a rollback segment, see the Oracle9i Database Administrator's Guide.

The system parameter SORT_AREA_SIZE affects the amount of time required to create the index. The SORT_AREA_SIZE value is the maximum amount, in bytes, of memory to use for a sort operation. The optimal value depends on the database size, but a good guideline is to make it at least 1 million bytes when you create an R-tree index. To change the SORT_AREA_SIZE value, use the ALTER SESSION statement. For example, to change the value to 20 million bytes:

ALTER SESSION SET SORT_AREA_SIZE = 20000000;

The tablespace specified with the tablespace keyword in the CREATE INDEX statement (or the default tablespace if the tablespace keyword is not specified) is used to hold both the index data table and some transient tables that are created for internal computations.

For large databases (over 1 million rows), a temporary tablespace may be needed to perform internal sorting operations. The recommended size for this temporary tablespace is 100*n bytes, where n is the number of rows in the table.

4.1.2 Determining Index Creation Behavior (Quadtree Indexes)

With a quadtree index, the tessellation algorithm used by the CREATE INDEX statement and by index maintenance routines on insert or update operations is determined by the SDO_LEVEL and SDO_NUMTILES values, which are supplied in the PARAMETERS clause of the CREATE INDEX statement. They are interpreted as follows:

SDO_LEVEL  SDO_NUMTILES  Action 

Not specified or 0 

Not specified or 0 

R-tree index. 

>= 1 

Not specified or 0 

Fixed indexing (indexing with fixed-size tiles). 

>= 1 

>= 1 

Hybrid indexing with fixed-size and variable-sized tiles. The SDO_LEVEL column defines the fixed tile size. The SDO_NUMTILES column defines the number of variable tiles to generate per geometry. 

Not specified or 0 

>= 1 

Not supported (error). 

An explicit commit operation is executed after the tessellation of all the geometries in a geometry column.

By default, spatial index creation requires a sizable amount of rollback space. To reduce the amount of rollback space required, you can supply the SDO_COMMIT_INTERVAL parameter in the CREATE INDEX statement. This will perform a database commit after every n geometries are indexed, where n is a user-defined value.

4.1.3 Spatial Indexing with Fixed-Size Tiles (Quadtree Indexes)

If you choose quadtree indexing for a spatial index, you should use fixed indexing for most applications, except for the rare circumstances where hybrid indexing should be considered. (See Appendix B for information about hybrid indexing. However, you should also consider using R-tree indexing before deciding on hybrid indexing.)

The fixed-size tile algorithm is expressed as a level referring to the number of tessellations performed. To use fixed-size tile indexing, omit the SDO_NUMTILES parameter and set the SDO_LEVEL value to the desired tiling level. The relationship between the tiling level and the resulting size of the tiles depends on the domain of the layer.

The domain used for indexing is defined by the upper and lower boundaries of each dimension stored in the DIMINFO column of the USER_SDO_GEOM_METADATA view, which contains an entry for the table and geometry column to spatially index. A typical domain could be -180 to 180 degrees for longitude, and -90 to 90 degrees for latitude, as represented in Figure 4-1. (The transference of the domain onto a sphere or other projection is left up to an application, unless a coordinate system is specified, as explained in Chapter 5.)

Figure 4-1 Sample Domain


Illustration of a sample domain (-180 to 180 degrees for longitude and -90 to 90 degrees for latitude).

If the SDO_LEVEL column is set to 1, then the tiles created by the indexing mechanism are the same size as tiles at the first level of tessellation. Each tile would be 180 degrees by 90 degrees as shown in Figure 4-2.

Figure 4-2 Fixed-Size Tiling at Level 1


Illustration of fixed-size tiling at level 1.

The formula for the number of fixed-size tiles in a domain is 4n where n is the number of tessellations, stored in the SDO_LEVEL column. In reality, tiles are only generated where geometries exist, and not for the whole domain. Figure 4-3 shows fixed-size tiling at level 2. In this figure, each tile is 90 degrees by 45 degrees.

Figure 4-3 Fixed-Size Tiling at Level 2


Illustration of fixed-size tiling at level 2.

The size of a tile can be determined by applying the following formula to each dimension:

length = (upper_bound - lower_bound) / 2 ^ sdo_level

The length refers to the length of the tile along the specified dimension. Applying this formula to the tiling shown in Figure 4-3 yields the following sizes:

length for dimension X = (180 - (-180) ) / 2^2
                       = (360)  / 4
                       = 90
length for dimension Y = (90 - (-90) ) / 2^2
                       = (180) / 4
                       = 45

At level 2, the tiles are 90 degrees by 45 degrees in size. As the number of levels increases, the tiles become smaller and smaller. Smaller tiles provide a more precise fit of the tiles over the geometry being indexed. However, because the number of tiles generated is unbounded, you must take into account the performance implications of using higher levels.


Note:

The Spatial Index Advisor component of Oracle Enterprise Manager can be used to determine an appropriate level for indexing with fixed-size tiles. The SDO_TUNE.ESTIMATE_TILING_LEVEL function, described in Chapter 17, can also be used for this purpose; however, this function performs less analysis than the Spatial Index Advisor. 


Besides the performance aspects related to selecting a fixed-size tile, tessellating the geometry into fixed-size tiles might have benefits related to the type of data being stored, such as using tiles sized to represent 1-acre farm plots, city blocks, or individual pixels on a display. Data modeling, an important part of any database design, is essential in a spatial database where the data often represents actual physical locations.

In Example 4-1, assume that data has been loaded into a table called ROADS, and the USER_SDO_GEOM_METADATA view has an entry for ROADS.SHAPE. You can use the following SQL statement to create a fixed index named ROADS_FIXED.

Example 4-1 Creating a Fixed Index

CREATE INDEX ROADS_FIXED ON ROADS(SHAPE) INDEXTYPE IS MDSYS.SPATIAL_INDEX
  PARAMETERS('SDO_LEVEL=8');


The SDO_LEVEL value is used while tessellating objects. Increasing the level results in smaller tiles and better geometry approximations.

4.1.4 Constraining Data to a Geometry Type

When you create or rebuild a spatial index, you can ensure that all geometries that are in the table or that are inserted later are of a specified geometry type. To constrain the data to a geometry type in this way, use the layer_gtype keyword in the PARAMETERS clause of the CREATE INDEX or ALTER INDEX REBUILD statement, and specify a value from the Geometry Type column of Table 2-1 in Section 2.2.1. For example, to constrain spatial data in a layer to polygons:

CREATE INDEX cola_spatial_idx
ON cola_markets(shape)
INDEXTYPE IS MDSYS.SPATIAL_INDEX
PARAMETERS ('layer_gtype=POLYGON');

The geometry types in Table 2-1 are considered as a hierarchy when data is checked:

4.1.5 Creating a Cross-Schema Index

You can create a spatial index on a table that is not in your schema. Assume that user B wants to create a spatial index on column GEOMETRY in table T1 under user A's schema. User B must perform the following steps:

  1. Connect as user A (or have user A connect) and execute the following statement:

    GRANT select on T1 to B;
    
    
    
  2. Connect as user B and execute a statement such as the following:

    GRANT create table to A;
    CREATE INDEX t1_spatial_idx on A.T1(geometry)
      INDEXTYPE IS mdsys.spatial_index;
    

4.1.6 Using Partitioned Spatial Indexes

You can create a partitioned spatial index on a partitioned table. This section describes usage considerations specific to Oracle Spatial. For a detailed explanation of partitioned tables and partitioned indexes, see the Oracle9i Database Administrator's Guide.

A partitioned spatial index can provide the following benefits:

The following restrictions apply to spatial index partitioning:

To create a partitioned spatial index, you must specify the LOCAL keyword. For example:

CREATE INDEX counties_idx ON counties(geometry)
   INDEXTYPE IS MDSYS.SPATIAL_INDEX LOCAL;

In this example, the default values are used for the number and placement of index partitions, namely:

If you do specify parameters for individual partitions, the following considerations apply:

To override the default partitioning values, use a CREATE INDEX statement with the following general format:

CREATE INDEX <indexname> ON <table>(<column>) 
  INDEXTYPE IS MDSYS.SPATIAL_INDEX 
     [PARAMETERS ('<spatial-params>, <storage-params>')] LOCAL 
     [( PARTITION <index_partition> 
        PARAMETERS ('<spatial-params>, <storage-params>') 
     [, PARTITION <index_partition> 
        PARAMETERS ('<spatial-params>, <storage-params>')] 
     )] 

For example, if the COUNTIES table has two partitions, P1 and P2, you can create a quadtree index as follows:

CREATE INDEX counties_idx ON counties(geometry) 
 INDEXTYPE IS MDSYS.SPATIAL_INDEX
    PARAMETERS ('sdo_level=6 tablespace=def_tbs') 
LOCAL 
    (PARTITION ip1 PARAMETERS ('sdo_level=6 tablespace=local_tbs1'), 
     PARTITION ip2 PARAMETERS ('sdo_level=6 tablespace=local_tbs2'),
     PARTITION ip3); 

In the preceding example:

Queries can operate on partitioned tables to perform the query on only one partition. For example:

SELECT * FROM counties PARTITION(p1)
    WHERE ...<some-spatial-predicate>;

Querying on a selected partition may speed up the query and also improve overall throughput when multiple queries operate on different partitions concurrently.

When queries use a partitioned spatial index, the semantics (meaning or behavior) of spatial operators and functions is the same with partitioned and nonpartitioned indexes, except in the case of SDO_NN (nearest neighbor). With SDO_NN, the requested number of geometries is returned for each partition that is affected by the query. For example, if you request the 5 closest restaurants to a point and the spatial index has 4 partitions, SDO_NN returns up to 20 (5*4) geometries. In this case, you must use the ROWNUM pseudocolumn (here, WHERE ROWNUM <=5) to return the 5 closest restaurants. See the description of the SDO_NN in Chapter 11 for more information.

4.2 Querying Spatial Data

This section describes how the structures of a Spatial layer are used to resolve spatial queries and spatial joins.

4.2.1 Query Model

Spatial uses a two-tier query model to resolve spatial queries and spatial joins. The term two-tier is used to indicate that two distinct operations are performed in order to resolve queries. If both operations are performed, the exact result set is returned.

The two operations are referred to as primary filter and secondary filter operations.

4.2.2 Spatial Query

An important concept in the spatial data model is that each geometry is represented by a set of exclusive and exhaustive tiles. This means that no tiles overlap each other (exclusive), and the tiles fully cover the object (exhaustive).

Consider the following layer containing several objects in Figure 4-4. Each object is labeled with its SDO_GID. The relevant tiles are labeled with Tn.

Figure 4-4 Tessellated Layer with Multiple Objects


Illustration of a tessellated layer with multiple objects.

A typical spatial query is to request all objects that lie within a defined fence or window. A query window is shown in Figure 4-5 by the dotted-line box. A dynamic query window refers to a fence that is not defined in the database, but that must be defined before it is used.

Figure 4-5 Tessellated Layer with a Query Window


Illustration of a tessellated layer with a query window.

4.2.2.1 Primary Filter

Spatial provides an operator named SDO_FILTER. This implements the primary filter portion of the two-step process involved in the product's query processing model. The primary filter uses the index data only to determine a set of candidate object pairs that may interact. The syntax is as follows:

SDO_FILTER(geometry1 MDSYS.SDO_GEOMETRY, geometry2 MDSYS.SDO_GEOMETRY, 
params VARCHAR2)

Where:

The following examples perform a primary filter operation only. They will return all the geometries shown in Figure 4-5 that have an index tile in common with one of the index tiles that approximates the query window: tiles T1, T2, T3, and T4. The result of the following examples are geometries with IDs 1013, 1243, 12, and 501.

Example 4-2 performs a primary filter operation without inserting the query window into a table. The window will be indexed in memory and performance will be very good.

Example 4-2 Primary Filter with a Temporary Query Window

SELECT A.Feature_ID FROM TARGET A
WHERE sdo_filter(A.shape, mdsys.sdo_geometry(2003,NULL,NULL, mdsys.sdo_elem_info_array(1,1003,3), mdsys.sdo_ordinate_array(x1,y1, x2,y2)), 'querytype=window') = 'TRUE';

Note that (x1,y1) and (x2,y2) are the lower-left and upper-right corners of the query window.

In Example 4-3, a transient instance of type SDO_GEOMETRY was constructed for the query window instead of specifying the window parameters in the query itself.

Example 4-3 Primary Filter with a Transient Instance of the Query Window

SELECT A.Feature_ID FROM TARGET A
WHERE sdo_filter(A.shape, :theWindow,'querytype=window') = 'TRUE';

Example 4-4 assumes the query window was inserted into a table called WINDOWS, with an ID of WINS_1.

Example 4-4 Primary Filter with a Stored Query Window

SELECT A.Feature_ID FROM TARGET A, WINDOWS B
WHERE B.ID = 'WINS_1' AND
sdo_filter(A.shape, B.shape,'querytype=window') = 'TRUE';

If the B.SHAPE column is not spatially indexed, the SDO_FILTER operator indexes the query window in memory and performance is very good.

If the B.SHAPE column is spatially indexed with the same SDO_LEVEL value as the A.SHAPE column, the SDO_FILTER operator reuses the existing index, and performance is very good or better.

If the B.SHAPE column is spatially indexed with a different SDO_LEVEL value than the A.SHAPE column, the SDO_FILTER operator reindexes B.SHAPE in the same way as if there were no index on the column originally, and then performance is very good.

4.2.2.2 Primary and Secondary Filters

The SDO_RELATE operator performs both the primary and secondary filter stages when processing a query. The syntax of the operator is as follows:

 SDO_RELATE(geometry1  MDSYS.SDO_GEOMETRY,
            geometry2  MDSYS.SDO_GEOMETRY,
            params     VARCHAR2)

Where:

The following examples perform both primary and secondary filter operations. They return all the geometries in Figure 4-5 that lie within or overlap the query window. The result of these examples is objects 1243 and 1013.

Example 4-5 performs both primary and secondary filter operations without inserting the query window into a table. The window will be indexed in memory and performance will be very good.

Example 4-5 Secondary Filter Using a Temporary Query Window

SELECT A.Feature_ID FROM TARGET A
   WHERE sdo_relate(A.shape, mdsys.sdo_geometry(2003,NULL,NULL,
                                       mdsys.sdo_elem_info_array(1,1003,3),
                                      mdsys.sdo_ordinate_array(x1,y1, x2,y2)),
'mask=anyinteract querytype=window') = 'TRUE';

Note that (x1,y1) and (x2,y2) are the lower-left and upper-right corners of the query window.

Example 4-6 assumes the query window was inserted into a table called WINDOWS, with an ID of WINS_1.

Example 4-6 Secondary Filter Using a Stored Query Window

SELECT A.Feature_ID FROM TARGET A, WINDOWS B
WHERE B.ID= 'WINS_1' AND
sdo_relate(A.shape, B.shape, 'mask=anyinteract querytype=window') = 'TRUE';

If the B.SHAPE column is not spatially indexed, the SDO_RELATE operator indexes the query window in memory and performance is very good.

If the B.SHAPE column is spatially indexed with the same SDO_LEVEL value as the A.SHAPE column, the SDO_RELATE operator reuses the existing index, and performance is very good or better.

If the B.SHAPE column is spatially indexed with a different SDO_LEVEL value than the A.SHAPE column, the SDO_FILTER operator reindexes B.SHAPE in the same way as if there were no index on the column originally, and then performance is very good.

4.2.2.3 Within-Distance Operator

The SDO_WITHIN_DISTANCE operator is used to determine the set of objects in a table that are within n distance units from a reference object. The reference object may be a transient or persistent instance of MDSYS.SDO_GEOMETRY (such as a temporary query window or a permanent geometry stored in the database). The syntax is as follows:

SDO_WITHIN_DISTANCE(geometry1  MDSYS.SDO_GEOMETRY, 
                    aGeom      MDSYS.SDO_GEOMETRY,
                    params     VARCHAR2);

Where:

The following example selects any objects within 1.35 distance units from the query window:

SELECT A.Feature_ID
  FROM TARGET A
  WHERE SDO_WITHIN_DISTANCE( A.shape, :theWindow, 'distance=1.35') = 'TRUE';

The distance units are based on the geometry coordinate system in use. The distance units are those specified in the UNIT field of the well-known text (WKTEXT) associated with the coordinate system. (The WKTEXT is explained in Section 5.4.1.1.) If you are using a geodetic coordinate system, the units are meters. If no coordinate system is used, the units are the same as for the stored data.

The SDO_WITHIN_DISTANCE operator is not suitable for performing spatial joins. That is, a query such as Find all parks that are within 10 distance units from coastlines will not be processed as an index-based spatial join of the COASTLINES and PARKS tables. Instead, it will be processed as a nested loop query in which each COASTLINES instance is in turn a reference object that is buffered, indexed, and evaluated against the PARKS table. Thus, the SDO_WITHIN_DISTANCE operation is performed n times if there are n rows in the COASTLINES table.

For non-geodetic data, there is an efficient way to accomplish a spatial join that involves buffering all the geometries of a layer. This method does not use the SDO_WITHIN_DISTANCE operator. First, create a new table COSINE_BUFS as follows:

CREATE TABLE cosine_bufs UNRECOVERABLE AS
   SELECT SDO_BUFFER (A.SHAPE, B.DIMINFO, 1.35)
     FROM COSINE A, USER_SDO_GEOM_METADATA B
     WHERE TABLE_NAME='COSINES' AND COLUMN_NAME='SHAPE';

Next, create a spatial index on the SHAPE column of COSINE_BUFS. Then you can perform the following query:

SELECT a.gif, b.gid FROM parks A cosine_bufs B
  WHERE SDO_Relate(A.shape, B.shape, 'mask=ANYINTERACT querytype=JOIN') ='TRUE';

4.2.2.4 Nearest Neighbor Operator

The SDO_NN operator is used to identify the nearest neighbors for a geometry. The syntax is as follows:

SDO_NN(geometry1  MDSYS.SDO_GEOMETRY, 
       geometry2  MDSYS.SDO_GEOMETRY,
       param      VARCHAR2
       [, number  NUMBER]);

Where:

The following example finds the two objects from the SHAPE column in the COLA_MARKETS table that are closest to a specified point (10,7). (Note the use of the optimizer hint in the SELECT statement, as explained in the Usage Notes for the SDO_NN operator in Chapter 11.)

SELECT /*+ INDEX(cola_markets cola_spatial_idx) */
 c.mkt_id, c.name  FROM cola_markets c  WHERE SDO_NN(c.shape,
   mdsys.sdo_geometry(2001, NULL, mdsys.sdo_point_type(10,7,NULL), NULL,
   NULL),  'sdo_num_res=2') = 'TRUE'; 

4.2.3 Spatial Join

A spatial join is the same as a regular join except that the predicate involves a spatial operator. In Spatial, a spatial join takes place when you compare all the geometries of one layer to all the geometries of another layer. This is unlike a query window that only compares a single geometry to all geometries of a layer.

In a spatial join, all tables must have the same type of spatial index (that is, R-tree or quadtree) defined on the geometry column; and if they have quadtree indexes, the SDO_LEVEL value must be the same for all the indexes.

Spatial joins can be used to answer questions such as, Which highways cross national parks?

The following table structures illustrate how the join would be accomplished for this example:

PARKS(    GID VARCHAR2(32), SHAPE MDSYS.SDO_GEOMETRY)
HIGHWAYS( GID VARCHAR2(32), SHAPE MDSYS.SDO_GEOMETRY) 

The primary filter would identify pairs of GID values from the PARKS and HIGHWAYS tables that interact in their index entries. The query that performs the primary filter join is:

SELECT A.GID, B.GID 
FROM PARKS A, HIGHWAYS B
WHERE sdo_filter(A.shape, B.shape, 'querytype=join') = 'TRUE';

The original question, asking about highways that cross national parks, requires the secondary filter operator to find the exact relationship between highways and parks.

The query that performs this join using both primary and secondary filters is:

SELECT A.GID, B.GID 
  FROM parks A, highways B
  WHERE sdo_relate(A.shape, B.shape, 
'mask=ANYINTERACT querytype=join');

4.2.4 Cross-Schema Operator Invocation

You can invoke spatial operators on an indexed table that is not in your schema. Assume that user A has a spatial table T1 (with index table IDX_TAB1) with a spatial index defined, that user B has a spatial table T2 (with index table IDX_TAB2) with a spatial index defined, and that user C wants to invoke operators on tables in one or both of the other schemas.

If user C wants to invoke an operator only on T1, user C must perform the following steps:

  1. Connect as user A and execute the following statements:

    GRANT select on T1 to C;
    GRANT select on idx_tab1 to C;
    
    
  2. Connect as user C and execute a statement such as the following:

    SELECT a.gid 
      FROM T1 a
      WHERE sdo_filter(a.geometry, :theGeometry, 'querytype=WINDOW') = 'TRUE';
    
    

If user C wants to invoke an operator on both T1 and T2, user C must perform the following steps:

  1. Connect as user A and execute the following statements:

    GRANT select on T1 to C;
    GRANT select on idx_tab1 to C;
    
    
  2. Connect as user B and execute the following statements:

    GRANT select on T2 to C;
    GRANT select on idx_tab2 to C;
    
    
  3. Connect as user C and execute a statement such as the following:

    SELECT a.gid
      FROM T1 a, T2 b
      WHERE b.gid = 5 AND 
            sdo_filter(a.geometry, b.geometry, 'querytype=WINDOW') = 'TRUE';
    


Go to previous page Go to next page
Oracle
Copyright © 2001 Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index