Oracle Spatial User's Guide and Reference Release 8.1.7 Part Number A8533701 

Oracle Spatial is an integrated set of functions and procedures that enables spatial data to be stored, accessed, and analyzed quickly and efficiently in an Oracle8i database.
Spatial data represents the essential location characteristics of real or conceptual objects as those objects relate to the real or conceptual space in which they exist.
Oracle Spatial, often referred to as Spatial, provides a SQL schema and functions that facilitate the storage, retrieval, update, and query of collections of spatial features in an Oracle8i database. Spatial consists of the following components:
The spatial attribute of a spatial feature is the geometric representation of its shape in some coordinate space. This is referred to as its geometry.
Spatial supports two mechanisms, or models, for representing geometries:
These models roughly correspond to two alternatives described in the OpenGIS ODBC/SQL specification for geospatial features. The objectrelational model corresponds to a "SQL with Geometry Types" implementation of spatial feature tables, and the relational model corresponds to an implementation of spatial feature tables using numeric SQL types for geometry storage. Implementationspecific details are described in Part I "ObjectRelational Model" and Part II "Relational Model" of this guide.
You should select the objectrelational model in all cases except where the relational model is necessary for current needs. Basically, the objectrelational model is preferable in cases where replication and distributed databases are not required.
The following are some of the benefits of using the objectrelational model, as described in Part I of this guide:
The following are some of the benefits of using the relational model, as described in Part II of this guide:
When Oracle introduces replication and distributed support for objects in a future release, there will be no benefits to using the relational model.
Oracle Spatial is designed to make spatial data management easier and more natural to users of applications such as a Geographic Information System (GIS). Once this data is stored in an Oracle database, it can be easily manipulated, retrieved, and related to all the other data stored in the database.
A common example of spatial data can be seen in a road map. A road map is a twodimensional object that contains points, lines, and polygons that can represent cities, roads, and political boundaries such as states or provinces. A road map is a visualization of geographic information. The location of cities, roads, and political boundaries that exist on the surface of the Earth are projected onto a twodimensional display or piece of paper, preserving the relative positions and relative distances of the rendered objects.
The data that indicates the Earth location (latitude and longitude, or height and depth) of these rendered objects is the spatial data. When the map is rendered, this spatial data is used to project the locations of the objects on a twodimensional piece of paper. A GIS is often used to store, retrieve, and render this Earthrelative spatial data.
Types of spatial data that can be stored using Spatial other than GIS data include data from computeraided design (CAD) and computeraided manufacturing (CAM) systems. Instead of operating on objects on a geographic scale, CAD/CAM systems work on a smaller scale, such as for an automobile engine or printed circuit boards.
The differences among these three systems are only in the scale of the data, not its complexity. They might all actually involve the same number of data points. On a geographic scale, the location of a bridge can vary by a few tenths of an inch without causing any noticeable problems to the road builders. Whereas, if the diameter of an engine's pistons are off by a few tenths of an inch, the engine will not run. A printed circuit board is likely to have many thousands of objects etched on its surface that are no bigger than the smallest detail shown on a road builder's blueprints.
These applications all store, retrieve, update, or query some collection of features that have both nonspatial and spatial attributes. Examples of nonspatial attributes are name, soil_type, landuse_classification, and part_number. The spatial attribute is a coordinate geometry, or vectorbased representation of the shape of the feature. The spatial attribute, referred to as the geometry, is an ordered sequence of vertices that are connected by straight line segments or circular arcs. The semantics of the geometry are determined by its type, which may be one of point, line string, or polygon.
The relational model of Spatial supports three geometric primitive types and geometries composed of collections of these types. The primitive types are as follows:
2D points are elements composed of two ordinates, X and Y, often corresponding to longitude and latitude. Line strings are composed of one or more pairs of points that define line segments. Polygons are composed of connected line strings that form a closed ring and the interior of the polygon is implied. Figure 11 illustrates the supported geometric primitive types.
Selfcrossing polygons are not supported, although selfcrossing line strings are supported. If a line string crosses itself, it does not become a polygon. A selfcrossing line string does not have any implied interior.
Thus, the objectrelational implementation supports the types listed in Figure 11, as well as the types shown in Figure 12.
The objectrelational model adds the following types to those previously listed:
Thus, the objectrelational implementation supports the types listed in Figure 11, as well as the types shown in Figure 12.
The Spatial data model is a hierarchical structure consisting of elements, geometries, and layers, which correspond to representations of spatial data. Layers are composed of geometries, which in turn are made up of elements.
For example, a point might represent a building location, a line string might represent a road or flight path, and a polygon might represent a state, city, zoning district, or city block.
An element is the basic building block of a geometry. The supported spatial element types are points, line strings, and polygons. For example, elements might model star constellations (point clusters), roads (line strings), and county boundaries (polygons). Each coordinate in an element is stored as an X,Y pair. The exterior ring and the interior ring of a polygon with holes are considered as two distinct elements that together make up a complex polygon.
Point data consists of one coordinate. Line data consists of two coordinates representing a line segment of the element. Polygon data consists of coordinate pair values, one vertex pair for each line segment of the polygon. Coordinates are defined in order around the polygon (counterclockwise for an exterior polygon ring, clockwise for an interior polygon ring).
A geometry (or geometry object) is the representation of a spatial feature, modeled as an ordered set of primitive elements. In the relational model, each geometry is required to be uniquely identified by a geometry identifier (GID) associating it with the other attributes of the feature. This is not required in the objectrelational model.
A geometry can consist of a single element, which is an instance of one of the supported primitive types, or a homogeneous or heterogeneous collection of elements. A multipolygon, such as one used to represent a set of islands, is a homogeneous collection. A heterogeneous collection is one in which the elements are of different types.
In the relational model, a complex geometry such as a polygon with holes would be stored as a sequence of polygon elements. All subelements of a multielement polygon are wholly contained within the outermost element. This is not required using the objectrelational model.
An example of a geometry might describe the buildable land in a town. This could be represented as a polygon with holes where water or zoning prevents construction.
A layer is a heterogeneous collection of geometries having the same attribute set. For example, one layer in a GIS might include topographical features, while another describes population density, and a third describes the network of roads and bridges in the area (lines and points). Each layer's geometries and associated spatial index are stored in the database in standard tables.
Many Spatial functions accept a tolerance parameter. If the distance between two points is less than or equal to the tolerance, Spatial considers the two points to be a single point. Thus, tolerance is usually a reflection of how accurate or precise users perceive their spatial data to be.
For example, assume that you want to know which restaurants are within 5 kilometers of your house. Assume also that Maria's Pizzeria is 5.1 kilometers from your house. If you ask, Find all restaurants within 5 kilometers and use a tolerance of 0.1 (or greater, such as 0.5), Maria's Pizzeria will be included; however, if you specify a tolerance less than 0.1 (such as 0.05), Maria's Pizzeria will not be included.
Tolerance values for Spatial functions are typically very small, for example, 0.0005 (5E4). With a tolerance of 5E4 and the query in the preceding paragraph, a restaurant 5.0005 kilometers away is returned but a restaurant 5.00051 kilometers away is not returned.
Spatial uses a twotier query model to resolve spatial queries and spatial joins. The term is used to indicate that two distinct operations are performed to resolve queries. The output of both operations yields the exact result set.
The two operations are referred to as primary and secondary filter operations.
Figure 13 illustrates the relationship between the primary and secondary filters.
As shown in Figure 13, the primary filter operation on a large input data set produces a smaller candidate set, which contains at least the exact result set and may contain more records. The secondary filter operation on the smaller candidate set produces the exact result set.
Spatial uses a linear quadtreebased spatial index to implement the primary filter. This is described in detail in following sections.
The function SDO_GEOM.RELATE is used as a secondary filter. It evaluates the topological relationship, such as whether two given geometries are touching, covering each other, or have any interaction.
Spatial does not require the use of both the primary and secondary filters. In some cases, just using the primary filter is sufficient. For example, a zoom feature in a mapping application queries for data that overlaps a rectangle representing visible boundaries. The primary filter very quickly returns a superset of the query. The mapping application can then apply clipping routines to display the target area.
The purpose of the primary filter is to quickly create a subset of the data and reduce the processing burden on the secondary filter. The primary filter therefore should be as efficient (that is, selective yet fast) as possible. This is determined by the characteristics of the spatial index on the data.
The introduction of spatial indexing capabilities into the Oracle database engine is a key feature of the Spatial product. A spatial index, like any other index, provides a mechanism to limit searches, but in this case based on spatial criteria such as intersection and containment. A spatial index is needed to:
A spatial index is considered a logical index. The entries in the spatial index are dependent on the location of the geometries in a coordinate space, but the index values are in a different domain. Index entries take on values from a linearly ordered integer domain, while the coordinates for a geometry may be pairs of integer, floatingpoint, or doubleprecision numbers.
Oracle Spatial lets you use Rtree indexing (the default) or quadtree indexing, or both. Each index type is appropriate in different situations. You can maintain both an Rtree and quadtree index on the same geometry column, by using the add_index parameter with the ALTER INDEX statement (described in Chapter 5), and you can choose which index to use for a query by specifying the idxtab1 and/or idxtab2 parameters with certain Spatial operators, such as SDO_RELATE, described in Chapter 6.
In choosing whether to use an Rtree or quadtree index for a spatial application, consider the following.
With Rtree indexes:
With quadtree indexes:
Testing of Rtree and quadtree indexes with many workloads and operators is ongoing, and results and recommendations will be documented as they become available. However, before choosing an index type for an application, you should understand the concepts and options associated with both Rtree indexing (described in Section 1.7.1) and quadtree indexing (described in Section 1.7.2).
A spatial Rtree index can index spatial data of up to 4 dimensions. An Rtree index approximates each geometry by a single rectangle that minimally encloses the geometry (called the minimum bounding rectangle, or MBR). For a layer of geometries, an Rtree index consists of a hierarchical index on the MBRs of the geometries in the layer. This Rtree index is stored in the spatial index table (SDO_INDEX_TABLE in the USER_SDO_INDEX_METADATA view, described in Section 2.5). The Rtree index also maintains a sequence number generator (SDO_RTREE_SEQ_NAME in the USER_SDO_INDEX_METADATA view) to ensure that simultaneous updates by concurrent users can be made to the index.
If you create a spatial index without specifying any indexing parameters, an Rtree index is created. For example, the following statement creates a spatial Rtree index named territory_idx using default values for parameters that apply to Rtree indexes:
CREATE INDEX territory_idx ON territories (territory_geom) INDEXTYPE IS MDSYS.SPATIAL_INDEX;
If you create a spatial index without specifying the SDO_LEVEL or SDO_NUMTILES keyword in the PARAMETERS clause, an Rtree index is created. For detailed information about options when creating a spatial index, see the documentation for the CREATE INDEX statement in Chapter 5.
If the rollback segment is not large enough, an attempt to create an Rtree index will fail. The rollback segment should be 100*n bytes, where n is the number of rows of data to be indexed. For example, if the table contains 1 million (1,000,000) rows, the rollback segment size should be 100,000,000 (100 million bytes).
To ensure an adequate rollback segment, or if you have tried to create an Rtree index and received an error that the system rollback segment cannot be extended, place that rollback segment offline, create a public rollback segment of the appropriate size, and place that rollback segment online. For information about performing these operations on a rollback segment, see the Oracle8i Administrator's Guide.
The system parameter SORT_AREA_SIZE affects the amount of time required to create the index. The SORT_AREA_SIZE value is the maximum amount, in bytes, of memory to use for a sort operation. The optimal value depends on the database size, but a good guideline is to make it at least 1 million bytes when you create an Rtree index. To change the SORT_AREA_SIZE value, use the ALTER SESSION statement. For example, to change the value to 20 million bytes:
ALTER SESSION SET SORT_AREA_SIZE = 20000000;
For large databases (over 1 million rows), a temporary tablespace may be needed to perform internal computations. The recommended size for this temporary tablespace is 100*n bytes, where n is the number of rows in the table.
In the linear quadtree indexing scheme, the coordinate space (for the layer where all geometric objects are located) is subjected to a process called tessellation, which defines exclusive and exhaustive cover tiles for every stored geometry. Tessellation is done by decomposing the coordinate space in a regular hierarchical manner. The range of coordinates, the coordinate space, is viewed as a rectangle. At the first level of decomposition, the rectangle is divided into halves along each coordinate dimension generating four tiles. Each tile that interacts with the geometry being tessellated is further decomposed into four tiles. This process continues until some termination criteria, such as size of the tiles or the maximum number of tiles to cover the geometry, is met.
Spatial can use either fixedsize or variablesized tiles to cover a geometry:
Fixedsize tile resolution and the number of variablesized tiles used to cover a geometry are userselectable parameters called SDO_LEVEL and SDO_NUMTILES, respectively. Smaller fixedsize tiles or more variablesized tiles provides better geometry approximations. The smaller the number of tiles, or the larger the tiles, the coarser are the approximations.
Spatial supports two quadtree indexing types, reflecting two valid combinations of SDO_LEVEL and SDO_NUMTILES values:
The process of determining which tiles cover a given geometry is called tessellation. The tessellation process is a quadtree decomposition, where the twodimensional coordinate space is broken down into four^{ }covering tiles of equal size. Successive tessellations divide those tiles that interact with the geometry down into smaller tiles, and this process continues until the desired level or number of tiles has been achieved. The results of the tessellation process on a geometry are stored in a table, referred to as the SDOINDEX table.
The tiles at a particular level can be linearly sorted by systematically visiting tiles in an order determined by a spacefilling curve as shown in Figure 14. The tiles can also be assigned unique numeric identifiers, known as Morton codes or zvalues. The terms tile and tile code will be used interchangeably in this and other sections related to spatial indexing.
Fixedsize tile spatial indexing is the preferred indexing method for the relational model. This method uses tiles of equal size to cover a geometry. Because all the tiles are the same size, they all have codes of the same length, and the standard SQL equality operator (=) can be used to compare tiles during a join operation. This results in excellent performance characteristics.
Two geometries are likely to interact, and hence pass the primary filter stage, if they share one or more tiles. The SQL statement for the primary filter stage is:
SELECT DISTINCT <select_list for geometry identifiers>
FROM table1_sdoindex A, table2_sdoindex B
WHERE A.sdo_code = B.sdo_code
The effectiveness and efficiency of this indexing method depends on the tiling level and the variation in size of the geometries in the layer. If you select a small fixedsize tile to cover small geometries and then try to use the same size tile to cover a very large geometry, a large number of tiles would be required. However, if the chosen tile size is large, so that fewer tiles are generated in the case of a large geometry, then the index selectivity suffers because the large tiles do not approximate the small geometries very well. Figure 15 and Figure 16 illustrate the relationships between tile size, selectivity, and the number of cover tiles.
With a small fixedsize tile as shown in Figure 15, selectivity is good, but a large number of tiles is needed to cover large geometries. A window query would easily identify geometries A and B, but would reject C.
With a large fixedsize tile as shown in Figure 16, fewer tiles are needed to cover the geometries, but the selectivity is not as good. The same window query as in Figure 15 would probably pick up all three geometries. Any object that shares tile T1 or T2 would identify object C as a candidate, even though the objects may be far apart, such as objects B and C are in Figure 16.
The SDO_TUNE.ESTIMATE_TILING_LEVEL function helps determine an appropriate tiling level for your data set.
Figure 17 illustrates geometry 1013 tessellated to three fixedsized tiles at level 1. The codes for these cover tiles are then stored in an SDOINDEX table.
Only three of the four tiles generated by the first tessellation interact with the geometry. Only those tiles that interact with the geometry are stored in the SDOINDEX table, as shown in Table 11. In this example, three fixedsize tiles are used. The table structure is shown for illustrative purposes only. The column names of this table differ depending on which implementation method, relational or objectrelational, is in use. In the relational model, you must directly access the index tables. In the objectrelational model, this is both unnecessary and not recommended.
SDO_GID <number>  SDO_CODE <raw> 

1013 
T0 
1013 
T2 
1013 
T3 
All elements in a geometry are tessellated. In a multielement geometry like 1013, Element 1 is already covered by tile T2 from the tessellation of Element 0. If, however, the specified tiling resolution were such that tile T2 were further subdivided and one of these smaller tiles were completely contained in Element 1, then that tile would be excluded because it would not interact with the geometry.
Hybrid indexing uses a combination of fixedsize and variablesized tiles for spatially indexing a layer. Variablesized tile spatial indexing uses tiles of different sizes to approximate a geometry. For each geometry, you will have a set of fixedsize tiles that fully cover the geometry, and also a set of variablesized tiles that fully cover the geometry.
For most applications, you should not use hybrid indexes, but should instead use fixed indexes or Rtree indexes. The rare circumstances where hybrid indexes should be considered are as follows:
The best starting value for SDO_NUMTILES in the new hybrid layer can be calculated by getting a count of the rows in the spatial index table and dividing this number by the number of rows with geometries in the layer, then rounding up. A spatial join (`QUERYTYPE=JOIN') is not a common requirement for applications, and it is comparable to a spatial cross product where each of the geometries in one layer will be compared with each of the geometries in the other layer.
If both of these conditions are true, it may be better to use the SDO_NUMTILES parameter to get coverage for the smaller geometries, while keeping the fixed tile size relatively large for the large geometries by using a smaller SDO_LEVEL value.
In Figure 18, the variablesized cover tiles closely approximate each geometry. This results in good selectivity. The number of variable tiles needed to cover a geometry is controlled using the SDO_NUMTILES parameter.
A variable tile is subdivided if it interacts with the geometry, and subdivision will not result in tiles that are smaller than a predetermined size. This size, or tiling resolution, is determined by a default SDO_MAXLEVEL value.
Figure 19 illustrates how geometry OBJ_1, represented using the objectrelational implementation, is approximated with hybrid indexing (SDO_LEVEL = 1 and SDO_NUMTILES = 4). These are not recommended values for SDO_LEVEL and SDO_NUMTILES; they were chosen to simplify this example. The cover tiles are stored in the SDOINDEX table as shown in Table 12.
In Figure 19, note that for simplicity the tiles have been numbered, and LL and UR indicate lower left and upper right, respectively. For example, T2_LL indicates the lower left corner of tile T2. (This designation scheme does not reflect the actual format use in Spatial.)
In Figure 19, note which fixedsize tiles are associated with geometry OBJ_1. Only three (T0, T2, T3) of the four large tiles (T0, T1, T2, T3) generated by the tessellation actually interact with the geometry. Only those three are stored in the SDOINDEX table. In examining which variablesized tiles are used, tile T0 shows a further tessellation to four smaller tiles, two of which (T02, T03) are used to cover a portion of the geometry. The variablesized tiles are stored in the SDO_CODE column in the Spatial index table. The fixedsize tiles are stored in the SDO_GROUPCODE column. The spatial index structure is discussed in Section 2.5.
Table 12 shows the tiles from Figure 19 that are stored in the SDOINDEX table.
Table 12 Section of the SDOINDEX Table
As with the fixedsize tile model, all elements in a geometry are tessellated in one step. In a multielement geometry like OBJ_1, Element 1 (the hole shown in Figure 19) is covered by a redundant tile (T2) from the tessellation of Element 0, but this tile is stored only once.
The SDO_TUNE package has some functions that help determine appropriate SDO_LEVEL and SDO_NUMTILES values. Appendix A contains suggestions on when hybrid indexing may be beneficial, and how to select values for the two required parameters.
Spatial uses filter methods to determine the spatial relationship between entities in the database. The spatial relation is based on geometry locations. The most common spatial relations are based on topology and distance. For example, the boundary of an area consists of a set of curves that separates the area from the rest of the coordinate space. The interior of an area consists of all points in the area that are not on its boundary. Given this, two areas are said to be adjacent if they share part of a boundary but do not share any points in their interior.
The distance between two spatial objects is the minimum distance between any points in them. Two objects are said to be within a given distance of one another if their distance is less than the given distance.
To determine spatial relations, Spatial has several secondary filter methods:
The syntax of these operators is given in Chapter 6.
The SDO_RELATE operator implements a 9intersection model for categorizing binary topological relations between points, lines, and polygons. Each spatial object has an interior, a boundary, and an exterior. The boundary consists of points or lines that separate the interior from the exterior. The boundary of a line consists of its end points. The boundary of a polygon is the line that describes its perimeter. The interior consists of points that are in the object but not on its boundary, and the exterior consists of those points that are not in the object.
Given that an object A has 3 components (a boundary Ab, an interior Ai, and an exterior Ae), any pair of objects has 9 possible interactions between their components. Pairs of components have an empty (0) or a nonempty (1) set intersection. The set of interactions between 2 geometries is represented by a 9intersection matrix that specifies which pairs of components intersect and which do not. Figure 110 shows the 9intersection matrix for 2 polygons that are adjacent to one another. This matrix yields the following bit mask, generated in rowmajor form: "101001111".
Some of the topological relationships identified in the seminal work by Professor Max Egenhofer (University of Maine, Orono) and colleagues have names associated with them. Spatial uses the following names:
The SDO_WITHIN_DISTANCE operator determines if two spatial objects, A and B, are within a Euclidean distance of one another. This operator first constructs a distance buffer, Db, around the reference object B. It then checks that A and Db are nondisjoint. The distance buffer of an object consists of all points within the given distance from that object. Figure 111 shows the distance buffers for point, line, and area objects. Notice how the buffer is rounded near the corners of the objects.
The SDO_NN operator returns a specified number of objects from a geometry column that are closest to a specified geometry (for example, the five closest restaurants to a city park). In determining how close two geometry objects are, the shortest possible distance between any two points on the surface of each object is used.
Point data, unlike line and polygon data, has the unique characteristic of always using only one tile per point. For applications handling point data sets that are several tens of gigabytes or larger, performance gains can be achieved by using Oracle8i table partitioning features.
Table partitioning is available only with the Partitioning Option of Oracle8i Enterprise Edition. If the Partitioning Option is available to you, the preferred method is to use Oracle8i table partitioning in conjunction with spatial indexing (using the relational model). See Oracle8i Concepts for a description of Oracle8i partitioning. See Section A.1.7.3 for a description of a sample script that uses table partitioning with point data.
Oracle Spatial provides examples that you can use to reinforce your learning and to create models for coding certain operations. Several examples are provided in the following directory:
$ORACLE_HOME/md/demos/examples
The following files in that directory are helpful for applications that use the Oracle Call Interface (OCI):
This guide also includes many examples in SQL and PL/SQL. One or more examples are usually provided with the reference information for each function or procedure, and several simplified examples are provided that illustrate table and index creation, as well as several functions and procedures:

Copyright © 19962000, Oracle Corporation. All Rights Reserved. 
