Oracle Spatial User's Guide and Reference
Release 8.1.6






Prev Next

Spatial Concepts

Oracle Spatial is an integrated set of functions and procedures that enables spatial data to be stored, accessed, and analyzed quickly and efficiently in an Oracle8i database.

Spatial data represents the essential location characteristics of real or conceptual objects as those objects relate to the real or conceptual space in which they exist.

1.1 What Is the Spatial Product?

Oracle Spatial, often referred to as Spatial, provides a SQL schema and functions that facilitate the storage, retrieval, update, and query of collections of spatial features in an Oracle8i database. Spatial consists of the following components:

The spatial attribute of a spatial feature is the geometric representation of its shape in some coordinate space. This is referred to as its geometry.

1.2 Object-Relational and Relational Models

Spatial supports two mechanisms, or models, for representing geometry:

These models roughly correspond to two alternatives described in the OpenGIS ODBC/SQL specification for geospatial features. The object-relational model corresponds to a "SQL with Geometry Types" implementation of spatial feature tables, and the relational model corresponds to an implementation of spatial feature tables using numeric SQL types for geometry storage.

Implementation-specific details are described in Part I "Object-Relational Model" and Part II "Relational Model" of this guide. Other sections in this chapter describe Spatial concepts and features, without reference to their implementation wherever possible.

You should select the best model to fit your needs. Basically, the object-relational model is preferable in cases where replication and distributed databases are not required.


If read-only replication is acceptable: Oracle8i does not provide database replication support for tables containing one or more columns of an object data type. In many application environments, however, it may be acceptable to have read-only replicas that do not need to be perfectly up to date with the production database. In these environments, users can take advantage of the Oracle8i managed standby capability to get, in effect, read-only replication of database instances or tablespaces, and thus the object-relational model can be used. 

1.2.1 Benefits of the Object-Relational Model

The following are some of the benefits of using the object-relational model, as described in Part I of this guide:

1.2.2 Benefits of the Relational Model

The following are some of the benefits of using the relational model, as described in Part II of this guide:

When Oracle introduces replication and distributed support for objects in a future release, there will be no benefits to using the relational model.

1.3 Introduction to Spatial Data

Oracle Spatial is designed to make spatial data management easier and more natural to users or applications such as a Geographic Information System (GIS). Once this data is stored in an Oracle database, it can be easily manipulated, retrieved, and related to all the other data stored in the database.

A common example of spatial data can be seen in a road map. A road map is a two-dimensional object that contains points, lines, and polygons that can represent cities, roads, and political boundaries such as states or provinces. A road map is a visualization of geographic information. The location of cities, roads, and political boundaries that exist on the surface of the Earth are projected onto a two-dimensional display or piece of paper, preserving the relative positions and relative distances of the rendered objects.

The data that indicates the Earth location (latitude and longitude, or height and depth) of these rendered objects is the spatial data. When the map is rendered, this spatial data is used to project the locations of the objects on a two-dimensional piece of paper. A GIS is often used to store, retrieve, and render this Earth-relative spatial data.

Other types of spatial data that can be stored using Spatial besides GIS data include data from computer-aided design (CAD) and computer-aided manufacturing (CAM) systems. Instead of operating on objects on a geographic scale, CAD/CAM systems work on a smaller scale, such as for an automobile engine or printed circuit boards.

The differences among these three systems are only in the scale of the data, not its complexity. They might all actually involve the same number of data points. On a geographic scale, the location of a bridge can vary by a few tenths of an inch without causing any noticeable problems to the road builders. Whereas, if the diameter of an engine's pistons are off by a few tenths of an inch, the engine will not run. A printed circuit board is likely to have many thousands of objects etched on its surface that are no bigger than the smallest detail shown on a road builder's blueprints.

These applications all store, retrieve, update, or query some collection of features that have both nonspatial and spatial attributes. Examples of nonspatial attributes are name, soil_type, landuse_classification, and part_number. The spatial attribute is a coordinate geometry, or vector-based representation of the shape of the feature. The spatial attribute, referred to as the geometry, is an ordered sequence of vertices that are connected by straight line segments or circular arcs. The semantics of the geometry are determined by its type, which may be one of point, line string, or polygon.

1.4 Geometric Types for Relational and Object-Relational Models

The relational model of Spatial supports three geometric primitive types and geometries composed of collections of these types. The primitive types are as follows:

2-D points are elements composed of two ordinates, X and Y, often corresponding to longitude and latitude. Line strings are composed of one or more pairs of points that define line segments. Polygons are composed of connected line strings that form a closed ring and the interior of the polygon is implied. Figure 1-1 illustrates the supported geometric primitive types.

Figure 1-1 Geometric Primitive Types

Self-crossing polygons are not supported although self-crossing line strings are. If a line string crosses itself, it does not become a polygon. A self-crossing line string does not have any implied interior.

Thus, the object-relational implementation supports the types listed in Figure 1-1, as well as the types shown in Figure 1-2.

The object-relational model adds the following types to those previously listed:

Thus, the object-relational implementation supports the types listed in Figure 1-1, as well as the types shown in Figure 1-2.

Figure 1-2 Additional Geometry Types Using the Object-Relational Model

1.5 Data Model

The Spatial data model is a hierarchical structure consisting of elements, geometries, and layers, which correspond to representations of spatial data. Layers are composed of geometries, which in turn are made up of elements.

For example, a point might represent a building location, a line string might represent a road or flight path, and a polygon might represent a state, city, zoning district, or city block.

1.5.1 Element

An element is the basic building block of a geometry. The supported spatial element types are points, line strings, and polygons. For example, elements might model star constellations (point clusters), roads (line strings), and county boundaries (polygons). Each coordinate in an element is stored as an X,Y pair. The exterior ring and the interior ring of a polygon with holes are considered as two distinct elements that together make up a complex polygon.

Point data consists of one coordinate. Line data consists of two coordinates representing a line segment of the element. Polygon data consists of coordinate pair values, one vertex pair for each line segment of the polygon. Coordinates are defined in order around the polygon (clockwise for an exterior polygon ring, counter-clockwise for an interior polygon ring).

1.5.2 Geometry

A geometry (or geometry object) is the representation of a spatial feature, modeled as an ordered set of primitive elements. In the relational model, each geometry is required to be uniquely identified by a geometry identifier (GID) associating it with the other attributes of the feature. This is not required in the object-relational model.

A geometry can consist of a single element, which is an instance of one of the supported primitive types, or a homogeneous or heterogeneous collection of elements. A multipolygon, such as one used to represent a set of islands, is a homogeneous collection. A heterogeneous collection is one in which the elements are of different types.

In the relational model, a complex geometry such as a polygon with holes would be stored as a sequence of polygon elements. All subelements of a multielement polygon are wholly contained within the outermost element. This is not required using the object-relational model.

An example of a geometry might describe the buildable land in a town. This could be represented as a polygon with holes where water or zoning prevents construction.

1.5.3 Layer

A layer is a heterogeneous collection of geometries having the same attribute set. For example, one layer in a GIS might include topographical features, while another describes population density, and a third describes the network of roads and bridges in the area (lines and points). Each layer's geometries and associated spatial index are stored in the database in standard tables.

1.5.4 Tolerance

Many Spatial functions accept a tolerance parameter. If the distance between two points is less than or equal to the tolerance, Spatial considers the two points to be a single point. Thus, tolerance is usually a reflection of how accurate or precise users perceive their spatial data to be.

For example, assume that you want to know which restaurants are within 5 kilometers of your house. Assume also that Maria's Pizzeria is 5.1 kilometers from your house. If you ask "Find all restaurants within 5 kilometers and use a tolerance of 0.1" (or greater, such as 0.5), Maria's Pizzeria will be included; however, if you specify a tolerance less than 0.1 (such as 0.05), Maria's Pizzeria will not be included.

Tolerance values for Spatial functions are typically very small, for example, 0.0005 (5E-4). With a tolerance of 5E-4 and the query in the preceding paragraph, a restaurant 5.0005 kilometers away is returned but a restaurant 5.00051 kilometers away is not returned.

1.6 Query Model

Spatial uses a two-tier query model to resolve spatial queries and spatial joins. The term is used to indicate that two distinct operations are performed in order to resolve queries. The output of both operations yields the exact result set.

The two operations are referred to as primary and secondary filter operations.

Figure 1-3 illustrates the relationship between the primary and secondary filters.

Figure 1-3 Query Model

Spatial uses a linear quadtree-based spatial index to implement the primary filter. This is described in detail in following sections.

The function SDO_GEOM.RELATE is used as a secondary filter. It evaluates the topological relationship, such as whether two given geometries are touching, covering each other, or have any interaction.

Spatial does not require the use of both the primary and secondary filters. In some cases, just using the primary filter is sufficient. For example, a zoom feature in a mapping application queries for data that overlaps a rectangle representing visible boundaries. The primary filter very quickly returns a superset of the query. The mapping application can then apply clipping routines to display the target area.

The purpose of the primary filter is to quickly create a subset of the data and reduce the processing burden on the secondary filter. The primary filter therefore should be as efficient (that is, selective yet fast) as possible. This is determined by the characteristics of the spatial index on the data.

1.7 Indexing Methods

The introduction of spatial indexing capabilities into the Oracle database engine is a key feature of the Spatial product. A spatial index, like any other index, provides a mechanism to limit searches, but in this case based on spatial criteria such as intersection and containment. A spatial index is required to:

A spatial index is considered a logical index. The entries in the spatial index are dependent on the location of the geometries in a coordinate space, but the index values are in a different domain. Index entries take on values from a linearly ordered integer domain, while the coordinates for a geometry may be pairs of integer, floating-point, or double-precision numbers. Spatial uses a linear quadtree-based indexing scheme, also known as z-ordering, which works as described in the following paragraphs.

The coordinate space (for the layer where all geometric objects are located) is subjected to a process called tessellation, which defines exclusive and exhaustive cover tiles for every stored geometry. Tessellation is done by decomposing the coordinate space in a regular hierarchical manner. The range of coordinates, the coordinate space, is viewed as a rectangle. At the first level of decomposition, the rectangle is divided into halves along each coordinate dimension generating four tiles. Each tile that interacts with the geometry being tessellated is further decomposed into four tiles. This process continues until some termination criteria, such as size of the tiles or the maximum number of tiles to cover the geometry, is met.

Spatial can use either fixed-size or variable-sized tiles to cover a geometry:

Fixed-size tile resolution and the number of variable-sized tiles used to cover a geometry are user-selectable parameters called SDO_LEVEL and SDO_NUMTILES, respectively. Smaller fixed-size tiles or more variable-sized tiles provides better geometry approximations. The smaller the number of tiles or the larger the tiles, the coarser are the approximations.

Spatial supports two indexing types, reflecting two valid combinations of SDO_LEVEL and SDO_NUMTILES values:

1.7.1 Tessellation of a Layer During Indexing

The process of determining which tiles cover a given geometry is called tessellation. The tessellation process is a quadtree decomposition, where the two-dimensional coordinate space is broken down into four equal-sized covering tiles. Successive tessellations divide those tiles that interact with the geometry down into smaller tiles, and this process continues until the desired level or number of tiles has been achieved. The results of the tessellation process on a geometry are stored in a table, referred to as the SDOINDEX table.

The tiles at a particular level can be linearly sorted by systematically visiting tiles in an order determined by a space-filling curve as shown in Figure 1-4. The tiles can also be assigned unique numeric identifiers, known as Morton codes or z-values. The terms tile and tile code will be used interchangeably in this and other sections related to spatial indexing.

Figure 1-4 Quadtree Decomposition and Morton Codes

1.7.2 Fixed Indexing

Fixed-size tile spatial indexing is the preferred indexing method for the relational model. This method uses tiles of equal size to cover a geometry. Because all the tiles are the same size, they all have codes of the same length, and the standard SQL equality operator (=) can be used to compare tiles during a join operation. This results in excellent performance characteristics.

Two geometries are likely to interact, and hence pass the primary filter stage, if they share one or more tiles. The SQL statement for the primary filter stage is:

SELECT DISTINCT <select_list for geometry identifiers> 
FROM table1_sdoindex A, table2_sdoindex B
WHERE A.sdo_code = B.sdo_code

The effectiveness and efficiency of this indexing method depends on the tiling level and the variation in size of the geometries in the layer. If you select a small fixed-size tile to cover small geometries and then try to use the same size tile to cover a very large geometry, a large number of tiles would be required. However, if the chosen tile size is large, so that fewer tiles are generated in the case of a large geometry, then the index selectivity suffers because the large tiles do not approximate the small geometries very well. Figure 1-5 and Figure 1-6 illustrate the relationships between tile size, selectivity, and the number of cover tiles.

With a small fixed-size tile as shown in Figure 1-5, selectivity is good, but a large number of tiles is needed to cover large geometries. A window query would easily identify geometries A and B, but would reject C.

Figure 1-5 Fixed-Size Tiling with Many Small Tiles

With a large fixed-size tile as shown in Figure 1-6, fewer tiles are needed to cover the geometries, but the selectivity is not as good. The same window query as in Figure 1-5 would probably pick up all three geometries. Any object that shares tile T1 or T2 would identify object C as a candidate, even though the objects may be far apart, such as objects B and C are in Figure 1-6.

Figure 1-6 Fixed-Size Tiling with Fewer Large Tiles

The SDO_TUNE.ESTIMATE_TILING_LEVEL function helps determine an appropriate tiling level for your data set.

Figure 1-7 illustrates geometry 1013 tessellated to three fixed-sized tiles at level 1. The codes for these cover tiles are then stored in an SDOINDEX table.

Figure 1-7 Tessellated Figure

Only three of the four tiles generated by the first tessellation interact with the geometry. Only those tiles that interact with the geometry are stored in the SDOINDEX table, as shown in Table 1-1. In this example, three fixed-size tiles are used. The table structure is shown for illustrative purposes only. The column names of this table differ depending on which implementation method, relational or object-relational, is in use. In the relational model, you must directly access the index tables. In the object-relational model, this is both unnecessary and not recommended.

Table 1-1 SDOINDEX Table Using Fixed-Size Tiles 
SDO_GID <number>  SDO_CODE <raw> 







All elements in a geometry are tessellated. In a multielement geometry like 1013, Element 1 is already covered by tile T2 from the tessellation of Element 0. If, however, the specified tiling resolution were such that tile T2 were further subdivided and one of these smaller tiles were completely contained in Element 1, then that tile would be excluded because it would not interact with the geometry.

1.7.3 Hybrid Indexing

Hybrid indexing is the preferred method for indexing the object-relational model. Hybrid indexing uses a combination of fixed-size and variable-sized tiles for spatially indexing a layer. Variable-sized tile spatial indexing uses tiles of different sizes to approximate a geometry. For each geometry, you will have a set of fixed-size tiles that fully cover the geometry, and also a set of variable-sized tiles that fully cover the geometry.

In Figure 1-8, the variable-sized cover tiles closely approximate each geometry. This results in good selectivity. The number of variable tiles needed to cover a geometry is controlled using the SDO_NUMTILES parameter.

Figure 1-8 Variable-Sized Tile Spatial Indexing

A variable tile is subdivided if it interacts with the geometry, and subdivision will not result in tiles that are smaller than a predetermined size. This size, or tiling resolution, is determined by a default SDO_MAXLEVEL parameter. A user may modify this parameter, but it is not recommended.

Figure 1-9 illustrates how geometry OBJ_1, represented using the object-relational implementation, is approximated with hybrid indexing (SDO_LEVEL = 1 and SDO_NUMTILES = 4). These are not recommended values for SDO_LEVEL and SDO_NUMTILES; they were chosen to simplify this example. The cover tiles are stored in the SDOINDEX table as shown in Table 1-2.

Figure 1-9 Decomposition of the Geometry

In Figure 1-9, note that for simplicity the tiles have been numbered, and LL and UR indicate lower left and upper right, respectively. For example, T2_LL indicates the lower left corner of tile T2. (This designation scheme does not reflect the actual format use in Spatial.)

In Figure 1-9, note which fixed-size tiles are associated with geometry OBJ_1. Only three (T0, T2, T3) of the four large tiles (T0, T1, T2, T3) generated by the tessellation actually interact with the geometry. Only those three are stored in the SDOINDEX table. In examining which variable-sized tiles are used, tile T0 shows a further tessellation to four smaller tiles, two of which (T02, T03) are used to cover a portion of the geometry. The variable-sized tiles are stored in the SDO_CODE column in the Spatial index table. The fixed-size tiles are stored in the SDO_GROUPCODE column. The spatial index structure is discussed in Section 2.5.

Table 1-2 shows the tiles from Figure 1-9 that are stored in the SDOINDEX table.

Table 1-2 Section of the SDOINDEX Table




<binary data> 


<binary data> 



<binary data> 


<binary data> 



<binary data> 


<binary data> 



<binary data> 


<binary data> 

As with the fixed-size tile model, all elements in a geometry are tessellated in one step. In a multielement geometry like OBJ_1, Element 1 (the "hole" shown in Figure 1-9) is covered by a redundant tile (T2) from the tessellation of Element 0, but this tile is stored only once.

The SDO_TUNE package has some functions that help determine appropriate SDO_LEVEL and SDO_NUMTILES values. Appendix A contains suggestions on when hybrid indexing may be beneficial, and how to select values for the two required parameters.

1.8 Spatial Relations and Filtering

Spatial uses filter methods to determine the spatial relationship between entities in the database. The spatial relation is based on geometry locations. The most common spatial relations are based on topology and distance. For example, the boundary of an area consists of a set of curves that separates the area from the rest of the coordinate space. The interior of an area consists of all points in the area that are not on its boundary. Given this, two areas are said to be adjacent if they share part of a boundary but no points in their interior.

The distance between two spatial objects is the minimum distance between any points in them. Two objects are said to be within a given distance of one another if their distance is less than the given distance.

To determine spatial relations, Spatial has two secondary filter methods. One method (SDO_RELATE operator) evaluates topological criteria, and a second method (SDO_WITHIN_DISTANCE operator) determines if two spatial objects are within a Euclidean distance of each other. The syntax of both operators is given in Chapter 9.

The SDO_RELATE operator implements a 9-intersection model for categorizing binary topological relations between points, lines, and polygons. Each spatial object has an interior, a boundary, and an exterior. The boundary consists of points or lines that separate the interior from the exterior. The boundary of a line consists of its end-points. The boundary of a polygon is the line that describes its perimeter. The interior consists of points that are in the object but not on its boundary and the exterior consists of those points that are not in the object.

Given that an object A has 3 components (a boundary Ab, an interior Ai, and an exterior Ae), any pair of objects has 9 possible interactions between their components. Pairs of components have an empty (0) or a non-empty (1) set intersection. The set of interactions between 2 geometries is represented by a 9-intersection matrix that specifies which pairs of components intersect and which do not. Figure 1-10 shows the 9-intersection matrix for 2 polygons that are adjacent to one another. This matrix yields the following bit mask, generated in row-major form: "101001111".

Figure 1-10 The 9-Intersection Model

Some of the topological relationships identified in the seminal work by Dr. Egenhofer1 and colleagues have names associated with them. Spatial uses the following names:

The other secondary filter method (SDO_WITHIN_DISTANCE operator) determines if two spatial objects, A and B, are within a Euclidean distance of one another. It first constructs a distance buffer, Db, around the reference object B. It then checks that A and Db are non-disjoint. The distance buffer of an object consists of all points within the given distance from that object. Figure 1-11 shows the distance buffers for point, line, and area objects. Notice how the buffer is rounded near the corners of the objects.

Figure 1-11 Distance Buffers for Points, Lines, and Polygons

1.9 Partitioned Point Data

Point data, unlike line and polygon data, has the unique characteristic of always using only one tile per point. For applications handling point data sets that are several tens of gigabytes or larger, performance gains can be achieved by using Oracle8i table partitioning features.

Table partitioning is available only with the Partitioning Option of Oracle8i Enterprise Edition. If the Partitioning Option is available to you, the preferred method is to use Oracle8i table partitioning in conjunction with spatial indexing (using the relational model). See Oracle8i Concepts for a description of Oracle8i Partitioning. See Section A.1.7.3 for a description of a sample script that uses table partitioning with point data.

1 Dr. Max Egenhofer, University of Maine, Orono.

Prev Next
Copyright © 1999 Oracle Corporation.

All Rights Reserved.