Oracle7 Spatial Data Option User's Guide and Reference |
Library |
Product |
Contents |
Index |

Oracle Spatial Data Option is an integrated set of functions and procedures that enables spatial data to be stored, accessed, and analyzed quickly and efficiently in an Oracle7 database.

Spatial data represents the essential location characteristics of real or conceptual objects as those objects relate to the real or conceptual space in which they exist.

Spatial Data Option is designed to make the storage, retrieval, and manipulation of spatial data easier and more natural to users such as a Geographic Information System (GIS). Once this data is stored in an Oracle7 relational database, it can be easily and meaningfully manipulated and retrieved as it relates to all the other data stored in the database.

A common example of spatial data can be seen in a road map. A road map is a two-dimensional object that contains points, lines, and polygons that can represent cities, roads, and political boundaries such as states or provinces. A road map is a visualization of geographic information. The location of cities, roads, and political boundaries that exist on the surface of the Earth are projected onto a two-dimensional display or piece of paper, preserving the relative positions and relative distances of the rendered objects.

The data that indicates the Earth location (latitude and longitude, or height and depth) of these rendered objects is the spatial data. When the map is rendered, this spatial data is used to project the locations of the objects on a two-dimensional piece of paper. A GIS is often used to store, retrieve, and render this Earth-relative spatial data.

Other types of spatial data that can be stored using Spatial Data Option besides GIS data include data from computer-aided design (CAD) and computer-aided manufacturing (CAM) systems. Instead of operating on objects on a geographic scale, CAD/CAM systems work on a smaller scale such as for an automobile engine or much smaller scale as for printed circuit boards.

The differences among these three systems are only in the scale of the data, not its complexity. They might all actually involve the same number of data points. On a geographic scale, the location of a bridge can vary by a few tenths of an inch without causing any noticeable problems to the road builders. Whereas, if the diameter of an engine's pistons are off by a few tenths of an inch, the engine will not run. A printed circuit board is likely to have many thousands of objects etched on its surface that are no bigger than the smallest detail shown on a roadbuilder's blueprints.

Spatial Data Option supports three geometric primitive types and geometries composed of collections of these types. The three primitive types are:

**2-D points** are elements composed of two ordinates, X and Y, often corresponding to longitude and latitude. **Line strings** are composed of one or more pairs of points that define line segments. **Polygons** are composed of connected line strings that form a closed ring and the interior of the polygon is implied. Figure 1-1 illustrates the supported geometric primitives:

Self-crossing polygons are not supported although self-crossing line strings are. If a line string crosses itself it does not become a polygon. A self-crossing line string does not have any implied interior.

The Spatial Data Option data model is a hierarchical structure consisting of elements, geometries, and layers, which correspond to representations of spatial data. Layers are composed of geometries, which in turn are made up of elements.

For example, a point might represent a building location, a line string might be a road or flight path, and a polygon could be a state, city, zoning district, or city block.

An** element** is the basic building block of a geometric feature for Spatial Data Option. The supported spatial element types are points, line strings, and polygons. For example, elements might model star constellations (point clusters), roads (line strings), and county boundaries (polygons). Each coordinate in an element is stored as an X,Y pair.

**Point data**^{1} consists of one coordinate. **Line data** consists of two coordinates representing a line segment of the element. **Polygon data** consists of coordinate pair values, one vertex pair for each line segment of the polygon. Coordinates are defined in either a clockwise or counter-clockwise order around the polygon. Each layer's geometric objects and their associated spatial index are stored in the database in tables.

If an element spans more than one row, an incremental sequence number (starting at zero) orders the rows.

A **geometry**, or geometric object, is the representation of a user's spatial feature, modeled as an ordered set of primitive elements. Each geometric object is required to be uniquely identified by a numeric geometry identifier (GID), associating the object with its corresponding attribute set.

A complex geometric feature such as a polygon with holes would be stored as a sequence of polygon elements. In a multi-element polygonal geometry, all subelements are wholly contained within the outermost element, thus building a more complex geometry from simpler pieces.

For example, a geometry might describe the buildable land in a town. This could be represented as a polygon with holes where water or zoning prevents construction.

A** layer** is a heterogeneous collection of geometries having the same attribute set. For example, one layer in a GIS might include topographical features, while another describes population density, and a third describes the network of roads and bridges in the area (lines and points).

Spatial Data Option uses four database tables to store and index spatial data. These four tables are collectively referred to as a "layer". A template SQL script is provided to facilitate the creation of these tables. See Section A.1.3, "crlayer.sql Script" for details.

The following tables describe the schema of a Spatial Data Option layer.

SDO_ORDCNT |
SDO_LEVEL |
SDO_NUMTILES |
SDO_COORDSYS |
---|---|---|---|

<number> | <number> | <number> | <varchar> |

SDO_DIMNUM |
SDO_LB |
SDO_UB |
SDO_TOLERANCE |
SDO_DIMNAME |
---|---|---|---|---|

<number> | <number> | <number> | <number> | <varchar> |

SDO_GID |
SDO_ESEQ |
SDO_ETYPE |
SDO_SEQ |
SDO_X1 |
SDO_Y1 |
... |
SDO_Xn |
SDO_Yn |
---|---|---|---|---|---|---|---|---|

<number> | <number> | <number> | <number> | <number> | <number> | ... | <number> | <number> |

SDO_GID |
SDO_CODE |
SDO_MAXCODE ** | SDO_GROUPCODE ** | SDO_META |
---|---|---|---|---|

<number> | <raw> | <raw> | <raw> | <raw> |

The SDO_MAXCODE and SDO_GROUPCODE columns are not required for the recommended indexing algorithm using fixed-size tiles.

The columns of each table are defined as follows:

**SDO_ORDCNT -**The SDO_ORDCNT column is the total number of ordinates per row in the <layername>_SDOGEOM table. That is, the total number of data value columns, and not the number of points or coordinates. SDO_ORDCNT should not be multiplied by the total number of dimensions per coordinate as it is already a total.**SDO_LEVEL**- The SDO_LEVEL column stores the number of times the layer should be tessellated during the indexing stage. Use the SDO_TUNE.ESTIMATE_TILING_LEVEL() procedure to determine an appropriate tiling level for your data.**SDO_NUMTILES -**The SDO_NUMTILES is the number of variable-sized tiles used to tessellate each object in the <layername>_SDOGEOM table. This column must be set to NULL when using fixed-size tiles.*This column is new for release 7.3.4.***SDO_COORDSYS -**The SDO_COORDSYS is an optional column where you can indicate the name of the coordinate system, using a standard such as "POSC" or "OGIS".*This column is new for release 7.3.4.*

**SDO_DIMNUM**- The SDO_DIMNUM column is the dimension to which this row refers, starting with 1 and increasing.**SDO_LB**- The SDO_LB column is the lower bound of the ordinate in this dimension. For example, if the dimension is latitude, the lower bound would be -90.**SDO_UB**- The SDO_UB column is the upper bound of the ordinate in this dimension. For example, if the dimension is longitude, the upper bound would be 180.**SDO_TOLERANCE**- The SDO_TOLERANCE column is the distance two points can be apart and still be considered the same due to round-off errors. Tolerance must be greater than zero. If you want zero tolerance, enter a number such as 0.00005, where the number of zeroes to the right of the decimal point matches the precision of your data. The extra "5" will round up to your last decimal digit.**SDO_DIMNAME**- The SDO_DIMNAME column is used for the usual name applied to this dimension, such as "longitude," "latitude," "X" or "Y."

**SDO_GID**- A**G**eometry**ID**entifier is a unique numeric identifier for each geometry in a layer.**SDO_ESEQ**- A geometry is composed of one or more primitive types called elements. The column SDO_ESEQ enumerates each element in a geometry, that is, the**E**lement**SEQ**uence number.**SDO_ETYPE**- The type of each element is recorded by the**E**lement**TYPE**column. For this release of Spatial Data Option, the valid values are SDO_GEOM.POINT_TYPE, SDO_GEOM.LINESTRING_TYPE, or SDO_GEOM.POLYGON_TYPE (ETYPE values 1, 2, and 3, respectively). Setting the ETYPE to zero (0) indicates that this element should be ignored. See Section A.2.7 for information on ETYPE 0.**SDO_SEQ**- The SDO_SEQ column records the order (the**SEQ**uence number) of each row of data making up the element.**SDO_X1**- X value of the first coordinate.**SDO_Y1**- Y value of the first coordinate.**SDO_Xn**- X value of the Nth coordinate.**SDO_Yn**- Y

**SDO_GID**- A**G**eometry**ID**entifier is a unique numeric identifier for each geometry in a layer. This can be thought of as a foreign key back to the <layername>_SDOGEOM table.**SDO_CODE**- The SDO_CODE column is the bit interleaved ID of a tile that covers SDO_GID. The number of bytes needed for the SDO_CODE and SDO_MAXCODE columns depends on the level used for tiling. Use the SDO_ADMIN.SDO_CODE_SIZE() function to determine the size required for a given layer. The maximum number of bytes possible is 255.**SDO_MAXCODE**- The SDO_MAXCODE column describes a logical tile which is the smallest tile (with the longest tile ID) in the current quadrant. The SDO_MAXCODE column is SDO_CODE padded out one place farther than the longest allowable code name for this index. This column is not used for fixed-size tiles.**SDO_GROUPCODE****-**The SDO_GROUPCODE is a prefix of SDO_CODE. It represents a tile at level <layername>_SDOLAYER.SDO_LEVEL that contains or is equal to the tile represented by SDO_CODE. This column is not used for fixed-size tiles.*This column is new for release 7.3.4.***SDO_META****-**The SDO_META column is not required for spatial queries. It provides information necessary to find the bounds of a tile. See Section A.2.4 for one possible usage of this column.*This column is new for release 7.3.4.*

Spatial Data Option provides stored procedures that assume the existence of the layer schema as described in this section. While layer objects may contain additional columns, they are required to contain at least the columns described here with the same column names and data types. The SDO_GID column always needs to be specified when loading or inserting geometries.

Figure 1-2 illustrates how a geometry is stored in the database using Spatial Data Option. The geometry to be stored is a complex polygon with a hole in it.

<layername>_SDOLAYER

SDO_ORDCNT (number) |
---|

4 |

<layername>_SDODIM

SDO_DIMNUM (number) | SDO_LB (number) | SDO_UB (number) | SDO_TOLERANCE (number) | SDO_DIMNAME (varchar) |
---|---|---|---|---|

1 |
0 |
100 |
.05 |
X axis |

2 |
0 |
100 |
.05 |
Y axis |

<layername>_SDOGEOM

In this example, the <layername>_SDOGEOM table is shown as a 4-wide table. In actual usage, Spatial Data Option supports N-wide^{2} tables. The coordinates for each ESEQ in this example table could have been loaded into a single, 18-wide row containing values for SDO_X1 and SDO_Y1 through SDO_X9 and SDO_Y9.

The data in the <layername>_SDOINDEX table is described in Section 1.5, "Indexing Methods".

Spatial Data Option release 7.3.3 introduced two distinct algorithms for building a spatial index: fixed-size tiling and variable-sized tiling. Based on testing and customer feedback, for release 7.3.4, Oracle recommends using only fixed-size tiling on production systems. Variable-sized tiling, while it has theoretical advantages in some situations, is included for experimentation purposes only.

In spatial indexing, the object space (the layer where all geometric objects are located,) is subjected to a quad-tree decomposition called **tessellation**, which defines exclusive and exhaustive cover tiles of every stored element. Spatial Data Option can use either fixed or variable-sized tiles to cover a geometry.

The number of tiles used to cover an element is a user-tunable parameter. Using either smaller fixed-size tiles or more variable-sized tiles provides a better fit of the tiles to the element. The fewer the number of tiles or the larger the tiles, the coarser the fit.

The process of determining which tiles cover a given element is called tessellation. The results of a tessellation process on an element are stored in the <layername>_SDOINDEX table. See Section 2.3, "Index Creation" for more information on tessellation.

Figure 1-3 illustrates geometry 1013 decomposed to a maximum of four cover tiles. The cover tiles are then shown stored in the <layername>_SDOINDEX table.

Only three of the four tiles generated by the first tessellation interact with the geometry. Only those tiles that interact with the geometry are stored in the <layername>_SDOINDEX table, as shown in Table 1-5. In this example, three fixed-size tiles are used.

SDO_GID <NUMBER> | SDO_CODE <RAW> |
---|---|

1013 |
T0 |

1013 |
T2 |

1013 |
T3 |

All elements in a geometry are tessellated. In a multi-element polygon like 1013, Element 1 is covered by the redundant tile T2 from the tessellation of Element 0.

Fixed-tile spatial indexing is the recommended indexing method. This method uses cover tiles of equal size to cover a geometry. Because all of the tiles are the same size, the standard SQL equality operator (=) can be used to compare tiles during a join operation. This results in excellent performance characteristics.

If you select a small fixed-tile size to cover small geometries and then try to use the same sized tiles to cover a very large geometry, a large number of tiles would be required, thereby increasing the size of the index table. However, if the fixed-tile size chosen is large, so that fewer tiles are generated in the case of a large geometry, then the index selectivity suffers because the large tiles do not fit the small geometries very well. Figure 1-4 and Figure 1-5 illustrate the relationships between tile size, selectivity, and the number of cover tiles.

Using a small fixed-size tile as shown in Figure 1-4, selectivity is good, but a large number of tiles are needed to cover large geometries. A window query would easily identify geometries A and B, but would reject C.

Using a large fixed-size tile as shown in Figure 1-5, fewer tiles are needed to cover the geometries, but the selectivity is poor. A window query would likely pick up all three geometries. Any object that shares tile T1 or T2 would identify object C as a candidate, even though the objects may be far apart, such as objects B and C are in this figure.

Use the SDO_TUNE.ESTIMATE_TILING_LEVEL() function to determine an appropriate tiling level for your dataset.

Note:
Variable-sized tile spatial indexing is not recommended for production environments. It is included primarily for experimentation purposes. |

Variable-sized tile spatial indexing uses cover tiles of different sizes to approximate a geometry. The user specifies the number of tiles per object that should be used to approximate it and this governs the tiling process. As in the case of a linear quad tree, the cover tiles depend on the size and shape of each geometry being indexed and therefore good primary filter selectivity can be achieved. Figure 1-6 illustrates the approximation that variable-sized tiles can achieve.

In Figure 1-6, the variable-sized cover tiles conform closely to each geometry, resulting in good selectivity. The number of tiles needed to cover a geometry is controlled using the sdo_numtiles column in the SDO_LAYER table. See Section 2.3.3 for information on selecting appropriate values for variable-sized tiling.

Two geometries may interact if a tile of one object is equal to, inside of, or contains a tile of the other. Thus, the predicate to compare tiles involves a test for either equality or containment. This is unlike fixed-size tiling, which only requires an equality check. Example 1-1 demonstrates this feature ("5" is an arbitrary window identifier):

SELECT r.sdo_gid FROM roads_sdoindex r, window_sdoindex w WHERE w.sdo_gid = 5 AND (r.sdo_code BETWEEN w.sdo_code AND w.sdo_maxcode OR w.sdo_code BETWEEN r.sdo_code AND r.sdo_maxcode);

To reduce the number of times a complex predicate needs to be applied, variable-sized tile indexing uses a mechanism similar to spatial partitioning. To use this mechanism, select a tiling level, called the **groupcode level**, that results in tiles larger than any variable-sized tile generated for all the geometries in the layer or dataset of interest. Each tile at the specified groupcode level can be considered as a spatial partition. This reduces the size of the dataset on which the complex predicate is evaluated. Example 1-2 illustrates this feature:

SELECT r.sdo_gid FROM layer_sdoindex r, window_sdoindex w WHERE w.sdo_gid = 5 AND r.sdo_group_code = w.sdo_groupcode AND (r.sdo_code BETWEEN w.sdo_code AND w.sdo_maxcode OR w.sdo_code BETWEEN r.sdo_code AND r.sdo_maxcode);

In Figure 1-7, consider the domain partitioned into 16 subregions. If a join compares tiles from the two objects, under normal circumstances the join operation would process tiles from the entire domain, searching for tiles that interact. However, if you constrain the processing to common partitions, then only partitions 5 and 6 would need to be processed. This may result in substantial performance improvements.

Spatial Data Option has undergone an architectural change, beginning with the 7.3.3 release. A reliance on partitioned tables has changed to utilize improved spatial indexing capable of handling complex geometries. However, for handling very large amounts (tens of gigabytes) of purely point data, keeping that data in partitioned tables may be more efficient than using the new spatial indexing scheme.

Table partitioning and spatial indexing are two very different techniques. While both are important to their respective users, this manual emphasizes and recommends the spatial indexing capabilities of Spatial Data Option.

See Chapter 4, "Partitioned Point Data" for a brief overview of partitioned point data.

^{1}
Point data can also be stored in a partitioned table. See Chapter 4, "Partitioned Point Data" for details.

^{2}
A <layername>_SDOGEOM table can have up to 255 columns. The maximum number of data columns is 255, minus 4 for the other required spatial columns, and minus any other user-defined columns. For polygon and linestrings, storing 16 to 20 ordinates per row is suggested, but not required.

Prev Next |
Copyright © 1997 Oracle Corporation. All Rights Reserved. |
Library |
Product |
Contents |
Index |