12.5 Data Structures for Geocoding
Oracle uses the following tables for geocoding.
-
GC_PARSER_PROFILES
-
GC_PARSER_PROFILEAFS
-
GC_COUNTRY_PROFILE
-
GC_AREA_<suffix>
-
GC_POSTAL_CODE_<suffix>
-
GC_ROAD_SEGMENT_<suffix>
-
GC_ROAD_<suffix>
-
GC_POI_<suffix>
-
GC_INTERSECTION_<suffix>
The GC_PARSER_PROFILES and GC_PARSER_PROFILEAFS tables store address format definitions of all supported counties. These tables are used by the internal address parser in parsing postal addresses into addressing fields. The data for these two tables is provided by your data provider or by Oracle. (If these tables are not supplied by your data provider, you will need to install and populate them as explained in Installing the Profile Tables.) The remaining tables store geocoding data provided by data vendors.
Each user that owns the tables containing geocoding data (that is, each user that can be specified with the username
parameter in a call to an SDO_GCDR subprogram) must have one GC_PARSER_PROFILES table, one GC_PARSER_PROFILEAFS table, and one GC_COUNTRY_PROFILE table. Each such user can have multiple sets of the other tables (GC_xxx_<suffix>). Each set of tables whose names end with the same suffix stores geocoding data of a country. For example, the following set of tables can be used to store geocoding data of the United States:
-
GC_AREA_US
-
GC_POSTAL_CODE_US
-
GC_ROAD_SEGMENT_US
-
GC_ROAD_US
-
GC_POI_US
-
GC_INTERSECTION_US
Geocoding data of one country cannot be stored in more than one set of those tables. The table suffix is defined by data venders and is specified in the GC_TABLE_SUFFIX column in the GC_COUNTRY_PROFILE table (described in GC_COUNTRY_PROFILE Table).
The following sections describe the vendor-supplied tables that store geocoding data, in alphabetical order by table name.
Indexes on Tables for Geocoding describes the indexes that you must create in order to use these tables for geocoding.
- GC_ADDRESS_POINT_<suffix> Table and Index
- GC_AREA_<suffix> Table
- GC_COUNTRY_PROFILE Table
- GC_INTERSECTION_<suffix> Table
- GC_PARSER_PROFILES Table
- GC_PARSER_PROFILEAFS Table
- GC_POI_<suffix> Table
- GC_POSTAL_CODE_<suffix> Table
- GC_ROAD_<suffix> Table
- GC_ROAD_SEGMENT_<suffix> Table
- Indexes on Tables for Geocoding
Parent topic: Geocoding Address Data
12.5.1 GC_ADDRESS_POINT_<suffix> Table and Index
The GC_ADDRESS_POINT_<suffix> table (for example, GC_ADDRESS_POINT_US) stores the geographic (latitude, longitude) coordinates for addresses in the country or group of countries associated with the table-name suffix. This table is not required for geocoding (although it is required for point-based geocoding); however, it enables the geocoder to provide more accurate location results. It is automatically used when present in the schema. This table contains one row for each address stored in the table, and it contains the columns shown in Table 12-7.
Table 12-7 GC_ADDRESS_POINT_<suffix> Table
Column Name | Data Type | Description |
---|---|---|
ADDRESS_POINT_ID |
NUMBER(10) |
ID number of the address point. (Required) |
ROAD_ID |
NUMBER |
ID number of the road on which the address point is located. (Required) |
ROAD_SEGMENT_ID |
NUMBER(10) |
ID number of the road segment on the road on which the address point is located. (Required) |
SIDE |
VARCHAR2(1) |
Side of the road on which the address point is located. Possible values: |
LANG_CODE |
VARCHAR2(3) |
3-letter ISO national language code for the language associated with the address point. (Required) point |
HOUSE_NUMBER |
VARCHAR2(600 CHAR) |
House number of the address point; may contain non-numeric characters. (Required) |
PERCENT |
NUMBER |
Decimal fraction of the length of the road segment on which the address point is located. It is computed by dividing the distance from the segment start point to the address point by the length of the road segment. (Required). |
ADDR_LONG |
NUMBER(10) |
Longitude coordinate value of the address point. (Required) |
ADDR_LAT |
NUMBER(10) |
Latitude coordinate value of the address point. (Required) |
COUNTRY_CODE_2 |
VARCHAR2(2) |
2- letter ISO country code of the country to which the address point belongs. (Required) |
PARTITION_ID |
NUMBER |
Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required) |
If you use the GC_ADDRESS_POINT_<suffix> table, you must create an index on the table using a statement in the following form:
CREATE INDEX idx_<suffix>_addrpt_addr ON gc_address_point_<suffix> (road_segment_id, road_id, house_number, side);
Parent topic: Data Structures for Geocoding
12.5.2 GC_AREA_<suffix> Table
The GC_AREA_<suffix> table (for example, CG_AREA_US) stores administration area information for the country associated with the table name suffix. This table contains one row for each administration area, and it contains the columns shown in Table 12-8.
Table 12-8 GC_AREA_<suffix> Table
Column Name | Data Type | Description |
---|---|---|
AREA_ID |
NUMBER(10) |
Area ID number. (Required) |
AREA_NAME |
VARCHAR2(64) |
Area name. (Required) |
LANG_CODE |
VARCHAR2(3) |
3-letter ISO national language code for the language associated with the area. (Required) |
ADMIN_LEVEL |
NUMBER(1) |
Administration hierarchy level for the area. (Required) |
LEVEL1_AREA_ID |
NUMBER(10) |
ID of the level-1 area to which the area belongs. In the administration hierarchy, the level-1 area is the country. (Required) |
LEVEL2_AREA_ID |
NUMBER(10) |
ID of the level-2 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional) |
LEVEL3_AREA_ID |
NUMBER(10) |
ID of the level-3 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional) |
LEVEL4_AREA_ID |
NUMBER(10) |
ID of the level-4 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional) |
LEVEL5_AREA_ID |
NUMBER(10) |
ID of the level-5 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional) |
LEVEL6_AREA_ID |
NUMBER(10) |
ID of the level-6 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional) |
LEVEL7_AREA_ID |
NUMBER(10) |
ID of the level-7 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional) |
CENTER_LONG |
NUMBER |
Longitude value of the center of the area. The center is set to the closest road segment to the center longitude and latitude values. Oracle recommends that these two attributes be set properly. If these values are not set, the longitude and latitude coordinates of the geocoded result of an area will be (0,0). (Optional) |
CENTER_LAT |
NUMBER |
Latitude value of the center of the area. (See the explanation for the CENTER_LONG column.) (Optional) |
ROAD_SEGMENT_ID |
NUMBER(10) |
ID of the road segment to which the area center is set. This value must be set correctly if the geocoder is intended to work with the Oracle Spatial routing engine (described in Routing Engine); otherwise, it can be set to any nonzero value, but it cannot be null. (Required) |
POSTAL_CODE |
VARCHAR2(16) |
Postal code for the center of the area. Oracle recommends that this attribute be set correctly. If this value is null, the postal code attribute of the geocoded result of an area will be null. (Optional) |
COUNTRY_CODE_2 |
VARCHAR2(2) |
2- letter ISO country code of the country to which the area belongs. (Required) |
PARTITION_ID |
NUMBER |
Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required) |
REAL_NAME |
VARCHAR2(64) |
The real name of the area, as spelled using the local language. This column is useful for area names that are not in English. For example, the German name of city |
IS_ALIAS |
VARCHAR2(1) |
Contains |
NUM_STREETS |
NUMBER |
The number of streets inside this area. (Optional) |
Parent topic: Data Structures for Geocoding
12.5.3 GC_COUNTRY_PROFILE Table
The GC_COUNTRY_PROFILE table stores country profile information used by the geocoder. This information includes administrative-area hierarchy definitions, the national languages, and the table-name suffix used by the data tables and their indexes. This table contains one row for each supported country, and it contains the columns shown in Table 12-9.
Table 12-9 GC_COUNTRY_PROFILE Table
Column Name | Data Type | Description |
---|---|---|
COUNTRY_NAME |
VARCHAR2(60) |
Country name. (Required) |
COUNTRY_CODE_3 |
VARCHAR2(3) |
3- letter ISO country code. (Required) |
COUNTRY_CODE_2 |
VARCHAR2(2) |
2- letter ISO country code. (Required) |
LANG_CODE_1 |
VARCHAR2(3) |
3-letter ISO national language code. Some countries might have multiple national languages, in which case LANG_CODE_2 and perhaps other LANG_CODE_n columns should contain values. (Required) |
LANG_CODE_2 |
VARCHAR2(3) |
3-letter ISO national language code. (Optional) |
LANG_CODE_3 |
VARCHAR2(3) |
3-letter ISO national language code. (Optional) |
LANG_CODE_4 |
VARCHAR2(3) |
3-letter ISO national language code. (Optional) |
NUMBER_ADMIN_LEVELS |
NUMBER(1) |
Number of administration hierarchy levels. A country can have up to 7 administration area levels, numbered from 1 to 7 (largest to smallest). The top level area (country) is level 1. For the United States, the administration hierarchy is as follows: level 1 = country, level 2 = state, level 3 = county, level 4 = city. (Required) |
SETTLEMENT_LEVEL |
NUMBER(1) |
Administration hierarchy level for a settlement, which is the lowest area level used in addressing. In the United States, this is the city level; in Europe, this is generally a subdivision of a city (level 5). (Required) |
MUNICIPALITY_LEVEL |
NUMBER(1) |
Administration hierarchy level for a municipality, which is the second-lowest area level used in addressing. In the United States, this is the county (level 3); in Europe, this is generally a city (level 4). (Optional) |
REGION_LEVEL |
NUMBER(1) |
Administrative level for the region, which is above the municipality level. In the United States, this is the state or third-lowest area level used in addressing (level 2); in Europe, this is a recognized subdivision of the country (level 2 or level 3). (Optional) |
SETTLEMENT_IS_OPTIONAL |
VARCHAR2(1) |
Contains |
MUNICIPALITY_IS_OPTIONAL |
VARCHAR2(1) |
Contains |
REGION_IS_OPTIONAL |
VARCHAR2(1) |
Contains |
POSTCODE_IN_SETTLEMENT |
VARCHAR(1) |
Contains |
SETTLEMENT_AS_CITY |
VARCHAR(1) |
Contains |
CACHED_ADMIN_AREA_LEVEL |
NUMBER |
(Reserved for future use.) |
GC_TABLE_SUFFIX |
VARCHAR2(5) |
Table name suffix identifying the country for the GC_* data tables. For example, if the value of GC_TABLE_SUFFIX is |
CENTER_LONG |
NUMBER |
Longitude value of the center of the area. (Optional) |
CENTER_LAT |
NUMBER |
Latitude value of the center of the area. (Optional) |
SEPARATE_PREFIX |
VARCHAR2(1) |
Contains |
SEPARATE_SUFFIX |
VARCHAR2(1) |
Contains |
SEPARATE_STYPE |
VARCHAR2(1) |
Contains |
AREA_ID |
NUMBER |
Not currently used by Oracle. (Optional) |
VERSION |
VARCHAR2(10) |
Version of the data. The first version should be |
Parent topic: Data Structures for Geocoding
12.5.4 GC_INTERSECTION_<suffix> Table
The GC_INTERSECTION_<suffix> table (for example, GC_INTERSECTION_US) stores information on road intersections for the country or group of countries associated with the table-name suffix. An intersection occurs when roads meet or cross each other. This table contains the columns shown in Table 12-10.
Table 12-10 GC_INTERSECTION_<suffix> Table
Column Name | Data Type | Description |
---|---|---|
ROAD_ID_1 |
NUMBER |
ID number of the first road on which the intersection is located. (Required) |
ROAD_SEGMENT_ID_1 |
NUMBER |
ID number of the road segment on the first road on which the intersection is located. (Required) |
ROAD_ID_2 |
NUMBER |
ID number of the second road on which the intersection is located. (Required) |
ROAD_SEGMENT_ID_2 |
NUMBER |
ID number of the road segment on the second road on which the intersection is located. (Required) |
INTS_LONG |
NUMBER |
Longitude coordinate value of the intersection. (Required) |
INTS_LAT |
NUMBER |
Latitude coordinate value of the intersection. (Required) |
HOUSE_NUMBER |
NUMBER |
The leading numerical part of the house number at the intersection. (See the explanation of house numbers after Table 12-16 in GC_ROAD_SEGMENT_<suffix> Table.) (Required) |
HOUSE_NUMBER_2 |
VARCHAR2(10) |
The second part of the house number at the intersection. (See the explanation of house numbers after Table 12-16 in GC_ROAD_SEGMENT_<suffix> Table.) (Required) |
SIDE |
VARCHAR2(1) |
Side of the street on which the house at the intersection is located. Possible values: |
COUNTRY_CODE_2 |
VARCHAR2(2) |
2- letter ISO country code of the country to which the house at the intersection belongs. (Required) |
PARTITION_ID |
NUMBER |
Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required) |
Parent topic: Data Structures for Geocoding
12.5.5 GC_PARSER_PROFILES Table
The GC_PARSER_PROFILES table stores information about keywords typically found in postal addresses. The geocoder uses keywords to identify address fields, such as house number, road name, city name, state name, and postal code. A keyword can be the type of street (such as road, street, drive, or avenue) or the prefix or suffix of a street (such as north, south, east, or west). This table contains the columns shown in Table 12-11.
Table 12-11 GC_PARSER_PROFILES Table
Column Name | Data Type | Description |
---|---|---|
COUNTRY_CODE |
VARCHAR2(2) |
2- letter ISO country code of the country for the keyword. (Required) |
KEYWORDS |
SDO_KEYWORDARRAY |
A single array of keywords for a specific address field. The array may contain a single word, or a group of words and abbreviations that can be used with the same meaning; for example, United States of America, USA, and United States all refer to the US. The first word of this array should be the official full name of the keyword, if there is any. The US uses over 400 keywords in parsing addresses. The following are some examples of keyword arrays and keywords from the US data set; however, only a single SDO_KEYWORDARRAY object is stored in each row: SDO_KEYWORDARRAY( 'UNITED STATES OF AMERICA','US', 'USA', 'UNITED STATES', 'U.S.A.', 'U.S.') SDO_KEYWORDARRAY('AVENUE','AV', 'AVE', 'AVEN', 'AVENU', 'AVN', 'AVNUE', 'AV.','AVE.') SDO_KEYWORDARRAY('40TH', 'FORTIETH') SDO_KEYWORDARRAY('NEW YORK','NY') SDO_KEYWORDARRAY('LIBRARY') |
OUTPUT_KEYWORD |
VARCHAR2(2000) |
A keyword used in the geocoder data to represent an address field. It must be the same as one of the keywords used in the keyword array. The output keyword is used to match the addresses stored in the geocoding data tables to the user's input, for example, if the output keyword
|
SECTION_LABEL |
VARCHAR2(30) |
A label used to identify the type of keyword represented in the KEYWORDS and OUTPUT_KEYWORD columns. There are the multiple different section labels; however, only a single section label for each row is used in identifying the type of keywords: COUNTRY_NAME: Identifies keywords that are used to represent country names. LOCALITY_KEYWORD_DICTIONARY: Identifies keywords that are used to replace words in a locality (city, state, province, and so on) with a standardized form of the word. For example, Saint is replaced by St; and by doing so, the city names Saint Thomas and St. Thomas will be standardized to St Thomas, which is stored in the database. PLACE_NAME_KEYWORD: Identifies a point of interest (POI) name keyword, such as for a restaurant or a hotel. REGION_LIST: Identifies keywords that are known names of regions, such as NY, New York, NH, and New Hampshire. The regions identified must be administrative areas that belong to the third-lowest area level or third-smallest area used in addressing. In the US this is the state level (the lowest area level or smallest area is the city level). SECOND_UNIT_KEYWORD: Identifies keywords used in second-unit descriptions, such as Floor, #, Suite, and Apartment. STREET_KEYWORD_DICTIONARY: Identifies keywords used to replace non-street-type keywords in street names (such as 40TH and Fortieth) with a standardized form. STREET_PREFIX_KEYWORD: Identifies street name prefix keywords, such as South, North, West, and East. STREET_TYPE_KEYWORD: Identifies street type keywords, such as Road, Street, and Drive. IN_LINE_STREET_TYPE_KEYWORD: Identifies street type keywords that are attached to street names, such as strasse in the German street name Steinstrasse. |
POSITION |
VARCHAR2(1) |
The position of the keyword relative to a street name. It indicates whether the keyword can precede ( |
SEPARATENESS |
VARCHAR2(1) |
Indicates whether the keyword is separate from a street name. Keywords are either separable ( |
Parent topic: Data Structures for Geocoding
12.5.6 GC_PARSER_PROFILEAFS Table
The GC_PARSER_PROFILEAFS table stores the XML definition of postal-address formats. An XML string describes each address format for a specific country. In the Oracle Geocoder 10g and earlier, the J2EE geocoder uses a country_name.ppr file instead of this table. The content of the country_name.ppr file is equivalent to the content of the ADDRESS_FORMAT_STRING attribute. This table contains the columns shown in Table 12-12.
Table 12-12 GC_PARSER_PROFILEAFS Table
Column Name | Data Type | Description |
---|---|---|
COUNTRY_CODE |
VARCHAR2(2) |
2- letter ISO country code of the country. (Required) |
ADDRESS_FORMAT_STRING |
CLOB |
XML string describing the address format for the country specified in the COUNTRY_CODE column. (Example 12-4 shows the XML definition for the US address format, and ADDRESS_FORMAT_STRING Description explains the elements used in the US address format definition.). |
Example 12-4 shows the ADDRESS_FORMAT_STRING definition for the US address format.
Example 12-4 XML Definition for the US Address Format
<address_format unit_separator="," replace_hyphen="true"> <address_line> <place_name /> </address_line> <address_line> <street_address> <house_number> <format form="0*" effective="0-1" output="$" /> <format form="0*1*" effective="0-1" output="$"> <exception form="0*TH" /> <exception form="0*ST" /> <exception form="0*2ND" /> <exception form="0*3RD" /> </format> <format form="0*10*" effective="0-1" output="$" /> <format form="0*-0*" effective="0-1" output="$" /> <format form="0*.0*" effective="0-1" output="$" /> <format form="0* 0*/0*" effective="0-1" output="$" /> </house_number> <street_name> <prefix /> <base_name /> <suffix /> <street_type /> <special_format> <format form="1* HWY 0*" effective="7-8" addon_effective="0-1" addon_output="$ HWY"/> <format form="1* HIGHWAY 0*" effective="11-12" addon_effective="0-1" addon_output="$ HWY"/> <format form="1* HWY-0*" effective="7-8" addon_effective="0-1" addon_output="$ HWY"/> <format form="1* HIGHWAY-0*" effective="11-12" addon_effective="0-1" addon_output="$ HWY"/> <format form="HWY 0*" effective="4-5" addon_output="HWY" /> <format form="HIGHWAY 0*" effective="8-9" addon_output="HWY" /> <format form="ROUTE 0*" effective="6-7" addon_output="RT" /> <format form="I 0*" effective="2-3" addon_output="I" /> <format form="11 0*" effective="3-4" addon_effective="0-1" /> <format form="I0*" effective="1-2" addon_output="I" /> <format form="I-0*" effective="2-3" addon_output="I" /> <format form="11-0*" effective="3-4" addon_effective="0-1" /> <format form="ROUTE-0*" effective="6-7" addon_output="RT" /> <format form="US0*" effective="2-3" addon_output="US" /> <format form="HWY-0*" effective="2-3" addon_output="US" /> <format form="HIGHWAY-0*" effective="8-9" addon_output="HWY" /> </special_format> </street_name> <second_unit> <special_format> <format form="# 0*" effective="2-3" output="APT $" /> <format form="#0*" effective="1-2" output="APT $" /> </special_format> </second_unit> </street_address> </address_line> <address_line> <po_box> <format form="PO BOX 0*" effective="7-8" /> <format form="P.O. BOX 0*" effective="9-10" /> <format form="PO 0*" effective="3-4" /> <format form="P.O. 0*" effective="5-6" /> <format form="POBOX 0*" effective="6-7" /> </po_box> </address_line> <address_line> <city optional="no" /> <region optional="no" order="1" /> <postal_code> <format form="00000" effective="0-4" /> <format form="00000-0000" effective="0-4" addon_effective="6-9" /> <format form="00000 0000" effective="0-4" addon_effective="6-9" /> </postal_code> </address_line> </address_format>
12.5.6.1 ADDRESS_FORMAT_STRING Description
The ADDRESS_FORMAT_STRING column of the GC_PARSER_PROFILEAFS table describes the format of address fields and their positioning in valid postal addresses. The address format string is organized by address lines, because postal addresses are typically written in multiple address lines.
The address parser uses the format description defined in the XML address format, combined with the keyword definition for each address field defined in the GC_PARSER_PROFILES table, to parse the input address and identify individual address fields.
<address_format> Element
The <address_format>
element includes the unit_separator
and replace_hyphen
attributes. The unit_separator
attribute is used to separate fields in the stored data. By default it is a comma (unit_separator=","
). The replace_hyphen
attribute specifies whether to replace all hyphens in the user's input with a space. By default it is set to true (replace_hyphen="true"
), that is, it is expected that all names in the data tables will contain a space instead of a hyphen.
If replace_hyphen="true"
, administrative-area names in the data tables containing hyphens will not be matched during geocoding if replace_hyphen="true"
; however, these area names with hyphens can be placed in the REAL_NAME column of the GC_AREA table to be returned as the administrative-area name in the geocoded result. Road names in the NAME column of the GC_ROAD table containing hyphens will, however, be matched during geocoding, but the matching performance will be degraded
<address_line> Elements
Each <address_line>
element in the XML address format string describes the format of an address line. Each <address_line>
element can have one or more child elements describing the individual address fields, such as street address, city, state (region or province), and postal code. These address field elements are listed in the order that the address fields appear in valid postal addresses. The optional
attribute of the address field element is set to "no"
if the address field is mandatory. By default, address field elements are optional.
<format> Elements
The format descriptions for house number, special street name, post box, and postal code elements are specified with a single or multiple <format>
elements. Each <format>
element specifies a valid layout and range of values for a particular address field. The following example illustrates the format used to define a special street name:
<format form="1* HWY 0*" effective="7-8" output="$" addon_effective="0-1" addon_output="$ HIGHWAY" />
The form
attribute uses a regular expression-like string to describe the format: 1
stands for any alphabetic letter; 0
stands for any numerical digit; 2
stands for any alphabetic letter or any numerical digit; 1*
specifies a sting consisting of all alphabetic letters; 0*
specifies a string consisting of all numerical digits; 2*
specifies a string consisting of any combination of numerical digits and alphabetic letters. All other symbols represent themselves.
Any string matching the pattern specified by the form
attribute is considered to be a valid string for its (parent) address field. A valid string can then be broken down into segments specified by the attributes effective
and addon_effective
. The effective attribute specifies the more important, primary piece of the address string; the addon_effective attribute specifies the secondary piece of the address string.
-
The
effective
attribute specifies a substring of the full pattern using the start and end positions for the end descriptor of theform
attribute. In the preceding example,effective="7-8"
retrieves the substring (counting from position 0) starting at position 7 and ending at position 8, which is the substring defined by0*
, at the end of theform
attribute. -
The
addon_effective
attribute specifies a substring of the full pattern using the start and end positions for the start descriptor of theform
attribute. In the preceding example,addon_effective="0-1"
retrieves the substring, (counting from position 0) starting at position 0 and ending at position 1, which is the substring defined by1*
, at the beginning of theform
attribute.
The output
and addon_output
attributes specify the output form of the address string for segments specified by the effective
and addon_effective
attributes, respectively. These output forms are used during address matching. The symbol $
stands for the matched string, and other symbols represent themselves. In the preceding example:
-
In
output="$"
, the$
stands for the substring that was matched in theeffective
attribute. -
In
addon_output="$ HIGHWAY"
, the$ HIGHWAY
stands for the substring that was matched in theaddon_effective
attribute, followed by a space, followed by the word HIGHWAY.
Using the <format>
element in the preceding example, with form="1* HWY 0*"
, the input string 'STATE HWY 580' will have effective=580
, output=580
, addon_effective=STATE
, and addon_output=STATE HIGHWAY
.
The <format>
element may also contain an <exception>
element. The <exception>
element specifies a string that has a valid form, but must be excluded from the address field. For example, in a <house_number>
element with valid numbers 0*1*
(that is, any numeric digits followed by any alphabetic letters), specifying <exception form="0*TH" />
means that any house number with (or without) numeric digits and ending with "TH" must be excluded.
Parent topic: GC_PARSER_PROFILEAFS Table
12.5.7 GC_POI_<suffix> Table
The GC_POI_<suffix> table (for example, GC_POI_US) stores point of interest (POI) information for the country or group of countries associated with the table name suffix. POIs include features such as airports, monuments, and parks. This table contains one or more rows for each point of interest. (For example, it can contain multiple rows for a POI if the POI is associated with multiple settlements.) The GC_POI_<suffix> table contains the columns shown in Table 12-13.
Table 12-13 GC_POI_<suffix> Table
Column Name | Data Type | Description |
---|---|---|
POI_ID |
NUMBER |
ID number of the POI. (Required) |
NAME |
VARCHAR2(64) |
Name of the POI. (Required) |
LANG_CODE |
VARCHAR2(3) |
3-letter ISO national language code for the language for the POI name. (Required) |
FEATURE_CODE |
NUMBER |
Feature code for the POI, if the data vendor classifies POIs by category. (Optional) |
HOUSE_NUMBER |
VARCHAR2(10) |
House number of the POI; may contain non-numeric characters. (Required) |
STREET_NAME |
VARCHAR2(80) |
Road name of the POI. (Required) |
SETTLEMENT_ID |
NUMBER(10) |
ID number of the settlement to which the POI belongs. (Required if the POI is associated with a settlement) |
MUNICIPALITY_ID |
NUMBER(10) |
ID number of the municipality to which the POI belongs. (Required if the POI is associated with a municipality) |
REGION_ID |
NUMBER(10) |
ID number of the region to which the POI belongs. (Required if the POI is associated with a region) |
SETTLEMENT_NAME |
VARCHAR2(64) |
Name of the settlement to which the POI belongs. (Required if the POI is associated with a settlement) |
MUNICIPALITY_NAME |
VARCHAR2(64) |
Name of the municipality to which the POI belongs. (Required if the POI is associated with a municipality) |
REGION_NAME |
VARCHAR2(64) |
Name of the region to which the POI belongs. (Required if the POI is associated with a region) |
POSTAL_CODE |
VARCHAR2(16) |
Postal code of the POI. (Required) |
VANITY_CITY |
VARCHAR2(35) |
Name of the city popularly associated with the POI, if it is different from the actual city containing the POI. For example, the London Heathrow Airport is actually located in a town named Hayes, which is part of greater London, but people tend to associate the airport only with London. In this case, the VANITY_CITY value is |
ROAD_SEGMENT_ID |
NUMBER |
ID of the road segment on which the POI is located. (Required) |
SIDE |
VARCHAR2(1) |
Side of the street on which the POI is located. Possible values: |
PERCENT |
NUMBER |
Percentage value at which the POI is located on the road. It is computed by dividing the distance from the street segment start point to the POI by the length of the street segment. (Required) |
TELEPHONE_NUMBER |
VARCHAR2(20) |
Telephone number of the POI. (Optional) |
LOC_LONG |
NUMBER |
Longitude coordinate value of the POI. (Required) |
LOC_LAT |
NUMBER |
Latitude coordinate value of the POI. (Required) |
COUNTRY_CODE_2 |
VARCHAR2(2) |
2- letter ISO country code of the country to which the POI belongs. (Required) |
PARTITION_ID |
NUMBER |
Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required) |
Parent topic: Data Structures for Geocoding
12.5.8 GC_POSTAL_CODE_<suffix> Table
The GC_POSTAL_CODE_<suffix> table (for example, GC_POSTAL_CODE_US) stores postal code information for the country or group of countries associated with the table-name suffix, if postal codes are used in the address format. This table contains one or more rows for each postal code; it may contain multiple rows for a postal code when the postal code is associated with multiple settlements. The GC_POSTAL_CODE_<suffix> table contains the columns shown in Table 12-14.
Table 12-14 GC_POSTAL_CODE_<suffix> Table
Column Name | Data Type | Description |
---|---|---|
POSTAL_CODE |
VARCHAR2(16) |
Postal code for the postal code area. (Required) |
SETTLEMENT_NAME |
VARCHAR2(64) |
Name of the settlement to which the postal code belongs. (Required if the postal code is associated with a settlement) |
MUNICIPALITY_NAME |
VARCHAR2(64) |
Name of the municipality to which the postal code belongs. (Required if the postal code is associated with a municipality) |
REGION_NAME |
VARCHAR2(64) |
Name of the region to which the postal code belongs. (Required if the postal code is associated with a region) |
LANG_CODE |
VARCHAR2(3) |
3-letter ISO national language code for the language associated with the area. (Required) |
SETTLEMENT_ID |
NUMBER(10) |
ID number of the settlement to which the postal code belongs. (Required if the postal code is associated with a settlement) |
MUNICIPALITY_ID |
NUMBER(10) |
ID number of the municipality to which the postal code belongs. (Required if the postal code is associated with a municipality) |
REGION_ID |
NUMBER(10) |
ID number of the region to which the postal code belongs. (Required if the postal code is associated with a region) |
CENTER_LONG |
NUMBER |
Longitude value of the center of the postal-code area. The center (longitude, latitude) value is set to the start- or end-point of the closest road segment to the center, depending on which point is closer. Oracle recommends that the CENTER_LONG and CENTER_LAT values be correctly set. If these values are not set, the longitude, latitude values of the geocoded result for an area will be (0,0). (Optional) |
CENTER_LAT |
NUMBER |
Latitude value of the center of the area. (See the explanation for the CENTER_LONG column.) (Optional) |
ROAD_SEGMENT_ID |
NUMBER(10) |
ID of the road segment to which the area center is set. This value must be set correctly if the geocoder is intended to work with the Oracle Spatial routing engine (described in Routing Engine); otherwise, it can be set to any nonzero value, but it cannot be null. (Required) |
COUNTRY_CODE_2 |
VARCHAR2(2) |
2- letter ISO country code of the country to which the area belongs. (Required) |
PARTITION_ID |
NUMBER |
Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required) |
NUM_STREETS |
NUMBER |
The number of streets inside this postal code area. (Optional) |
Parent topic: Data Structures for Geocoding
12.5.9 GC_ROAD_<suffix> Table
The GC_ROAD_<suffix> table (for example, GC_ROAD_US) stores road information for the country associated with the table name suffix. A road is a collection of road segments with the same name in the same settlement area; a road segment is defined in GC_ROAD_SEGMENT_<suffix> Table. The GC_ROAD_<suffix> table contains one or more rows for each road. (For example, it can contain multiple rows for a road if the road is associated with multiple settlements.) The GC_ROAD_<suffix> table contains the columns shown in Table 12-15.
Table 12-15 GC_ROAD_<suffix> Table
Column Name | Data Type | Description |
---|---|---|
ROAD_ID |
NUMBER |
ID number of the road. (Required) |
SETTLEMENT_ID |
NUMBER(10) |
ID number of the settlement to which the road belongs. (Required if the road is associated with a settlement) |
MUNICIPALITY_ID |
NUMBER(10) |
ID number of the municipality to which the road belongs. (Required if the road is associated with a municipality) |
PARENT_AREA_ID |
NUMBER(10) |
ID number of the parent area of the municipality to which the road belongs. (Required if the road is associated with a parent area) |
LANG_CODE |
VARCHAR2(3) |
3-letter ISO national language code for the language for the road name. (Required) |
NAME |
VARCHAR2(64) |
Name of the road, including the type (if any), the prefix (if any), and the suffix (if any). For example, |
BASE_NAME |
VARCHAR2(64) |
Name of the road, excluding the type (if any), the prefix (if any), and the suffix (if any). For example, |
PREFIX |
VARCHAR2(32) |
Prefix of the road name. For example, |
SUFFIX |
VARCHAR2(32) |
Suffix of the road name. For example, |
STYPE_BEFORE |
VARCHAR2(32) |
Street type that precedes the base name. For example, |
STYPE_AFTER |
VARCHAR2(32) |
Street type that follows the base name. For example, |
STYPE_ATTACHED |
VARCHAR2(1) |
Contains |
START_HN |
NUMBER(5) |
The lowest house number on the road. It is returned when a specified house number is lower than this value. |
CENTER_HN |
NUMBER(5) |
Leading numerical part of the center house number. The center house number is the left side house number at the start point of the center road segment, which is located in the center of the whole road. (See the explanation of house numbers after Table 12-16 in GC_ROAD_SEGMENT_<suffix> Table.) It is returned when no house number is specified in an input address. (Required) |
END_HN |
NUMBER(5) |
The highest house number on the road. It is returned when a specified house number is higher than this value. |
START_HN_SIDE |
VARCHAR2(1) |
Side of the road of the lowest house number: |
CENTER_HN_SIDE |
VARCHAR2(1) |
Side of the road of the center house number: |
END_HN_SIDE |
VARCHAR2(1) |
Side of the road of the highest house number: |
START_LONG |
NUMBER |
Longitude value of the lowest house number. |
START_LAT |
NUMBER |
Latitude value of the lowest house number. |
CENTER_LONG |
NUMBER |
Longitude value of the center house number. The center house number is the left side house number at the start point of the center road segment, which is located in the center of the whole road. (See the explanation of house numbers after Table 12-16 in GC_ROAD_SEGMENT_<suffix> Table.) (Required) |
CENTER_LAT |
NUMBER |
Latitude value of the center house number. (See also the explanation of the CENTER_LONG column.) (Required) |
END_LONG |
NUMBER |
Longitude value of the highest house number. |
END_LAT |
NUMBER |
Latitude value of the highest house number. |
START_ROAD_SEG_ID |
NUMBER(5) |
ID number of the road segment at the start of the road. |
CENTER_ROAD_SEG_ID |
NUMBER(5) |
ID number of the road segment at the center point of the road. (Required) |
END_ROAD_SEG_ID |
NUMBER(5) |
ID number of the road segment at the end of the road. |
POSTAL_CODE |
VARCHAR2(16) |
Postal code for the road. (Required) |
COUNTRY_CODE_2 |
VARCHAR2(2) |
2- letter ISO country code of the country to which the road belongs. (Required) |
PARTITION_ID |
NUMBER |
Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required) |
CENTER_HN2 |
VARCHAR2(10) |
The second part of the center house number. (See the explanation of house numbers after Table 12-16 in GC_ROAD_SEGMENT_<suffix> Table.) (Required) |
Parent topic: Data Structures for Geocoding
12.5.10 GC_ROAD_SEGMENT_<suffix> Table
The GC_ROAD_SEGMENT_<suffix> table (for example, GC_ROAD_SEGMENT_US) stores road segment information for the country associated with the table name suffix. A road segment is the portion of a road between two continuous intersections along the road; an intersection occurs when roads meet or cross each other. A road segment can also be the portion of a road between the start (or end) of the road and its closest intersection along the road, or it can be the entire length of a road if there are no intersections along the road. The GC_ROAD_SEGMENT_<suffix> table contains one row for each road segment, and it contains the columns shown in Table 12-16.
Table 12-16 GC_ROAD_SEGMENT_<suffix> Table
Column Name | Data Type | Description |
---|---|---|
ROAD_SEGMENT_ID |
NUMBER |
ID number of the road segment. A positive value, as explained in Relationship between Routing Engine and Geocoder. (Required) |
ROAD_ID |
NUMBER |
ID number of the road containing this road segment. (Required) |
L_ADDR_FORMAT |
VARCHAR2(1) |
Left side address format. Specify |
R_ADDR_FORMAT |
VARCHAR2(1) |
Right side address format. Specify |
L_ADDR_SCHEME |
VARCHAR2(1) |
Numbering scheme for house numbers on the left side of the road segment: |
R_ADDR_SCHEME |
VARCHAR2(1) |
Numbering scheme for house numbers on the right side of the road segment: |
START_HN |
NUMBER(5) |
The lowest house number on this road segment. (Required) |
END_HN |
NUMBER(5) |
The highest house number on this road segment. (Required) |
L_START_HN |
NUMBER(5) |
The leading numerical part of the left side starting house number. (See the explanation of house numbers after this table.) (Required) |
L_END_HN |
NUMBER(5) |
The leading numerical part of the left side ending house number. (See the explanation of house numbers after this table.) (Required) |
R_START_HN |
NUMBER(5) |
The leading numerical part of the right side starting house number. (See the explanation of house numbers after this table.) (Required) |
R_END_HN |
NUMBER(5) |
The leading numerical part of the right side ending house number. (See the explanation of house numbers after this table.) (Required) |
POSTAL_CODE |
VARCHAR2(16) |
Postal code for the road segment. If the left side and right side of the road segment belong to two different postal codes, create two rows for the road segment with identical values in all columns except for POSTAL_CODE. (Required) |
GEOMETRY |
SDO_GEOMETRY |
Spatial geometry object representing the road segment. (Required) |
COUNTRY_CODE_2 |
VARCHAR2(2) |
2- letter ISO country code of the country to which the road segment belongs. (Required) |
PARTITION_ID |
NUMBER |
Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required) |
L_START_HN2 |
VARCHAR2(10) |
The second part of the left side starting house number. (See the explanation of house numbers after this table.) (Required if the left side starting house number has a second part) |
L_END_HN2 |
VARCHAR2(10) |
The second part of the left side ending house number. (See the explanation of house numbers after this table.) (Required if the left side ending house number has a second part) |
R_START_HN2 |
VARCHAR2(10) |
The second part of the right side starting house number. (See the explanation of house numbers after this table.) (Required if the right side starting house number has a second part) |
R_END_HN2 |
VARCHAR2(10) |
The second part of the right side ending house number. (See the explanation of house numbers after this table.) (Required if the right side ending house number has a second part) |
A house number is a descriptive part of an address that helps identify the location of a establishment along a road segment. A house number is divided into two parts: the leading numerical part and the second part, which is the rest of the house number. The leading numerical part is the numerical part of the house number that starts from the beginning of the complete house number string and ends just before the first non-numeric character (if present). If the house number contains non-numeric characters, the second part of the house number is the portion from the first non-numeric character through the last character of the string. For example, if the house number is 123
, the leading numerical part is 123
and the second part is null; however, if the house number is 123A23
, the leading numerical part is 123
and the second part is A23
.
The starting house number is the house number at the start point of a road segment; the start point of the road segment is the first shape point of the road segment geometry. The ending house number is the house number at the end point of a road segment; the end point of the road segment is the last shape point of the road segment geometry. The left and right side starting house numbers do not need to be lower than the left and right side ending house numbers. The house number attributes in the data tables follow these conventions in locating establishments along road segments.
Parent topic: Data Structures for Geocoding
12.5.11 Indexes on Tables for Geocoding
To use the vendor-supplied tables for geocoding, indexes must be created on many of the tables, and the names of these indexes must follow certain requirements.
Example 12-5 lists the format of CREATE INDEX statements that create the required indexes. In each statement, you must use the index name, table name, column names, and (if multiple columns are indexed) sequence of column names as shown in Example 12-5, except that you must replace all occurrences of <suffix> with the appropriate string (for example, US
for the United States). Note that the first index in the example is a spatial index. Optionally, you can also include other valid keywords and clauses in the CREATE INDEX statements.
Example 12-5 Required Indexes on Tables for Geocoding
CREATE INDEX idx_<suffix>_road_geom ON gc_road_segment_<suffix> (geometry) INDEXTYPE IS mdsys.spatial_index_v2; CREATE INDEX idx_<suffix>_road_seg_rid ON gc_road_segment_<suffix> (road_id, start_hn, end_hn); CREATE INDEX idx_<suffix>_road_id ON gc_road_<suffix> (road_id); CREATE INDEX idx_<suffix>_road_setbn ON gc_road_<suffix> (settlement_id, base_name); CREATE INDEX idx_<suffix>_road_munbn ON gc_road_<suffix> (municipality_id, base_name); CREATE INDEX idx_<suffix>_road_parbn ON gc_road_<suffix> (parent_area_id, country_code_2, base_name); CREATE INDEX idx_<suffix>_road_setbnsd ON gc_road_<suffix> (settlement_id, soundex(base_name)); CREATE INDEX idx_<suffix>_road_munbnsd ON gc_road_<suffix> (municipality_id, soundex(base_name)); CREATE INDEX idx_<suffix>_road_parbnsd ON gc_road_<suffix> (parent_area_id, country_code_2, soundex(base_name)); CREATE INDEX idx_<suffix>_inters ON gc_intersection_<suffix> (country_code_2, road_id_1, road_id_2); CREATE INDEX idx_<suffix>_area_name_id ON gc_area_<suffix> (country_code_2, area_name, admin_level); CREATE INDEX idx_<suffix>_area_id_name ON gc_area_<suffix> (area_id, area_name, country_code_2); CREATE INDEX idx_<suffix>_poi_name ON gc_poi_<suffix> (country_code_2, name); CREATE INDEX idx_<suffix>_poi_setnm ON gc_poi_<suffix> (country_code_2, settlement_id, name); CREATE INDEX idx_<suffix>_poi_ munnm ON gc_poi_<suffix> (country_code_2, municipality_id, name); CREATE INDEX idx_<suffix>_poi_ regnm ON gc_poi_<suffix> (country_code_2, region_id, name); CREATE INDEX idx_<suffix>_ postcode ON gc_postal_code_<suffix> (country_code_2, postal_code); CREATE INDEX idx_<suffix>_addrpt_addr ON gc_address_point_<suffix> (road_segment_id, road_id, house_number, side);
Parent topic: Data Structures for Geocoding