12.1 Concepts for Geocoding

This topic describes concepts that you must understand before you use the Spatial geocoding capabilities.

12.1.1 Address Representation

Addresses to be geocoded can be represented either as formatted addresses or unformatted addresses.

A formatted address is described by a set of attributes for various parts of the address, which can include some or all of those shown in Table 12-1.

Table 12-1 Attributes for Formal Address Representation

Address Attribute Description

Name

Place name (optional).

Intersecting street

Intersecting street name (optional).

Street

Street address, including the house or building number, street name, street type (Street, Road, Blvd, and so on), and possibly other information.

In the current release, the first four characters of the street name must match a street name in the geocoding data for there to be a potential street name match.

Settlement

The lowest-level administrative area to which the address belongs. In most cases it is the city. In some European countries, the settlement can be an area within a large city, in which case the large city is the municipality.

Municipality

The administrative area above settlement. Municipality is not used for United States addresses. In European countries where cities contain settlements, the municipality is the city.

Region

The administrative area above municipality (if applicable), or above settlement if municipality does not apply. In the United States, the region is the state; in some other countries, the region is the province.

Postal code

Postal code (optional if administrative area information is provided). In the United States, the postal code is the 5-digit ZIP code.

Postal add-on code

String appended to the postal code. In the United States, the postal add-on code is typically the last four numbers of a 9-digit ZIP code specified in "5-4" format.

Country

The country name or ISO country code.

Formatted addresses are specified using the SDO_GEO_ADDR data type, which is described in SDO_GEO_ADDR Type.

An unformatted address is described using lines with information in the postal address format for the relevant country. The address lines must contain information essential for geocoding, and they might also contain information that is not needed for geocoding (something that is common in unprocessed postal addresses). An unformatted address is stored as an array of strings. For example, an address might consist of the following strings: '22 Monument Square' and 'Concord, MA 01742'.

Unformatted addresses are specified using the SDO_KEYWORDARRAY data type, which is described in SDO_KEYWORDARRAY Type.

12.1.2 Match Modes

The match mode for a geocoding operation determines how closely the attributes of an input address must match the data being used for the geocoding. Input addresses can include different ways of representing the same thing (such as Street and the abbreviation St), and they can include minor errors (such as the wrong postal code, even though the street address and city are correct and the street address is unique within the city).

You can require an exact match between the input address and the data used for geocoding, or you can relax the requirements for some attributes so that geocoding can be performed despite certain discrepancies or errors in the input addresses. Table 12-2 lists the match modes and their meanings. Use a value from this table with the MatchMode attribute of the SDO_GEO_ADDR data type (described in SDO_GEO_ADDR Type) and for the match_mode parameter of a geocoding function or procedure.

Table 12-2 Match Modes for Geocoding Operations

Match Mode Description

EXACT

All attributes of the input address must match the data used for geocoding. However, if the house or building number, base name (street name), street type, street prefix, and street suffix do not all match the geocoding data, a location in the first match found in the following is returned: postal code, city or town (settlement) within the state, and state. For example, if the street name is incorrect but a valid postal code is specified, a location in the postal code is returned.

RELAX_STREET_TYPE

The street type can be different from the data used for geocoding. For example, if Main St is in the data used for geocoding, Main Street would also match that, as would Main Blvd if there was no Main Blvd and no other street type named Main in the relevant area.

RELAX_POI_NAME

The name of the point of interest does not have to match the data used for geocoding. For example, if Jones State Park is in the data used for geocoding, Jones State Pk and Jones Park would also match as long as there were no ambiguities or other matches in the data.

RELAX_HOUSE_NUMBER

The house or building number and street type can be different from the data used for geocoding. For example, if 123 Main St is in the data used for geocoding, 123 Main Lane and 124 Main St would also match as long as there were no ambiguities or other matches in the data.

RELAX_BASE_NAME

The base name of the street, the house or building number, and the street type can be different from the data used for geocoding. For example, if Pleasant Valley is the base name of a street in the data used for geocoding, Pleasant Vale would also match as long as there were no ambiguities or other matches in the data.

RELAX_POSTAL_CODE

The postal code (if provided), base name, house or building number, and street type can be different from the data used for geocoding.

RELAX_BUILTUP_AREA

The address can be outside the city specified as long as it is within the same county. Also includes the characteristics of RELAX_POSTAL_CODE.

RELAX_ALL

Equivalent to RELAX_BUILTUP_AREA.

DEFAULT

Equivalent to RELAX_POSTAL_CODE.

12.1.3 Match Codes

The match code is a number indicating which input address attributes matched the data used for geocoding. The match code is stored in the MatchCode attribute of the output SDO_GEO_ADDR object (described in SDO_GEO_ADDR Type).

Table 12-3 lists the possible match code values.

Table 12-3 Match Codes for Geocoding Operations

Match Code Description

1

Exact match: the city name, postal code, street base name, street type (and suffix or prefix or both, if applicable), and house or building number match the data used for geocoding.

2

The city name, postal code, street base name, and house or building number match the data used for geocoding, but the street type, suffix, or prefix does not match.

3

The city name, postal code, and street base name match the data used for geocoding, but the house or building number does not match.

4

The city name and postal code match the data used for geocoding, but the street address does not match.

10

The city name matches the data used for geocoding, but the postal code does not match.

11

The postal code matches the data used for geocoding, but the city name does not match.

12

The region matches the data in the geocoder schema, but the city name and postal code do not match.

12.1.4 Error Messages for Output Geocoded Addresses

Note:

You are encouraged to use the MatchVector attribute (see Match Vector for Output Geocoded Addresses) instead of the ErrorMessage attribute, which is described in this section.

For an output geocoded address, the ErrorMessage attribute of the SDO_GEO_ADDR object (described in SDO_GEO_ADDR Type) contains a string that indicates which address attributes have been matched against the data used for geocoding. Before the geocoding operation begins, the string is set to the value ???????????281C??; and the value is modified to reflect which attributes have been matched.

Table 12-4 lists the character positions in the string and the address attribute corresponding to each position. It also lists the character value that the position is set to if the attribute is matched.

Table 12-4 Geocoded Address Error Message Interpretation

Position Attribute Value If Matched

1-2

(Reserved for future use)

??

3

Address point

X

4

POI name

O

5

House or building number

#

6

Street prefix

E

7

Street base name

N

8

Street suffix

U

9

Street type

T

10

Secondary unit

S

11

Built-up area or city

B

12-13

(Reserved)

(Ignore any values in these positions.)

14

Region

1

15

Country

C

16

Postal code

P

17

Postal add-on code

A

12.1.5 Match Vector for Output Geocoded Addresses

For an output geocoded address, the MatchVector attribute of the SDO_GEO_ADDR object (described in SDO_GEO_ADDR Type) contains a string that indicates how each address attribute has been matched against the data used for geocoding. It gives more accurate and detailed information about the match status of each address attribute than the ErrorMessage attribute (described in Error Messages for Output Geocoded Addresses). Before the geocoding operation begins, the string is set to the value ?????????????????. Each character of this string indicates the match status of an address attribute.

Table 12-5 lists the character positions in the string and the address attribute corresponding to each position. Following the table is an explanation of what the value in each character position represents.

Table 12-5 Geocoded Address Match Vector Interpretation

Position Attribute

1-2

(Reserved for future use)

3

Address point location (not interpolated)

4

POI name

5

House or building number

6

Street prefix

7

Street base name

8

Street suffix

9

Street type

10

Secondary unit

11

Built-up area or city

14

Region

15

Country

16

Postal code

17

Postal add-on code

Each character position in Table 12-5 can have one of the following possible numeric values:

  • 0: The input attribute is not null and is matched with a non-null value.

  • 1: The input attribute is null and is matched with a null value.

  • 2: The input attribute is not null and is replaced by a different non-null value.

  • 3: The input attribute is not null and is replaced by a null value.

  • 4: The input attribute is null and is replaced by a non-null value.