JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Java CAPS Master Index Standardization Engine Reference     Java CAPS Documentation
search filter icon
search icon

Document Information

Oracle Java CAPS Master Index Standardization Engine Reference

About the Master Index Standardization Engine

Related Topics

Master Index Standardization Engine Overview

Standardization Concepts

Data Parsing or Reformatting

Data Normalization

Phonetic Encoding

How the Master Index Standardization Engine Works

Master Index Standardization Engine Data Types and Variants

Master Index Standardization Engine Standardization Components

Finite State Machine Framework

About the Finite State Machine Framework

FSM-Based Configuration

Rules-Based Framework

About the Rules-Based Framework

Rules-Based Configuration

Oracle Java CAPS Master Index Standardization and Matching Process

Master Index Standardization Engine Internationalization

Finite State Machine Framework Configuration

FSM Framework Configuration Overview

Process Definition File

Standardization State Definitions

Input Symbol Definitions

Output Symbol Definitions

Data Cleansing Definitions

Data Normalization Definitions

Standardization Processing Rules Reference

dictionary

fixedString

lexicon

normalizeSpace

pattern

replace

replaceAll

transliterate

uppercase

Lexicon Files

Normalization Files

FSM-Based Person Name Configuration

Person Name Standardization Overview

Person Name Standardization Components

Person Name Standardization Files

Person Name Lexicon Files

Person Name Normalization Files

Person Name Process Definition Files

Person Name Standardization and Oracle Java CAPS Master Index

Person Name Processing Fields

Person Name Standardized Fields

Person Name Object Structure

Configuring a Normalization Structure for Person Names

Configuring a Standardization Structure for Person Names

Configuring Phonetic Encoding for Person Names

FSM-Based Telephone Number Configuration

Telephone Number Standardization Overview

Telephone Number Standardization Components

Telephone Number Standardization Files

Telephone Number Standardization and Oracle Java CAPS Master Index

Telephone Number Processing Fields

Telephone Number Standardized Fields

Telephone Number Object Structure

Configuring a Standardization Structure for Telephone Numbers

Rules-Based Address Data Configuration

Address Data Standardization Overview

Address Data Standardization Components

Address Data Standardization Files

Address Clues File

Address Master Clues File

Address Patterns File

Address Pattern File Components

Address Type Tokens

Pattern Classes

Pattern Modifiers

Priority Indicators

Address Standardization and Oracle Java CAPS Master Index

Address Data Processing Fields

Address Standardized Fields

Address Object Structure

Configuring a Standardization Structure for Address Data

Configuring Phonetic Encoding for Address Data

Rules-Based Business Name Configuration

Business Name Standardization Overview

Business Name Standardization Components

Business Name Standardization Files

Business Name Adjectives Key Type File

Business Alias Key Type File

Business Association Key Type File

Business General Terms Reference File

Business City or State Key Type File

Business Former Name Reference File

Merged Business Name Category File

Primary Business Name Reference File

Business Connector Tokens Reference File

Business Country Key Type File

Business Industry Sector Reference File

Business Industry Key Type File

Business Organization Key Type File

Business Patterns File

Business Name Tokens

Business Name Standardization and Oracle Java CAPS Master Index

Business Name Processing Fields

Business Name Standardized Fields

Business Name Object Structure

Configuring a Standardization Structure for Business Names

Configuring Phonetic Encoding for Business Names

Custom FSM-Based Data Types and Variants

About Custom FSM-Based Data Types and Variants

About the Standardization Packages

Creating Custom FSM-Based Data Types

Creating the Working Directory

To Create the Working Directory

Defining the Service Type

To Define the Service Type

Defining the Variants

To Define the Variants

Packaging and Importing the Data Type

To Package and Import the Data Type

Service Type Definition File

Creating Custom FSM-Based Variants

Creating the Working Directory

To Create the Working Directory

Defining the Service Instance

To Define the Service Instance

Defining the State Model and Processing Rules

To Define the State Model and Processing Rules

Creating Normalization and Lexicon Files

To Create Normalization and Lexicon Files

Packaging and Importing the Variant

To Package and Import the Variant

Service Instance Definition File

Address Data Standardization Components

Standardization engines use tokens to determine how each field is standardized into its individual field components and to determine how to normalize a field value. Tokens also identify the field components to external applications like a master index application. The following table lists each token generated by the Master Index Standardization Engine for address data along with the standardization component they represent. You can only specify the predefined field tokens that are listed in this table for addresses unless you create a new data type or variant.

Table 3 Address Data Tokens

Token
Description
BoxDescript
Represents the PO box type from a standardized address field. By default, this is stored in the field_name_StName field in a master index database.
BoxIdentif
Represents the parsed PO box number from a standardized address field. By default, this is stored in the field_name_HouseNo field in a master index database.
CenterDescript
Represents the parsed structure description from a standardized address field. This address component is not included in the default master index standardization structure, but you can add it if needed.
CenterIdentif
Represents the parsed structure identifier from a standardized address field. This address component is not included in the default master index standardization structure, but you can add it if needed.
ExtraInfo
Represents any extra information that was not included in any of the other parsed components. This address component is not included in the default standardization structure, but you can add it if needed.
HouseClass
Represents the parsed house classification from a standardized address field. This address component is not included in the default master index standardization structure, but you can add it if needed.
HouseNumber
Represents the parsed house number from a standardized address field. By default, this is stored in the field_name_HouseNo field in a master index database.
HouseNumPrefix
Represents the parsed house number prefix from a standardized address field (such as the “A” in “A 1587 4th Street”). This address component is not included in the default master index standardization structure, but you can add it if needed.
HouseNumSuffix
Represents the parsed house number suffix from a standardized address field (such as the “B” in “5900 B Arnett Avenue”). This address component is not included in the default master index standardization structure, but you can add it if needed.
MatchPropertyName
Represents the parsed match property name from a standardized address field and is used internally by the standardization engine for blocking and phonetic encoding. This address component is not included in the default master index standardization structure, but you can add it if needed.
MatchStreetName
Represents the parsed and standardized street name from a standardized address field and is used internally by the standardization engine. If you want to store the standardized street name in the database (recommended), map this field to the street name field in the database. By default, this is stored in the field_name_StName field in a master index database.
OrigPropertyName
Represents the parsed original property name (such as the name of a complex or business park) from a standardized address field. This address component is not included in the default master index standardization structure, but you can add it if needed.
PropDesPrefDirection
Represents the parsed property direction from a standardized address field. This field ID handles cases where the direction is a prefix to the property description. By default, this is stored in the field_name_StDir field in a master index database.
PropDesPrefType
Represents the parsed property type from a standardized address field. This field ID handles cases where the street type is a prefix to the property description. By default, this is stored in the field_name_StType field in a master index database.
PropertySufDirection
Represents the parsed property direction from a standardized address field. This field ID handles cases where the direction is a suffix to the property description. By default, this is stored in the field_name_StDir field in a master index database.
PropertySufType
Represents the parsed property type from a standardized address field. This field ID handles cases where the street type is a suffix to the property description. By default, this is stored in the field_name_StType field in a master index database.
RuralRouteDescript
Represents the parsed rural route description from a standardized address field. By default, this is stored in the field_name_StName field in a master index database.
RuralRouteIdentif
Represents the parsed rural route identifier from a standardized address field. By default, this is stored in the field_name_HouseNo field in a master index database.
SecondHouseNumber
Represents the parsed second house number prefix from a standardized address field. This address component is not included in the default master index standardization structure, but you can add it if needed.
SecondHouseNumberPrefix
Represents the parsed second house number prefix from a standardized address field (such as “25” in “25 319 10th Ave.”). This address component is not included in the default master index standardization structure, but you can add it if needed.
SecondStreetNameSufDirection
Represents the parsed second street direction from a standardized address field. This address component is not included in the default standardization structure, but you can add it if needed.
SecondStreetNameSufType
Represents the parsed second street type from a standardized address field. This address component is not included in the default standardization structure, but you can add it if needed.
OrigSecondStreetName
Represents the parsed second street name from a standardized address field (for example, an address might include a cross-street or a thoroughfare and dependent thoroughfare). This address component is not included in the default master index standardization structure, but you can add it if needed.
OrigStreetName
Represents the parsed street name from an address field. If you want to store the original street name in the database, map this field to the street name field in the database. This address component is not included in the default standardization structure, but you can add it if needed.
StreetNamePrefDirection
Represents the parsed street direction from a standardized address field. This field ID handles cases where the direction is a prefix to the street name. By default, this is stored in the field_name_StDir field in a master index database.
StreetNamePrefType
Represents the parsed street type from a standardized address field. This field ID handles cases where the street type is a prefix to the street name. By default, this is stored in the field_name_StType field in a master index database.
StreetNameSufDirection
Represents the parsed street direction from a standardized address field. This field ID handles cases where the direction is a suffix to the street name. By default, this is stored in the field_name_StDir field in a master index database.
StreetNameSufType
Represents the parsed street type from a standardized address field. This field ID handles cases where the street type is a suffix to the street name. By default, this is stored in the field_name_StType field in a master index database.
StreetNameExtensionIndex
Represents the parsed street name extension from a standardized address field. This address component is not included in the default standardization structure, but you can add it if needed.
WithinStructDescript
Represents the parsed internal descriptor (such as “Floor”) from a standardized address field. This address component is not included in the default standardization structure, but you can add it if needed.
WithinStructIdentif
Represents the parsed internal identifier (such as a floor number) from a standardized address field. This address component is not included in the default standardization structure, but you can add it if needed.