5.2 Analyzers
The Analyzer assists compliance teams and technical users in understanding how specific input data—such as names or addresses—is tokenized, normalized, and matched against watchlist entries. It enables detailed examination of synonym expansion, token classification, and match scoring mechanisms that are essential for fine-tuning the screening engine and ensuring accurate detection of true positives while minimizing false positives.
Configure Analyzer on Index Management
- From the Navigation List, click Watchlist Management>Index Management. The index Management page displays a list of available watchlists.
- Click Index JSON
to view the JSON file details associated with this watch list.
Figure 5-1 Edit icon

The Edit Index JSON pop-up appears.
- You can edit the JSON in this window.
For example: If we want to use Country Synonyms for Country Name under Prohibited Country Watchlist, then edit the Country List's index json and change the
analyzerTypeof particular attribute ("name" : "v_country_name") to address.{ "schemaName" : "", "runSkey" : 68, "batchRunId" : "WLDJWLoad_2024-03-28_1711619630679_1", "tableName" : "FCC_WL_DJW_V", "deletedProfilesTableName" : null, "filterCondition" : "1=1", "indexName" : "fcc_idx_djw", "indexAlias" : "idx_djw", "disasterRecovery" : false, "indexLogicalName" : "Watchlist", "indexBusinessName" : "Dow Jones", "indexKeyAttribute" : "n_uid", "loadType" : "FullLoad", "shards" : 3, "replicas" : 4, "attributes" : [ { "name" : "v_given_name", -- Each of these blocks is used to define pre-processing of the fields. "type" : "text", "similarity" : "boolean", "analyzerType" : "name", -- Here, the user can enter any analyser mentioned in the list. "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_family_name", "type" : "text", "similarity" : "boolean", "analyzerType" : "namestop", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_full_name", "type" : "text", "similarity" : "boolean", "analyzerType" : "namestop", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_aliases_given_name", "type" : "text", "similarity" : "boolean", "analyzerType" : "namestop", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_aliases_family_name", "type" : "text", "similarity" : "boolean", "analyzerType" : "name", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_entity_name", "type" : "text", "similarity" : "boolean", "analyzerType" : "organization", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_entity_name_bus_strip", "type" : "text", "similarity" : "boolean", "analyzerType" : "organization", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_original_script_name", "type" : "text", "similarity" : "boolean", "analyzerType" : "name", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_date_of_births", "type" : "text", "similarity" : "boolean", "analyzerType" : "date", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_passports", "type" : "text", "similarity" : "boolean", "analyzerType" : "name", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_ssn", "type" : "text", "similarity" : "boolean", "analyzerType" : "name", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_identification_numbers", "type" : "text", "similarity" : "boolean", "analyzerType" : "name", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_city", "type" : "text", "similarity" : "boolean", "analyzerType" : "address", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_country", "type" : "text", "similarity" : "boolean", "analyzerType" : "address", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_nationality", "type" : "text", "similarity" : "boolean", "analyzerType" : "address", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_residence", "type" : "text", "similarity" : "boolean", "analyzerType" : "address", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_yob", "type" : "text", "similarity" : "boolean", "analyzerType" : "name", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_min_yob", "type" : "integer", "similarity" : "boolean", "analyzerType" : null, "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_max_yob", "type" : "integer", "similarity" : "boolean", "analyzerType" : null, "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_address", "type" : "text", "similarity" : "boolean", "analyzerType" : "address", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_aliases", "type" : "text", "similarity" : "boolean", "analyzerType" : "namestop", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_gender", "type" : "text", "similarity" : "boolean", "analyzerType" : "gender", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_place_of_birth", "type" : "text", "similarity" : "boolean", "analyzerType" : "address", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null }, { "name" : "v_title", "type" : "text", "similarity" : "boolean", "analyzerType" : "gender", "searchAnalyzerType" : null, "fields" : [ ], "termVector" : null } ], "customAnalyzer" : [ ], "customFilter" : [ ], "customCharFilter" : [ ], "customTokenizer" : [ ], "others" : [ "n_wl_skey", "n_run_skey", "v_wl_sub_type", "v_wl_type", "v_entity_type", "n_uid" ], "replaceEmptyFields" : [ ], "replaceCharFields" : [ { "name" : "v_full_name", "charArray" : [ "-.", "''" ], "replaceWith" : [ " ", "" ] }, { "name" : "v_family_name", "charArray" : [ "-.", "''" ], "replaceWith" : [ " ", "" ] }, { "name" : "v_given_name", "charArray" : [ "-.", "''" ], "replaceWith" : [ " ", "" ] }, { "name" : "v_entity_name", "charArray" : [ "-.", "''", "&", "()" ], "replaceWith" : [ " ", "", "and", " " ] }, { "name" : "v_entity_name_bus_strip", "charArray" : [ "-.", "''", "&", "()" ], "replaceWith" : [ "", "", "and", " " ] }, { "name" : "v_original_script_name", "charArray" : [ "-.", "''" ], "replaceWith" : [ " ", "" ] }, { "name" : "v_country", "charArray" : [ "-.", "''" ], "replaceWith" : [ " ", "" ] }, { "name" : "v_address", "charArray" : [ "-.", "''" ], "replaceWith" : [ " ", "" ] }, { "name" : "v_aliases_given_name", "charArray" : [ "-.", "''" ], "replaceWith" : [ " ", "" ] }, { "name" : "v_aliases_family_name", "charArray" : [ "-.", "''" ], "replaceWith" : [ " ", "" ] }, { "name" : "v_aliases", "charArray" : [ "-.", "''" ], "replaceWith" : [ " ", "" ] } ], "translateFields" : [ "v_family_name", "v_given_name", "v_full_name", "v_aliases_family_name", "v_aliases_given_name", "v_aliases" ]}} - Click Validate to verify that your edits to the JSON are valid.
- Click Save to update the JSON or click Cancel to close without saving your changes.
Analyzer Types
Table 5-3 Analyzer Types
| Analyzer Type | Supported Filters | Type | Description |
|---|---|---|---|
| Name |
Individual Name Synonyms Individual Title |
Synonym Stop Word |
The Name analyzer processes person names by applying standardization rules such as name synonyms (e.g., Will - William) and removal of non-essential titles (e.g., Mr., Dr.). This ensures better match results during screening. |
| Address | Country Synonyms | Synonym |
The Address analyzer processes address components by resolving country-specific synonyms (e.g., USA - United States, , UK - United Kingdom)). This enhances consistency and accuracy in location-based matching. Example: Input: Normalized to: |
| Phone | No Token Filters | - | The Phone Analyzer tokenizes and indexes phone numbers without applying any additional filters. It enables straightforward and direct matching of phone number values during screening. |
| No Token Filters | - | The Email Analyzer processes email addresses as exact tokens without applying any transformations or filters. It is designed for direct string matching, ensuring that the full email address is preserved for accurate comparison. | |
| Organization |
Organization Numbers Organization Suffix Organization Strip Words |
Synonym Stop Word Stop Word |
The Organization analyzer standardizes organization names by removing common suffixes (e.g., Inc, Ltd), normalizing common terms, and ignoring non-distinct or generic words. This improves matching for corporate entity names. |
| Gender | Individual Gender | Synonym | The Gender analyzer handles gender-related fields by resolving known synonyms (e.g., F - Female, M - Male) for consistent identity matching. |
| Date | No Token Filters | - | Dates are indexed as-is with no transformation or filtering. |
| Name Stop |
Individual Name Synonyms Individual Title Individual Name Strip Words |
Synonym Stop Word Stop Word |
The Name Stop Analyzer clean and normalize names by removing non-essential tokens and standardizing known variations. This helps improve match accuracy during screening. Example: Name : The Name Stop Analyzer:
|
| TF Analyzer | No Token Filters | - | The TF Analyzer is used in Transaction Filtering to tokenize and normalize input data by applying filters like lower casing, stop word removal, and synonym resolution for improved match accuracy. |
| Document ID | No Token Filters | - | A custom analyzer that tokenizes text using delimiters (comma, semicolon, Tilde, Parentheses, and space), converts tokens to lowercase, and removes duplicates. This analyzer is intended for indexing document identifiers such as national ID numbers or other personal identification numbers. |
| Organization Strip |
Organization Numbers Organization Stop Words |
Synonym Stop Word |
The Organization Strip Analyzer is designed to strip common suffixes or terms from organization names that don't add unique value to the name, improving match consistency across similar entries. Example: Consider the organization name:
The Organization Strip Analyzer:
Processed Result: This helps normalize similar entries like ABC Technologies Limited and ABC Tech Ltd. to a comparable form for accurate screening. |
| Alphanumeric Keyword | No Token Filters | - | A custom analyzer that splits text on any sequence of non-letter and non-digit characters, converts all characters to lowercase, and applies ASCII folding to normalize accented or special characters to their ASCII equivalents. |