Oracle Commerce Guided Search - Specifying non-default language analysis

Specifying non-default language analysis

With the exception of Chinese, Japanese, and Korean (the CJK languages), you can set the default language analysis for each language to either OLT or Latin-1 language analysis. CJK languages default to OLT analysis and cannot be configured to use Latin-1.

To change the default language analyzer for other languages:

For Dutch, English, English (UK), French, German, Italian, Portuguese, and Spanish, which default to Latin-1 analysis:
For Arabic, Czech, Danish, Greek, Hungarian, Polish, and Russian, which default to OLT analysis:
Note
The configuration for the stemming.xml file was designed to accept only a limited set of languages. These languages must be enabled explicitly in the file to set the language analyzer. Additional languages, including those listed below in Step 3, are automatically configured based on the presence or absence of a custom stemming dictionary.
For Catalan, Croatian, Finnish, Hebrew, Persian (Farsi), Portuguese (Brazil), Norwegian (Bokmal and Nynorsk), Romanian, Serbian, Serbian (Latin), Slovak, Slovenian, Swedish, Thai, and Turkish, which default to OLT analysis:
1. Navigate to the MDEX\<version>\conf\stemming\custom directory.
2. Create a static stemming dictionary named <lang id>_word_forms_collection.xml.
  This configures the language for Latin-1 analysis.
The presence of the static stemming dictionary is sufficient to change the language analyzer to Latin-1.

Note

The Dgidx and Dgraph load custom dictionaries for all languages configured in the stemming.xml file.

Copyright © Legal Notices