Creating a custom dictionary

This topic provides the steps to create your custom dictionary file.

To create a custom dictionary:

  1. Start a text editor that supports UTF-8 characters and enables you to edit the language you want to supplement.
  2. Create a new UTF-8 encoded file.
  3. Add words to the dictionary. Start each word on a separate line that begins with the command STEM or COMPOUND, followed by the word or character, any optional attributes, and then a carriage return.
  4. Optionally, add comments to the file.

    Comments must begin with a pound sign (#). You can also have blank lines in this file.

  5. Save the dictionary file with the filename dictionary.<lang_code>.dict, where <lang_code> is one of the supported language codes (such as dictionary.de.dict for a German custom dictionary).

    Note that the dictionary name does not need to include a region code unless you are using a language such as simplified Chinese (zh_CN) or Brazilian Portuguese (pt_BR). For example: dictionary.zh_CN.dict.

  6. Place the file in the $ENDECA_HOME/endeca-server/dgraph/olt directory.
  7. Re-index your data by re-ingesting it with your ETL tool.