Rather than supplement a default stemming dictionary, you may chose to entirely replace a default stemming dictionary with a custom a stemming dictionary.

To replace a default stemming dictionary with a custom stemming dictionary:

  1. Create a custom dictionary file with stemming entries. For example XML, see the XML schema of any default stemming dictionary stored in <install path>\MDEX\<version>\conf\stemming.

    For example, this simplified English stemming dictionary contains one term and one stemmed variant:

    <?xml version="1.0"?>
    
    <!ELEMENT WORD_FORMS_COLLECTION_UPDATES (COMMENT?, REMOVE_WORD_FORMS_KEYS*,ADD_WORD_FORMS*)> 
    
    <WORD_FORMS_COLLECTION>
    
    <WORD_FORMS>
    
    <WORD_FORM>car</WORD_FORM>
    
    <WORD_FORM>cars</WORD_FORM>
    
    </WORD_FORMS>
    
    </WORD_FORMS_COLLECTION>
  2. When you have created the custom stemming dictionary, save the XML file with one of the following name formats:

    For example, the XML above would be saved as en_word_forms_collection.xml where en is the ISO639-1 code for English.

  3. Place the XML file in <install path>\MDEX\<version>\conf\stemming\custom.

  4. Open your project in Developer Studio.

  5. In the Project Explorer, expand Search Configuration.

  6. Double-click Stemming to display the Stemming editor.

  7. Un-check the language you want to replace.

  8. Click OK.

  9. Specify the --lang flag to Dgidx with a <lang id> argument that matches the language code of the custom stemming dictionary file.

    In the example above that uses an English (en) dictionary, you would specify:

    dgidx --lang en

Copyright © Legal Notices