Normalizing themes

You can normalize a number of discovered themes into a single reported theme.

For example, suppose theme processing discovers iPhones, iPads, Androids, and Blackberries. Rather than reporting separate themes for each device, you would prefer to output one theme: "smart device". Normalizing themes provides this capability.

You must create a text file named normalization.dat and store it in the directory %LEXALYTICS_HOME%/data/themes. The format of this file is
<stemmed_theme>[tab]<normalized_theme>
For example, to implement the normalization case for smart devices described earlier, your normalization.dat file would include the following data:
iPhone[tab]smart device
iPad[tab]smartdevice
Android[tab]smart device
Blackberry[tab]smart device

In the text enrichment properties file, add the property te.theme-type.enabled with the value normalized. If you do not specify this property, stemmed themes will be output. If you specify the property incorrectly, the graph will fail.

Note: Any theme specified in the normalization.dat file will be normalized. Thus, you may see normalization occurring even in graphs where you have not specified te.theme-type.enabled=normalized. In this case, delete normalization.dat and use standard themes.