The
parse-plugins.xml file provides mappings of MIME
types to parsers.
The
mime-types.xml file has two
purposes:
Note that the name of this file is specified to the Web Crawler via
the
parse.plugin.file property in the
default.xml configuration file.
This entry from the file shows how these parsing rules are set:
<mimeType name="text/xml"> <plugin id="parse-html" /> <plugin id="endeca-searchexport-converter-parser" /> </mimeType>
In this entry, the HtmlParser plugin is first invoked for a
text/xml MIME type. If that plugin is successful,
the parsing is finished. If it is unsuccessful, then the
endeca-searchexport-converter-parser plugin is invoked.
Note that this entry:
<mimeType name="*"> <plugin id="endeca-searchexport-converter-parser" /> </mimeType>
indicates that the endeca-searchexport-converter-parser plugin is invoked for any unmatched MIME type.
In general, you should not modify the contents of this file unless you have written your own parser plugin.

