The Endeca Web Crawler is based on the Apache Nutch open-source project. As a result, its major functionality is implemented as plugins. Its framework allows you to write your own plugins, such as plugins that extract additional content from Web pages.
The sample plugin demonstrates how to integrate custom plugins into the Web Crawler. The Endeca Web Crawler APIs contain sample code and documentation to help you create your own plugins.
All plugins (including the default plugins and user-created plugins) reside in the IAS\<version>\lib\web-crawler\plugins directory. Each individual plugin directory contains one or more JAR files and a plugin descriptor file (named plugin.xml).