These key tests should be performed on every content crawler.
All the
following tests should be performed in multiple implementations of
the portal.
Test the
entire crawl depth. Confirm that documents are structured correctly
in every level. Crawl depth should be as shallow as possible. If there
are problems, check the filters on the target folders. If nothing
is returned, check the authentication settings in the associated Content
Source and Web Service - Content objects.
Check the
document metadata. Is it stored in the appropriate properties?
Does it match the metadata in the source repository? If there are
problems, check the Content Type settings in the Content Crawler editor,
and check the mappings for each associated Content Type.
Click through
to crawled documents from each crawled directory. If there are
problems, check the gateway settings in the Web Service - Content
editor.
Test refreshing
documents to confirm that they reflect modifications. If there
are problems, make sure you are providing the correct document signature.
Check logs
after every crawl. The log can reveal problems even if the portal
reports a successful crawl.