The robot extracts and follows links to the various sites selected for indexing. As the system administrator, you can control these processes through a number of settings, including:
Starting, stopping, and scheduling the robot
Defining the sites the robot visits
Crawling attributes that determine how aggressively it crawls
The types of resources the robot indexes by defining filters
What kind of entries the robot creates in the database by defining the indexing attributes
See the Sun Java System Portal Server 7.1 Technical Reference for descriptions of the robot crawling attributes.
Filters enable identify a resource so that it can be excluded or included by comparing an attribute of a resource against a filter definition. The robot provides a number of predefined filters, some of which are enabled by default. The following filters are predefined. Filters marked with an asterisk are enabled by default.
Archive Files*
Audio Files*
Backup Files*
Binary Files*
CGI Files*
Image Files*
Java, JavaScript, Style Sheet Files*
Log Files*
Lotus Domino Documents
Lotus Domino OpenViews
Plug-in Files
Power Point Files
Revision Control Files*
Source Code Files*
Spreadsheet Files
System Directories (UNIX)
System Directories (NT)
Temporary Files*
Video Files*
You can create new filter definitions, modify a filter definition, or enable or disable filters. See Resource Filtering Process for detailed information.