Sun GlassFish Web Space Server 10.0 Microsoft Sharepoint Add-On Guide

Control Crawler Tab

You can start or stop the crawler from this tab. You can start the crawler after you add a Sharepoint site. The crawler crawls the information available on the Sharepoint site added from the Manage Sites tab in the Sharepoint Integration Admin portlet.

Web Space Server uses the Sharepoint search crawler to examine Sharepoint site URLs and associated network resources such as Sharepoint lists (like calendars and alerts), so that they can be indexed in a site search database and a content search database used by the membership and the search portlets.

Figure 3–5 Control Crawler Tab

Control Crawler Tab

The Ready status means that the crawler has not started. To start crawling Sharepoint sites, click Start Crawler from the Actions menu. After the crawling is complete, the Crawler status is Stopped.

It also displays the other information related to crawling status, including the number of sites crawled, the number of sites which are enabled for crawling, and the number of sites for which the crawling is completed.

Indexed displays the number of metadata items collected. When a new metadata item is added, all the metadata is automatically reindexed. But if changes are made only in the Active Directory, the crawler might not reindex the metadata. If this situations occurs, choose Clean Running Status, and Remove Index from the Actions menu. Then restart the crawler by selecting Start Crawler.

Figure 3–6 Crawling a Sharepoint Site

Crawling a Sharepoint Site

Choosing Actions Menu Options

The Actions menu has Start Crawler, Clean Running Status, and Remove Index options.

Start Crawler

When you select Start Crawler, the crawler starts indexing all the metadata. The Crawler Status changes from Ready to Running, and when the crawling is completed, it changes to Stopped.

Clean Running Status

The "Status" records the last-modified of a crawled object (site, list or item), so the next crawl will only update the index of those items changed since last crawl. When need to crawl all objects disregarding the time stamp, then we need to clean the status (database). Running Clean Running Status cleans the status.

Remove Index

After cleaning the status, the crawler runs just as the very first time. If you want to re-index all the metadata, you need to select the Clean Running Status and Remove Index options consequently. If the Sharepoint site has no changes, you will see exactly the same number of indexed items.