E Crawler Performance Metrics

The crawler reports crawler performance metrics information in the crawler log file in case of any crawl performance problem. These metrics messages are logged hourly in the crawler log file and are prefixed with "#%#%#", so as to make it easier to identify them, and to enable extracting them from the log file.

The following table lists the most common crawler performance metrics reported in the crawler log file.

The "ms" suffix in some of the sample metrics values in the table denotes unit of time in milliseconds. The sample metrics values having format "00:00:00" denotes the duration in hours:minutes:seconds.

Crawler Metrics Sample Value Description
Average document processing time 143.4 ms Average time taken for saving document's content into Oracle SES eq$cache table and to update the corresponding URLs. It is a measure of the database performance for saving a cache file.
Average wait time on cache queue 89.01 ms Average time taken for the caching thread requesting a document from in-memory cache queue till the time when the document is available.
Average caching time 144.7 ms Average time taken for getting the next document from cache and saving it into the database. It is almost equal to - [document processing time] + [wait time on cache queue] + [other overheads].
Cumulative number of page submit call 100331 Total number of document submissions from connector to the crawler. Each call submits one document for indexing.
Cumulative time for executing page submit 9:49:21 Total time taken by the crawler to process the submitted documents. This includes the time for - filtering, submitting the document cache for caching to the database, and so on. If this time is too long, then it indicates possible bottleneck on the caching thread side, where saving the cache to the database is not fast enough.
Average time (ms) per page submit call 352 ms Average time taken by the crawler to finish processing one document.
Cum. number of processUrlEntry call 100331 This is a sub-task of "page submit call". This metrics shows the total number of calls to - convert the submitted document into HTML format, perform filtering if necessary, and process attachments, if any.
Cum. processUrlEntry time 5:06:38 Total time taken for processing URLs. It is about 50% of the page submit call.
Average time (ms) per processUrlEntry call 183 ms Average time taken for processing URLs.
Cum. number of processUrlInfo call 100331 This is a sub-task of "page submit call". This metrics shows the total number of calls for parsing HTML-formatted documents and applying document service processing on them, such as, extracting topics. For non-connector data sources, it includes calls related to extracting attribute values, such as, title from a document.
Cum. processUrlInfo time 4:39:00 Total time taken for executing the crawler tasks - processUrlEntry and processUrlInfo - for making a document ready for indexing. Generally, this time is more than 95% of the page submit time.
Average time (ms) per filtering call 253 ms Filtering is a sub-task of "processUrlEntry" task, and it denotes the average time taken for filtering a document.
Cum. number of storing attr. values call 100331 Total number of calls made for storing attribute values in database.
Cum. time for storing attr. values 0:06:04 Total time taken for storing attribute values in database.
Average time (ms) per storing attr. values call 3 ms Average time taken for storing attribute values in database.
Cum. number of set attr. values call 100331 Total number of calls made for setting attribute values.
Cum. time for set attr. values 0:05:25 Total time taken for setting attribute values.
Average time (ms) per set attr. values call 3 ms Average time taken for setting attribute values.
Cum. number of adding attr. values call 0 Total number of calls made for adding attribute values.
Cum. time for adding attr. values 0 ms Total time taken for adding attribute values.
Average time (ms) per adding attr. values call 0 ms Average time taken for adding attribute values.
Cum. number of set 4k text call 100331 Total number of calls made for storing 4KB of text in database.
Cum. time for set 4k text 0:01:13 Total time taken for storing 4KB of text in database.
Average time (ms) per set 4k text call 0 ms Average time taken for storing a single 4KB of text in database.
Cum. number of storing security attr. values call 100331 Total number of calls made for storing security attribute values in database.
Cum. time for storing security attr. values 0:02:58 Total time taken for storing security attribute values in database.
Average time (ms) per storing security attr. values call 1 ms Average time taken for storing a security attribute values in database.
Cum. number of insert URL entry call 100331 Total number of calls made for storing URLs in database.
Cum. time for insert URL entry 0:21:51 Total time taken for storing URLs in database.
Average time (ms) per insert URL entry call 13 ms Average time taken for storing a URL in database.
Cum. number of pagesProcessed call 100331 Total number of pages processed.
Cum. time for executing pagesProcessed 0:01:54 Total time taken for processing pages.
Cum. time for executing page update 0:01:12 Total time taken for updating pages.
Cum. time for executing copy URL stage 0:00:36 Total time taken for copying URLs.
Average time (ms) per pagesProcessed call 1 ms Average time taken for processing a page.
Average time (ms) per page update call 0 ms Average time taken for updating a page.
Average time (ms) per URL stage copy call 0 ms Average time taken for copying a URL.