Examining the Endeca CAS Service log

The Endeca CAS Service logs messages for all CAS components and crawls in the cas-service.log file.

Location of the CAS Service log

The Endeca CAS Server has one (and only one) log, regardless of how many crawls have been configured. The log is named cas-service.log and is located in the logs directory in the CAS workspace directory. If you are using the default workspace directory name, the pathname of the log file is similar to this:
C:\Endeca\CAS\workspace\logs\cas-service.log

Format of log entries

The log contains two types of log entries:
  • CAS component log entries, which are entries that pertain to starting and stopping CAS components.
  • crawl log entries, which are entries that pertain to a specific crawl.
By default, crawl log entries have the format:
yy-MM-dd HH-mm-ss logLevel crawl-name message (module)
where:
  • yy-MM-dd HH-mm-ss is the timestamp of the entry. You can change the format by editing the cas-server.log4j.properties file.
  • logLevel is the log level of the entry, such as INFO or FATAL.
  • crawl-id is the name of the crawl.
  • message is the message returned by a CAS Server module.
  • module is the CAS Server module that was working on the document when it returned the message.
For example, a log entry for a crawl named Crawl07 might look like this (assuming a DEBUG log level and omitting timestamps for ease of reading):
[Crawl07] Starting work: Processing C:\Work\Plans.doc (WorkExecutor$WorkRunnable)
[Crawl07] Processing record C:\Work\Plans.doc (FileCrawlSource)
[Crawl07] Extracting text from file: C:\Work\Plans.doc of size 82K (DocumentConversionProcessor)
[Crawl07] Stellent converting file: C:\Work\Plans.doc (StellentDocumentConverter)
[Crawl07] Successfully converted file: C:\Work\Plans.doc (StellentDocumentConverter)
[Crawl07] Finished work: Processing C:\Work\Plans.doc (WorkExecutor$WorkRunnable)
The entries show that the text-extraction process for a file (named Plans.doc) was successfully accomplished.

Enabling crawl statistics

If a crawl log level is set to INFO, TRACE, or DEBUG, the crawl statistics are entered as INFO entries in the log when the crawl finishes, as in this example (timestamps and log levels are omitted for ease of reading):
Crawl Mode = FULL_CRAWL (MetricsReport)
Crawl Stop Cause = Completed (MetricsReport)
Directories Filtered from Archives = 0 (MetricsReport)
Directories Filtered = 0 (MetricsReport)
Total Records Output = 423 (MetricsReport)
Files Filtered from Archives = 124 (MetricsReport)
Directories Crawled Not from Archives = 55 (MetricsReport)
Documents Unsuccessfully Converted = 9 (MetricsReport)
Files Crawled from Archives = 65 (MetricsReport)
Files Crawled Not from Archives = 285 (MetricsReport)
Delete Records Output = 0 (MetricsReport)
Files Filtered Not from Archives = 51 (MetricsReport)
Directories Crawled = 73 (MetricsReport)
Directories Filtered Not from Archives = 0 (MetricsReport)
Documents Converted = 333 (MetricsReport)
Files Crawled = 350 (MetricsReport)
Documents Converted After Retry = 0 (MetricsReport)
New or Updated Records Output = 423 (MetricsReport)
Directories Crawled from Archives = 18 (MetricsReport)
Files Filtered = 175 (MetricsReport)
Crawl Seconds = 71 (MetricsReport)
Start Time = 5/23/08 9:23:59 AM EDT (MetricsReport)
End Time = 5/23/08 9:25:10 AM EDT (MetricsReport)

Note that for incremental crawls, the Delete Records Output statistic is also included and indicates how many files were deleted from the previous crawl. An Endeca record is created for each deleted file; the record will have the Endeca.Action property set to DELETE.

The Crawl Stop Cause statistic has one of the following values:
  • Completed
  • Failed
  • Aborted
If a crawl fails, the Crawl Failure Reason statistic provides a message from the CAS Server explaining the failure.

Keep in mind that if the log is too verbose (thus making it more difficult to find errors), you can change the log level of the crawl. The default log level is INFO.

The CAS logging configuration file is cas-service-log4j.properties and is located in the <install path>\CAS\workspace\conf directory. You can also change the log level on a per-crawl basis using the CAS Console, the CAS API, or the CAS command-line utilities.