Oracle Commerce Guided Search - Archived output files

Archived output files

The first time that CAS Server crawls a data source, the output file is named as described in the previous section. For example, if you run a full crawl, the output filename might be CrawlerOutput-FULL-sgmt000.bin.gz. If you then run a second crawl (for example, an incremental crawl), the CAS Server works as follows:

A directory named archive is created under the output directory.
The original CrawlerOutput-FULL-sgmt000.bin.gz file is moved to the archive directory and is renamed by adding a timestamp to the name; for example:
```
CrawlerOutput-FULL-20071026140235-sgmt000.bin.gz
```
The output file from a second incremental run is named CrawlerOutput-INCR-sgmt000.bin.gz and is stored in the output directory.
For every subsequent crawl using the same output directory, steps 2 and 3 are repeated.

The timestamp format used for renaming is:

YYYYMMDDHHmmSS

YYYY is a four-digit year, such as 2009.
MM is the month as a number (01-12), such as 10 for October.
DD is the day of the month, such as 25 (for October 25th).
HH is the hour of the day in a 24-hour format (00-23), such as 14 (for 2 p.m.).
mm is the minute of the hour (00-59).
SS is the second of the minute (00-59).

The timestamp format is not configurable.

Copyright © Legal Notices