The first time that CAS Server crawls a data source, the output file is named as described in the previous section.
For example, if you run a full crawl, the output filename might be CrawlerOutput-FULL-sgmt000.bin.gz
.
If you then run a second crawl (for example, an incremental crawl), the CAS Server works as follows:
A directory named
archive
is created under the output directory.The original
CrawlerOutput-FULL-sgmt000.bin.gz
file is moved to thearchive
directory and is renamed by adding a timestamp to the name; for example:CrawlerOutput-FULL-20071026140235-sgmt000.bin.gz
The output file from a second incremental run is named
CrawlerOutput-INCR-sgmt000.bin.gz
and is stored in the output directory.For every subsequent crawl using the same output directory, steps 2 and 3 are repeated.
The timestamp format used for renaming is:
YYYYMMDDHHmmSS
where:
The timestamp format is not configurable.