For the first time that a crawl is run in a given workspace directory, the output file is named as described in the previous section. For example, if you run a full crawl, the output filename might be endecaOut-sgmt000.bin.gz. If you then run a second crawl (full or resumable), the Web Crawler works as follows:

The timestamp format used for renaming is:

YYYYMMDDHHmmSS

where:

Note that the timestamp format is hard-coded and cannot be reconfigured.


Copyright © Legal Notices