The Content Acquisition System automatically creates directories
under
<install path>\CAS\workspace\state
that you can
use to store state information for a data source or manipulator extension. An
extension can read, write, or delete state information from these directories
as necessary.
A data source may require state information to run an incremental acquisition. For example, by relying on a file that stores the last date that the data source read from a CMS. The data source may later read from the file and pass in the date in order to run an incremental acquisition.
The path for a data source's state directory is
<install
path>\CAS\workspace\state\cas\crawls\
.
crawlId
\source\
The path for a manipulator's state directory is
<install
path>\CAS\workspace\state\cas\crawls\
.
crawlId
\manipulators\manipulatorId
At end of an extension's life cycle, CAS calls
PipelineComponent.deleteInstance()
and then CAS also
deletes the contents of the
state
directory.
Related links