About threading

Data sources and manipulators must be thread safe.

The stop() method can be called concurrently when any of the following methods are running:

Recommendations for data sources

The requirement to be thread safe has a few implementation implications for data sources:
  • Any state that is shared with runFullAcquisition() needs to be synchronized with stop(). State may be share with checkFullAcquisitionRequired() and the binary content interfaces (BinaryContentFileProvider and BinaryContentInputStreamProvider).
  • If you are supporting text extraction by implementing either the BinaryContentFileProvider interface or the BinaryContentInputStreamProvider interface, the data source must be thread safe because IAS Server calls BinaryContentFileProvider.getBinaryContentFile() or BinaryContentInputStreamProvider.getBinaryContentInputStream() from multiple threads.

Recommendations for manipulators

The requirement to be thread safe has a few implementation implications for manipulators:
  • If possible, use only local variables or final immutable fields.
  • Persist internal state across calls to processRecord() or onInputClose() only if it is absolutely necessary. If it is necessary, access state in a synchronized way.

For optimal performance, it is a good idea to minimize the time you hold locks in processRecord().

Manipulators should not hold locks when calling OutputChannel.output() from processRecord(). The call to output() may take a while to return, which blocks other threads that are concurrently calling processRecord(). One way of holding locks is by using the Java synchronize keyword for a method. However, synchronizing processRecord() adversely affects performance. Synchronizing effectively makes the manipulator single threaded by preventing other threads from entering processRecord().

Configuration and context synchronization

As part of the implementation of an extension, the IAS Server passes in a PipelineComponentConfiguration object and a PipelineComponentRuntimeContext object to either DataSource.createDataSourceRuntime() (in the case of data sources) and Manipulator.createManipulatorRuntime() (in the case of manipulators). The IAS Server does not modify the PipelineComponentConfiguration after createManipulatorRuntime() or createDataSourceRuntime() has been called.

When the IAS Server runs an acquisition, the PipelineComponentRuntimeContext and everything accessible from it is thread safe.