This section provides an overview of pipeline components and helps you identify when to use them.
Once the source data passes through a record adapter, or, optionally, through your own content adapter, it gets turned into Endeca records. The records may need to be modified further, within the Forge pipeline.
For example, you may determine that some properties of your records must be cleaned or changed.
For the purpose of changing the records, the Endeca software offers a variety of methods:
Java manipulators—Java manipulators are the most generic way of cleaning or changing your records in the pipeline; they use Java classes to specify the records transformations you need.
Record manipulators—Record manipulators contain expressions, which are evaluated against each record as it flows through the pipeline. When Forge evaluates an expression, it may change the current record. The changes take a variety of forms, from adjustments of property values to creation of new data.
Perl manipulators—Perl manipulators use Perl to manipulate source records as part of Forge’s data processing. You can use a Perl manipulator to add, remove, and reformat properties, join record sources, and other such tasks.
You can use any one of these methods, based on your preferences. Use the following guidelines to decide which type of manipulator you need:
If you prefer to use Java, use Java manipulators for adding, removing or changing properties of your records. Create Java manipulators with the CADK in cases when your records require any further transformations, after passing through other pipeline components, such as record and content adapters.
If you prefer to use Perl, use Perl manipulators to perform your source record management tasks.
If you prefer to use Data Foundry expressions and would rather edit XML expressions directly within the files (or within the Expressions Editor in Developer Studio) use record manipulators to transform your records.
In general, record manipulators are far more limiting than Java manipulators, in what they let you do with them. On the other hand, transforming records with record manipulators is faster than using Java manipulators, which in turn are faster than Perl manipulators.