Adding a record adapter to load data

Record adapters read and write record data. A record adapter describes where the data is located (or will be saved to), the format, and various aspects of processing.

Forge can read source data from a variety of file formats and source systems. Each data source needs a corresponding input record adapter describing the particulars of that source. Based on this information, Forge parses the data and turns it into Endeca records. Input record adapters automatically decompress source data that is compressed in the gzip format.

Note: Output record adapters are generally used for diagnostic purposes. Hence, this section focuses on input record adapters. See Writing out record data for more information on output record adapters.

To add an input record adapter to your pipeline:

  1. In the Pipeline Diagram editor, choose New > Record > Adapter. The Record Adapter editor appears.
  2. In the Name text box, type a unique name for this record adapter.
  3. In the General tab, do the following:
    1. In the Direction frame, choose Input.
    2. In the Format list, choose one of the following: XML, binary, fixed-width, delimited, vertical, document, JDBC adapter, Exchange, ODBC (Windows only), or custom adapter (available only by request from Endeca).
    3. In the URL text box, type the location of the source data.
    4. In the Delimiters frame, if the format is delimited, add row and column delimiters. If the format is vertical, add row, column, and record delimiters.
    5. (Optional) In the Encoding text box, define the encoding of the input data. If Encoding is not set, it is assumed to be Latin-1.
      Note: This setting is ignored by the XML format, because the encoding is specified in the XML header. It is also ignored for binary format because Forge detects the binary format's encoding automatically. The Document format also ignores the Encoding setting.
    6. If any of the text boxes in the Java properties frame are made available by your format selection, type in the required information.
    7. (Optional) Check Require Data if you want Forge to exit with an error if the URL does not exist or is empty.
    8. Check Filter empty properties.
      Note: If it is not checked, by default the adapter assigns the property a value of '' (an empty string) if a record has no value for a given property.
    9. (Optional) Check Multi File if Forge can read data from more than one input file.
      Note: The URL will be treated as a wildcard and all matching files will be read in alphabetical order.
    10. Check Maintain State if you are using the Endeca Application Controller (EAC) environment.
      Note: This setting specifies that the records are output in the directory structure the EAC requires.
    11. (Optional) Check Custom Compression Level if your input file is compressed to indicate to Forge that it must decompress data from this source.
      Note: The compression level setting is ignored for an input record adapter. Compression of input files is detected automatically.
  4. Ignore the Sources tab. Its settings are not used by an input record adapter.
  5. (Optional) In the Record Index tab, do the following:
    1. Specify which properties or dimensions you want to use as the record index for this component.
    2. Indicate whether you want to discard records with duplicate keys.
      Note: Developer Studio performs a case-insensitive search for duplicate keys.
  6. If you are using XSLT to transform your XML into Endeca-compatible XML, in the Transformer tab, specify the type (XSLT) and the location of the stylesheet.
  7. If your format is ODBC, fixed-width, delimited, JDBC, custom, or Exchange, in the Pass Through tab, enter the necessary information.
  8. (Optional) In the Comment tab, add a comment for the component.
  9. Click OK.