Record Adapter editor

The Record Adapter editor contains a unique name for this record adapter.

The Record Adapter editor contains the following tabs:

General

The General tab contains the following options:

Option Description

Direction

Input Adapter

Required. Set to input.

Output Adapter

Required. Set to Output.

Format

Input Adapter

Required. The format type of the raw data to be loaded. One of the following: delimited, XML, binary, fixed width, document, ODBC (Windows only), vertical, JDBC Adapter, Exchange, or Custom Adapter. Your record format affects what delimiter options, if any, are necessary.

Note: The custom adapter option is only available by request from Endeca.

Output Adapter

Required. Can be set to delimited, XML, binary, fixed width, or vertical.

URL

Input Adapter

Required for delimited, XML, binary, fixed-width, and vertical input adapters. Location of the file being loaded. The path can be either an absolute path, or a path relative to the Pipeline.epx file. With an absolute path, the protocol can be specified in RFC 2396 syntax. Usually this means the file:/// prefix precedes the path to the data file. Relative paths should not specify the protocol. Any paths that are part of this URL will be overridden if the Forge --inputDir option is specified.

Note: Exchange input adapters also require a URL but the URL is specified in a pass through element using the Pass Throughs tab.

Output Adapter

Required. Location to which the data will be saved, using the same path caveats as input adapters.

Row, column, and record delimiters

Input Adapter

Optional. Used by input adapters only if the data is in delimited or vertical format. Row and Column are used for Delimited and Vertical formats. Record is used for Vertical formats.

Output Adapter

Not used.

Java properties

Input Adapter

Required as follows:

  • Java home

    Used by JDBC, Exchange, and Custom adapters. Specifies the location of the Java runtime engine (JRE).

  • Class

    Used by Custom adapters. Specifies the name of the adapter class to load within the .jar file indicated in Classpath.

  • Class path

    Used by JDBC and Custom adapters. For Custom adapters, this setting specifies a path to a .jar file (or a set of .jar files, separated by colons, ":") containing the classes required by the Custom adapter. For JDBC, this attribute specifies the location of the .jar file containing the JDBC driver.

Note: When running your pipeline through Forge, you can override the Java home and Class path settings using command-line options. See Overriding Java home and class path settings.

Output Adapter

Not used.

Encoding

Input Adapter

Optional. Defines the encoding of the input data. Several hundred encodings are supported; the following are typical examples.

  • ISO8859-1 (Latin-1)
  • ISO8859-15 (Latin-9)
  • CP1252 (WINDOWS-1252)
  • ASCII
  • UTF-8

If Encoding is not set, it is assumed to be Latin-1. If an incorrect encoding is specified, then Forge will generate warnings about any characters that do not make sense in the specified encoding. For example, in the ASCII encoding, any character with a number above 127 is considered invalid.

Note: This setting is ignored by the XML format, because the encoding is specified in the XML header, and by Output record adapters. It is also ignored for binary format encoding only applies to text files.

Output Adapter

Required. Set to UTF-8.

Require data

Input Adapter

Optional. If checked, Forge exits immediately with an error if the URL does not exist or is empty. The error is sent to wherever logging is configured to send errors, typically to the console or stderr.

Output Adapter

Not used.

Filter empty properties

Input Adapter

Optional. Determines whether source properties with empty property values are assigned to Endeca Records:

  • If unchecked, the record adapter does not filter empty properties. In this case, the record adapter assigns an empty string (" ") to the current record for any empty properties.
  • If checked, properties with empty property values are filtered, or ignored.

For a filtering example, see Filtering empty properties.

Output Adapter

Not used.

Multi file

Input Adapter

Optional. Specifies whether Forge can read data from more than one input file. If checked, the input URL is interpreted as a pattern, and Forge reads each file matching the pattern in alphabetical order. For example, the record adapter may specify a URL pattern of "*.update.txt", in which case Forge reads any file in the given directory that has the .update.txt suffix

Output Adapter

Not used.

Maintain state

Input Adapter

Not used.

Output Adapter

Optional. If checked, indicates that the value of URL is relative to the Forge flag --stateDir. (This allows you to change your state directory using the --stateDir flag and yet not require you to modify your record adapter configuration.

Compression level

Input Adapter

Not used. Compression of input files is detected automatically.

Output Adapter

Sets the level of compression to be performed on the record data when its written to disk. To save on the amount of disk space used, check Custom compression level and slide the bar to the recommended value of 7.

Note: Compressed data consumes less disk space but takes longer to read and write.

Sources

The Sources tab contains the following options:

Option Description

Record source

A choice of the record servers in the project. Used for output record adapters only.

Dimension source

A choice of the dimension adapters and dimension servers in the project. Generally used for output record adapters only. Input record adapters only require a dimension source if they implement a record index that includes dimensions.

Record Index

Optional. The Record Index tab allows you to add or remove dimensions or properties used in a component's record index, and to change their order. Record indexes support join functionality. See Join sources must have matching join keys and record indexes for more details.

The Record Index tab contains the following fields:

Field Description

Discard records with duplicate keys

When checked, Forge discards any records with duplicate keys and logs a warning that specifies the number of records discarded.
Note: Developer Studio performs a case-insensitive search for duplicate keys.

Transformer

The Transformer tab is for the XML format. XML adapters assume that data is in the Endeca Record XML format and, without transformation, other XML formats cannot be read by the Data Foundry. To support these situations, an XSLT transformation can be applied to the source data to convert it into Endeca Records XML, which the Data Foundry can read.

The Transformer tab has the following options:

Option Description

Type

Must be XSLT.

URL

Location of the stylesheet to use.

Pass Throughs

The Pass Throughs tab is used with certain formats to pass additional information to Forge. It contains text boxes where you can add, modify, or delete key/value pairs. Pass throughs are required for ODBC, fixed-width, delimited, JDBC, custom, or Exchange adapters.

Comment

Optional. Provides a way to associate comments with a pipeline component.