Record Adapter editor

The Record Adapter editor contains the following tabs:

General

The General tab contains the following options:

Option	Description
Direction	Input Adapter Required. Set to input. Output Adapter Required. Set to Output.
Format	Input Adapter Required. The format type of the raw data to be loaded. One of the following: delimited, XML, binary, fixed width, document, ODBC (Windows only), vertical, JDBC Adapter, Exchange, or Custom Adapter. Your record format affects what delimiter options, if any, are necessary. Note: The custom adapter option is only available by request from Endeca. Output Adapter Required. Can be set to delimited, XML, binary, fixed width, or vertical.
URL	Input Adapter Required for delimited, XML, binary, fixed-width, and vertical input adapters. Location of the file being loaded. The path can be either an absolute path, or a path relative to the Pipeline.epx file. With an absolute path, the protocol can be specified in RFC 2396 syntax. Usually this means the file:/// prefix precedes the path to the data file. Relative paths should not specify the protocol. Any paths that are part of this URL will be overridden if the Forge --inputDir option is specified. Note: Exchange input adapters also require a URL but the URL is specified in a pass through element using the Pass Throughs tab. Output Adapter Required. Location to which the data will be saved, using the same path caveats as input adapters.
Row, column, and record delimiters	Input Adapter Optional. Used by input adapters only if the data is in delimited or vertical format. Row and Column are used for Delimited and Vertical formats. Record is used for Vertical formats. Output Adapter Not used.
Java properties	Input Adapter Required as follows: Java home Used by JDBC, Exchange, and Custom adapters. Specifies the location of the Java runtime engine (JRE). Class Used by Custom adapters. Specifies the name of the adapter class to load within the .jar file indicated in Classpath. Class path Used by JDBC and Custom adapters. For Custom adapters, this setting specifies a path to a .jar file (or a set of .jar files, separated by colons, ":") containing the classes required by the Custom adapter. For JDBC, this attribute specifies the location of the .jar file containing the JDBC driver. Note: When running your pipeline through Forge, you can override the Java home and Class path settings using command-line options. See Overriding Java home and class path settings. Output Adapter Not used.
Encoding	Input Adapter Optional. Defines the encoding of the input data. Several hundred encodings are supported; the following are typical examples. ISO8859-1 (Latin-1) ISO8859-15 (Latin-9) CP1252 (WINDOWS-1252) ASCII UTF-8 If Encoding is not set, it is assumed to be Latin-1. If an incorrect encoding is specified, then Forge will generate warnings about any characters that do not make sense in the specified encoding. For example, in the ASCII encoding, any character with a number above 127 is considered invalid. Note: This setting is ignored by the XML format, because the encoding is specified in the XML header, and by Output record adapters. It is also ignored for binary format encoding only applies to text files. Output Adapter Required. Set to UTF-8.
Require data	Input Adapter Optional. If checked, Forge exits immediately with an error if the URL does not exist or is empty. The error is sent to wherever logging is configured to send errors, typically to the console or stderr. Output Adapter Not used.
Filter empty properties	Input Adapter Optional. Determines whether source properties with empty property values are assigned to Endeca Records: If unchecked, the record adapter does not filter empty properties. In this case, the record adapter assigns an empty string (" ") to the current record for any empty properties. If checked, properties with empty property values are filtered, or ignored. For a filtering example, see Filtering empty properties. Output Adapter Not used.
Multi file	Input Adapter Optional. Specifies whether Forge can read data from more than one input file. If checked, the input URL is interpreted as a pattern, and Forge reads each file matching the pattern in alphabetical order. For example, the record adapter may specify a URL pattern of "*.update.txt", in which case Forge reads any file in the given directory that has the .update.txt suffix Output Adapter Not used.
Maintain state	Input Adapter Not used. Output Adapter Optional. If checked, indicates that the value of URL is relative to the Forge flag --stateDir. (This allows you to change your state directory using the --stateDir flag and yet not require you to modify your record adapter configuration.
Compression level	Input Adapter Not used. Compression of input files is detected automatically. Output Adapter Sets the level of compression to be performed on the record data when its written to disk. To save on the amount of disk space used, check Custom compression level and slide the bar to the recommended value of 7. Note: Compressed data consumes less disk space but takes longer to read and write.

Sources

The Sources tab contains the following options:

Option	Description
Record source	A choice of the record servers in the project. Used for output record adapters only.
Dimension source	A choice of the dimension adapters and dimension servers in the project. Generally used for output record adapters only. Input record adapters only require a dimension source if they implement a record index that includes dimensions.

Record Index

Optional. The Record Index tab allows you to add or remove dimensions or properties used in a component's record index, and to change their order. Record indexes support join functionality. See Join sources must have matching join keys and record indexes for more details.

The Record Index tab contains the following fields:

Field	Description
Discard records with duplicate keys	When checked, Forge discards any records with duplicate keys and logs a warning that specifies the number of records discarded. Note: Developer Studio performs a case-insensitive search for duplicate keys.

Transformer

The Transformer tab is for the XML format. XML adapters assume that data is in the Endeca Record XML format and, without transformation, other XML formats cannot be read by the Data Foundry. To support these situations, an XSLT transformation can be applied to the source data to convert it into Endeca Records XML, which the Data Foundry can read.

The Transformer tab has the following options:

Option	Description
Type	Must be XSLT.
URL	Location of the stylesheet to use.

Pass Throughs

The Pass Throughs tab is used with certain formats to pass additional information to Forge. It contains text boxes where you can add, modify, or delete key/value pairs. Pass throughs are required for ODBC, fixed-width, delimited, JDBC, custom, or Exchange adapters.

Comment

Optional. Provides a way to associate comments with a pipeline component.