You add an Endeca Record File data source to CAS Console by specifying one or more Endeca record files and a record ID property. Valid file types are .xml, .xml.gz, .bin, .bin.gz, .binary, and .binary.gz. Wildcard characters may be used to specify multiple files in a given directory, but wildcards cannot be used in directory syntax to specify multiple directories.
To add a new Endeca Record File data source:
Select Endeca Record File from the list and click Add.
The Data Source tab displays.
In Name, specify a unique name for the data source to distinguish it from others in the CAS Console.
You can create a data source name with alphanumeric characters, underscores, dashes, and periods. All other characters are invalid for a name.
In Path to Input File(s), specify an absolute path to the files you want to crawl.
Wildcards may be used in the filename but not in the path preceding the filename.
Examples of local folders on Windows:
Examples of syntax for network drives:
In Record Id Property, specify the name of the source property that you want to map to the record ID property in the generated records.
This property must be unique across all files being crawled.
The data source displays Acquisition Steps where you can add manipulators, revise the data source configuration if necessary, or start acquiring data from the data source.
At this point, you can add manipulators, acquire data from the data source, and monitor its status.