Basic Pipeline Development
This part contains the following sections:
Copyright ©
Legal Notices
Guided Search Platform Services Forge Guide
Documentation Home
Feedback
Highlighting
Prev
Next
Contents
Search
Preface
About this guide
Who should use this guide
Conventions used in this guide
Contacting Oracle Support
Basic Pipeline Development
The Guided Search ITL
Introduction to the Guided Search ITL
Content Acquisition System
Guided Search Data Foundry
Guided Search ITL components
Data Foundry programs
Configuration files
Pipeline
Dimension hierarchy
Index configuration
Guided Search ITL Development
Guided Search ITL development process
Guided Search tools suite
Developer Studio
Workbench
About system provisioning tasks in Workbench
About system operations tasks in Workbench
Finding more information about tools setup and usage
About controlling your environment
About using the Endeca Application Controller
Application Controller architecture
Ways of communicating with the Endeca Application Controller
About using Workbench to communicate with the EAC Central Server
A closer look at data processing and indexing
Data processing
Source data
About loading source data
Standardizing source records
About mapping source properties and property values
About writing out tagged data
About indexing
Overview of Source Property Mapping
About source property mapping
About using a single property mapper
About using explicit mapping
Minimum configuration
About mapping unwanted properties
About removing source properties after mapping
Types of source property mapping
Priority order of source property mapping
About adding a property mapper
Determining where to add the property mapper
Creating the property mapper
The Mappings editor
Creating new source mappings
Using null mappings to override implicit and default mappings
About assigning multiple mappings
Match Modes
About choosing a match mode for dimensions
Normal mode
Must Match mode
Auto Generate mode
Rules of thumb for dimension mapping
Dimension mapping example
Wine_Type dimension
Country dimension
Body dimension
Advanced Mapping Techniques
The Property Mapper editor Advanced tab
About enabling implicit mapping
Enabling default mapping
About the default maximum length for source property values
About overriding the default maximum length setting
Before Building Your Instance Configuration
Endeca Application Controller directory structure
Pipeline overview
About adding and editing pipeline components
About creating a data flow using component names
URLs in the pipeline
About Creating a Basic Pipeline
The Basic Pipeline template
Record adapters
About the Record Index tab
Dimension adapter
Dimension server
Property mapper
Indexer adapter
About Running Your Basic Pipeline
Running a pipeline
Viewing pipeline results in a UI reference implementation
After Your Basic Pipeline Is Running
Additional tasks
About source property mapping
Adding and mapping Guided Search properties
Adding and mapping dimensions
About synonyms
About null mappings
Setting the record specifier property
About specifying dimensions and dimension value order
Additional pipeline components
Additional index configuration options
Joins
Overview of Joins
Record assemblers and joins
About performing joins in a database
Join keys and record indexes
About matching record indexes for join sources
Join types
Left join
Inner join
Outer join
Disjunct join
Switch join
Sort switch join
First record join
Combine join
About Configuring Join Keys and Record Indexes
Creating a record index
Creating a join key for a record cache
Join keys with multiple properties or dimensions
About Implementing Joins
Implementing a join
Adding a record cache
Adding a record assembler
Configuring the join
Advanced Join Behavior
Records that have multiple values for a join key
Sources that have multiple records with the same join key value
About tweaking left joins
Tips and Troubleshooting for Joins
Joins that do not require record caches
Working with sources that have multiple records with the same join key value
Best practice for choosing left and right side of joins
Combining equivalent records in record caches
Forge warnings when combining large numbers of records
Advanced Dimension Features
Externally-Created Dimensions
Overview of externally-created dimensions
Operations on external dimensions in Developer Studio
Including externally-created dimensions in your project
XML requirements
XML syntax to specify dimension hierarchy
Node ID requirements
Importing an externally-created dimension
Operations on external dimensions in Developer Studio
Overview of externally-managed taxonomies
Overview of externally-managed taxonomies
Including externally-managed taxonomies in your project
Operations on external dimensions in Developer Studio
Importing an externally-created dimension
Including externally-managed taxonomies in your project
Other Advanced Features
The Forge Logging System
Overview of the Forge logging system
Log levels reference
About logging topics
The command line interface
Aliasing existing -v levels
About logging output to a file
Changes to the EDF_LOG_LEVEL environment variable
The Forge Metrics Web Service
About enabling Forge metrics
About enabling SSL security
About using Forge metrics
The MetricsService API
Methods
Classes
Forge Flag Reference
Forge flag options reference
File Formats Supported by the Document Conversion Module
Word processing formats
Text and markup formats
Spreadsheet formats
Vector image formats
Notes on Adobe PDF text extraction
Raster image formats
Presentation formats
Archive formats
Database formats
E-mail formats
Other formats
Advanced JDBC Column Handler
About the Advanced JDBC Column Handler
JDBC driver
JDBC configuration options
Storing data on disk
Using the Advanced JDBC Column Handler
Output
File system output
Importing character data with IMPORT_PROP
Processing binary data with the Document Converter
Troubleshooting
Logging output
JDBC driver
Search Terms