In order to implement a join, you must add the join and the records it will process into your pipeline, and configure the join accordingly.
Implementing a join is a three-step process:
Each step is described in the following sections.
Use the options in the Record Cache editor to add and configure a record cache for each of your record sources.
To add a record cache for each record source that will feed the join:
In the Pipeline Diagram editor, click New, and then choose → .
The Record Cache editor appears.
In the Name text box, type a unique name for this record cache.
(Optional) In the General tab, you may do the following:
If the cache should load fewer than the total number of records from the record source, type the number of records to load in the Maximum Records text box. This features is provided for testing purposes.
If you want to merge records with equivalent record index key values into a single record, check the Combine Records option. For one-to-many or many-to-many joins, leave Combine Records unchecked.
In the Sources tab, select a record source and, optionally, a dimension source.
If a component's record index contains dimension values, you must provide a dimension source. Generally, this is only the case if you are caching data that has been previously processed by Forge.
(Optional) In the Comment tab, add a comment for the component.
Repeat these steps for all record sources that will be part of the join.
Use the Record Assembler editor to add and configure a new record assembler for your pipeline.
To add a record assembler to your pipeline:
In the Pipeline Diagram editor, click New, and then choose → .
The Record Assembler editor appears.
In the Name text box, type a unique name for the new record assembler.
In the Sources tab, do the following:
In the Record Sources list, select a record source and click Add. Repeat as necessary to add additional record sources.
With two exceptions, record assemblers must use record caches as their source of record data.
In the Dimension Source list, select a dimension source.
If the key on which a join is performed contains dimension values, you must provide a dimension source. Generally, this is only the case if you are joining data that has already been processed once by Forge.
(Optional) In the Record Index tab, do the following:
Specify which properties or dimensions you want to use as the record index for this component.
An assembler's record index does not affect the join, it only affects the order in which downstream components will retrieve records from the assembler.
Indicate whether you want to discard records with duplicate keys.
(Optional) In the Comment tab, add a comment for the component.
Related links
You can use the Record Assembler and Join Type editors to choose from and configure the different types of joins.
To configure the join in the record assembler:
Use the Join Type list to select the kind of join you want to perform.
If you are performing a left join, check the Multi Sub-records option if the left record can be joined to more than one right record.
The join entries list represents the record sources that will participate in the join, as specified on the Sources tab. In the Join Entries list, define the order of your join entries by selecting an entry and clicking Up or Down.
For all joins, properties get processed from join sources in the order in they are in the list. The first entry is the Left entry for a left join.
To define the join key for a join entry, select the entry from the Join Entries list and click Edit.
The Join Entry editor appears.
The Key Component editor appears.
Using the steps below, create a join key that is identical to the record index key for the join entry you selected.
Repeat steps 5 through 7 for each record source that is participating in the join.
When you are done configuring your join, click OK to close the Record Assembler editor.