The Bulk Add/Replace Records component adds new records or replaces existing records in an Endeca data domain.
The default behavior of the Bulk Load Interface is to force a merge to a single generation at the end of every bulk-load ingest operation. This behavior is intended to maximize query performance at the end of a single, large, homogenous data update that would occur during a regularly scheduled update window.
The input metadata schema for the Bulk Add/Replace Records component is not fixed. Each metadata field represents a property on a data domain record.
The metadata type of the Integrator ETL field (as shown in the Edit Metadata dialog on the edge connecting to the connector) translates to the mdex property type. For example, the Integrator ETL integer data type translates to the mdex:int data type. Note that you must override this behavior to support Integrator ETL non-native types (such as mdex:duration, mdex:time, and mdex:geocode). For details, see Creating mdexType Custom properties.
Multi-value input fields can be defined using either the multi-assign delimiter or a list data type. See Processing multi-value data for additional details.
Use the Bulk Add/Replace Records component to load data in bulk when it is acceptable to delay the visibility of the updates and for query performances to stop while data is loaded.
The following table describes the configuration properties available for the Bulk Add/Replace Records component.
Name | Description | Valid Values | Example |
---|---|---|---|
Endeca Server Host | Identifies the machine on which the Endeca Server is running. | The name or IP address of the machine. You can use localhost. | MyEndecaServer
255.255.255.0 |
Endeca Server Port | Identifies the port on which the Endeca Server is listening. | Valid ports.
The default Endeca Server port is 7001, but it can be changed to another port. |
7001 |
Endeca Server Context Root | Identifies the WebLogic application root context of the Endeca Server | Valid root context names in WebLogic | /endeca-server |
Data Domain Name | Name of the data domain that will
be modified.
The data domain should be running when the graph containing the connector is run. |
Valid data domain names | quickstart |
Spec Attribute | Specifies the primary key (record spec) for the records that will be uploaded. | Name of the primary key attribute.
If the primary key does not exist in the data domain, the property is created automatically with system default values. Note: When specifying the Spec Attribute in the
Bulk Add/Replace Records component, you
should not prepend the collection key. When you check the
Prefix Attributes With Collection Key
box, the collection key will be prepended automatically. If you prepend the
collection key when specifying the Spec Attribute, it will be prepended twice.
|
FactSales_OrderNumber |
Collection key | Required. Specifies the key of the collection to which
the data should be loaded.
|
Alphanumeric characters, but must begin with either a letter or an underscore. | New Collection |
Prefix Attributes With Collection Key | Specifies whether to prepend attribute names with the name of the collection when uploading data to the data domain. The underscore character is used as a separator between the collection key and the attribute name: "collectionkey_attributename". | Checked (True; the collection key
will be prepended to attribute names when uploading data to the data domain.)
Unchecked (false; the attribute name is not modified when uploading data to the data domain; default.) |
|
Post Ingest Query Optimization | Specifies whether to merge records immediately after the ingest operation is complete or to use the standard background merge process. | Checked (True; default)
Unchecked (False) |
|
Post Ingest Dictionary Update | Specifies whether to update the spelling dictionary automatically immediately after the ingest operation is complete. If the dictionary is not updated when the ingest operation is complete, you must issue the update command manually using web services. | Checked (True; default)
Unchecked (False) |
|
SSL Enabled | Enables or disables SSL for the
component.
SSL should only be enabled when the Endeca Server to which you are connecting has SSL enabled. |
Checked (True)
Unchecked (False) |
|
Stop after this many errors | Specifies the maximum number of ingest errors allowed in a single load operation. If this number of errors occurs, the ingest operation is terminated. | Either 0 (no errors are allowed) or a positive integer. | 0
15 |
Multi-assign delimiter | Sets the character that separates multi-assign values in
a property in a source record. Keep in mind that this delimiter is different
from the delimiter that separates property fields on the source record.
See also Multi-assign delimiter. |
A single character that is the multi-assign delimiter. The default is the Unicode DELETE character (\U007F). You do not have to use this field if your data does not include multi-assign properties. | |
Timeout (ms) | Specifies the timeout of operations of the component.
If timeouts occur when running graphs, operations on the Endeca Server may be taking too long. Change the value of this parameter to allow more time for operations on the Endeca Server. The default configures a one minute timeout. |
Integers | 60000 |
When a bulk load ingest operation is terminated because of an error, records that were ingested before the error should be included in the data domain. Although the data domain may accept queries on the ingested records, you should consider the data domain to be in an inconsistent state. To restore a consistent state, review the logs to determine the problems that caused the bulk load operation to fail, correct these problems, then reload the data.
Port Field Name | Data Type | Description | Example |
---|---|---|---|
Records added | Long | Number of records added to the data domain | 984341 |
Records Queued | Long | Number of records queued for processing but not processed | 1568 |
Records Rejected | Long | Number of records submitted that were not added to the data domain | 24836 |
State | String | Data domain status string returned by the Bulk Load API |
Port Field Name | Data Type | Description | Example |
---|---|---|---|
Fault Message | String | Error message returned by the Endeca Server |