3.1.2.1 Configuring Pipeline Preferences
- Click the user name in the top right corner of the screen and Select System Settings from the drop-down list.
- Click Pipelines and set the following configurations:
-
Batch Duration: Set the default duration of the batch for each pipeline.
-
Executor Count: Set the default number of executors per pipeline.
-
Cores per Executor: Set the default number of cores. A minimum value of 2 is required.
-
Executor Memory: Set the default allocated memory for each executor instance in megabytes.
-
Cores per Driver: Set the default number of cores. A minimum value of 1 is required.
-
Driver Memory: Set the default allocated memory per driver instance in megabytes.
-
Log Level: Select a log level for unpublished pipelines, from the drop-down list.
Note:
Reset the default log level of draft pipelines to WARNING, and of published pipelines, to ERROR. -
Check Pointing (High Availability): Turning this option on or off turns the default checking point value on or off for each pipeline.
- Pipeline Topic Retention: Set retention period for intermediate stage topics, in milliseconds. Default is one hour.
- Enable Pipeline Topics: Select this option to enable creation of intermediate kafka topics. It is selected by default.
- Input Topics Offset: Select the Kafka topic offset value from the drop-down list. The default value is latest.
Note:
When you publish the pipeline for the first time, the input stream is read based on the offset value you have selected in this drop-down list. On a subsequent publish, the value you have selected here is not considered, and the input stream is read from where it was last left off. - Reset Offset: Select this option to read the input stream based on the offset value selected in the Input Topics Offset drop-down list.
Note:
If you are using two Kafka streams as an input to the pipeline, the offset is not preserved and the pipeline starts from the current timestamp. With a single stream the offset is maintained and the pipeline can read from the previous state of it.
- Datastream Reset Offset: Enable
resetting offsets for data streams. Use this option to choose the
starting point to ingest data from the stream. Select a value from
the drop-down list:
- Now: Select this option for the pipeline to start ingesting the stream data from the current time.
- Earliest: Select this option for the pipeline to start ingesting the earliest available data.
- Timestamp: Provide a specific UTC timestamp to start ingesting data from the stream.
- LCR: Set the Log Change Record (LCR) position for the pipeline to start ingesting data from the stream.
-