Specify a New Data Indicator for a Data Source

To configure incremental processing in a data flow, you select the data column to use as the flow new data indicator in the data source. This indicator determines when new data is detected since the last time the data flow was executed. For example, you might select a timestamp column.

Before you start, create a connection to one of the supported databases, for example Oracle, Oracle Autonomous Data Warehouse, Apache Hive, Hortonworks Hive, or Map R Hive.
  1. On the home page, click Navigator, then click Data
  2. Hover over a dataset, click Actions (Actions menu ellipsis icon), then select Open.
  3. In the Join Diagram, double-click the table that includes the incremental identifier you'd like to use.
  4. Click Edit Definition.
  5. If the data access panel isn't displayed, go to the center of the right edge of the window to locate the Expand option, then click Expand.

    You can now view the caching options and the Flow New Data Indicator field under Advanced.
    Description of dataset_editor_data_panel_grabhandle.png follows
    Description of the illustration dataset_editor_data_panel_grabhandle.png

  6. In the Flow New Data Indicator field, select a column to detect when new data is added.
  7. Click OK.