4.2.1.4 Creating a Kafka Stream

Prerequisite: A Kafka connection.

To create a Kafka stream:

  1. On the Catalog page, click Create New Item.
  2. Hover the mouse over Stream and select Kafka from the submenu.
  3. On the Type Properties screen, enter the following details:
    • Name: Enter a unique name for the stream. This is a mandatory field.
    • Display Name: Enter a display name for the stream. If left blank, the Name field value is copied.
    • Description
    • Tags
    • Stream Type: The selected stream is displayed.
  4. Click Next.
  5. On the Source Details screen, enter the following details:
    • Connection: Select a Kafka connection for the stream.

    • Topic name: Enter a name for the kafka topic that will store the stream.

    • Data Format: Select CSV, JSON, or AVRO as the data format for the stream.

      for each format type:
  6. Click Next.
  7. On the Data Format screen, enter the shape details for the stream, based on the data format you have selected.
    • For JSON:
      • Allow Missing Column Names: Select this option to allow an input stream that has a column undefined in the shape.
    • For CSV:
      • CSV Predefined Format: Select one of the predefined data formats from the drop-down list. For more information, see Predefined CSV Data Formats.
      • First record as header: Select this option to use the first record as the header row.
    • For AVRO:
      • Schema Namespace: Enter the schema name combined with the namespace, to uniquely identify the schema within the store.
      • Schema (optional): Upload a schema file to infer shape from.
  8. Click Next.
  9. On the Shape screen, select one of the methods to define the shape:
    • Infer Shape : Select this option to detect the shape automatically from the input data stream.

    • Select Existing Shape: Select one of the existing shapes from the drop-down list.

    • Manual Shape : Select this option to manually infer the fields from a stream or file. You can also update the datatype of the fields.

      Note:

      • To retrieve the entire JSON payload, add a new field with path $.
      • To retrieve the content of the array, add a new field with path $[arrayField].

      In both the cases, the value returned is of type Text.

    • From Stream: Select this option to detect the shape based on the earliest or the latest offset of the kafka topic. The default option is earliest. Use latest to infer the shape based on latest records in the Kafka topic.

      This option is currently available only for JSON data format.

    • From File: Select this option to infer the shape from Kafka, a JSON schema file, or a JSON or CSV data file. You can also save the auto-detected shape and use it later.

      This option is enabled if you have selected CSV as the data format.

    • From Schema: Select this option to infer the shape based on the schema you selected in Step 6. This option is enabled if you have selected AVRO as the data format.
  10. Click Save.