4.1.1.13 Creating a Connection to Microsoft Azure Data Lake-Gen2

To create a connection to Microsoft Azure Data Lake-Gen2:
  1. On the Catalog page, click Create New Item.
  2. Hover the mouse over Connection and select HDFS from the submenu.
  3. On the Type Properties screen, enter the following details:
    • Name: Enter a unique name for the connection. This is a mandatory field.
    • Display Name: Enter a display name for the connection. If left blank, the Name field value is copied.
    • Description
    • Tags
    • Connection Type: The selected connection is displayed.
  4. Click Next.
  5. On the Connection Details screen, enter the following details:
    • core-site.xml: Upload the core-site.xml file with fs.defaultFS, fs.default.name, hadoop.security.authentication, and fs.AbstractFileSystem.hdfs.impl properties.

      Note:

      Download the files core-site.xml and hdfs-site.xmlfrom the Azure website.
    • hdfs-site.xml : Set client-specific properties. This is an optional field.
    • Use Kerberos: To use a kerberized cluster, select this option to enable Kerberos principal and Kerberos keytab. Provide the following details:
      • Kerberos KDC
      • Kerberos Realm
      • Kerberos Principal: Set HDFS service principal.
      • Kerberos Key Tab: Upload the keytab file that has HDFS service principal
  6. Click Test Connection, to ensure that you have successfully created a connection, and to download the third-party libraries required to connect to Azure Data lake-Gen2. The libraries are also used by targets to write to HDFS paths.

    Note:

    Retain the core-site.xml and hdfs-site.xml file names exactly as they are.
  7. Click Save.