Enabling Kerberos

BDD supports Kerberos 5+ to authenticate its communications with Hadoop. You can enable this for BDD to improve the security of your cluster and data.

Before you can configure Kerberos for BDD, you must install it on your Hadoop cluster. If your Hadoop cluster already uses Kerberos, you must enable it for BDD so it can access the Hive tables it requires.

To enable Kerberos:

  1. Install the kinit and kdestroy utilities on all BDD nodes.
  2. Create the following directories in HDFS:
    • /user/<bdd>, where <bdd> is the name of the bdd user.
    • /user/<HDFS_DP_USER_DIR>, where <HDFS_DP_USER_DIR> is the value of HDFS_DP_USER_DIR defined in bdd.conf.
    The owner of both directories must be the bdd user. Their group must be the HDFS super users group, which is defined by the dfs.permissions.supergroup configuration parameter. The default value is supergroup.
  3. Add the bdd user to the hdfs and hive groups on all BDD nodes.
  4. If you use HDP, add the group that the bdd user belongs to to the hadoop.proxyuser.hive.groups property in core-site.xml.
    You can do this in Ambari.
  5. Create a principal for BDD.
    The primary component must be the name of the bdd user and the realm must be your default realm.
  6. Generate a keytab file for the BDD principal and move it to the Admin Server.
    The name and location of this file are arbitrary as you will pass this information to the bdd-admin script at runtime.
  7. Copy your krb5.conf file to the same location on all BDD nodes.
    The location is arbitrary, but the default is /etc.
  8. If your Dgraph databases are stored on HDFS, you must also enable Kerberos for the Dgraph. On the Admin Server, make a copy of $BDD_HOME/BDD_manager/conf/bdd.conf and edit the following properties in the copy:
    Property Description
    KERBEROS_TICKET_REFRESH_ INTERVAL The interval (in minutes) at which the Dgraph's Kerberos ticket is refreshed. For example, if set to 60, it would be refreshed ever 60 minutes, or every hour.
    KERBEROS_TICKET_LIFETIME The amount of time that the Dgraph's Kerberos ticket is valid. This should be given as a number followed by a supported unit of time: s, m, h, or d. For example, 10h (10 hours), or 10m (10 minutes).
    Then go to $BDD_HOME/BDD_manager/bin and run:
    ./bdd-admin.sh publish-config <path>
    Where <path> is the absolute path to the modified version copy of bdd.conf.
  9. Go to $BDD_HOME/BDD_manager/bin and run:
    ./bdd-admin.sh publish-config kerberos on -k <krb5> -t <keytab> -p <principal>
    Where:
    • <krb5> is the absolute path to krb5.conf on all BDD nodes
    • <keytab> is the absolute path to the BDD keytab file on the Admin Server
    • <principal> is the BDD principal
    The script updates BDD's configuration files with the name of the principal and the location of the krb5.conf file. It also renames the keytab file to bdd.keytab and distributes it to $BDD_HOME/common/kerberos on all BDD nodes.
  10. If you use HDP, publish the change you made to core-site.xml:
    ./bdd-admin.sh publish-config hadoop
  11. Restart your cluster for the changes to take effect:
    ./bdd-admin.sh restart [-t <minutes>]

Once Kerberos is enabled, you can use the bdd-admin script to update its configuration as needed. For more information, see kerberos.