Sentry

Sentry provides role-based authorization in Hadoop clusters. Among other things, it can be used to restrict access to Hive data at a granular level.

Oracle strongly recommends using Sentry to protect your data from outside users. If you already have it set up in your Hadoop cluster, you must do a few things to enable BDD to work with it.

Note: The first two steps in this procedure are also required to enable Kerberos. If you've already done them, you can skip them.

To enable Sentry:

  1. If you haven't already, create the following directories in HDFS:
    • /user/<bdd user>, where <bdd user> is the name of the bdd user.
    • /user/<HDFS_DP_USER_DIR>, where <HDFS_DP_USER_DIR> is the value of HDFS_DP_USER_DIR in BDD's configuration file.
    The owner of both directories must be the bdd user. Their group must be the HDFS super users group, which is defined by the dfs.permissions.supergroup configuration parameter. The default value is supergroup.
  2. If you haven't already, add the bdd user to the hive group.
  3. Create a new role for BDD:
    create role <BDD_role>;
    grant all on server server1 to role <BDD_role>;
    show grant role <BDD_role>;
    grant role <BDD_role> to group hive;