DBMS_DCAT Package
The DBMS_DCAT package provides functions and procedures to
help Autonomous AI Database users leverage the data
discovery and centralized metadata management system of OCI Data Catalog.
Note:
Support forDBMS_DCAT is available in Oracle AI Database 19c starting with
version 19.30, and in Autonomous AI Database 26ai
starting with version 23.26.1.
Data Catalog harvests metadata from a data lake's object storage assets. The
harvesting process creates logical entities, which can be thought of as tables with
columns and associated data types. DBMS_DCAT procedures and functions
connect Autonomous AI Database to Data Catalog and
then synchronize the assets with the database, creating protected schemas and external
tables. You can then query object store using those external tables, easily joining
external data with data stored in Autonomous AI Database. This dramatically simplifies the management process; there is a
single, centrally managed metadata store that is shared across multiple OCI services
(including Autonomous AI Databases). There are also
Autonomous AI Database dictionary views that allow
you to inspect the contents of Data Catalog using SQL, and show you how these Data
Catalog entities map to your Autonomous AI Database
schemas and tables.
Data Catalog Users and Roles
The DBMS_DCAT package supports synced users/schemas,
dcat_admin users and local users. Users must have the
dcat_sync role to be able to use this package.
Data Catalog Users
- Synced users/schemas
The synced external tables are organized into database schemas corresponding to Data Asset/Bucket combinations, or according to custom properties set by the user. The synced schemas are automatically created/dropped during Data Catalog synchronization. They are created as no authentication users without the
CREATE SESSIONprivilege. The synced schemas are also created using the protected clause, so that they cannot be altered by local users (not even the PDB admin) and can only be modified through the synchronization. - User
dcat_adminUser
dcat_adminis a local database user that can run a sync and grantREADprivilege on synced tables to other users or roles. The user is created as a no authentication user without theCREATE SESSIONprivilege. - Local users
Database users querying the external tables must be explicitly granted
READprivileges on the synced external tables by usersdcat_adminor ADMIN. By default, after the sync is completed, only usersdcat_adminandADMINhave access to the synced external tables.
Data Catalog Roles
dcat_syncThe
dcat_syncrole has all the required privileges for using theDBMS_DCATpackage. Users must have this role to be able to use the API for navigating the Data Catalog and running the sync.
Required Credentials and IAM Policies
This topic describes the Oracle Cloud Infrastructure Identity and Access Management (IAM) user credentials and policies required to give Autonomous AI Database users permission to manage a data catalog and to read from object storage.
OCI Data Catalog Credential and Policy Requirements:
- A credential object with permission to manage a Data Catalog
instance is required. Credential objects containing OCI native
authentication is supported. Credential objects based on authentication
token user principals and resource principals credentials are not
supported.
For information on managing credentials, see DBMS_CLOUD for Access Management.
For OCI native authentication examples, see Example: Creating an OCI Native Authentication Credential Object and Autonomous AI Database Now Supports Accessing the Object Storage with OCI Native Authentication.
-
The manage Data Catalog privilege is required in order for Autonomous AI Database to add custom properties to the Data Catalog namespace. These privileges allow you to override schema names, table names, column names and more.
For further information on Data Catalog permissions, see Permissions Required for Each API Operation.
-
The read object storage privilege on buckets is required so that Autonomous AI Database can query data files.
For further Oracle Object Storage Policy Examples, see Policy Examples.
AWS Glue Data Catalog Credential and Policy Requirements
- A credential object with permission to access an AWS Glue Data
Catalog is required. For information on managing credentials, see DBMS_CLOUD for Access Management.
For accessing an AWS Glue Data Catalog the following privileges are required: glue:GetDatabases , glue:GetTables , and glue:GetTable.
In addition, privilege s3:GetBucketLocation is needed during synchronization for generating resolvable https urls pointing to the underlying S3 objects. - A credential object with permission to access the files stored in S3 is required so that Autonomous AI Database can query data files.
- AWS credentials are supported. AWS Amazon Resource Names (ARN) credentials are not supported.
Example: Creating an OCI Native Authentication Credential Object
In this example, we create an OCI native authentication credential that can be used when creating a data catalog or an object store credential object. For more details, see DBMS_DCAT. SET_DATA_CATALOG_CREDENTIAL Procedure and DBMS_DCAT.SET_OBJECT_STORE_CREDENTIAL Procedure respectively.
In OCI native authentication, the
DBMS_CLOUD.CREATE_CREDENTIAL procedure includes these
parameters: credential_name, user_ocid,
tenancy_ocid, private_key, and
fingerprint. See DBMS_CLOUD CREATE_CREDENTIAL Procedure for a
complete description of this procedure.
The credential_name is the name of the credential
object. The user_ocid and tenancy_ocid parameters
correspond to the user's and tenancy's OCIDs respectively.
The private_key parameter specifies the generated
private key in PEM format. Private keys created with a passphrase are not supported.
Therefore, we need to make sure we generate a key with no passphrase. See How to Generate an API Signing
Key for more details on how to create a private key with no passphrase.
Also, the private key that we provide for this parameter must only contain the key
itself without any header or footer (e.g. ‘-----BEGIN RSA PRIVATE KEY-----',
‘-----END RSA PRIVATE KEY-----’).
The fingerprint parameter specifies the fingerprint
that is obtained either after uploading the public key to the console, or using the
OpenSSL commands. See How to Upload the Public Key
and How to Get the Key's
Fingerprint for further details on obtaining the fingerprint.
Once all the necessary information is gathered and the private key is
generated, we're ready to run the following CREATE_CREDENTIAL
procedure:
BEGIN
DBMS_CLOUD.CREATE_CREDENTIAL (
credential_name => 'OCI_NATIVE_CRED',
user_ocid => 'ocid1.user.oc1..aaaaaaaatfn77fe3fxux3o5lego7glqjejrzjsqsrs64f4jsjrhbsk5qzndq',
tenancy_ocid => 'ocid1.tenancy.oc1..aaaaaaaapwkfqz3upqklvmelbm3j77nn3y7uqmlsod75rea5zmtmbl574ve6a',
private_key => 'MIIEogIBAAKCAQEA...t9SH7Zx7a5iV7QZJS5WeFLMUEv+YbYAjnXK+dOnPQtkhOblQwCEY3Hsblj7Xz7o=',
fingerprint => '4f:0c:d6:b7:f2:43:3c:08:df:62:e3:b2:27:2e:3c:7a');
END;
/
dba_credentials
table:SELECT owner, credential_name
FROM dba_credentials
WHERE credential_name LIKE '%NATIVE%';
OWNER CREDENTIAL_NAME
----- ---------------
ADMIN OCI_NATIVE_CREDExample: Using User Principals
In this example, user1 is a member of the group
adb-admins. All members of this group are given permission to
manage all data catalogs in mycompartment, and to read from
object-store in mycompartment.
- Allow users that are members of
adb-adminsto manage all data catalogs withinmycompartment.allow group adb-admins to manage data-catalog-family in compartment mycompartment - Allow users that are members of
adb-adminsto read any object in any bucket withinmycompartment.allow group adb-admins to read objects in compartment mycompartment