Query External Data on Oracle Compute Cloud at Customer
Describes packages and tools to allow the Autonomous Database deployments on Exadata Cloud@Customer to load, query, and manage unstructured data by building external tables and external partitioned tables from objects in Data Lakes.
You can use the DBMS_CLOUD
package to configure Autonomous Database deployments on Exadata Cloud@Customer to access the unstructured data
stored on the Oracle Compute Cloud at Customer (C3) Object Storage Bucket.
Overview
You require a data platform that can extract business insights from structured and unstructured data. The combination of structured data available in a data warehouse and unstructured data available in a Data Lake is known as Data Lakehouse.
An Autonomous Data Lakehouse allows data analytics on both structured
and unstructured data. The structured data is stored in an Autonomous Data Warehouse, while the
unstructured data is stored in a Data Lake configured on C3 Object Storage Buckets.
The Autonomous Data Warehouse can query,
load, and manage objects in the Data Lake by using a database package called
DBMS_CLOUD
. This package allows the Autonomous Database to create external and
external partitioned tables on data lake objects such as csv, txt, Avro, and Parquet
files. You can build Autonomous Data Lakehouses by combining the capabilities of Autonomous Database deployments on Exadata Cloud@Customer and Oracle C3.
An Autonomous Data Lakehouse has two main components: An Autonomous Data Warehouse and a Data
Lake. The Autonomous Data Warehouse is
built using Autonomous Database deployments on
Exadata Cloud@Customer, while the Data Lake is
built using C3 Object Storage Buckets. The Autonomous Database hosts the structured
relational data, while the Data Lake hosts collections of unstructured data in the
form of txt, csv, Avro, Parquet, and other file types. You can query the Autonomous Database, and the database determines
where to find the results, whether in the structured data, the unstructured data, or
a combination of the two. You can use the DBMS_CLOUD
package to
configure Autonomous Database deployments on Exadata Cloud@Customer to load, query, and manage
unstructured data by building external and external partitioned tables from objects
in Data Lakes.
Prerequisites
- Create a C3 Object Storage Bucket.
- Create a user and group. Add the user to the group and configure the
required policies to allow the user to manage files (add/read/remove) to the
C3 Object Storage Bucket.
Example:
Allow group <group_name> to manage objects in tenancy where target.bucket.name='<new_bucket_name_created>'
- Create an API-Key for the user
created. From the API-key generation process, note the following
information:
user_ocid
,tenancy_ocid
,fingerprint
,region
. In addition, save the Public and Private PEM key files created during the API-key generation process. This information is required to configure theDBMS_CLOUD
package later. - Download the https certificate authority chain as shown below. This
certificate authority chain allows client applications to authenticate the
C3 Object Storage Bucket via https protocol. The https certificates are
usually saved on a crt file, for example
c3_ca.crt
. Obtain the certificate authority chain by pointing a browser to the following URL:https://iaas.domainname/cachain
wheredomainname
is the name of the DNS domain configured during the C3 infrastructure deployment. - Create a new Compute VM. Once the new VM starts, copy the Private PEM key file and the C3 https certificate file created in the above steps to the VM.
- Create an Exadata Infrastructure Resource.
- Create an Autonomous VM Cluster.
- Create an Autonomous Container Database (ACD).
- Create an Autonomous Database.
Configuring Autonomous Database to communicate with the C3 Object Storage Bucket
- Add the C3 Object Storage https certificate to the Autonomous VM Cluster
Grid Infrastructure TCPS Wallet. For example, the wallet location is
/var/opt/oracle/dbaas_acfs/grid/tcps_wallets
. This is needed so that the Autonomous Database can authenticate the https connection to the C3 Object Storage. - Add an entry to the
C##CLOUD$SERVICE_dbms_cloud_store
table on the ACD created above. This is to tell theDBMS_CLOUD
package that the C3 Object Storage bucket is a valid cloud store. A sample SQL command to do this is:SQL>INSERT INTO C##CLOUD$SERVICE.dbms_cloud_store VALUES('ORACLE_BMC','<objectstorage.mydomain.com>',null,1);
- OCID of Exadata Infrastructure resource
- OCID of the Autonomous VM Cluster
- OCID of the ACD
- C3 Object Storage URL
- C3 Object Storage https certificate file
After Cloud Operations completes the above tasks, log in to the Autonomous Database.
Use the DBMS_CLOUD.CREATE_CREDENTIAL
procedure to provide the Autonomous Database with the authentication
information required to connect to the C3 Object Storage Bucket. See CREATE_CREDENTIAL for more details.