Harvesting from Oracle Object Storage
Harvesting is a process that extracts technical metadata from your data assets into your data catalog. A Data Asset represents a data source. For example: a database, an object store, a file or document store, a message queue, or an application.
In this tutorial, you:
- Allow Data Catalog to access any object in your Oracle Object Storage, in any bucket, in any compartment within the tenancy where the policy is created.
- Create an Oracle Object Storage data asset.
- Add one default connection for the data asset.
- Harvest the data asset by running the harvest job immediately.
You can harvest Object Storage files as logical data entities.
Before You Begin
To successfully perform this tutorial, you must have the following:
1. Creating an Access Policy
You create a policy to allow Data Catalog to access your Object Storage resources.
At a minimum, you must have
READ permission for the Object Storage
aggregate resource type
object-family, or for all the individual
To create an access policy to grant
READ permission to the Object
Storage aggregate resource type
object-family, perform the
- Open the Console navigation menu and then select Policies under Identity.
- Click Create Policy.
In the Create Policy panel, enter a unique name for the policy. The name must
be unique across all policies in your tenancy. You cannot change the name later.
For example, data-catalog-policy.
Next, enter a description such as Grant access to all resources in any compartment in the tenancy and then select Keep Policy Current.
Under Policy Statements field, enter the following policy rule. Then,
allow service datacatalog to read object-family in tenancyNote
This policy allows access to any object, in any bucket, in any compartment within the tenancy where the policy is created. For more examples, see policy examples.
2. Creating a Data Asset
You are now ready to register your Oracle Object Storage data sources with Data Catalog as a data asset .
To create an Oracle Object Storage data asset, perform the following steps:
- In the Console, open the navigation menu, and then under Data and AI, click Data Catalog.
- Click the data catalog instance where you want to create your data asset.
- On your data catalog instance Home page, click Create Data Asset from the Quick Actions tile.
- In the Create Data Asset panel, enter a name to uniquely identify your data asset. Optionally, enter a description. Then, from the Type drop-down list, select Oracle Object Storage. More fields display.
In the URL field, enter the swift URI for your Oracle Cloud
Infrastructure Object Storage resource. For example:
In the Namespace field, enter the object storage namespace for the
specified Oracle Cloud Infrastructure Object Storage resource and then click
To view your Object Storage namespace string in the Console, from the Profile menu click Tenancy:<your_tenancy_name>. The namespace is listed under Object Storage Settings.
3. Adding a Connection
After creating the Oracle Object Storage data asset, you create a connection for the data asset.
To create a connection for your Oracle Object Storage data asset, perform the following steps:
- On the Home page, click Data Assets to access the Data Assets page.
- In the Data Assets list, select the data asset you created previously.
- In the Summary tab on the data asset details page, under Connection Information, click Add Connection.
- In the Add Connection panel, enter a unique name for your connection. Optionally, enter a short description and then ensure that S2S Principal is selected for Type.
In the OCI Region field, enter the region identifier for your Object
To view the region identifier for your region in the Console, from the Profile menu click Tenancy: <your_tenancy_name>. The region name and identifier are displayed.
In the Compartment OCID field, enter the compartment OCID for your
Object Storage resource.
To view the compartment OCID in the Console, navigate to Identity → Compartments. Click the compartment link for your Object Storage resource. From the Compartment Details page, copy the OCID under Compartment Information.
- Select Make this connection the default connection for the data asset.
- Click Test Connection. A notification displays indicating whether the test connection was successful or failed. Next, click Add.
4. Harvesting the Data Asset
You are now ready to harvest your Oracle Object Storage data asset.
To harvest your Oracle Object Storage data asset, perform the following steps:
- Click Harvest on the data asset details page for a data asset.
- The Select Connection page displays and the default connection is selected. Click Next.
The Select Data Entities page displays. View and add all the data
entities you want to harvest from the Available Folders / Data Entities section.
After you have reviewed the data entities you want to harvest from the Selected Folders / Data Entities section, click Next.
- Click the add icon for each data entity you want to include in the harvest job.
- Click Add All to select all the entities for harvesting.
- To find a data entity from the available data entities, use the Filter folders / data entities box.
- Use the page navigation icons to browse all the data entities.
- Click the remove icon for any selected data entity that you want to remove from the harvest job.
- If you need to start over, click Remove All and then start over.
- The Create Job page displays. In the Job Name field, enter a unique name to identify the harvest job. Optionally, enter a Description. Next, select Run job now and then click Create Job.
- The job to harvest your Oracle Object Storage data asset is created successfully. Click the job name.