6 Auto Populate Catalog

This chapter contains information about creating and managing automated extractors to pull data into your catalogs.

About Auto Populate

You can automate the process of extracting metadata from sources directly to your data catalogs.

Manually creating schema, tables, and partitions from your data sources is time consuming and complicated. Oracle AI Data Platform offers the ability to automatically extract metadata from data sources and create entities in catalogs that you specify in the metadata extractor.

You automatically populate this metadata in your catalog by creating a metadata extractor. As part of creating the extractor, you specify the target catalog to extract metadata to and the source for the metadata. You can choose to have the extractor create tables in a specified schema, or let the system suggest where the tables are created if no schema is specified or detected.

Auto populate can extract metadata from the following file types:

  • CSV
  • JSON
  • Avro
  • ORC
  • Parquet
  • Delta Lake

You can opt to either manually review entities that are extracted or let the system automatically create the entities from the extracted metadata. When extracting metadata, entities that cause errors are captured in the log. You can view the log to see which entities encountered errors and take action to correct.

Manually reviewing entities allows you to accept or reject entities on an individual basis. You can view entities are already approved or rejected in the Reviewed Entities tab.

Extractors display their status to let you know what stage they are currently at and if user intervention is required.

Extractor Status Description
Not Started The extractor has not started. Start the extractor to begin.
Running Extractor is in progress
Ready for review The extractor has run and you have chosen manual approval. Extracted entities must be reviewed and either accepted or approved.
Reviewing The extractor has run and you have chosen manual approval. Some entities have been reviewed or approved by a user, but entities remain that require review.
Completed The extractor has run and entities have either been approved automatically or manually approved by a user

You can view and use metadata extractors created by other users if you have the requisite permissions.

Create Metadata Extractor

You can create metadata extractors to automate extracting entities like schema and tables to your catalogs.

  1. On the Home page, click Auto populate catalog.
  2. Click Create schema icon Create Metadata Extractor.
  3. Enter a name for the metadata extractor.
  4. Select the target catalog from the Catalog dropdown.
  5. Select the appropriate source type from Source Type dropdown.
  6. Next to Compute, click Browse and choose the cluster the extractor should use. Click Select.
  7. Next to Compartment, click Browse and choose the compartment to extract your metadata to. Click Select.
  8. Next to Bucket, click Browse and choose the bucket within the compartment to extract your metadata to. Click Select.
  9. Optional: Next to Folder, click Browse and choose the folder within the bucket to extract your metadata to. Click Select.
  10. Select whether entities are created with manual approval or automatically approved by the system.
  11. Optional: Select the schema where external tables are created. If no schema specified, the system creates tables in schema based on folder structure, or in the default schema if no schema is detected.

Manually Review Extracted Metadata Entities

When you choose the manual method of creating entities in a metadata extractor, you need to review the extracted entities and approve or reject adding them to your catalog.

  1. On the Home page, click Auto populate catalog.
  2. Click the name of the metadata extractor.
  3. Click the Entities awaiting review tab.
  4. For each entity, select Approve or Reject.
  5. Optional: Select Approval All or Reject All to set all entities under review to the selected status.
  6. Click Submit.

View Reviewed Entities

You can see entities that have been manually or automatically reviewed as part of metadata extraction and see log details, table details, or column schema for that entity.

  1. On the Home page, click Auto populate catalog.
  2. Click on the name of the metadata extractor.
  3. Click the Reviewed entities tab.
  4. Next to an entity, click Actions three dot icon Actions.
    • Click View table details to see the table details for the selected entity.
    • Click View column schema to see the column schema for the selected entity.
    • Click View logs to see the metadata extractor logs for the selected entity.

View Metadata Extractor Details

You can view the details of a metadata extractor to see its status, metadata creation method, base location, and creation details.

  1. On the Home page, click Auto populate catalog.
  2. Click the name of the metadata extractor.
  3. Click the Details tab.

Delete Metadata Extractor

You can delete metadata extractors that are no longer needed.

  1. On the Home page, click Auto populate catalog.
  2. Next to the metadata extractor you want to delete, click Actions three dot icon Actions and click Delete
  3. Click Delete.