Knowledge Bases

Knowledge bases leverage Oracle Database 26ai Vector Search capability to store vector embeddings from documents stored in AI Data Platform Workbench.

Through Oracle Database 26ai’s vector search capabilities, knowledge bases empower AI agents to perform semantic searches and retrieve semantically relevant documents. In AI Data Platform Workbench, knowledge bases are created in a schema of a catalog under the Knowledge Bases type.


AI Data Platform Workbench Master Catalog page open with a catalog selected and Knowledge Bases highlighted

In AI Data Platform Workbench, knowledge bases are created in schemas of standard catalogs using the Knowledge Base type. The ingestion of PDF, DOCX, and TXT files stored in either managed or external volumes is supported in knowledge bases. By default, vectors are stored in Oracle Database 26ai Vector Search instance that is provisioned in your tenancy when your instance of AI Data Platform is created.

AI Data Platform Workbench supports two embeddings models:
  • ALL_MINILM_L12_V2: A sentence-transformers model that maps sentences and paragraphs to a 384 dimensional dense vector space. Used for tasks like clustering or semantic search.
  • MULTILINGUAL_E5_SMALL: Generates vector embeddings for text in multiple languages. Its compact design enables effective performance across various languages, suitable for diverse datasets and multilingual scenarios.

Note:

By itself, a knowledge base object in AI Data Platform Workbench cannot be directly queried. You query a knowledge base by creating a RAG tool attached to an agent in an agent flow and selecting the relevant knowledge base. For more information about RAG tools, see RAG Tool. For more information about AI Agents, see AI Agents.

Ingest Data Sources

After you create a knowledge base in AI Data Platform Workbench, you need to go into that knowledge base and specify a data source to ingest data from. You can select an entire volume or a folder in a volume as a source for ingestion, but you cannot select individual files.

You can see your data sources in the Data Source tab of your knowledge base and see the information about that data source by clicking its name. The Parameters tab provides information about the selected volume, file path, attached cluster, and file types.

Note:

AI Data Platform Workbench does not support scheduled ingestion jobs. You can ingest data immediately by clicking Ingest now in the Parameters tab of your data source.

You can see more detailed information about your data source in the Details tab and see a history of data ingestion jobs in the Job runs tab.

Create a Knowledge Base

Creating a knowledge base in AI Data Platform Workbench is a one-time setup that allows you to register a document source, automatically chunk, embed, and index files, and enable semantic search and RAG retrieval via agent flows.

You cannot directly query knowledge bases in AI Data Platform Workbench. You can query knowledge bases by creating a RAG tool attached to an AI agent. For more information, see AI Agents.
  1. Click Master catalog.
  2. Navigate to the standard catalog and schema where you want to create your knowledge base.
  3. Click Knowledge Bases.
  4. Click Create knowledge base icon Create Knowledge Base.

    Create Knowledge Base dialog

  5. Provide a name and description for your knowledge base.
  6. Select a workspace and Spark cluster for file ingestion. If no cluster is selected, the Default Master Catalog Compute is used.
  7. Select the embedding model used, if necessary.
  8. Provide the chunk size and chunk overlap, if necessary.
  9. Click Create.

Edit a Knowledge Base

You can modify the name, description, cluster, model, or chunking details for an existing knowledge base if you have the relevant permissions.

  1. Navigate to your knowledge base folder.
  2. Next to the knowledge base you want to edit, click Actions three dot icon Actions, then click Edit.
  3. Make any changes to the attributes of the knowledge base.
  4. Click Save.

Delete a Knowledge Base

You can delete knowledge bases you no longer need or use from your catalog.

  1. Navigate to your knowledge base folder.
  2. Next to the knowledge base you want to delete, click Actions three dot icon Actions, then click Delete.
  3. Click Delete.

Add a Data Source to a Knowledge Base

After you have created a knowledge base, you need to assign it a data source for ingestion.

  1. Navigate to your knowledge base.
  2. Click the Data Source tab.
  3. Click Add data source to knowledge base Add data source to knowledge base.

    Add data source to knowledge base dialog

  4. In the Master Catalog, select the volume or folder in a volume you want to ingest into your knowledge base. You cannot select individual files.
  5. If necessary, select the compute cluster to use for data ingestion.
  6. Select which file types to ingest. Supported file types are PDF, TXT, and DOCX.
  7. Select Start ingestion job on add to start ingestion immediately after adding the data source.
  8. Click Add.

Ingest Data to a Knowledge Base

Once a data source is added to a knowledge base, you can manually start a data ingestion job run from the Parameters tab.

  1. Navigate to your knowledge base.
  2. In the Data Source tab, click the name of the data source you want to run an ingest data job for.
  3. In the Parameters tab, click Ingest now.

View Ingestion Job Run Status

You can view a list of all ingestion jobs for the data source from the Job Runs tab of the data source.

  1. Navigate to your knowledge base.
  2. In the Data Source tab, click the name of the data source you want to view the status for.
  3. Click the Job runs tab.
  4. Use the filters to narrow the list of displayed job runs.

Delete a Data Source

You can delete data sources you no longer need or use from your knowledge base.

Deleting a data source also deletes the corresponding vector embeddings from your AI Data Platform.
  1. Navigate to your knowledge base. Click the Data Sources tab.
  2. Next to the data source you want to delete, click Actions three dot icon Actions, then click Delete.
  3. Click Delete.