Overview to Performing Similarity Analysis in Oracle Analytics

In Oracle Analytics, you can perform similarity analysis on your data using a variety of vector embedding models.

For example, you might want to answer questions such as:
  • Which patients have similar symptoms or track records to a given patient?
  • Which customers have a similar profile to a given customer?
  • Which insurance claims are similar in profile to a given insurance claim?

How does it work?

Oracle Database V23ai supports vector search and SQL functions to calculate the distance between vectors, which is used to quantify the degree of resemblance between data records. Oracle Analytics uses vector search behind the scenes to perform similarity analysis on datasets.

Performance Considerations

Processing time for similarity analysis will vary depending on:
  • The number of rows in your source dataset.
  • The number of columns you select to use in your data flow. Note that not all columns in your source dataset will be used in your similarity analysis model.
  • (Specific to Oracle Autonomous Data Warehouse) The number of ECPUs allocated to your Oracle Autonomous Data Warehouse instance.

Data flows have a timeout max of 2.5 hours. This limit dictates the amount of data that can be processed. Refer to the parameters section for additional details.