1 About Oracle Private AI Services Container

The Private AI Services Container is a lightweight, containerized web service that provides an interface for performing inference on Oracle ONNX Pipeline format models using REST. This container can run in your data center or on compute nodes in the public cloud.

Using the Private AI Services Container allows you to offload expensive AI computation, such as vector embedding generation, outside of the database. This can free up database compute resources that can be used for indexing and similarity search.

Multiple containers can be run simultaneously using Podman. Oracle Database Kubernetes Operator can be used for container orchestration.

The management of the AI Services Container is handled by certain logical roles including the Container Admin, Model Creator, and Inference Client. These roles may be assumed by the same Linux user simultaneously.
  • Container Admin: Acting as container admin, a user can configure, stop, start, and perform general management of a container.
  • Model Creator: As model creator, a user performs actions that are related to the provision of model files. This includes tasks such as creating an ONNX pipeline and providing metadata and model specific configurations when necessary.
  • Inference Client: As inference client, a user performs inference on data and queries available models using REST APIs.

The container enables multiple concurrent users. The effective number of concurrent users is determined by the number of CPU cores and the embedding model.

Note:

User data used for inference is not stored and is only processed transiently. All requests to the container are stateless and the data is not stored.

The AI Services Container supports a fixed set of Oracle Machine Learning (OML) model types and mining functions. The model type is automatically determined by the container based on the data type and shape of the model inputs. The supported model types are as follows:

Model Type Model Function Goal
ONNX_TXT EMBEDDING

Generate text embeddings.

The model takes text as input and produces embeddings as output. Example models include sentence transformers and CLIP (text).

ONNX_IMG EMBEDDING

Generate image embeddings.

The model takes an image as input and produces embeddings as output. Example models include Vision Transformers and CLIP (image).

Only Oracle ONNX Pipeline formatted models that are produced by OML4Py are supported for deployment in the Private AI Services Container and in the database, as of Oracle AI Database 26ai. Both text and image embedding pipelines are supported. For more information about ONNX pipeline models, see Oracle Machine Learning for Python User’s Guide. For more information about importing pretrained models in ONNX format, see Oracle AI Database AI Vector Search User's Guide .

The container can be called from the database using the UTL_TO_EMBEDDING and UTL_TO_EMBEDDINGS procedures of the DBMS_VECTOR PL/SQL package. The container can also be called by REST clients such as curl, or clients that use the OpenAI SDK. For information about the syntax and for examples using the PL/SQL procedures with the Private AI Services Container, see Oracle AI Database AI Vector Search User's Guide.

Note:

Support may be available for ancillary use of this program in conjunction with a supported Oracle product, to the extent described in that product's documentation. If support is provided it will be in accordance with Oracle’s technical support policies which may be found at https://www.oracle.com/support/policies/.

For licensing information related to Oracle AI Private Services Container, see Licensing Information User Manual for Oracle Private AI Services Container.