2 Install the Private AI Services Container
Use the included steps to configure and install the container with best practices using bash scripts.
The Oracle Private AI Services Container is AI infrastructure designed to run on-premises, and optionally in air-gapped environments. The container can create low latency, free vector embeddings in a secure manner within the privacy of your own realm.
The container uses TLS 1.3 and API Keys for security. A PKCS12 keystore is used by default, but JKS is also supported. The container can also work with Security Enhanced Linux (SELinux) in enforcing mode.
The included scripts, provided for HTTP and HTTP/SSL, offer examples of how to use curl to create vector embeddings, list the loaded embedding models, check on the health of the container, and produce runtime metrics.
Prerequisites
The Oracle Private AI Services Container uses Oracle Linux 8 within the container and works with the following host Linux x86_64 distributions:
- Oracle Linux 8.6+
- Oracle Linux 9
- Oracle Linux 10
The AI Services Container uses TLS 1.3. This means that you need a version of OpenSSL that supports TLS 1.3 to create certificates, for example:
- OpenSSL 1.1.1k+
- OpenSSL 3.0+
The AI Services Container can be used with the following software:
- Podman
The chosen software must be one of the following versions:
- Podman 3.4.4+, 4.4+, 5+
- Kubernetes 1.31.1+
- Red Hat OpenShift 4.19+
The included examples use Podman with Oracle Linux 8 as the host operating
system on OCI. The example OCI VM where the container is installed is called
privateaivm.
-
Memory: You must have at least 16 GB of free memory to effectively use the container to create vector embeddings. If you want to use large embedding models or create multiple different types of vectors at the same time, then you will need more memory. Insufficient memory can result in reduced performance.
-
CPU cores: Although you can use a single CPU core with the ONNX Runtime to create vector embeddings, having more CPU cores will enhance performance. Multiple CPU cores enable either a single vector or many different vectors to be created at the same time using multi-threading.
-
Disk space: You need at least 22 GB of disk space to effectively use the container. You will need additional disk space for each additional embedding model that you use. You will also need additional disk space depending on the log level for long running containers.
Note that the container should not run on the same machine as the Oracle AI Database Server but on a Linux machine that is close to the database server. The container is designed to accelerate resource intensive tasks such as creating vectors or vector index offload of machines other than the Oracle AI Database. This enables the Oracle AI Database to have low latency and high throughput without being burdened with these resource-intensive AI infrastructure tasks.