Considerations when Configuring the Private AI Services Container

4 Considerations when Configuring the Private AI Services Container

When you configure the Private AI Services Container, there are a number of options and information to consider that can help you get the most out of this service.

Available Embedding Models
A number of embedding models are available to be used with the Private AI Services Container, including some that are shipped with the container. There are also pre-built models available for download as well as models that are known to work with the container but must be built using Oracle Machine Learning for Python (OML4Py) Client 2.1.
Private AI Environment Variables
Environment variables can be used to set configuration properties, such as file locations, logging settings, and others. They can be passed to the container when running the container using the standard -e flag.
Use API Keys in Clients
Clients to the Private AI Services Container must send a valid API key if the service has PRIVATE_AI_AUTHENTICATION_ENABLED set to true.
Container Input Validation
Inputs to the Private AI Services Container can come from a variety of sources, each of which are validated using different methods.
Container Automatic Image Conversion
The request header x-convert-images can be used to instruct the container to examine the format of input images and convert them as necessary. The JPEG format is supported by default.
Rate Limiting
The Private AI Services Container provides a configurable method to control the number of requests per minute that the container can handle for different categories of endpoints. This improves scalability and helps prevent abuse, including denial-of-service (DOS) attacks.
Transport Layer Security
The container admin can manage transport layer security using environment variables and a keystore file, along with a password file to access the keystore.
Container Log Files
Diagnostic information about the Private AI Services Container is available and stored in log messages.
Multi-threaded Scaling
The ONNX Runtime enables multi-threading and can benefit from multiple CPU cores.