Private AI Services Container API Reference

models

Use GET requests to print a list of all currently deployed models.

Syntax

/v1/models

Returns

A list of deployed models.

Example Output

{
  "data": [
    {
      "id": "h126414603234059290",
      "modelSize": "string",
      "modelDeployedTime": "2025-12-22T15:49:11.745Z",
      "modelCapabilities": [
        "TEXT_EMBEDDINGS"
      ]
    }
  ]
}

models/{id}

Use GET requests to print information about a specific model.

Syntax

/v1/models/{id}

Parameters

id (string): A unique model name. This parameter is required.

Returns

Information about the model specified by model ID.

Example Output

{
  "id": "L55808652807957200809612118083123839056757756025",
  "modelSize": "string",
  "modelDeployedTime": "2025-12-22T16:52:28.365Z",
  "modelCapabilities": [
    "TEXT_EMBEDDINGS"
  ]
}

embeddings

Use POST requests to get embeddings against a model.

Syntax

/v1/embeddings

Parameters

x-convert-images (boolean): Indicates whether images in the input list require conversion to JPG. The default value is false.

Example Input

Note that input can be a string or an array of strings.

{
  "input": "string",
  "model": "string"
}

Example Output

Embedding results:

{
  "data": [
    {
      "embedding": [
        0
      ],
      "index": 0
    }
  ],
  "model": "string"
}

400: Error processing the request data.
404: Model not found.
500: An error occurred during the score operation for this model.

health

Use GET requests to verify that the container is ready to use.

Syntax

/health

Example Output

200: Private AI Services Container is up and running.
401: Unauthorized
500: Internal server error

metrics

Use GET requests to return a list of metric names exposed by the application.

Syntax

/metrics

Returns

Returns a list of metric names exposed by the application.

Example Output

Successful response with metric names:

{
  "names": [
    "embeddings_call_error_total",
    "embeddings_call_latency",
    "embeddings_call_success_total",
    "embeddings_call_total",
    "embeddings_last_latency",
    "http.server.requests",
    "jvm.memory.used",
    "process.cpu.usage",
    "system.cpu.usage"
  ]
}

401: Unauthorized
500: Internal server error

metrics/{metricName}

Use GET requests to return detailed information for a metric, including measurements and available tags. Supports optional tag filters using repeated tag query params in the form key:value.

Syntax

/metrics/{metricName}

Parameters

metricName (string): Metric name as returned by GET /metrics. This parameter is required.

tag (array<string>): Tag filter(s) in the form key:value. Repeat for multiple tags.

Example Output

Metric details:

{
  "name": "embeddings_call_error_total",
  "description": "Total number of errors from embeddings calls.",
  "baseUnit": "count",
  "measurements": [
    {
      "statistic": "COUNT",
      "value": 3
    }
  ],
  "availableTags": [
    {
      "tag": "model"
    },
    {
      "tag": "status",
      "values": [
        "success",
        "error"
      ]
    }
  ]
}

400: Invalid tag filter
404: Metric not found
500: Internal server error

api

Use GET requests to return the OpenAI specification for this API in YAML format.

Syntax

/v1/api

Returns

Returns the OpenAPI (YAML) document as a string.

A Private AI Services Container API Reference