39.2 Examples of Using ONNX Models
The following examples use the Iris data set to showcase loading and inference from ONNX format machine learning models for machine learning techniques such as Classification, Regression, and Clustering in your Oracle AI Database instance.
ONNX Classification Examples
The following examples showcase various JSON metadata parameters that can be defined for ONNX models.
Example: Specifying JSON Metadata for Classification Models
The following example illustrates JSON metadata parameters with
                Classification as the function. Assume the model has an output named
                    probabilities for the probability of the prediction. To use the
                    PREDICTION_PROBABILITY scoring function, you must set the field
                    classificationProbOutput to the name of the model output that
                holds the probability.
                  
BEGIN 
DBMS_VECTOR.LOAD_ONNX_MODEL('classification_model.onnx', 'doc_model', 
JSON('{"function" : "classification",
       "classificationProbOutput": "probabilities"}'));
END;
/
                  Example: Specifying labels in JSON Metadata for Classification Models
The following example illustrates how you can specify custom labels in the JSON metadata.
BEGIN 
DBMS_VECTOR.LOAD_ONNX_MODEL('classification_model.onnx', 'doc_model',
JSON('{"function" : "classification",
	   "classificationProbOutput": "probabilities",
	   "labels": ["Setosa", "Versicolour", "Virginica"]}'));
END;
/
                  You can use the PREDICTION and
                    PREDICTION_PROBABILITY functions for inference or scoring:
                  
SELECT
    iris.*,
    PREDICTION(doc_model USING *) as predicted_species_id,
    PREDICTION_PROBABILITY(doc_model, 'setosa' USING *) as setosa_probability
FROM iris;
                  The query predicts iris species and the probability of
                    setosa species using the iris data set. The data from
                    iris table is used in a SELECT query to
                predict a species ID and the probability that the species is setosa using a
                machine learning model named doc_model. The
                    PREDICTION function predicts the species based on the
                attributes in the table, and the PREDICTION_PROBABILITY function
                computes the probability that the predicted species is setosa. The result
                includes all columns from the iris view along with the predicted
                species ID and the probability of the species being setosa.
                  
Example: Specifying input in JSON Metadata for Classification Models
The following example illustrates how you can specify input attribute
                names that map to the actual ONNX model input names. This example assumes a model
                with four inputs named SEPAL_LENGTH, SEPAL_WIDTH,
                    PETAL_LENGTH, and PETAL_WIDTH. You can specify
                alternative input attribute names using the JSON metadata as shown in this example.
                Here, each input is assumed to be a tensor with a dimension of 1. The
                    input field must be a JSON object where each field is a model
                input name (For example, SEPAL_LENGTH), and its value is a JSON
                array sized according to the tensor's dimension (here, 1) with one attribute name
                per element in the array.
                  
BEGIN DBMS_VECTOR.LOAD_ONNX_MODEL('classification_model.onnx', 'doc_model', 
JSON('{"function" : "classification", 
       "classificationProbOutput": "probabilities", 
       "input": { "SEPAL_LENGTH": ["SEPAL_LENGTH_CM"], 
                  "SEPAL_WIDTH": ["SEPAL_WIDTH_CM"], 
                  "PETAL_LENGTH": ["PETAL_LENGTH_CM"], 
                  "PETAL_WIDTH": ["PETAL_WIDTH_CM"] } }'));
END;
/
                  You can also have a different order of the columns as input.
BEGIN DBMS_VECTOR.LOAD_ONNX_MODEL('classification_model.onnx', 'doc_model', 
	JSON('{"function" : "classification", 
		   "classificationProbOutput": "probabilities", 
		   "input": { "SEPAL_WIDTH": ["SEPAL_WIDTH_CM"],
			       "PETAL_LENGTH": ["PETAL_LENGTH_CM"],
			       "PETAL_WIDTH": ["PETAL_WIDTH_CM"],
				"SEPAL_LENGTH": ["SEPAL_LENGTH_CM"] } }'));
END;
/
                  Example: Specifying a Single input With Four Dimensions
Here is an example where the model has a single input tensor named
                    x with four dimensions. The corresponding JSON metadata for
                this scenario is:
                  
JSON('{"function" : "classification",
       "classificationProbOutput": "probabilities",
       "input": { "x": ["SEPAL_LENGTH_CM",
                        "SEPAL_WIDTH_CM",
                        "PETAL_LENGTH_CM",
                        "PETAL_WIDTH_CM"]
                     }'));
                  You can use PREDICTION and
                    PREDICTION_PROBABILITY functions for inference or scoring.
                  
WITH
dummy_iris AS (
    SELECT
    4.5 as petal_length_cm,
    1.5 as petal_width_cm,
    4.3 as sepal_length_cm,
    2.9 as sepal_width_cm
    FROM iris
)
SELECT
    dummy_iris.*,
    PREDICTION(doc_model USING *) as predicted_species_id,
    PREDICTION_PROBABILITY(doc_model 'setosa' USING *) as setosa_probability
FROM dummy_iris;
                  The query predicts iris species and the probability of
                    setosa species using specified attributes in a temporary data set. The
                query creates a temporary dummy_iris view with attributes values
                set. This temporary view is then used in a SELECT query to predict
                a species ID and the probability that the species is setosa using a machine
                learning model named doc_model. The PREDICTION
                function predicts the species based on the attributes provided, and the
                    PREDICTION_PROBABILITY function computes the probability that
                the predicted species is setosa. The result includes all columns from the
                    dummy_iris view along with the predicted species ID and the
                probability of the species being setosa.
                  
Example: Specifying defaultOnNull in JSON Metadata for Classification Models
The following examples illustrates how you can specify
                    defaulOnNull provides default values to be used for specific
                attributes when their values are NULL in the data set. Use the names
                    SEPAL_LENGTH, SEPAL_WIDTH,
                    PETAL_LENGTH, and PETAL_WIDTH as fields in the
                    defaultOnNull object, which are the assumed input attribute
                names for a ONNX model with four inputs. These names serve as the default input
                attribute names, so you can use them as fields in the
                defaultOnNull.
                  
BEGIN DBMS_VECTOR.LOAD_ONNX_MODEL('classification_model.onnx', 'doc_model', 
	JSON('{"function" : "classification", 
		   "classificationProbOutput": "probabilities", 
		   "defaultOnNull": {"SEPAL_LENGTH": "5.1", 
		   "SEPAL_WIDTH": "3.5", 
		   "PETAL_LENGTH": "1.4", 
		   "PETAL_WIDTH": "0.2"}}'));
END;
/
                  - 
                        
                        
"SEPAL_LENGTH": "5.1": If the sepal length is null, use 5.1 as the default value. - 
                        
                        
"SEPAL_WIDTH": "3.5": If the sepal width is null, use 3.5 as the default value. - 
                        
                        
"PETAL_LENGTH": "1.4": If the petal length is null, use 1.4 as the default value. - 
                        
                        
"PETAL_WIDTH": "0.2": If the petal width is null, use 0.2 as the default value. 
Example: Specifying input and defaultOnNull JSON Metadata for Classification Models
Here is a combined example of specifying input and
                    defaultOnNull values. This example uses the values that were
                illustrated in the earlier examples where input and
                    defaultOnNull values are specified: 
                  
JSON('{"function" : "classification",
               "classificationProbOutput": "probabilities",
               "input": { "SEPAL_WIDTH": ["SEPAL_WIDTH_CM"],
                          "PETAL_LENGTH": ["PETAL_LENGTH_CM"],
                          "PETAL_WIDTH": ["PETAL_WIDTH_CM"],
                          "SEPAL_LENGTH": ["SEPAL_LENGTH_CM"] },
                    "defaultOnNull": {"SEPAL_LENGTH_CM": "5.1",
                                      "SEPAL_WIDTH_CM": "3.5"}}')
                  ONNX Clustering Examples
The following examples showcase various JSON metadata parameters that can be defined for ONNX models.
Example: Specifying JSON Metadata for Clustering Models
The following example illustrates JSON metadata parameters with
                Clustering as the function. Assume the model has an output named
                    probabilities for the probability of the prediction. To use the
                    CLUSTER_PROBABILITY scoring function, you must set the field
                    clusteringProbOutput to the name of the model output that holds
                the probability.
                  
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL('clustering_model.onnx','doc_model',
    JSON('{"function": "clustering",
           "clusteringProbOutput": "probabilities"
        }
    ')
);
END;
/
                  You can use CLUSTER_ID and
                    CLUSTER_PROBABILITY functions for inference or scoring.
                  
SELECT
    iris.*,
    CLUSTER_ID(doc_model USING *) as cluster_id,
    CLUSTER_PROBABILITY(doc_model, 1 USING *) as cluster_1_probability
FROM iris;
                  This query predicts the cluster assignments and the probabilities of
                belonging to a specific cluster for each record of the iris data
                set. The query retrieves all columns of each record (iris.*) and
                applies the clustering model named doc_model to each record of the
                    iris data set and predicts the cluster ID. The USING
                    * clause tells the model to use all available columns in the
                    iris table for this prediction. The
                    CLUSTER_PROBABILITY(doc_model, 1 USING *) as
                    cluster_1_probability part of the query calculates the probability that
                each record belongs to cluster 1, according to the doc_model from
                the iris data set. This provides insights into how likely each
                record is to be part of cluster 1, giving a quantitative measure of membership
                strength.
                  
Example: Specifying clusteringDistanceOutput in JSON Metadata for Clustering Models
The following example illustrates how you can specify
                    clusteringDistanceOutput and for ONNX Clustering models.
                  
In this model, an output tensor named distances
                provides distances for the input, which is a single tensor named
                    float_input with a dimension of 4. The JSON metadata
                    input field must map attribute names to entries of the tensor,
                such as "SEPAL_LENGTH", "SEPAL_WIDTH", "PETAL_LENGTH",
                    "PETAL_WIDTH".
                  
BEGIN 
DBMS_VECTOR.LOAD_ONNX_MODEL('clustering_model.onnx', 'doc_model', 
JSON('{"function" : "clustering", 
"clusteringDistanceOutput": "distances", 
"normalizeProb": "softmax", 
"input": { "float_input": ["SEPAL_LENGTH", "SEPAL_WIDTH", "PETAL_LENGTH", "PETAL_WIDTH"] } 
      }')
);
END;
/
                  You can use
                CLUSTER_DISTANCE
                function for inference or scoring. These SQL queries utilize clustering models to
                predict cluster distances from the IRIS
                data set.
                  
SELECT CLUSTER_DISTANCE(doc_model USING *) AS predicted_target_value,
CLUSTER_DISTANCE (doc_model,1 USING *) AS dist1,
CLUSTER_DISTANCE (doc_model,2 USING *) AS dist2,
CLUSTER_DISTANCE (doc_model,3 USING *) AS dist3
FROM IRIS
ORDER BY ID
FETCH NEXT 10 ROWS ONLY;
                  Here, the query focuses on understanding the physical distance of data
                points from cluster centroids, which is particularly useful for identifying outliers
                or for performing detailed cluster analysis. The query calculates the distance of
                each record in the IRIS data set from the centroids of different
                clusters using the doc_model. The USING * syntax
                indicates that the model must use all available columns of the IRIS
                data set for making the prediction. CLUSTER_DISTANCE(doc_model, n USING
                    *) computes the distance from cluster n
                    (n being 1, 2, and 3 in this query). Each distance is selected
                as a separate column (dist1, dist2,
                    dist3). 
                  
The output is limited to the first 10 rows of the result set ordered by
                the ID column of the IRIS table.
                  
Example: Specifying clusteringProbOutput and normalizeProb in JSON Metadata for Clustering Models
The following example illustrates how you can specify
                    clusteringProbOutput and normalizeProb for
                ONNX Clustering models.
                  
BEGIN 
DBMS_VECTOR.LOAD_ONNX_MODEL('clustering_model.onnx', 'doc_model',
               JSON('{"function" :
			   "clustering",
			   "clusteringProbOutput": "probabilities",
                        "normalizeProb" : "softmax",
			   "input": { "float_input": ["SEPAL_LENGTH", "SEPAL_WIDTH", "PETAL_LENGTH", "PETAL_WIDTH"] } }')
      );
END;
/
                  You can use CLUSTER_PROBABILITY and
                    CLUSTER_SET functions for inference or
                scoring:
                  
SELECT CLUSTER_ID (doc_model USING *) AS predicted_target_value,
                    CLUSTER_PROBABILITY (doc_model,1 USING *) AS prob1,
                    CLUSTER_PROBABILITY (doc_model,2 USING *) AS prob2,
                    CLUSTER_PROBABILITY (doc_model,3 USING *) AS prob3
                    FROM IRIS
                    ORDER BY ID
                    FETCH NEXT 10 ROWS ONLY;
                In this case, a clustering model is used to predict the cluster IDs and
                associated probabilities for records from the IRIS data set.
                Because the JSON metadata specifies softmax for the
                    normalizeProb field, the model applies softmax normalization to
                the probabilities before returning them as the result of the
                    CLUSTER_PROBABILITY scoring operator.
                  
The
                SQL query selects CLUSTER_ID column from the IRIS
                table and adds a new column, predicted_target_value, which contains
                predictions made by the doc_model. The USING *
                syntax means that all columns of the current row are used as input features for the
                    doc_model model to predict the value as
                    predicted_target_value. The result of this prediction is then
                included as a new column in the output of the query. 
                  
CLUSTER_PROBABILITY(model, n USING *): Computes the
                probability that the record belongs to cluster n
                    (n being 1, 2, and 3 in this query). This is done for three
                different clusters, and each probability is selected as a separate column
                    (prob1, prob2, prob3).
                  
The output is limited to the first 10 rows of the result set ordered by
                the ID column of the IRIS table.
                  
SELECT S.CLUSTER_ID, S.PROBABILITY
 FROM (SELECT CLUSTER_SET(doc_model USING *) pset
          FROM IRIS ORDER BY ID) T,
       TABLE(T.pset) S
FETCH NEXT 10 ROWS ONLY;
                  The CLUSTER_SET query generates a set of cluster data
                using the doc_model. The resultant column pset
                represents all possible cluster assignments for each record, which includes cluster
                IDs and their respective probabilities ordered by the ID column.
                The SELECT S.CLUSTER_ID, S.PROBABILITY part of the query selects
                the cluster ID and probability from the resultant column set. The output is limited
                to the first 10 rows of the result set.
                  
ONNX Regression Examples
The following examples showcase various JSON metadata parameters that can
                be defined for ONNX Regression models. All examples assume an ONNX model that has
                one output named regressionOutput and four input tensors of
                dimension 1 whose name match exactly the name of the IRIS table
                columns, namely, SEPAL_LENGTH, SEPAL_WIDTH, PETAL_LENGTH,
                    PETAL_WIDTH.
                  
Example: Specifying JSON Metadata for Regression Models
The following is a simple example illustrating JSON metadata parameters
                with Regression as the function. Assume the ONNX model features one output named
                    regressionOutput and four input tensors of dimension 1, whose
                names match exactly after the IRIS table columns
                    ("SEPAL_LENGTH", "SEPAL_WIDTH", "PETAL_LENGTH", "PETAL_WIDTH").
                The JSON metadata can be as simple as the following:
                  
BEGIN DBMS_VECTOR.LOAD_ONNX_MODEL(
    'regression_model.onnx',
    'doc_model',
    JSON('{"function": "regression"}
    ')
);
END;
/
                  You can use the PREDICTION function for inference or
                scoring:
                  
SELECT
    iris.*,
    PREDICTION(doc_model USING *) as predicted_petal_width_cm
FROM iris;
                  In this case, the SQL query selects all columns from the
                    iris table and adds a new column,
                    predicted_petal_width_cm, which contains predictions made by
                the doc_model. The USING * syntax means that all
                columns of the current row are used as input features for the
                    doc_model model to predict the value of
                    PETAL_WIDTH as predicted_petal_width_cm. The
                result of this prediction is then included as a new column in the output of the
                query. 
                  
Example: Specifying input and defaultOnNull in JSON Metadata for Regression Models
The following example illustrates how you can specify input attribute
                names that map to the actual ONNX model input names. The
                    defaulOnNull providing default values to be used for specific
                attributes when their values are NULL in the data set.
                  
BEGIN DBMS_VECTOR.LOAD_ONNX_MODEL('regression_model.onnx','doc_model',
    JSON('{"function": "regression",
            "input": {
                "SEPAL_LENGTH": ["dummy_sepal_length_cm"],
                "SEPAL_WIDTH": ["dummy_sepal_width_cm"]
                      },
            "defaultOnNull": {
                "dummy_sepal_length_cm": "5.1",
                "dummy_sepal_width_cm": "3.5",
                 }
           }
        ')
);
END;
/
                  You can use the PREDICTION function for inference or
                scoring.
                  
WITH
dummy_iris AS (
    SELECT
    (CASE WHEN petal_length > 5 THEN 4.9 ELSE NULL END)
        as dummy_sepal_length_cm,
    (CASE WHEN petal_length < 4 THEN 2.5 ELSE NULL END)
        as dummy_sepal_width_cm,
    petal_length
    petal_width
    FROM iris
)
SELECT
    dummy_iris.*,
    PREDICTION(doc_model USING *) as predicted_petal_width_cm
FROM dummy_iris;
                  In this case, a temporary dummy_iris table is created
                with three columns: dummy_sepal_length_cm,
                    dummy_sepal_width_cm, and petal_length. The
                values of the dummy_sepal_length_cm and
                    dummy_sepal_width_cm are based on petal_length
                values of the iris table. If petal_length is
                greater than 5, dummy_sepal_length_cm is set to 4.9, otherwise it
                is NULL. If petal_length is less than 4,
                    dummy_sepal_width_cm is set to 2.5, otherwise it remains NULL. 
                  
Then the SELECT query retrieves all columns from the
                    dummy_iris table and uses the doc_model to
                predict petal_width, adding this prediction as a new column named
                    predicted_petal_width_cm. The model uses the derived dummy
                columns, petal_length and petal_width for its
                predictions.
                  
See Also:
- 
                        
                        
LOAD_ONNX_MODEL in Oracle AI Database PL/SQL Packages and Types Reference