10.6.2.6 pyqIndexEval Function (Autonomous Database)

The function pyqIndexEval when used in Oracle Autonomous Database, runs a user-defined Python function multiple times as required in the Python engines spawned by the database environment.

The function pyqIndexEval runs the user-defined Python function specified by the SCR_NAME parameter. The PAR_LST parameter specifies the special control argument oml_graphics_flag to capture images rendered in the script, and the oml_parallel_flag and oml_service_level flags enable parallelism using the MEDIUM service level. See also: Special Control Arguments (Autonomous Database).

Syntax

FUNCTION PYQSYS.pyqIndexEval(
    PAR_LST VARCHAR2,
    OUT_FMT VARCHAR2,
    TIMES_NUM NUMBER,
    SCR_NAME VARCHAR2,
    SCR_OWNER VARCHAR2 DEFAULT NULL,
    ENV_NAME VARCHAR2 DEFAULT NULL
)
RETURN SYS.AnyDataSet

Parameters

Parameter Description

PAR_LST

A JSON string that contains additional parameters to pass to the user-defined Python function specified by the SCR_NAME parameter. Special control arguments, which start with oml_, are not passed to the function specified by SCR_NAME, but instead control what happens before or after the invocation of the function.

For example, to specify the input data type as pandas.DataFrame, use:

'{"oml_input_type":"pandas.DataFrame"}'

See also: Special Control Arguments (Autonomous Database).

OUT_FMT

The format of the output returned by the function. It can be one of the following:

  • A JSON string that specifies the column names and data types of the table returned by the function. Any image data is discarded. The Python function must return a pandas.DataFrame, a numpy.ndarray, a tuple, or a list of tuples.
  • The string 'JSON', which specifies that the table returned contains a CLOB that is a JSON string.

  • The string 'XML', which specifies that the table returned contains a CLOB that is an XML string. The XML can contain both structured data and images, with structured or semi-structured Python objects first, followed by the image or images generated by the Python function.
  • The string 'PNG', which specifies that the table returned contains a BLOB that has the image or images generated by the Python function. Images are returned as a base 64 encoding of the PNG representation.

See also: Output Formats (Autonomous Database).

TIMES_NUM

The number of times to execute the Python script.

SCR_NAME

The name of a user-defined Python function in the OML4Py script repository.

SCR_OWNER

The owner of the registered Python script. The default value is NULL. If NULL, will search for the Python script in the user’s script repository.

ENV_NAME

The name of the conda environment that should be used when running the named user-defined Python function.

Example

Define the Python function fit_lm and store it with the name myFitMultiple in the script repository. The function returns a pandas.DataFrame containing the index and prediction score of the fitted model on the data sampled from scikit-learn’s IRIS dataset.

begin
    sys.pyqScriptCreate('myFitMultiple',
        'def fit_lm(i, sample_size):
            from sklearn import linear_model
            from sklearn.datasets import load_iris
            import pandas as pd
            
            import random
            random.seed(10)
            
            iris = load_iris()
            x = pd.DataFrame(iris.data, columns = ["Sepal_Length",\
                        "Sepal_Width","Petal_Length","Petal_Width"])
            y = pd.DataFrame(list(map(lambda x: {0:"setosa", 1: "versicolor",\
                                  2: "virginica"}[x], iris.target)),\
                         columns = ["Species"])
            dat = pd.concat([y, x], axis=1).sample(sample_size)
            regr = linear_model.LinearRegression()
            regr.fit(x.loc[:, ["Sepal_Length", "Sepal_Width", \
                              "Petal_Length"]],
                     x.loc[:,["Petal_Width"]])
            sc = regr.score(dat.loc[:, ["Sepal_Length", "Sepal_Width", \
                              "Petal_Length"]],
                            dat.loc[:,["Petal_Width"]])
            return pd.DataFrame([[i,sc]],columns=["id","score"])
        ',FALSE,TRUE); -- V_GLOBAL, V_OVERWRITE
end;
/

Issue a query that invokes the pyqIndexEval function. In the function, the PAR_LST argument specifies the function argument sample_size. The OUT_FMT argument specifies a JSON string that contains the column names and data types of the table returned by pyqIndexEval. The TIMES_NUM parameter specifies the number of times to execute the script. The SCR_NAME parameter specifies the user-defined Python function stored with the name myFitMultiple in the script repository.

select *
    from table(pyqIndexEval(
        par_lst => '{"sample_size":80,
                        "oml_parallel_flag":true", "oml_service_level":"MEDIUM"}',
        out_fmt => '{"id":"number","score":"number"}',
        times_num => 3,
        scr_name => 'myFitMultiple'));

The output is the following:

        id score
---------- ----------
         1 .943550631
         2 .927836941
         3 .937196049
3 rows selected.