10.6.2.5 pyqGroupEval Function (Autonomous Database)

The function pyqGroupEval when used in Oracle Autonomous Database, groups data by one or more columns and runs a user-defined Python function on each group.

The function pyqGroupEval runs the user-defined Python function specified by the SCR_NAME parameter. Pass data to the user-defined Python function with the INP_NAM parameter. The PAR_LST parameter specifies the special control argument oml_graphics_flag to capture images rendered in the script, and the oml_parallel_flag and oml_service_level flags enable parallelism using the MEDIUM service level. See also: Special Control Arguments (Autonomous Database). Specify one or more grouping columns with the GRP_COL parameter.

The user-defined Python function can return a boolean, a dict, a float, an int, a list, a str, a tuple or a pandas.DataFrame object. Define the form of the returned value with the OUT_FMT parameter.

Syntax

FUNCTION PYQSYS.pyqGroupEval(
    INP_NAM    VARCHAR2,
    PAR_LST    VARCHAR2,
    OUT_FMT    VARCHAR2,
    GRP_COL    VARCHAR2,
    ORD_COL    VARCHAR2,
    SCR_NAME   VARCHAR2,
    SCR_OWNER  VARCHAR2 DEFAULT NULL,
    ENV_NAME   VARCHAR2 DEFAULT NULL
    )
    RETURN SYS.AnyDataSet

Parameters

Parameter Description

INP_NAM

The name of a table or view that specifies the data to pass to the Python function specified by the SCR_NAME parameter. If using a table or view owned by another user, use the format <owner name>.<table/view name>. You must have read access to the specified table or view.

PAR_LST

A JSON string that contains additional parameters to pass to the user-defined Python function specified by the SCR_NAME parameter. Special control arguments, which start with oml_, are not passed to the function specified by SCR_NAME, but instead control what happens before or after the invocation of the function.

For example, to specify the input data type as pandas.DataFrame, use:

'{"oml_input_type":"pandas.DataFrame"}'

See also: Special Control Arguments (Autonomous Database).

OUT_FMT

The format of the output returned by the function. It can be one of the following:

  • A JSON string that specifies the column names and data types of the table returned by the function. Any image data is discarded. The Python function must return a pandas.DataFrame, a numpy.ndarray, a tuple, or a list of tuples.
  • The string 'JSON', which specifies that the table returned contains a CLOB that is a JSON string.

  • The string 'XML', which specifies that the table returned contains a CLOB that is an XML string. The XML can contain both structured data and images, with structured or semi-structured Python objects first, followed by the image or images generated by the Python function.
  • The string 'PNG', which specifies that the table returned contains a BLOB that has the image or images generated by the Python function. Images are returned as a base 64 encoding of the PNG representation.

See also: Output Formats (Autonomous Database).

GRP_COL

The names of the grouping columns by which to partition the data. Use commas to separate multiple columns. For example, to group by GENDER and YEAR:

"GENDER,YEAR"

ORD_COL

Comma-separated column names to order the input data. For example to order by GENDER:

"GENDER"

If specified, the input data will first be ordered by the ORD_COL columns and then grouped by the GRP_COL columns.

SCR_NAME

The name of a user-defined Python function in the OML4Py script repository.

SCR_OWNER

The owner of the registered Python script. The default value is NULL. If NULL, will search for the Python script in the user’s script repository.

ENV_NAME

The name of the conda environment that should be used when running the named user-defined Python function.

Example

This example uses the IRIS table created in the example shown in pyqEval Function (Autonomous Database).

Define the Python function group_count and store it with the name mygroupcount in the script repository. The function returns a pandas.DataFrame generated on each group of data dat. The function also plots the sepal length with the petal length values on each group.

BEGIN
    sys.pyqScriptCreate('mygroupcount',
        'def group_count(dat):
        import pandas as pd
        import matplotlib.pyplot as plt
        plt.plot(dat[["Sepal_Length"]], dat[["Petal_Length"]], ".")
        plt.xlabel("Sepal Length")
        plt.ylabel("Petal Length")
        plt.title("{}".format(dat["Species"][0]))
        return pd.DataFrame([(dat["Species"][0], dat.shape[0])],\
        columns = ["Species", "CNT"]) ',
        FALSE, TRUE); -- V_GLOBAL, V_OVERWRITE
END;
/

Issue a query that invokes the pyqGroupEval function. In the function, the INP_NAM argument specifies the data in the IRIS table to pass to the function.

The PAR_LST argument specifies the special control argument oml_input_type.

The OUT_FMT argument specifies a JSON string that contains the column names and data types of the table returned by pyqGroupEval.

The GRP_COL parameter specifies the column to group by.

The SCR_NAME parameter specifies the user-defined Python function stored with the name mygroupcount in the script repository.

SELECT *
    FROM table(
        pyqGroupEval(
            inp_nam => 'IRIS',
            par_lst => '{"oml_input_type":"pandas.DataFrame"}',
            out_fmt => '{"Species":"varchar2(10)", "CNT":"number"}',
            grp_col => 'Species',
            ord_col => NULL,
            scr_name => 'mygroupcount'));

The output is the following:

Species CNT
---------- ----------
virginica 50
setosa 50
versicolor 50
3 rows selected.

Run the same script with IRIS data and return the XML output. The PAR_LST argument specifies the special control argument oml_graphics_flag to capture images rendered in the script. Both structured data and images are included in the XML output. The XML output is a CLOB; you can call set long [length] to get more output.

set long 300
SELECT *
    FROM table(
        pyqGroupEval(
            inp_nam => 'IRIS',
            par_lst => '{"oml_input_type":"pandas.DataFrame", "oml_graphics_flag":true, "oml_parallel_flag":true", "oml_service_level":"MEDIUM"}',
            out_fmt => 'XML',
            grp_col => 'Species',
            ord_col => NULL,
            scr_name => 'mygroupcount'));

The output is the following.

NAME VALUE
virginica <root><Py-data><pandas_dataFrame><ROW-pandas_dataFrame><Species>virginica</Species><CNT>50</CNT></ROW-pandas_dataFrame></pandas_dataFrame></Py-data><images><image><img src="data:image/pngbase64"><![CDATA[iVBORw0KGgoAAAANSUhEUgAAAoAAAAHgCAYAAAA10dzkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMu
setosa <root><Py-data><pandas_dataFrame><ROW-pandas_dataFrame><Species>setosa</Species><CNT>50</CNT></ROW-pandas_dataFrame></pandas_dataFrame></Py-data><images><image><img src="data:image/pngbase64"><![CDATA[iVBORw0KGgoAAAANSUhEUgAAAoAAAAHgCAYAAAA10dzkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMyw
versicolor <root><Py-data><pandas_dataFrame><ROW-pandas_dataFrame><Species>versicolor</Species><CNT>50</CNT></ROW-pandas_dataFrame></pandas_dataFrame></Py-data><images><image><img src="data:image/pngbase64"><![CDATA[iVBORw0KGgoAAAANSUhEUgAAAoAAAAHgCAYAAAA10dzkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjM

Run the same script with IRIS data and get the PNG output. The PAR_LST argument specifies the special control argument oml_graphics_flag to capture images.

column name format a7
column value format a15
column title format a16
column image format a15
SELECT *
    FROM table(
        pyqGroupEval(
            inp_nam => 'IRIS',
            par_lst => '{"oml_input_type":"pandas.DataFrame", "oml_graphics_flag":true}',
            out_fmt => 'PNG',
            grp_col => 'Species',
            ord_col => NULL,
            scr_name => 'mygroupcount'));

The output is the following:

NAME            ID VALUE           TITLE            IMAGE
------- ---------- --------------- ---------------- ---------------
GROUP_s          1 [{"Species":"se setosa           89504E470D0A1A0
etosa              tosa","CNT":50}                  A0000000D494844
                   ]                                520000028000000
                                                    1E0080600000035
                                                    D1DCE4000000397
                                                    4455874536F6674
                                                    77617265004D617
                                                    4706C6F746C6962
                                                    2076657273696F6
                                                    E332E332E332C20
                                                    6874747073

NAME            ID VALUE           TITLE            IMAGE
------- ---------- --------------- ---------------- ---------------

GROUP_v          1 [{"Species":"ve versicolor       89504E470D0A1A0
ersicol            rsicolor","CNT"                  A0000000D494844
or                 :50}]                            520000028000000
                                                    1E0080600000035
                                                    D1DCE4000000397
                                                    4455874536F6674
                                                    77617265004D617
                                                    4706C6F746C6962
                                                    2076657273696F6
                                                    E332E332E332C20

NAME            ID VALUE           TITLE            IMAGE
------- ---------- --------------- ---------------- ---------------
                                                    6874747073

GROUP_v          1 [{"Species":"vi virginica        89504E470D0A1A0
irginic            rginica","CNT":                  A0000000D494844
a                  50}]                             520000028000000
                                                    1E0080600000035
                                                    D1DCE4000000397
                                                    4455874536F6674
                                                    77617265004D617
                                                    4706C6F746C6962
                                                    2076657273696F6