10.5.5 pyqGroupEval Function (On-Premises Database)
This topic describes the pyqGroupEval
function when used
in an on-premises Oracle Database. The pyqGroupEval
function groups data by one
or more columns and runs a user-defined Python function on each group.
The pyqGroupEval
function runs the user-defined Python
function specified by the EXP_NAM
parameter. Pass data to the Python function with the INP_NAM
parameter. Pass arguments to the Python function with the
PAR_QRY
parameter. Specify one or
more grouping columns with the GRP_COL
parameter.
The Python function can return a boolean
, a
dict
, a float
, an int
, a
list
, a str
, a tuple
or a
pandas.DataFrame
object. Define the form of the returned value with the
OUT_QRY
parameter.
Syntax
pyqGroupEval (
INP_NAM VARCHAR2 IN
PAR_QRY VARCHAR2 IN
OUT_QRY VARCHAR2 IN
GRP_COL VARCHAR2 IN
EXP_NAM VARCHAR2 IN)
Parameters
Parameter | Description |
---|---|
|
The name of a table or view that specifies the data
to pass to the Python function specified by the
|
|
A JSON string that contains additional parameters to
pass to the user-defined Python function specified by the
For example, to specify the input data type as
|
|
The format of the output returned by the function. It can be one of the following:
|
GRP_COL |
The names of the grouping columns by which to partition the data. Use
commas to separate multiple columns. For example, to group by
|
|
The name of a user-defined Python function in the OML4Py script repository. |
Returns
Function pyqGroupEval
returns a table that has the structure
specified by the OUT_QRY
parameter value.
Example 10-18 Using the pyqGroupEval Function
This example defines the Python function create_iris_table
and
stores it with the name create_iris_table in the OML4Py script
repository. It then invokes pyqEval
, which invokes the user-definded Python
function and creates the IRIS database table. The example creates the package
irisPkg
and uses that package in specifying the data cursor to pass to
the irisGroupEval
function, which is a user-defined
pyqGroupEval
function. It defines another Python function,
group_count
and stores it in the script repository with the name
mygroupcount. The example then invokes the irisGroupEval
function and
passes it the Python function saved with the name mygroupcount.
In a PL/SQL block, define the Python function
create_iris_table
and store in the script repository with the name
create_iris_table.
BEGIN
sys.pyqScriptCreate('create_iris_table',
'def create_iris_table():
from sklearn.datasets import load_iris
import pandas as pd
iris = load_iris()
x = pd.DataFrame(iris.data, columns = ["Sepal_Length",\
"Sepal_Width","Petal_Length","Petal_Width"])
y = pd.DataFrame(list(map(lambda x: {0:"setosa", 1: "versicolor",\
2: "virginica"}[x], iris.target)),\
columns = ["Species"])
return pd.concat([y, x], axis=1)');
END;
/
Invoke the pyqEval
function to create the database table
IRIS
, using the Python function stored with the name
create_iris_table
in the script repository.
CREATE TABLE IRIS AS
(SELECT * FROM pyqEval(
NULL,
'{"Species":"VARCHAR2(10)","Sepal_Length":"number",
"Sepal_Width":"number","Petal_Length":"number",
"Petal_Width":"number"}',
'create_iris_table'
));
Define the Python function group_count
and store it with the
name mygroupcount
in the script repository. The function returns a
pandas.DataFrame
generated on each group of data
dat
.
BEGIN
sys.pyqScriptCreate('mygroupcount',
'def group_count(dat):
import pandas as pd
return pd.DataFrame([(dat["Species"][0], dat.shape[0])],\
columns = ["Species", "CNT"]) ');
END;
/
Issue a query that invokes the pyqGroupEval
function. In the
function, the INP_NAM
argument specifies
the data in the IRIS table to pass to the function.
The PAR_QRY
argument
specifies the special control argument oml_input_type
.
The OUT_QRY
argument
specifies a JSON string that contains the column names and data types of the table returned
by pyqGroupEval
.
The GRP_COL
parameter
specifies the column to group by.
The EXP_NAM
parameter
specifies the user-defined Python function stored with the name
mygroupcount
in the script repository.
SELECT *
FROM table(
pyqGroupEval(
'IRIS',
'{"oml_input_type":"pandas.DataFrame"}',
'{"Species":"varchar2(10)", "CNT":"number"}',
'Species',
'mygroupcount'));
The output is the following.
Species CNT
---------- ----------
setosa 50
versicolor 50
virginica 50