A.6 rqRowEval Function

The rqRowEval function executes the R function in the script specified by the EXP_NAM parameter.

You pass data to the R function with the INP_CUR parameter. You can pass arguments to the R function with the PAR_CUR parameter. The ROW_NUM parameter specifies the number of rows that should be passed to each invocation of the R function. The last chunk may have fewer rows than the number specified.

The rqRowEval function supports data-parallel execution, in which one or more R engines perform the same R function, or task, on disjoint chunks of data. Oracle Database handles the management and control of the potentially multiple R engines that run on the database server machine, automatically chunking and passing data to the R engines executing in parallel. Oracle Database ensures that R function executions for all chunks of rows complete, or the rqRowEval function returns an error.

The R function returns an R data.frame object, which appears as a SQL table in the database. You define the form of the returned value with the OUT_QRY parameter.

Syntax

rqRowEval (
     INP_CUR     REF CURSOR     IN
     PAR_CUR     REF CURSOR     IN
     OUT_QRY     VARCHAR2       IN
     ROW_NUM     NUMBER         IN
     EXP_NAM     VARCHAR2       IN)

Parameters

Table A-1 Parameters of the rqRowEval Function

Parameter Description

INP_CUR

A cursor that specifies the data to pass to the R function specified by the EXP_NAME parameter.

PAR_CUR

A cursor that contains argument values to pass to the R function.

OUT_QRY

One of the following:

  • NULL, which returns a serialized object that can contain both data and image objects.

  • A SQL SELECT statement that specifies the column names and data types of the table returned by rqEval. Any image data is discarded. You can provide a prototype row using the dual dummy table or you can base the SELECT statement on an existing table or view. The R function must return a data.frame.

  • The string 'XML', which specifies that the table returned contains a CLOB that is an XML string. The XML can contain both structured data and images, with structured or semi-structured R objects first, followed by the image or images generated by the R function. Images are returned as a base 64 encoding of the PNG representation.

  • The string 'PNG', which specifies that the table returned contains a BLOB that has the image or images generated by the R function in PNG format.

ROW_NUM

The number of rows to include in each invocation of the R function.

EXP_NAM

The name of a script in the OML4R script repository.

Return Value

Function rqRowEval returns a table that has the structure specified by the OUT_QRY parameter value.

Examples

This example uses the C50 package to score churn data (that is, to predict which customers are likely to churn) using C5.0 decision tree models. The example scores the customers from the specified state in parallel. This example produces the same result as the invocation of function ore.rowApply in Example 6-16.

Tip:

This example uses the CHURN_TEST table and the myXLevels datastore created by Example 6-16 so in R you should invoke the functions that create the table and that get the xlevels object and save it in the myXLevels datastore in Example 6-16 before running this example.

Example A-8 Using an rqRowEval Function

This example creates a user-defined function and saves the function in the OML4R script repository. The user-defined function creates a C5.0 model for a state and saves the model in a datastore. In this example, the user-defined function myC5.0FunctionForLevels uses the list of levels created in Example 6-16. The function myC5.0FunctionForLevels returns the value TRUE.

This example creates the PL/SQL package churnPkg and the function churnGroupEval. The example declares a cursor to get the names of the datastores that include the string myC5.0modelFL and then executes a PL/SQL block that deletes those datastores. The example next executes a SELECT statement that invokes the churnGroupEval function. The churnGroupEval function invokes the myC5.0FunctionForLevels function to generate the C5.0 models and save them in datastores.

The example then creates the myScoringFunction function and stores it in the script repository. The function scores a C5.0 model for the levels of a state and returns the results in a data.frame.

Finally, the example executes a SELECT statement that invokes the rqRowEval function. The input cursor to the rqRowEval function uses the PARALLEL hint to specify the degree of parallelism to use. The cursor specifies the CHURN_TEST table as the data source and filters the rows to include only those for Massachusetts. All rows processed use the same predictive model.

The parameters cursor specifies the ore.connect control argument to connect to OML4R on the database server and specifies values for the datastorePrefix and xlevelsDatastore arguments to the myScoringFunction function.

The SELECT statement for the OUT_QRY parameter specifies the format of the output. The ROW_NUM parameter specifies 200 as the number of rows to process at a time in each parallel R engine. The EXP_NAME parameter specifies myScoringFunction in the script repository as the R function to invoke.

BEGIN
  sys.rqScriptDrop('myC5.0FunctionForLevels');
  sys.rqScriptCreate('myC5.0FunctionForLevels',
    'function(dat, xlevelsDatastore, datastorePrefix) {
       library(C50)
       state <- dat[1,"state"]
       datastoreName <- paste(datastorePrefix, dat[1, "state"], sep = "_")
       dat$state <- NULL
       ore.load(name = xlevelsDatastore) # To get the xlevels object.
       for (j in names(xlevels))
         dat[[j]] <- factor(dat[[j]], levels = xlevels[[j]])
       c5mod <- C5.0(churn ~ ., data = dat, rules = TRUE)
       ore.save(c5mod, name = datastoreName)
       TRUE
      }');
END;
/

CREATE OR REPLACE PACKAGE churnPkg AS
  TYPE cur IS REF CURSOR RETURN CHURN_TEST%ROWTYPE;
END churnPkg;
/
CREATE OR REPLACE FUNCTION churnGroupEval(
  inp_cur churnPkg.cur,
  par_cur SYS_REFCURSOR, 
  out_qry VARCHAR2,
  grp_col VARCHAR2,
  exp_txt CLOB)
RETURN SYS.AnyDataSet
PIPELINED PARALLEL_ENABLE (PARTITION inp_cur BY HASH ("state"))
CLUSTER inp_cur BY ("state")
USING rqGroupEvalImpl;
/

DECLARE
 CURSOR c1
  IS
    SELECT dsname FROM RQUSER_DATASTORELIST WHERE dsname like 'myC5.0modelFL%';

BEGIN
  FOR dsname_st IN c1
  LOOP
    rqDropDataStore(dsname_st.dsname);
  END LOOP;
END;

SELECT *
  FROM table(churnGroupEval(
    cursor(SELECT * /*+ parallel(t,4) */ FROM CHURN_TEST t),
    cursor(SELECT 1 AS "ore.connect",
           'myXLevels' as "xlevelsDatastore",
           'myC5.0modelFL' AS "datastorePrefix" FROM dual),
    'XML', 'state', 'myC5.0FunctionForLevels'));

BEGIN
  sys.rqScriptDrop('myScoringFunction');
  sys.rqScriptCreate('myScoringFunction',
    'function(dat, xlevelsDatastore, datastorePrefix) {
       library(C50)
       state <- dat[1, "state"]
       datastoreName <- paste(datastorePrefix, state, sep = "_")
       dat$state <- NULL
       ore.load(name = xlevelsDatastore) # To get the xlevels object.
       for (j in names(xlevels))
         dat[[j]] <- factor(dat[[j]], levels = xlevels[[j]])
       ore.load(name = datastoreName)
       res <- data.frame(pred = predict(c5mod, dat, type = "class"),
                         actual= dat$churn,
                         state = state)
       res
     }');
END;
/

SELECT * FROM table(rqRowEval(
  cursor(select /*+ parallel(t, 4) */ *
         FROM CHURN_TEST t
         WHERE "state" = 'MA'),
  cursor(SELECT 1 as "ore.connect", 
         'myC5.0modelFL' as "datastorePrefix",
         'myXLevels' as "xlevelsDatastore"
         FROM dual),
  'SELECT ''aaa'' "pred",''aaa'' "actual" , ''aa'' "state" FROM dual',
   200, 'myScoringFunction'));

In Oracle SQL Developer, the results of the last SELECT statement are:

pred actual state
---- ------ -----
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   yes    MA    
yes  yes    MA    
yes  yes    MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
yes  yes    MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   no     MA    
no   yes    MA    
no   no     MA    
no   no     MA    
no   no     MA    
yes  yes    MA    
no   no     MA    
no   no     MA    
no   no     MA    
 
 38 rows selected