6.2.1 Arguments for Functions that Run Scripts

The Oracle R Enterprise embedded R execution functions ore.doEval, ore.tableApply, ore.groupApply, ore.rowApply, and ore.indexApply have arguments that are common to some or all of the functions. Some of the functions also have an argument that is unique to the function.

This section describes the arguments in the following topics:

See Also:

  • For function signatures and more details about function arguments, see the online help displayed by invoking help(ore.doEval)

  • For examples of the use of the arguments, see "Using the ore.doEval Function" and the other topics on using the embedded R execution functions

6.2.1.1 Input Function to Execute

The embedded R execution functions all require a function to apply during the execution of the script. You specify the input function with one of the following mutually exclusive arguments:

  • FUN

  • FUN.NAME (and optional FUN.OWNER)

The FUN argument takes a function object as a directly specified function or as one assigned to an R variable. Only a user with the RQADMIN role can use the FUN argument when invoking an embedded R function.

The FUN.NAME argument specifies a script that is stored in the Oracle R Enterprise R script repository. A stored script contains the function to apply when the script runs. Any Oracle R Enterprise user can use the FUN.NAME argument when invoking an embedded R function.

The optional argument FUN.OWNER specifies the owner of a script in the R script repository. The owner is the user who created the script. Use this argument only with the FUN.NAME argument. When FUN.NAME is a private script to which you have been granted read privilege access, use FUN.OWNNER to specify the owner of the private script.

The RQSYS schema is the owner of public scripts and the predefined Oracle R Enterprise scripts. For a list of the predefined scripts, invoke help(“ore.doEval”) and see the description of the FUN.NAME argument. If FUN.OWNNER is not specified or is NULL, then Oracle R Enterprise looks for the owner in the following order: user of the current session, RQSYS. If the owner of the script is not current user or RQSYS, then an error occurs.

Note:

The Oracle R Enterprise advanced analytics functions in the OREmodels package, ore.glm, ore.lm, ore.neural, and ore.randomForest, use the embedded R execution framework internally and cannot be used in embedded R execution functions.

6.2.1.2 Optional and Control Arguments

All of the embedded R execution functions take optional arguments, which can be named or not. Oracle R Enterprise passes user-defined optional arguments to the input function. You can pass any number of optional arguments to the input function, including complex R objects such as models.

Arguments that start with ore. are special control arguments. Oracle R Enterprise does not pass them to the input function, but instead uses them to control what happens before or after the execution of that function. The following control arguments are supported:

  • ore.connect controls whether to automatically connect to Oracle R Enterprise inside the embedded R execution function. This is equivalent to doing an ore.connect call with the same credentials as the client session. The default value is FALSE.

    If an automatic connection is enabled, the following functionality occurs:

    • The embedded R script is connected to the database.

    • The connection has the same credentials as the session that invokes the embedded R SQL function.

    • The script runs in an autonomous transaction.

    • ROracle queries can work with the automatic connection.

    • Oracle R Enterprise transparency layer functionality is enabled in the embedded script.

  • ore.drop controls the input data. If the option value is TRUE, a one column data.frame is converted to a vector. The default value is TRUE.

  • ore.envAsEmptyenv controls whether an environment referenced in an object is replaced with an empty environment during serialization. Some types of input parameters and returned objects, such as list and formula, are serialized before being saved to the database. If the control argument value is TRUE, then the referenced environment in the object is replaced with an empty environment whose parent is .GlobalEnv and the objects in the original referenced environment are not serialized. In some cases, this can significantly reduce the size of serialized objects. If the control argument value is FALSE, then all of the objects in the referenced environment are serialized and can be unserialized and recovered later. The default value is regulated by the global option ore.envAsEmptyenv.

  • ore.na.omit controls the handling of missing values in the input data. If you specify ore.na.omit = TRUE, then rows or vector elements, depending on the ore.drop setting, that contain missing values are removed from the input data. If all of the rows in a chunk contain missing values, then the input data for that chunk will be an empty data.frame or vector. The default value is FALSE.

  • ore.graphics controls whether to start a graphical driver and look for images. The default value is TRUE.

  • ore.png.* specifies additional arguments for the png graphics driver if ore.graphics is TRUE. The naming convention for these arguments is to add an ore.png. prefix to the arguments of the png function. For example, if ore.png.height is supplied, argument height is passed to the png function. If not set, the standard default values for the png function are used.

See Also:

For more details about control arguments, see the online help displayed by invoking help(ore.doEval)

6.2.1.3 Structure of Return Value

Another argument that applies to all of the embedded R execution functions is FUN.VALUE. If the FUN.VALUE argument is NULL, then the ore.doEval and ore.tableApply function can return a serialized R object as an ore.object class object, and the ore.groupApply, ore.indexApply, and ore.rowApply functions return an ore.list object. However, if you specify a data.frame or an ore.frame with the FUN.VALUE argument, then the function returns an ore.frame that has the structure of the specified data.frame or ore.frame object.

To specify that the corresponding output column of an ore.frame have a CLOB or BLOB database data type, you can apply the attribute ora.type to a column of a FUN.VALUE data.frame. For an example of using ora.type, see Example 6-11.

6.2.1.4 Input Data

The ore.doEval and ore.indexApply functions do not automatically receive any data from the database. They simply execute the function specified by the FUN or FUN.NAME argument. Any data needed by the input function is either generated within that function or explicitly retrieved from a data source such as Oracle Database, other databases, or flat files. The input function can load data from a file or a table using the ore.pull function or other transparency layer function.

The ore.tableApply, ore.groupApply, and ore.rowApply functions require a database table as input data. The table is represented by an ore.frame. You supply that data with an ore.frame object that you specify with the X argument, which is the first argument to the embedded R execution function. The embedded R execution function passes the ore.frame object to the user-defined input function as the first argument to that function.

Note:

The data represented by the ore.frame object passed to the user-defined R function is copied from Oracle Database to the database server R engine. The R memory limitations apply. If your database server machine has 32 GB RAM and your data table is 64 GB, then Oracle R Enterprise cannot load the data into the R engine memory.

6.2.1.5 Parallel Execution

The ore.groupApply, ore.indexApply, and ore.rowApply functions take the parallel argument. That argument specifies the degree of parallelism to use in the embedded R execution of the input function. See "Support for Parallel Execution".

6.2.1.6 Unique Arguments

The ore.groupApply, ore.indexApply, and ore.rowApply functions each take an argument unique to the function.

The ore.groupApply function takes the INDEX argument, which specifies the name of a column by which the rows of the input data are partitioned for processing by the input function.

The ore.indexApply function takes the times argument, which specifies the number of times to execute the input function.

The ore.rowApply function tales the rows argument, which specifies the number of rows to pass to each invocation of the input function.