2 Get Started with Oracle Machine Learning for R

Start using OML4R by connecting to an Oracle Database instance, creating OML4R objects, and storing them in the database.

This chapter discusses these topics:

2.1 Connect to an Oracle Database Instance

To use Oracle Machine Learning for R, you first connect to an Oracle Database instance.

2.1.1 About Connecting to the Database

Oracle Machine Learning for R client components connect an R session to an Oracle Database instance and the OML4R server components.

The connection makes the data in a database schema available to the R user. It also makes the processing power, memory, and storage capacities of the database server available to the R session through the OML4R client interface.

The following topics discuss connecting and disconnecting an R session to an Oracle Database instance:

2.1.1.1 About Using the ore.connect Function

To begin using OML4R, you first connect to a schema in an Oracle Database instance with the ore.connect function.

Only one OML4R connection can exist at a time during an R session. If an R session is already connected to the database, then invoking ore.connect terminates the active connection before opening a new connection. Before attempting to connect, you can discover whether an active connection exists by using the ore.is.connected function.

You explicitly end a connection with the ore.disconnect function. If you do not invoke ore.disconnect, then the connection is automatically terminated when the R session ends.

With the type argument of ore.connect, you specify the type of connection, either ORACLE or HIVE. A HIVE type of connection connects to Hive tables in a Hadoop cluster. An ORACLE type of connection connects to a schema in an Oracle Database instance. The default value of type is "ORACLE".

If the connection type is HIVE, then ore.connect ignores all other arguments. The HIVE option applies only if you are using Oracle Machine Learning for Spark (OML4Spark) in conjunction with a Hadoop cluster. OML4Spark is part of the Oracle Big Data Connectors option to the Big Data Appliance.

If the connection type is ORACLE, then you do the following:

  • Use the logical all argument to specify whether OML4R automatically creates an ore.frame object for each table to which the user has access in the schema and makes those ore.frame objects visible in the current R session. The ore.frame objects contain metadata about the tables. The default value of the all argument is FALSE.

    If all = TRUE, then OML4R implicitly invokes the ore.sync and ore.attach functions. If all = FALSE, then the user must explicitly invoke ore.sync to create ore.frame objects. To access these objects by name, the user must invoke ore.attach to include the names in the search path.

  • Use either the conn_string argument, or various combinations of the user, sid, host, password, port, service_name, and conn_string arguments to specify information that identifies the connection.

    To avoid using a clear-text password, you can specify an Oracle wallet password with the conn_string argument. No other arguments are needed. By specifying an Oracle wallet password, you can avoid embedding a database user password in application code, batch jobs, or scripts.

    With the other connection identifier arguments, you specify a database user name, host name, and password, and either a system identifier (SID) or service name, and, optionally, a TCP port, or you specify a database user name, password, and a conn_string argument.

    The default value of the port argument is 1521, the default value of host is "localhost", which specifies the local host, and the default value of conn_string is NULL. You specify the local host when your R session is running on the same computer as the Oracle Database instance to which you want to connect.

2.1.1.2 About Using the ore.disconnect Function

To explicitly end the connection between an R session and the Oracle Database instance, invoke the ore.disconnect function.

OML4R implicitly invokes ore.disconnect if you do either of the following:

  • Quit the R session.

  • Invoke ore.connect while an OML4R connection is already active.

When you disconnect the active connection, OML4R discards all OML4R objects that you have not explicitly saved in an OML4R datastore.

2.1.2 Use the ore.connect and ore.disconnect Functions

The examples in this section demonstrate the various ways of specifying an OML4R connection to an Oracle Database instance.

The examples use sample values for the ore.connect argument values. Replace the sample values with the appropriate values for connecting to your database.

Example 2-1 Using ore.connect and Specifying a SID

This example invokes the ore.connect function and specifies the user, sid, host, password, and port arguments.

ore.connect(user = "oml_user", sid = "sales", host = "sales-server",
            password = "oml_userStrongPassword", port = 1521 )

Example 2-2 Using ore.connect and Specifying a Service Name

This example demonstrates using a service name rather than a SID. It also specifies connecting to the local host.

ore.connect(user = "oml_user", host = "localhost", 
            password = "oml_userStrongPassword",
            service_name = "sales.example.com")

Example 2-3 Using ore.connect and Specifying an Easy Connect String

This example uses the conn_string argument to specify an easy connect string that identifies the connection.

ore.connect(user = "oml_user", password = "oml_userStrongPassword", 
            conn_string = "sales-server:1521:sales
              (ADDRESS=(PROTOCOL=tcp) (HOST=sales-server) (PORT=1521))
              (CONNECT_DATA=(SERVICE_NAME=sales.example.com)))")

Example 2-4 Using ore.connect and Specifying a Full Connection String

This example uses the conn_string argument to specify a full connection string that identifies the connection.

ore.connect(user = "oml_user", password = "oml_userStrongPassword", 
            conn_string = "DESCRIPTION=
              (ADDRESS=(PROTOCOL=tcp) (HOST=sales-server) (PORT=1521))
              (CONNECT_DATA=(SERVICE_NAME=myserver.example.com))")

Example 2-5 Using the conn_string Argument to Specify an Oracle Wallet

This example uses the conn_string argument to specify an Oracle wallet. The mydb_test string is the connection identifier for the Oracle database. The Oracle wallet contains the information needed to create the connection. For information on creating an Oracle wallet for an OML4R connection, see Oracle Machine Learning for R Installation and Administration Guide.

ore.connect(conn_string = "mydb_test")

Example 2-6 Using the conn_string Argument and Specifying an Empty Connection String

This example uses an empty connection string to connect to the local host.

ore.connect(user = "oml_user", password = "oml_userStrongPassword", conn_string = "")

Example 2-7 Using the conn_string Argument in Connecting to a Pluggable Database

This example connects to a pluggable database using the conn_string argument to specify a service name.

ore.connect(conn_string = "pdb1.example.com")

Example 2-8 Using the service_name Argument in Connecting to a Pluggable Database

This example invokes ore.connect using a service name, host name, and port number to connect to a pluggable database.

ore.connect(service_name = "pdb1.example.com", host = "mypdb", port = 1521)

Example 2-9 Disconnecting an OML4R Session

This example explicitly disconnects an OML4R session from an Oracle database.

ore.disconnect()

2.2 Create and Manage R Objects in Oracle Database

With transparency layer functions you can connect to an Oracle Database instance and interact with data structures in a database schema.

You can move data to and from the database and create database tables. You can also save R objects in the database. The OML4R functions that perform these actions are described in the following topics.

2.2.1 Create R Objects for In-Database Data

Using Oracle Machine Learning for R, you can create R proxy objects in your R session for database-resident data.

Creating proxy objects is described in the following topics.

2.2.1.1 About Creating R Objects for Database Objects

To gain access to the data in the database tables in the schema, you use the ore.sync function.

When you invoke ore.connect in an R session, Oracle Machine Learning for R creates a connection to a schema in an Oracle Database instance. The ore.sync function creates an ore.frame object that is a proxy for a table in a schema. You can use the ore.attach function to add an R environment that represents a schema in the R search path.

When you use the ore.sync function to create an ore.frame object as a proxy for a database table, the name of the ore.frame proxy object is the same as the name of the database object. Each ore.frame proxy object contains metadata about the corresponding database object.

You can use the proxy ore.frame object to select data from the table. When you execute an R operation that selects data from the table, the operation returns the current data from the database object. However, if some application has added a column to the table, or has otherwise changed the metadata of the database object, the ore.frame proxy object does not reflect such a change until you again invoke ore.sync for the database object.

If you invoke the ore.sync function with no tables specified, and if the value of the all argument was FALSE in the ore.connect function call that established the connection to the Oracle database instance, then the ore.sync function creates a proxy object for each table in the schema specified by ore.connect. You can use the table argument to specify the tables for which you want to create ore.frame proxy objects.

Tip:

To conserve memory resources and save time, you should only add proxies for the tables that you want to use in your R session.

With the schema argument, you can specify the schema for which you want to create an R environment and proxy objects. Only one environment for a given database schema can exist at a time. With the use.keys argument, you can specify whether you want to use primary keys in the table to order the ore.frame object.

Tip:

Ordering is expensive in the database. Because most operations in R do not need ordering, you should generally set use.keys to FALSE unless you need ordering for sampling data or some other purpose.

With the query argument, you can specify a SQL SELECT statement. This enables you to create an ore.frame for a query without creating a view in the database. This can be useful when you do not have the CREATE VIEW system privilege for the current schema. You cannot use the schema argument and the query argument in the same ore.sync invocation.

You can use the ore.ls function to list the ore.frame proxy objects that correspond to database tables in the environment for a schema. You can use the ore.exists function to find out if an ore.frame proxy object for a database table exists in an R environment. The function returns TRUE if the proxy object exists or FALSE if it does not. You can remove an ore.frame proxy object from an R environment with the ore.rm function.

2.2.1.2 Synchronize Data with the ore.sync Function

The following example demonstrates the use of the ore.sync function.

The example first invokes the ore.exec function to create some tables to represent tables existing in the OML_USER database schema. The example then invokes ore.sync and specifies three tables of the schema. The ore.sync invocation creates an R environment for the OML_USER schema and creates proxy ore.frame objects for the specified tables in that schema. The example lists the ore.frame proxy objects in the current environment. The TABLE3 table exists in the schema but does not have an ore.frame proxy object because it was not included in the ore.sync invocation.

The example next invokes ore.sync with the query argument to create ore.frame objects for the specified SQL queries. The example lists the ore.frame objects again.

The example then invokes ore.sync again and creates an R environment for the SH schema and proxy objects in that environment for the specified tables in that schema. The example invokes the ore.exists function to find out if the specified table exists in the current environment and then in the SH environment. The example lists the R objects in the SH environment.

The example next removes the ore.frame objects QUERY1, QUERY2, and TABLE4 from the OML_USER environment. Finally, the example lists the proxy objects in the environment again.

Note:

The ore.rm function invocation removes the ore.frame that is a proxy for the TABLE4 table from the environment. It does not delete the table from the schema.

Example 2-10 Using ore.sync to Add ore.frame Proxy Objects to an R Environment

# After connecting to a database as OML_USER, create some tables.
ore.exec("CREATE TABLE TABLE1 AS SELECT * FROM dual")
ore.exec("CREATE TABLE TABLE2 AS SELECT * FROM dual")
ore.exec("CREATE TABLE TABLE3 AS SELECT * FROM dual")
ore.exec("CREATE TABLE TABLE4 AS SELECT * FROM dual")
# Create ore.frame objects for the specified tables.
ore.sync(table = c("TABLE1", "TABLE3", "TABLE4"))
# List the ore.frame proxy objects in the current environment.
ore.ls()
# Create ore.frame objects for the specified queries.
ore.sync(query = c("QUERY1" = "SELECT 0 X, 1 Y FROM dual",
                   "QUERY2" = "SELECT 1 X, 0 Y FROM dual"))
ore.ls()
# The OML_USER user has been granted SELECT permission on the tables in the 
# SH schema.
ore.sync("SH", table = c("CUSTOMERS", "SALES"))
# Find out if the CUSTOMERS ore.frame exists in the OML_USER environment.
ore.exists("CUSTOMERS")
# Find out if it exists in the SH environment.
ore.exists("CUSTOMERS", schema = "SH")
# List the ore.frame proxy objects in the SH environment.
ore.ls("SH")
# Remove the ore.frame objects for the specified objects.
ore.rm(c("QUERY1", "QUERY2", "TABLE4"))
# List the ore.frame proxy objects in the current environment again.
ore.ls()
Listing for This Example
R> # After connecting to a database as OML_USER, create some tables.
R> ore.exec("CREATE TABLE TABLE1 AS SELECT * FROM dual")
R> ore.exec("CREATE TABLE TABLE2 AS SELECT * FROM dual")
R> ore.exec("CREATE TABLE TABLE3 AS SELECT * FROM dual")
R> ore.exec("CREATE TABLE TABLE4 AS SELECT * FROM dual")
R> # Create ore.frame objects for the specified tables.
R> ore.sync(table = c("TABLE1", "TABLE3", "TABLE4"))
R> # List the ore.frame proxy objects in the current environment.
R> ore.ls()
 [1] "TABLE1"     "TABLE3"     "TABLE4"
R> # Create ore.frame objects for the specified queries.
R> ore.sync(query = c("QUERY1" = "SELECT 0 X, 1 Y FROM dual",
+                     "QUERY2" = "SELECT 1 X, 0 Y FROM dual"))
R> ore.ls()
 [1] "QUERY1"     "QUERY2"     "TABLE1"     "TABLE3"     "TABLE4"
R> # The OML_USER user has been granted SELECT permission on the tables in the 
R> # SH schema.
R> ore.sync("SH", table = c("CUSTOMERS", "SALES"))
R> # Find out if the CUSTOMERS ore.frame exists in the OML_USER environment.
R> ore.exists("CUSTOMERS")
[1] FALSE
R> # Find out if it exists in the SH environment.
R> ore.exists("CUSTOMERS", schema = "SH")
[1] TRUE
R> # List the ore.frame proxy objects in the SH environment.
R> ore.ls("SH")
[1] "CUSTOMERS" "SALES"
R> # Remove the ore.frame objects for the specified objects.
R> ore.rm(c("QUERY1", "QUERY2", "TABLE4"))
R> # List the ore.frame proxy objects in the current environment again.
R> ore.ls()
 [1] "TABLE1"     TABLE3"
2.2.1.3 Get Objects with the ore.get Function

After you have created an R environment and ore.frame proxy objects with ore.sync, you can get a proxy object by name with the ore.get function.

You can use ore.get to get the proxy ore.frame for a table and assign it to a variable in R, as in SH_CUST <- ore.get(name = "CUSTOMERS", schema = "SH"). The ore.frame exists in the R global environment, which can be referred to using .GlobalEnv, and so it appears in the list returned by the ls function. Also, because this object exists in the R global environment, as opposed an R environment that represents a database schema, it is not listed by the ore.ls function.

Example 2-11 Using ore.get to Get a Database Table

This example invokes the ore.sync function to create an ore.frame object that is a proxy for the CUSTOMERS table in the SH schema. The example then gets the dimensions of the proxy object.

ore.sync(schema = "SH", table = "CUSTOMERS", use.keys = FALSE)
dim(ore.get(name = "CUSTOMERS", schema = "SH"))
Listing for Example 2-11
R> ore.sync(schema = "SH", table = "CUSTOMERS", use.keys = FALSE)
R> dim(ore.get(name = "CUSTOMERS", schema = "SH"))
[1] 630  15
2.2.1.4 Add a Schema with the ore.attach Function

With ore.attach, you add an R environment for a database schema to the R search path.

When you add the R environment, you have access to database tables by name through the proxy objects created by the ore.sync function without needing to specify the schema environment.

The default schema is the one specified in creating the connection and the default position in the search path is 2. You can specify the schema and the position in the ore.attach function invocation.. You can also specify whether you want the ore.attach function to indicate whether a naming conflict occurs when adding the environment. You can detach the environment for a schema from the R search path with the ore.detach function.

Example 2-12 Using ore.attach to Add an Environment for a Database Schema

This example demonstrates the use of the ore.attach function. Comments in the example explain the function invocations.

# Connected as oml_user.
# Add the environment for the oml_user schema to the R search path.
ore.attach()
# Create an unordered ore.frame proxy object in the SH environment for the
# specifed table.
ore.sync(schema = "SH", table = "CUSTOMERS", use.keys = FALSE)
# Add the environment for the SH schema to the search path and warn if naming
# conflicts exist.
ore.attach("SH", 3, warn.conflicts = TRUE)
# Display the number of rows and columns in the proxy object for the table.
dim(CUSTOMERS)
# Remove the environment for the SH schema from the search path.
ore.detach("SH")
# Invoke the dim function again.
dim(CUSTOMERS)
Listing for Example 2-12
R> # Connected as oml_user.
R> # Add the environment for the oml_user schema to the R search path.
R> ore.attach()
R> # Create an unordered ore.frame proxy object in the SH environment for the
R> # specifed table.
R> ore.sync(schema = "SH", table = "CUSTOMERS", use.keys = FALSE)
R> # Add the environment for the SH schema to the search path and warn if naming
R> # conflicts exist.
R> ore.attach("SH", 3, warn.conflicts = TRUE)
R> # Display the number of rows and columns in the proxy object for the table.
R> dim(CUSTOMERS)
[1] 630  15
R> # Remove the environment for the SH schema from the search path.
R> ore.detach("SH")
R> # Invoke the dim function again.
R> dim(CUSTOMERS)
Error: object 'CUSTOMERS' not found

2.2.2 Create Ordered and Unordered ore.frame Objects

Oracle Machine Learning for R provides the ability to create ordered or unordered ore.frame objects.

The following topics describe this feature.

2.2.2.1 About Ordering in ore.frame Objects

R objects such as vector and data.frame have an implicit ordering of their elements.

The data in an Oracle Database table is not necessarily ordered. For some R operations, ordering is useful whereas for other operations it is unnecessary. By ordering an ore.frame, you are able to index the ore.frame object by using either integer or character indexes.

Using an ordered ore.frame object that is a proxy for a SQL query can be time-consuming for a large data set. Therefore, although OML4R attempts to create ordered ore.frame objects by default, it also provides the means of creating an unordered ore.frame object.

When you invoke the ore.sync function to create an OML4R ore.frame object as a proxy for a SQL query, you can use the use.keys argument to specify whether the ore.frame can be ordered or must be unordered.

An ore.frame object can be ordered if one or more of the following conditions are true:

  • The value of the use.keys argument of the ore.sync function is TRUE and a primary key is defined on the underlying table

  • The row names of the ore.frame constitute a unique tuple

  • The ore.frame object is produced by certain functions such as aggregate and cbind

  • All of the ore.frame objects that are input arguments to relevant OML4R functions are ordered

An ore.frame object is unordered if one or more of the following conditions are true:

  • The value of the use.keys argument of the ore.sync function is FALSE

  • No primary key is defined on the underlying table and either the row names of the ore.frame object are not specified or the row names of the ore.frame object are set to NULL

  • One or more of the ore.frame objects that are input arguments to relevant OML4R functions are unordered

An unordered ore.frame object has null row names. You can determine whether an ore.frame object is ordered by invoking is.null on the row names of the objects, as shown in the last lines of Example 2-13. If the ore.frame object is unordered, is.null returns an error.

2.2.2.2 Global Options Related to Ordering

OML4R has options that relate to the ordering of an ore.frame object.

The ore.warn.order global option specifies whether you want OML4R to display a warning message if you use an unordered ore.frame object in a function that requires ordering. If you know what to expect in an operation, then you might want to turn the warnings off so they do not appear in the output. For examples of the warning messages, see Example 2-13 and Example 2-14.

You can see what the current setting is, or turn the option on or off, as in the following example.

R> options("ore.warn.order")
$ore.warn.order
[1] TRUE
R> options("ore.warn.order" = FALSE)
R> options("ore.warn.order" = TRUE)

With the ore.sep option, you can specify the separator between the row name values that you use for multi-column keys, as in the following example.

R> options("ore.sep")
$ore.sep
[1] "|"

R> options("ore.sep" = "/")
R> options("ore.sep" = "|")
2.2.2.3 Ordering Using Keys

You can use the primary key of a database table to order an ore.frame object.

The following example loads the spam data set from the kernlab package. It adds two columns to the data set.

The example invokes ore.drop to drop the named tables if they exist. It then invokes ore.create to create two tables from the data set. It invokes ore.exec to make the USERID and TS columns a composite primary key of the SPAM_PK table, and invokes ore.sync to synchronize the table with its ore.frame proxy.

Note:

The ore.exec function executes a SQL statement in the Oracle Database schema. The function is intended for database definition language (DDL) statements that have no return value.

The example then displays the first eight rows of each table. The proxy object for the SPAM_PK table is an ordered ore.frame object. It has row names that are a combination of the TS and USERID column values separated by the "|" character. The proxy object for the SPAM_NOPK table is an unordered ore.frame object that has the symbol SPAM_NOPK. By default, SPAM_NOPK has row names that are sequential numbers.

Example 2-13 Ordering Using Keys

# Prepare the data.
library(kernlab)
data(spam)
s <- spam
# Create a column that has integer values.
s$TS <- 1001:(1000 + nrow(s))
# Create a column that has integer values with each number repeated twice.
s$USERID <- rep(351:400, each=2, len=nrow(s))
# Ensure that the database tables do not exist.
ore.drop(table='SPAM_PK')
ore.drop(table='SPAM_NOPK')
# Create database tables.
ore.create(s[,c(59:60,1:28)], table="SPAM_PK")
ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK")
# Using a SQL statement, alter the SPAM_PK table to add a composite primary key.
ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key
           (\"USERID\",\"TS\")")
# Synchronize the table to get the change to it.
ore.sync(table = "SPAM_PK")
# View the data in the tables.
# The row names of the ordered SPAM_PK are the primary key column values.
head(SPAM_PK[,1:8])
# The row names of the unordered SPAM_NOPK are sequential numbers.
# The first warning results from the inner accessing of SPAM_NOPK to subset
# the columns. The second warning is for the invocation of the head
# function on that subset.
head(SPAM_NOPK[,1:8])
# Verify that SPAM_NOPK is unordered.
is.null(row.names(SPAM_NOPK))
Listing for This Example
R> # Prepare the data.
R> library(kernlab)
R> data(spam)
R> s <- spam
R> # Create a column that has integer values.
R> s$TS <- 1001:(1000 + nrow(s))
R> # Create a column that has integer values with each number repeated twice.
R> s$USERID <- rep(351:400, each=2, len=nrow(s))
R> # Ensure that the database tables do not exist.
R> ore.drop(table='SPAM_PK')
R> ore.drop(table='SPAM_NOPK')
R> # Create database tables.
R> ore.create(s[,c(59:60,1:28)], table="SPAM_PK")
R> ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK")
R> # Using a SQL statement, alter the SPAM_PK table to add a composite primary key.
R> ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key
+  (\"USERID\",\"TS\")")
R> # Synchronize the table to get the change to it.
R> ore.sync(table = "SPAM_PK")
R> # View the data in the tables.
R> # The row names of the ordered SPAM_PK are the primary key column values.
R> head(SPAM_PK[,1:8])
           TS USERID make address  all num3d  our over
1001|351 1001    351 0.00    0.64 0.64     0 0.32 0.00
1002|351 1002    351 0.21    0.28 0.50     0 0.14 0.28
1003|352 1003    352 0.06    0.00 0.71     0 1.23 0.19
1004|352 1004    352 0.00    0.00 0.00     0 0.63 0.00
1005|353 1005    353 0.00    0.00 0.00     0 0.63 0.00
1006|353 1006    353 0.00    0.00 0.00     0 1.85 0.00
R> # The row names of the unordered SPAM_NOPK are sequential numbers.
R> # The first warning results from the inner accessing of SPAM_NOPK to subset
R> # the columns. The second warning is for the invocation of the head
R> # function on that subset.
R> head(SPAM_NOPK[,1:8])
    TS USERID make address  all num3d  our over
1 1001    351 0.00    0.64 0.64     0 0.32 0.00
2 1002    351 0.21    0.28 0.50     0 0.14 0.28
3 1003    352 0.06    0.00 0.71     0 1.23 0.19
4 1004    352 0.00    0.00 0.00     0 0.63 0.00
5 1005    353 0.00    0.00 0.00     0 0.63 0.00
6 1006    353 0.00    0.00 0.00     0 1.85 0.00
Warning messages:
1: ORE object has no unique key - using random order 
2: ORE object has no unique key - using random order
R> # Verify that SPAM_NOPK is unordered.
R> is.null(row.names(SPAM_NOPK))
Error: ORE object has no unique key
2.2.2.4 Ordering Using Row Names

You can use row names to order an ore.frame object.

The following example creates a data.frame object in the local R session memory and pushes it to the ore.frame object with the symbol a, which exists in the memory of the Oracle database to which the R session is connected. The example shows that the ore.frame object has the default row names of the R data.frame object. Because the ore.frame object is ordered, invoking the row.names function on it does not produce a warning message.

The example uses the ordered SPAM_PK and unordered SPAM_NOPK ore.frame objects to show that invoking row.names on the unordered SPAM_NOPK produces a warning message but invoking it on the ordered SPAM_PK does not.

The SPAM_PK object is ordered by the row names, which are the combined values of the TS and USERID column values separated by the "|" character. The example shows that you can change the row names.

Example 2-14 Ordering Using Row Names

# Prepare the data.
library(kernlab)
data(spam)
s <- spam
# Create a column that has integer values.
s$TS <- 1001:(1000 + nrow(s))
# Create a column that has integer values with each number repeated twice.
s$USERID <- rep(351:400, each=2, len=nrow(s))
# Ensure that the database tables do not exist.
ore.drop(table='SPAM_PK')
ore.drop(table='SPAM_NOPK')
# Create database tables.
ore.create(s[,c(59:60,1:28)], table="SPAM_PK")
ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK")
# Using a SQL statement, alter the SPAM_PK table to add a composite primary key.
ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key
           (\"USERID\",\"TS\")")
# Synchronize the table to get the change to it.
ore.sync(table = "SPAM_PK")
# Create an ordered ore.frame by default.
a <- ore.push(data.frame(a=c(1:10,10:1), b=letters[c(1:10,10:1)]))
# Display the values in the b column. Note that because the ore.frame is
# ordered, no warnings appear.
a$b
# Display the default row names for the first six rows of the a column.
row.names(head(a))
# SPAM_NOPK has no unique key, so row.names raises error messages.
row.names(head(SPAM_NOPK))
# Row names consist of TS ‘|' USERID.
# For display on this page, only the first four row names are shown.
row.names(head(SPAM_PK))
# Reassign the row names to the TS column only
row.names(SPAM_PK) <- SPAM_PK$TS
# The row names now correspond to the TS values only.
row.names(head(SPAM_PK[,1:4]))
head(SPAM_PK[,1:4])
Listing for This Example
R> # Prepare the data.
R> library(kernlab)
R> data(spam)
R> s <- spam
R> # Create a column that has integer values.
R> s$TS <- 1001:(1000 + nrow(s))
R> # Create a column that has integer values with each number repeated twice.
R> s$USERID <- rep(351:400, each=2, len=nrow(s))
R> # Ensure that the database tables do not exist.
R> ore.drop(table='SPAM_PK')
R> ore.drop(table='SPAM_NOPK')
R> # Create database tables.
R> ore.create(s[,c(59:60,1:28)], table="SPAM_PK")
R> ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK")
R> # Using a SQL statement, alter the SPAM_PK table to add a composite primary key.
R> ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key
+  (\"USERID\",\"TS\")")
R> # Synchronize the table to get the change to it.
R> ore.sync(table = "SPAM_PK")
R> # Create an ordered ore.frame by default.
R> a <- ore.push(data.frame(a=c(1:10,10:1), b=letters[c(1:10,10:1)]))
R> # Display the values in the b column. Note that because the ore.frame is
R> # ordered, no warnings appear.
R> a$b
 [1] a b c d e f g h i j j i h g f e d c b aLevels: a b c d e f g h i j
R> # Display the default row names for the first six rows of the a column.
R> row.names(head(a))
[1] 1 2 3 4 5 6
R> # SPAM_NOPK has no unique key, so row.names raises error messages.
R> row.names(head(SPAM_NOPK))
Error: ORE object has no unique key
In addition: Warning message:
ORE object has no unique key - using random order
R> # Row names consist of TS ‘|' USERID.
R> # For display on this page, only the first four row names are shown.
R> row.names(head(SPAM_PK))
        1001|351         1002|351         1003|352         1004|352
"1001|3.51E+002" "1002|3.51E+002" "1003|3.52E+002" "1004|3.52E+002"
R> # Reassign the row names to the TS column only
R> row.names(SPAM_PK) <- SPAM_PK$TS
R> # The row names now correspond to the TS values only.
R> row.names(head(SPAM_PK[,1:4]))
[1] 1001 1002 1003 1004 1005 1006
R> head(SPAM_PK[,1:4])
       TS USERID make address
1001 1001    351 0.00    0.64
1002 1002    351 0.21    0.28
1003 1003    352 0.06    0.00
1004 1004    352 0.00    0.00
1005 1005    353 0.00    0.00
1006 1006    353 0.00    0.00
2.2.2.5 Using Ordered Frames

This example shows the result of merging two ordered ore.frame objects and two unordered ore.frame objects.

Example 2-15 Merging Ordered and Unordered ore.frame Objects

# Prepare the data.
library(kernlab)
data(spam)
s <- spam
# Create a column that has integer values.
s$TS <- 1001:(1000 + nrow(s))
# Create a column that has integer values with each number repeated twice.
s$USERID <- rep(351:400, each=2, len=nrow(s))
# Ensure that the database tables do not exist.
ore.drop(table='SPAM_PK')
ore.drop(table='SPAM_NOPK')
# Create database tables.
ore.create(s[,c(59:60,1:28)], table="SPAM_PK")
ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK")
# Using a SQL statement, alter the SPAM_PK table to add a composite primary key.
ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key
           (\"USERID\",\"TS\")")
# Synchronize the table to get the change to it.
ore.sync(table = "SPAM_PK")
# Create objects for merging data from unordered ore.frame objects.
x <- SPAM_NOPK[,1:4]
y <- SPAM_NOPK[,c(1,2,4,5)]
m1 <- merge(x, y, by="USERID")
# The merged result m1 produces a warning because it is not an ordered frame.
head(m1,3)
# Create objects for merging data from ordered ore.frame objects.
x <- SPAM_PK[,1:4]
y <- SPAM_PK[,c(1,2,4,5)]
# The merged result m1 does not produce a warning now because it is an 
# ordered frame.
m1 <- merge(x, y, by="USERID")
head(m1,3)
Listing for This Example
R> # Prepare the data.
R> library(kernlab)
R> data(spam)
R> s <- spam
R> # Create a column that has integer values.
R> s$TS <- 1001:(1000 + nrow(s))
R> # Create a column that has integer values with each number repeated twice.
R> s$USERID <- rep(351:400, each=2, len=nrow(s))
R> # Ensure that the database tables do not exist.
R> ore.drop(table='SPAM_PK')
R> ore.drop(table='SPAM_NOPK')
R> # Create database tables.
R> ore.create(s[,c(59:60,1:28)], table="SPAM_PK")
R> ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK")
R> # Uing a SQL statement, alter the SPAM_PK table to add a composite primary key.
R> ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key
+          (\"USERID\",\"TS\")")
R> # Synchronize the table to get the change to it.
R> ore.sync(table = "SPAM_PK")
R> # Create objects for merging data from unordered ore.frame objects.
R> x <- SPAM_NOPK[,1:4]
R> y <- SPAM_NOPK[,c(1,2,4,5)]
R> m1 <- merge(x, y, by="USERID")
R> # The merged result m1 produces a warning because it is not an ordered frame.
R> head(m1,3)
  USERID TS.x make address.x TS.y address.y  all
1    351 5601 0.00         0 1001      0.64 0.64
2    351 5502 0.00         0 1001      0.64 0.64
3    351 5501 0.78         0 1001      0.64 0.64
Warning messages:
1: ORE object has no unique key - using random order 
2: ORE object has no unique key - using random order
R> # Create objects for merging data from ordered ore.frame objects.
R> x <- SPAM_PK[,1:4]
R> y <- SPAM_PK[,c(1,2,4,5)]
R> # The merged result m1 does not produce a warning now because it is an 
R> # ordered frame.
R> m1 <- merge(x, y, by="USERID")
R> head(m1,3)
          USERID TS.x make address.x TS.y address.y  all
1001|1001    351 1001    0      0.64 1001      0.64 0.64
1001|1002    351 1001    0      0.64 1002      0.28 0.50
1001|1101    351 1001    0      0.64 1101      0.00 0.00

2.2.3 Move Data to and from the Database

You can create a temporary database table, and its corresponding proxy ore.frame object, from a local R object with the ore.push function.

With the ore.pull function you can create a local R object that contains a copy of data represented by an OML4R proxy object.

The ore.push function translates an R object into an OML4R object of the appropriate data type. The ore.pull function takes an ore class object and returns an R object. If the input object is an ore.list, the ore.pull function creates a data.frame and translates each the data of each database column into the appropriate R representation.

Note:

You can pull data to a local R data.frame only if the data can fit into the R session memory. Also, even if the data fits in memory but is still very large, you may not be able to perform many, or any, R functions in the client R session.

Unless you explicitly save them, the temporary database tables and their corresponding OML4R proxy objects that you create with the ore.push function are discarded when you quit the R session.

See Also:

Example 2-16 Using ore.push and ore.pull to Move Data

This example demonstrates pushing an R data.frame object to the database as a temporary database table with an associated ore.frame object, iris_of, then creating another ore.frame object, iris_of_setosa, by selecting one column from iris_of, and then pulling the iris_of_setosa object into the local R session memory as a data.frame object. The example displays the class of some of the objects.

class(iris)
# Push the iris data frame to the database.
iris_of <- ore.push(iris)
class(iris_of)
# Display the data type of the Sepal.Length column in the data.frame.
class(iris$Sepal.Length)
# Display the data type of the Sepal.Length column in the ore.frame.
class(iris_of$Sepal.Length)
# Filter one column of the data set.
iris_of_setosa <- iris_of[iris_of$Species == "setosa", ]
class(iris_of_setosa)
# Pull the selected column into the local R client memory.
local_setosa = ore.pull(iris_of_setosa)
class(local_setosa)
Listing for This Example
R> class(iris)
[1] "data.frame"
R> # Push the iris data frame to the database.
R> iris_of <- ore.push(iris)
R> class(iris_of)
[1] "ore.frame"
attr(,"package")
[1] "OREbase"
R> # Display the data type of the Sepal.Length column in the data.frame.
R> class(iris$Sepal.Length)
[1] "numeric"
R> # Display the data type of the Sepal.Length column in the ore.frame.
R> class(iris_of$Sepal.Length)
[1] "ore.numeric"
attr(,"package")
[1] "OREbase"
R> # Filter one column of the data set.
R> iris_of_setosa <- iris_of[iris_of$Species == "setosa", ]
R> class(iris_of_setosa)
[1] "ore.frame"
attr(,"package")
[1] "OREbase"
R> # Pull the selected column into the local R client memory.
R> local_setosa = ore.pull(iris_of_setosa)
R> class(local_setosa)
[1] "data.frame"

2.2.4 Create and Delete Database Tables

Use the ore.create function to create a persistent table in an Oracle Database schema.

Note:

When creating a table in Oracle Machine Learning for R, if you use lowercase or mixed case for the name of the table, then you must use the same lowercase or mixed case name in double quotation marks when using the table in a SQL query or function. If, instead, you use an all uppercase name when creating the table, then the table name is case-insensitive: you can use uppercase, lowercase, or mixed case when using the table without using double quotation marks. The same is true for naming columns in a table.

Creating the table automatically creates an ore.frame proxy object for the table in the R environment that represents your database schema. The proxy ore.frame object has the same name as the table. You can delete the persistent table in an Oracle Database schema with the ore.drop function.

Caution:

Only use the ore.drop function to delete a database table and its associated ore.frame proxy object. Never use it to remove an ore.frame object that is not associated with a permanent database table. To remove an ore.frame object for a temporary database table, use the ore.rm function.

Example 2-17 Using ore.create and ore.drop to Create and Drop Tables

This example creates tables in the database and drops some of them.

# Create the AIRQUALITY table from the data.frame for the airquality data set.
ore.create(airquality, table = "AIRQUALITY")
# Create data.frame objects.
df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
# Create the DF1 and DF2 tables from the data.frame objects.
ore.create(df1, "DF1")
ore.create(df2, "DF2")
# Create the CARS93 table from the data.frame for the Cars93 data set.
ore.create(Cars93, table = "CARS93")
# List the OML4R proxy objects.
ore.ls()
# Drop the CARS93 object.
ore.drop(table = "CARS93")
# List the OML4R proxy objects again.
ore.ls()

Listing for This Example

R> # Create the AIRQUALITY table from the data.frame for the airquality data set.
R> ore.create(airquality, table = "AIRQUALITY")
R> # Create data.frame objects.
R> df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
R> df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
R> # Create the DF1_TABLE and DF2_TABLE tables from the data.frame objects.
R> ore.create(df1, "DF1")
R> ore.create(df2, "DF2")
R> # Create the CARS93 table from the data.frame for the Cars93 data set.
R> ore.create(Cars93, table = "CARS93")
R> # List the OML4R proxy objects.
R> ore.ls()
[1] "AIRQUALITY"  "CARS93"  "DF1"  "DF2_"
R> # Drop the CARS93 object.
R> ore.drop(table = "CARS93")
R> # List the OML4R proxy objects again.
R> ore.ls()
[1] "AIRQUALITY"  "DF1_"  "DF2"

Note:

A text query having more than 4000 characters or storing a value of over 4000 characters in a CLOB column will result in an error stating “ORA-01704: string literal too long”. Use a bind variable if the data is large as shown below. For more information on bind variables see ROracle.

library(ROracle)
options(error = expression(NULL))
Sys.setlocale(‘LC_ALL’, ‘C’)
cat(‘\n Welcome to ROracle(OCI) World\n’);
cat(‘\n DBI Version : ’);
print(packageVersion(‘DBI’));
cat(‘\n’);
#Creating table whose fields are of different type
createStr <- ‘create table TMRQORABND1_TAB(row_num number, id1 clob)’;
insStr <- ‘insert into TMRQORABND1_TAB values(:1, :2)’;
selStr <- ‘select * from TMRQORABND1_TAB order by row_num’;
y <- ‘1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef’;
z <- y
z <- paste(y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, ‘1234567890abcdef1234567890abcdef’, sep = ‘’);
c32767 <- paste(z, z, z, z, z, z, z, z, y, y, y, y, y, y, y, y, y, y, y,
          ‘1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcde’,
          sep = ‘’)
print(nchar(c32767))
c32766 <- paste(z, z, z, z, z, z, z, z, y, y, y, y, y, y, y, y, y, y, y,
          ‘1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcd’,
          sep = ‘’)
print(nchar(c32766))
c32768 <- paste(z, z, z, z, z, z, z, z, y, y, y, y, y, y, y, y, y, y, y,
          ‘1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef’,
          sep = ‘’)
print(nchar(c32768))
y <- paste(y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, ‘1234567890abcdef1234567890abcdef’, sep = ‘’);
y1 <- y
y <- paste(y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, ‘1234567890abcdef1234567890abcdef’, sep = ‘’);
y2 <- y
y <- paste(y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
          y, y, ‘1234567890abcdef1234567890abcdef’, sep = ‘’);
y3 <- y
y4 <- paste(y3, y3, y3, y3, y3)
r1c2 <- paste(y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
             y, y, ‘1234567890abcdef1234567890abcdef’, sep = ‘’);
print(nchar(y));
drv <- dbDriver(‘Oracle’);
cat(‘\n ROracle driver allocated.\n’);
con <- dbConnect(drv,‘scott’,‘tiger’);
cat(‘\n One database connection object created.\n’);
#tryCatch(
#{
 if (dbExistsTable(con, ‘TMRQORABND1_TAB’))
   dbGetQuery(con,‘drop table TMRQORABND1_TAB’);
 dbGetQuery(con, createStr);
 cat(‘\nTable created with columns data type as raw(n) \n’);
 x <- 1;
 dbGetQuery(con, insStr, data.frame(x,r1c2));
 dbCommit(con);
 x <- c(2, 3, 4, 5, 6, 7, 8, 9, 10);
 yy <- c(y1, y2, y3, z, y4, c32767, c32766, c32768, ‘’);
 dbGetQuery(con, insStr, data.frame(x, yy));
 dbCommit(con)
 print(dbGetQuery(con, ‘select row_num, length(id1) from TMRQORABND1_TAB’));
 x <- 100;
 y <- paste(y, c32767, sep = ‘’);
 dbGetQuery(con, insStr, data.frame(x,y));
 dbCommit(con)
 s <- dbSendQuery(con, selStr)
 cinfo <- dbColumnInfo(s)
 print(dbGetQuery(con, ‘select row_num, length(id1) from TMRQORABND1_TAB’));
 res <- dbGetQuery(con, selStr)
 if (res[,2][1] != r1c2) {
   print(paste(‘Row’, res[,1][1], cinfo[,1][2], res[,2][1],
               ‘not equal to’, r1c2))
 } else {
   print(paste(‘Row’, res[,1][1], cinfo[,1][2], ‘length is’,
               nchar(res[,2][1]),
               ‘length of data is’, nchar(r1c2)), sep = ‘’)
 }
 for (i in 2:9)
 {
   if (res[,2][i] != yy[i-1]) {
     print(paste(‘Row’, res[,1][i], cinfo[,1][2], res[,2][i],
                 ‘not equal to’, yy[i-1]))
   } else {
     print(paste(‘Row’, res[,1][i], cinfo[,1][2], ‘length is’,
                 nchar(res[,2][i]),
                 ‘length of data is’, nchar(yy[i-1])), sep = ‘’)
   }
 }
 if (!is.na(res[,2][10])) {
   print(paste(‘Row’, res[,1][10], cinfo[,1][2], res[,2][10],
               ‘not equal to’, yy[9]))
 } else {
   print(paste(‘Row’, res[,1][10], cinfo[,1][2], ‘length is’,
               nchar(res[,2][10]),
               ‘length of data is’, nchar(yy[9])), sep = ‘’)
 }
 if (res[,2][11] != y) {
   print(paste(‘Row’, res[,1][11], cinfo[,1][2], res[,2][11],
               ‘not equal to’, y))
 } else {
   print(paste(‘Row’, res[,1][11], cinfo[,1][2], ‘length is’,
               nchar(res[,2][11]),
               ‘length of data is’, nchar(y)), sep = ‘’)
 }
#}, finally = {
 dbGetQuery(con,‘drop table TMRQORABND1_TAB’);
 cat(‘\n ROracle driver deallocated successfully.\n’);
 cat(‘Releasing resources...‘);
 dbDisconnect(con);
 cat(‘\n Connection with database removed successfully.\n’);
 dbUnloadDriver(drv);
 cat(‘done\n’);
#}) # tryCatch()

2.2.5 Save and Manage R Objects in the Database

Oracle Machine Learning for R provides datastores that you can use to save OML4R proxy objects, as well as any R object, in an Oracle database.

You can grant or revoke read privilege access to a datastore for one or more users. You can restore the saved objects in another R session. The objects in a datastore are also accessible to embedded R execution through both the R and the SQL interfaces.

This section describes the OML4R functions that you can use to create and manage datastores. The section contains the following topics:

2.2.5.1 About Persisting Oracle Machine Learning for R Objects

With OML4R datastores, you can save R objects in the database.

R objects, including OML4R proxy objects, exist for the duration of the current R session unless you explicitly save them. The standard R functions for saving and restoring R objects, save and load, serialize objects in R memory to store them in a file and deserialize them to restore them in memory. However, for OML4R proxy objects, those functions do not save the database objects associated with the proxy objects in an Oracle database; therefore the saved proxy objects do not behave properly in a different R session.

You can save OML4R proxy objects, as well as any R object, with the ore.save function. The ore.save function specifies an OML4R datastore. A datastore persists in the database when you end the R session. The datastore maintains the referential integrity of the objects it contains. Using the ore.load function, you can restore in another R session the objects in the datastore.

Using a datastore, you can do the following:

  • Save OML4R and other R objects that you create in one R session and restore them in another R session.

  • Pass arguments to R functions for use in embedded R execution.

  • Pass objects for use in embedded R execution. You could, for example, use a function in the OREdm package to build an Oracle Machine Learning for SQL model and save it in a datastore. You could then use that model to score data in the database through embedded R execution. For an example of using a datastore in an embedded R execution function, see Example 6-10.

The following table lists the functions that manipulate datastores and provides brief descriptions of them.

Table 2-1 Functions that Manipulate Datastores

Function Description

ore.datastore

Lists information about a datastore in the current Oracle database schema.

ore.datastoreSummary

Provides detailed information about the specified datastore in the current Oracle database schema.

ore.delete

Deletes a datastore from the current Oracle database schema.

ore.grant Grants read access to a datastore.

ore.lazyLoad

Lazily restores objects from a datastore into an R environment.

ore.load

Restores objects from a datastore into an R environment.

ore.revoke Revokes read access to a datastore.

ore.save

Saves R objects in a new or existing datastore.

See Also:

"Using Oracle R Enterprise Embedded R Execution" for information on using the R and the SQL interfaces to embedded R execution

2.2.5.2 About OML4R Datastores

Each database schema has a table that stores named OML4R datastores.

A datastore can contain OML4R objects and standard R objects.

You create a datastore with the ore.save function. When you create a datastore, you specify a name for it. You can save objects in one or more datastores.

As long as a datastore contains an OML4R proxy object for a database object, the database object persists between R sessions. For example, you could use the ore.odmNB function in the OREdm package to build an Oracle Machine Learning for SQL Naive Bayes model. If you save the resulting ore.odmNB object in a datastore and end the R session, then Oracle Database does not delete the OML4SQL model. If no datastore contains the ore.odmNB object and the R session ends, then the database automatically drops the model.

2.2.5.3 Save Objects to a Datastore

The ore.save function saves one or more R objects in the specified datastore.

By default, OML4R creates the datastore in the current user schema. With the arguments to ore.save, you can provide the names of specific objects, or provide a list of objects. You can specify whether read privilege access to the datastore can be granted to other users. You can specify a particular R environment to search for the objects you would like to save. The overwrite and append arguments are mutually exclusive. If you set the overwrite argument to TRUE, then you can replace an existing datastore with another datastore of the same name. If you set the append argument to TRUE, then you can add objects to an existing datastore. With the description argument, you can provide some descriptive text that appears when you get information about the datastore. The description argument has no effect when used with the append argument.

Example 2-18 Saving Objects and Creating a Datastore

This example demonstrates creating datastores using different combinations of arguments.

# Create some R objects.
df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
iris_of <- ore.push(iris)

# Create a database table and an OML4R proxy object for the table.
ore.drop("AIRQUALITY")
ore.create(airquality, table = "AIRQUALITY")

# List the R objects.
ls()

# List the OML4R proxy objects.
ore.ls()

# Save the proxy object and all objects in the current workspace environment
# to the datastore named ds1 and supply a description.
ore.save(AIRQUALITY, list = ls(), name = "ds1", description = "My private datastore")

# Create some more objects.
x <- stats::runif(20)  # x is an object of type numeric.
y <- list(a = 1, b = TRUE, c = "hoopsa")
z <- ore.push(x)  # z is an object of type ore.numeric.

# Create another datastore.
ore.save(x, y, name = "ds2", description = "x and y")

# Overwrite the contents of datastore ds2.
ore.save(x, name = "ds2", overwrite = TRUE, description = "only x")

# Append object z to datastore ds2.
ore.save(z, name = "ds2", append = TRUE)

Listing for This Example

R> # Create some R objects.
R> df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
R> df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
R> iris_of <- ore.push(iris)
R> 
R> # Create a database table and an OML4R proxy object for the table.
R> ore.drop("AIRQUALITY")
R> ore.create(airquality, table = "AIRQUALITY")
R> 
R> # List the R objects.
R> ls()
[1] "df1"     "df2"     "iris_of"
R> 
R> # List the OML4R proxy objects.
R> ore.ls()
[1] "AIRQUALITY"
R> 
R> # Save the proxy object and all objects in the current workspace environment
R> # to the datastore named ds1 and supply a description.
R> ore.save(AIRQUALITY, list = ls(), name = "ds1", description = "My datastore")
R> 
R> # Create some more objects.
R> x <- stats::runif(20)  # x is an object of type numeric.
R> y <- list(a = 1, b = TRUE, c = "hoopsa")
R> z <- ore.push(x)  # z is an object of type ore.numeric.
R> 
R> # Create another datastore.
R> ore.save(x, y, name = "ds2", description = "x and y")
R> 
R> # Overwrite the contents of datastore ds2.
R> ore.save(x, name = "ds2", overwrite = TRUE, description = "only x")
R> 
R> # Append object z to datastore ds2.
R> ore.save(z, name = "ds2", append = TRUE)
2.2.5.4 Control Access to Datastores

With the ore.grant and ore.revoke functions you can grant or revoke access to an OML4R datastore.

With the ore.grant and ore.revoke functions, you can control access to datastores. You can grant read access to a specified user to a datastore that you own or revoke the access privilege. The functions ore.save, ore.load, ore.datastore, and ore.datastoreSummary have arguments related to the accessibility of datastores.

Note:

If you use ore.create to create a persistent database table and its proxy ore.frame object, then save the proxy ore.frame object in a grantable datastore, and then use ore.grant to grant read privilege access to the datastore, the access applies only to the ore.frame object. The read access does not extend to the persistent database table. To grant read permission to the table itself, you must execute an appropriate SQL command.

Example 2-19 Granting and Revoking Access to a Datastore

This example pushes the airquality data set from the local R session to the Oracle database, where it exists as the ore.frame object AIRQUALITY and as a temporary database table with the same name. The example then saves the AIRQUALITY object to the datastore ds3 and specifies that access to the datastore can be granted to other users. It invokes function ore.datastore with type = grantable to display all of the datastores to which read access has been granted. It grants the read privilege for the ds3 datastore to SCOTT. It then invokes ore.datastore with type = grant to display the datastores to which read access has been granted. It revokes the read privilege for SCOTT, and again displays the datastores to which access has been granted.

AIRQUALITY <- ore.push(airquality)
ore.save(AIRQUALITY, name = "ds3", 
         description = "My datastore 3", grantable  = TRUE)
ore.datastore(type = "grantable")
ore.datastore(type = "grant")
ore.grant("ds3", type = "datastore", user = "SCOTT")
ore.datastore(type = "grant")
ore.revoke("ds3", type = "datastore", user = "SCOTT")
ore.datastore(type = "grant")

Listing for This Example

R> AIRQUALITY <- ore.push(airquality)
R> ore.save(AIRQUALITY, name = "ds3", 
+           description = "My datastore 3", grantable  = TRUE)
R> ore.datastore(type = "grantable")
  datastore.name object.count  size       creation.date    description
1            ds3            1  1451 2015-11-30 18:48:25 My datastore 3
R> ore.datastore(type = "grant")
[1] datastore.name grantee       
<0 rows> (or 0-length row.names)
R> ore.grant("ds3", type = "datastore", user = "SCOTT")
R> ore.datastore(type = "grant")
  datastore.name grantee
1            ds3   SCOTT
R> ore.revoke("ds3", type = "datastore", user = "SCOTT")
R> ore.datastore(type = "grant")
[1] datastore.name grantee       
<0 rows> (or 0-length row.names)
2.2.5.5 Get Information about Datastore Contents

You can get information about a datastore in the current user schema by using the ore.datastore and ore.datastoreSummary functions.

Using the ore.datastore function, you can list basic information about datastores. To get information about a specific type of datastore, you can use the optional character string type argument. The valid values for typeare the following:

  • user, which lists the datastores created by current session user. This is the default value.

  • private, which lists the datastores for which read access cannot be granted by the current session user to other users.

  • all, which lists all of the datastores to which the current session user has read access.

  • grantable, which lists the datastores the read privilege for which can be granted by the current session user to other users.

  • grant, which lists the datastores the read privilege for which has been granted by the current session user to other users.

  • granted, which lists the datastores the read privilege for which has been granted by other users to the current session user.

If you do not specify a type, then function ore.datastore returns a data.frame object with columns that correspond to the datastore name, the number of objects in the datastore, the datastore size, the creation date, and a description. Rows are sorted by column datastore.name in alphabetical order. If you do specify a type, then the function returns a data.framethat has a column for the specified type.

You can search for a datastore by name or by using a regular expression pattern.

The ore.datastoreSummary function returns information about the R objects saved within a datastore in the user schema in the connected database. The function returns a data.frame with columns that correspond to object name, object class, object size, and either the length of the object, if it is a vector, or the number of rows and columns, if it is a data.frame object. It takes one required argument, the name of a datastore, and has an optional argument, the owner of the datastore.

Example 2-20 Using the ore.datastore Function

This example demonstrates using the ore.datastore function.

# Create some R objects.df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
iris_of <- ore.push(iris)

# Create a database table and an OML4R proxy object for the table.
ore.drop("AIRQUALITY")
ore.create(airquality, table = "AIRQUALITY")
                                                                            
# Save the objects to a datastore named ds1 and supply a description.
ore.save(AIRQUALITY, list = ls(), name = "ds1", description = "My private datastore")

# Create some more objects.
x <- stats::runif(20) # x is an object of type numeric.
y <- list(a = 1, b = TRUE, c = "hoopsa")
z <- ore.push(x) # z is an object of type ore.numeric.

# Create other datastores.
ore.save(x, y, name = "ds2", description = "x and y")
ore.save(df1, df2, name = "dfobj", description = "df objects")
ore.save(x, y, z, name = "another_ds", description = "For pattern matching")

# List all of the datastore objects.
ore.datastore()

# List the specified datastore.
ore.datastore("ds1")

# List the datastore objects with names that include "ds".
ore.datastore(pattern = "ds")

Listing for This Example

R> # Create some R objects.
R> df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
R> df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
R> iris_of <- ore.push(iris)
R> 
R> # Create a database table and an OML4R proxy object for the table.
R> ore.drop("AIRQUALITY")
R> ore.create(airquality, table = "AIRQUALITY")
R>                                                                             
R> # Save the objects to a datastore named ds1 and supply a description.
R> ore.save(AIRQUALITY, list = ls(), name = "ds1", description = "My private datastore")
R>
R> # Create some more objects.
R> x <- stats::runif(20) # x is an object of type numeric.
R> y <- list(a = 1, b = TRUE, c = "hoopsa")
R> z <- ore.push(x) # z is an object of type ore.numeric.
R>
R> # Create other datastores.
R> ore.save(x, y, name = "ds2", description = "x and y")
R> ore.save(df1, df2, name = "dfobj", description = "df objects")
R> ore.save(x, y, z, name = "another_ds", description = "For pattern matching")
R> 
R> # List all of the datastore objects.
R> ore.datastore()
  datastore.name object.count size       creation.date          description
1     another_ds            3 1284 2017-04-21 16:08:57 For pattern matching
2          dfobj            2  656 2017-04-21 16:08:38           df objects
3            ds1            4 3439 2017-04-21 16:03:55 My private datastore
4            ds2            2  314 2017-04-21 16:04:32              x and y

R> # List the specified datastore.
R> ore.datastore("ds1")
  datastore.name object.count size       creation.date          description
1            ds1            4 3439 2017-04-21 16:03:55 My private datastore

R> # List the datastore objects with names that include "ds".
R> ore.datastore(pattern = "ds")
  datastore.name object.count size       creation.date          description
1     another_ds            3 1284 2017-04-21 16:08:57 For pattern matching
2            ds1            4 3439 2017-04-21 16:03:55 My private datastore
3            ds2            2  314 2017-04-21 16:04:32              x and y

Example 2-21 Using the ore.datastoreSummary Function

This example demonstrates using the ore.datastoreSummary function. The example uses the datastores created in the previous example.

ore.datastoreSummary("ds1")
ore.datastoreSummary("ds2")

Listing for This Example

R> ore.datastoreSummary("ds1")
  object.name      class size length row.count col.count
1  AIRQUALITY  ore.frame 1213      6        NA         6
2         df1 data.frame  328      2         5         2
3         df2 data.frame  328      2         5         2
4     iris_of  ore.frame 1570      5        NA         5
R> ore.datastoreSummary("ds2")
  object.name   class size length row.count col.count
1           x numeric  182     20        NA        NA
2           y    list  132      3        NA        NA
2.2.5.6 Restore Objects from a Datastore

The ore.load function restores R objects saved in a datastore to the R global environment, .GlobalEnv.

The function returns a character vector that contains the names of the restored objects.

You can load all of the saved objects or you can use the list argument to specify the objects to load. With the envir argument, you can specify an environment in which to load objects.

Example 2-22 Using the ore.load Function to Restore Objects from a Datastore

This example demonstrates using the ore.load function to restore objects from datastores that were created in Example 2-20. The example runs in the same R session as that example.

# List the R objects.
ls()

# List the datastores.
ore.datastore()

# Delete the x and z objects.
rm(x, z)
ls()

# Restore all of the objects in datastore ds2.
ore.load("ds2")

ls()

# After ending the R session and starting another session.
ls()
# The datastore objects persist between sessions.
ore.datastore()

# Restore some of the objects from datastore ds1.
ore.load("ds1", list = c("df1", "df2", "iris_of"))
ls()
Listing for Example 2-22
R> # List the R objects.
R> ls()
[1] "df1"     "df2"     "iris_of" "x"       "y"       "z"
R> 
R> # List the datastores.
R> ore.datastore()
  datastore.name object.count size       creation.date  description
1     another_ds            3 1243 2014-07-24 13:31:56 For pattern mattching
2          dfobj            2  656 2014-07-24 13:31:46            df objects
3            ds1            4 3162 2014-07-24 13:25:17          My datastore
4            ds2            2 1111 2014-07-24 13:27:26                only x
R> 
R> # Delete the x and z objects.
R> rm(x, z)
R> ls()
[1] "df1"     "df2"     "iris_of" "y"
R> 
R> # Restore all of the objects in datastore ds2.
R> ore.load("ds2")
[1] "x" "z"
R> 
R> ls()
[1] "df1"     "df2"     "iris_of" "x"       "y"       "z"
R> 
R> # After ending the R session and starting another session.
R> ls()
character(0)
R> # The datastore objects persist between sessions.
R> ore.datastore()
  datastore.name object.count size       creation.date           description
1     another_ds            3 1243 2014-07-24 13:31:56 For pattern mattching
2          dfobj            2  656 2014-07-24 13:31:46            df objects
3            ds1            4 3162 2014-07-24 13:25:17          My datastore
4            ds2            2 1111 2014-07-24 13:27:26                only x

R> # Restore some of the objects from datastore ds1.
R> ore.load("ds1", list = c("df1", "df2", "iris_of"))
[1] "df1"     "df2"     "iris_of"
R> ls()
[1] "df1"     "df2"     "iris_of"
2.2.5.7 Delete a Datastore

With the ore.delete function, you can delete objects from an OML4R datastore or you can delete the datastore itself.

To delete a datastore, you specify the name of it. To delete one or more objects from the datastore, you specify the list argument. The ore.delete function returns the name of the deleted objects or datastore.

When you delete a datastore, OML4R discards all temporary database objects that were referenced by R objects in the deleted datastore. If you have saved an R object in more than one datastore, then OML4R discards a temporary database object only when no object in a datastore references the temporary database object.

Example 2-23 Using the ore.delete Function

This example demonstrates using ore.delete to delete an object from a datastore and then to delete the entire datastore. The example uses objects created in Example 2-18.

# Delete the df2 object from the ds1 datastore.
ore.delete("ds1", list = "df2")
# Delete the datastore named ds1.
ore.delete("ds1")
Listing for Example 2-23
R> # Delete the df2 object from the ds1 datastore.
R> ore.delete("ds1", list = "df2")[1] "df2"
R> # Delete the datastore named ds1.
R> ore.delete("ds1")
[1] "ds1"
2.2.5.8 About Using a Datastore in Embedded R Execution

Saving objects in a datastore makes it very easy to pass arguments to, and reference R objects with, embedded R execution functions.

You can save objects that you create in one R session in a single datastore in the database. You can pass the name of this datastore to an embedded R function as an argument for loading within that function. You can use a datastore to easily pass one object or multiple objects.