2 Get Started with Oracle Machine Learning for R
Start using OML4R by connecting to an Oracle Database instance, creating OML4R objects, and storing them in the database.
This chapter discusses these topics:
- Connect to an Oracle Database Instance
To use Oracle Machine Learning for R, you first connect to an Oracle Database instance. - Create and Manage R Objects in Oracle Database
With transparency layer functions you can connect to an Oracle Database instance and interact with data structures in a database schema.
2.1 Connect to an Oracle Database Instance
To use Oracle Machine Learning for R, you first connect to an Oracle Database instance.
- About Connecting to the Database
Oracle Machine Learning for R client components connect an R session to an Oracle Database instance and the OML4R server components. - Use the ore.connect and ore.disconnect Functions
The examples in this section demonstrate the various ways of specifying an OML4R connection to an Oracle Database instance.
Parent topic: Get Started with Oracle Machine Learning for R
2.1.1 About Connecting to the Database
Oracle Machine Learning for R client components connect an R session to an Oracle Database instance and the OML4R server components.
The connection makes the data in a database schema available to the R user. It also makes the processing power, memory, and storage capacities of the database server available to the R session through the OML4R client interface.
The following topics discuss connecting and disconnecting an R session to an Oracle Database instance:
- About Using the ore.connect Function
To begin using OML4R, you first connect to a schema in an Oracle Database instance with theore.connect
function. - About Using the ore.disconnect Function
To explicitly end the connection between an R session and the Oracle Database instance, invoke theore.disconnect
function.
Parent topic: Connect to an Oracle Database Instance
2.1.1.1 About Using the ore.connect Function
To begin using OML4R, you first connect to a schema in an Oracle Database instance with the ore.connect
function.
Only one OML4R connection can exist at a time during an R session. If an R session is already connected to the database, then invoking ore.connect
terminates the active connection before opening a new connection. Before attempting to connect, you can discover whether an active connection exists by using the ore.is.connected
function.
You explicitly end a connection with the ore.disconnect
function. If you do not invoke ore.disconnect
, then the connection is automatically terminated when the R session ends.
With the type
argument of ore.connect
, you specify the type of connection, either ORACLE or HIVE. A HIVE type of connection connects to Hive tables in a Hadoop cluster. An ORACLE type of connection connects to a schema in an Oracle Database instance. The default value of type
is "ORACLE"
.
If the connection type is HIVE, then ore.connect
ignores all other arguments. The HIVE option applies only if you are using Oracle Machine Learning for Spark (OML4Spark) in conjunction with a Hadoop cluster. OML4Spark is part of the Oracle Big Data Connectors option to the Big Data Appliance.
If the connection type is ORACLE, then you do the following:
-
Use the logical
all
argument to specify whether OML4R automatically creates anore.frame
object for each table to which the user has access in the schema and makes thoseore.frame
objects visible in the current R session. Theore.frame
objects contain metadata about the tables. The default value of theall
argument isFALSE
.If
all = TRUE
, then OML4R implicitly invokes theore.sync
andore.attach
functions. Ifall = FALSE
, then the user must explicitly invokeore.sync
to createore.frame
objects. To access these objects by name, the user must invokeore.attach
to include the names in the search path. -
Use either the
conn_string
argument, or various combinations of theuser
,sid
,host
,password
,port
,service_name
, andconn_string
arguments to specify information that identifies the connection.To avoid using a clear-text password, you can specify an Oracle wallet password with the
conn_string
argument. No other arguments are needed. By specifying an Oracle wallet password, you can avoid embedding a database user password in application code, batch jobs, or scripts.With the other connection identifier arguments, you specify a database user name, host name, and password, and either a system identifier (SID) or service name, and, optionally, a TCP port, or you specify a database user name, password, and a
conn_string
argument.The default value of the
port
argument is 1521, the default value ofhost
is"localhost"
, which specifies the local host, and the default value ofconn_string
is NULL. You specify the local host when your R session is running on the same computer as the Oracle Database instance to which you want to connect.
See Also:
-
"Using the ore.connect and ore.disconnect Functions" for examples of using the various connection identifiers
-
Oracle Machine Learning for R Installation and Administration Guide for information on creating an Oracle wallet.
Parent topic: About Connecting to the Database
2.1.1.2 About Using the ore.disconnect Function
To explicitly end the connection between an R session and the Oracle Database instance, invoke the ore.disconnect
function.
OML4R implicitly invokes ore.disconnect
if you do either of the following:
-
Quit the R session.
-
Invoke
ore.connect
while an OML4R connection is already active.
When you disconnect the active connection, OML4R discards all OML4R objects that you have not explicitly saved in an OML4R datastore.
Parent topic: About Connecting to the Database
2.1.2 Use the ore.connect and ore.disconnect Functions
The examples in this section demonstrate the various ways of specifying an OML4R connection to an Oracle Database instance.
The examples use sample values for the ore.connect
argument values. Replace the sample values with the appropriate values for connecting to your database.
Example 2-1 Using ore.connect and Specifying a SID
This example invokes the ore.connect
function and specifies the user
, sid
, host
, password
, and port
arguments.
ore.connect(user = "oml_user", sid = "sales", host = "sales-server", password = "oml_userStrongPassword", port = 1521 )
Example 2-2 Using ore.connect and Specifying a Service Name
This example demonstrates using a service name rather than a SID. It also specifies connecting to the local host.
ore.connect(user = "oml_user", host = "localhost", password = "oml_userStrongPassword", service_name = "sales.example.com")
Example 2-3 Using ore.connect and Specifying an Easy Connect String
This example uses the conn_string
argument to specify an easy connect string that identifies the connection.
ore.connect(user = "oml_user", password = "oml_userStrongPassword", conn_string = "sales-server:1521:sales (ADDRESS=(PROTOCOL=tcp) (HOST=sales-server) (PORT=1521)) (CONNECT_DATA=(SERVICE_NAME=sales.example.com)))")
Example 2-4 Using ore.connect and Specifying a Full Connection String
This example uses the conn_string
argument to specify a full connection string that identifies the connection.
ore.connect(user = "oml_user", password = "oml_userStrongPassword", conn_string = "DESCRIPTION= (ADDRESS=(PROTOCOL=tcp) (HOST=sales-server) (PORT=1521)) (CONNECT_DATA=(SERVICE_NAME=myserver.example.com))")
Example 2-5 Using the conn_string Argument to Specify an Oracle Wallet
This example uses the conn_string
argument to specify an Oracle wallet. The mydb_test
string is the connection identifier for the Oracle database. The Oracle wallet contains the information needed to create the connection. For information on creating an Oracle wallet for an OML4R connection, see Oracle Machine Learning
for R Installation and Administration Guide.
ore.connect(conn_string = "mydb_test")
Example 2-6 Using the conn_string Argument and Specifying an Empty Connection String
This example uses an empty connection string to connect to the local host.
ore.connect(user = "oml_user", password = "oml_userStrongPassword", conn_string = "")
Example 2-7 Using the conn_string Argument in Connecting to a Pluggable Database
This example connects to a pluggable database using the conn_string
argument to specify a service name.
ore.connect(conn_string = "pdb1.example.com")
Example 2-8 Using the service_name Argument in Connecting to a Pluggable Database
This example invokes ore.connect
using a service name, host name, and port number to connect to a pluggable database.
ore.connect(service_name = "pdb1.example.com", host = "mypdb", port = 1521)
Example 2-9 Disconnecting an OML4R Session
This example explicitly disconnects an OML4R session from an Oracle database.
ore.disconnect()
Parent topic: Connect to an Oracle Database Instance
2.2 Create and Manage R Objects in Oracle Database
With transparency layer functions you can connect to an Oracle Database instance and interact with data structures in a database schema.
You can move data to and from the database and create database tables. You can also save R objects in the database. The OML4R functions that perform these actions are described in the following topics.
- Create R Objects for In-Database Data
Using Oracle Machine Learning for R, you can create R proxy objects in your R session for database-resident data. - Create Ordered and Unordered ore.frame Objects
Oracle Machine Learning for R provides the ability to create ordered or unorderedore.frame
objects. - Move Data to and from the Database
You can create a temporary database table, and its corresponding proxyore.frame
object, from a local R object with theore.push
function. - Create and Delete Database Tables
Use theore.create
function to create a persistent table in an Oracle Database schema. - Save and Manage R Objects in the Database
Oracle Machine Learning for R provides datastores that you can use to save OML4R proxy objects, as well as any R object, in an Oracle database.
Parent topic: Get Started with Oracle Machine Learning for R
2.2.1 Create R Objects for In-Database Data
Using Oracle Machine Learning for R, you can create R proxy objects in your R session for database-resident data.
Creating proxy objects is described in the following topics.
- About Creating R Objects for Database Objects
To gain access to the data in the database tables in the schema, you use theore.sync
function. - Synchronize Data with the ore.sync Function
The following example demonstrates the use of theore.sync
function. - Get Objects with the ore.get Function
After you have created an R environment andore.frame
proxy objects withore.sync
, you can get a proxy object by name with theore.get
function. - Add a Schema with the ore.attach Function
Withore.attach
, you add an R environment for a database schema to the R search path.
Parent topic: Create and Manage R Objects in Oracle Database
2.2.1.1 About Creating R Objects for Database Objects
To gain access to the data in the database tables in the schema, you use the ore.sync
function.
When you invoke ore.connect
in an R session, Oracle Machine Learning for R creates a connection to a schema in an Oracle Database instance. The ore.sync
function creates an ore.frame
object that is a proxy for a table in a schema. You can use the ore.attach
function to add an R environment that represents a schema in the R search path.
When you use the ore.sync
function to create an ore.frame
object as a proxy for a database table, the name of the ore.frame
proxy object is the same as the name of the database object. Each ore.frame
proxy object contains metadata about the corresponding database object.
You can use the proxy ore.frame
object to select data from the table. When you execute an R operation that selects data from the table, the operation returns the current data from the database object. However, if some application has added a column to the table, or has otherwise changed the metadata of the database object, the ore.frame
proxy object does not reflect such a change until you again invoke ore.sync
for the database object.
If you invoke the ore.sync
function with no tables specified, and if the value of the all
argument was FALSE
in the ore.connect
function call that established the connection to the Oracle database instance, then the ore.sync
function creates a proxy object for each table in the schema specified by ore.connect
. You can use the table
argument to specify the tables for which you want to create ore.frame
proxy objects.
Tip:
To conserve memory resources and save time, you should only add proxies for the tables that you want to use in your R session.
With the schema
argument, you can specify the schema for which you want to create an R environment and proxy objects. Only one environment for a given database schema can exist at a time. With the use.keys
argument, you can specify whether you want to use primary keys in the table to order the ore.frame
object.
Tip:
Ordering is expensive in the database. Because most operations in R do not need ordering, you should generally set use.keys
to FALSE
unless you need ordering for sampling data or some other purpose.
With the query
argument, you can specify a SQL SELECT
statement. This enables you to create an ore.frame
for a query without creating a view in the database. This can be useful when you do not have the CREATE VIEW
system privilege for the current schema. You cannot use the schema
argument and the query
argument in the same ore.sync
invocation.
You can use the ore.ls
function to list the ore.frame
proxy objects that correspond to database tables in the environment for a schema. You can use the ore.exists
function to find out if an ore.frame
proxy object for a database table exists in an R environment. The function returns TRUE
if the proxy object exists or FALSE
if it does not. You can remove an ore.frame
proxy object from an R environment with the ore.rm
function.
Parent topic: Create R Objects for In-Database Data
2.2.1.2 Synchronize Data with the ore.sync Function
The following example demonstrates the use of the ore.sync
function.
The example first invokes the ore.exec
function to create some tables to represent tables existing in the OML_USER database schema. The example then invokes ore.sync
and specifies three tables of the schema. The ore.sync
invocation creates an R environment for the OML_USER schema and creates proxy ore.frame
objects for the specified tables in that schema. The example lists the ore.frame
proxy objects in the current environment. The TABLE3 table exists in the schema but does not have an ore.frame
proxy object because it was not included in the ore.sync
invocation.
The example next invokes ore.sync
with the query
argument to create ore.frame
objects for the specified SQL queries. The example lists the ore.frame
objects again.
The example then invokes ore.sync
again and creates an R environment for the SH schema and proxy objects in that environment for the specified tables in that schema. The example invokes the ore.exists
function to find out if the specified table exists in the current environment and then in the SH environment. The example lists the R objects in the SH environment.
The example next removes the ore.frame
objects QUERY1
, QUERY2
, and TABLE4
from the OML_USER environment. Finally, the example lists the proxy objects in the environment again.
Note:
The ore.rm
function invocation removes the ore.frame
that is a proxy for the TABLE4 table from the environment. It does not delete the table from the schema.
Example 2-10 Using ore.sync to Add ore.frame Proxy Objects to an R Environment
# After connecting to a database as OML_USER, create some tables.
ore.exec("CREATE TABLE TABLE1 AS SELECT * FROM dual")
ore.exec("CREATE TABLE TABLE2 AS SELECT * FROM dual")
ore.exec("CREATE TABLE TABLE3 AS SELECT * FROM dual")
ore.exec("CREATE TABLE TABLE4 AS SELECT * FROM dual")
# Create ore.frame objects for the specified tables.
ore.sync(table = c("TABLE1", "TABLE3", "TABLE4"))
# List the ore.frame proxy objects in the current environment.
ore.ls()
# Create ore.frame objects for the specified queries.
ore.sync(query = c("QUERY1" = "SELECT 0 X, 1 Y FROM dual",
"QUERY2" = "SELECT 1 X, 0 Y FROM dual"))
ore.ls()
# The OML_USER user has been granted SELECT permission on the tables in the
# SH schema.
ore.sync("SH", table = c("CUSTOMERS", "SALES"))
# Find out if the CUSTOMERS ore.frame exists in the OML_USER environment.
ore.exists("CUSTOMERS")
# Find out if it exists in the SH environment.
ore.exists("CUSTOMERS", schema = "SH")
# List the ore.frame proxy objects in the SH environment.
ore.ls("SH")
# Remove the ore.frame objects for the specified objects.
ore.rm(c("QUERY1", "QUERY2", "TABLE4"))
# List the ore.frame proxy objects in the current environment again.
ore.ls()
Listing for This ExampleR> # After connecting to a database as OML_USER, create some tables.
R> ore.exec("CREATE TABLE TABLE1 AS SELECT * FROM dual")
R> ore.exec("CREATE TABLE TABLE2 AS SELECT * FROM dual")
R> ore.exec("CREATE TABLE TABLE3 AS SELECT * FROM dual")
R> ore.exec("CREATE TABLE TABLE4 AS SELECT * FROM dual")
R> # Create ore.frame objects for the specified tables.
R> ore.sync(table = c("TABLE1", "TABLE3", "TABLE4"))
R> # List the ore.frame proxy objects in the current environment.
R> ore.ls()
[1] "TABLE1" "TABLE3" "TABLE4"
R> # Create ore.frame objects for the specified queries.
R> ore.sync(query = c("QUERY1" = "SELECT 0 X, 1 Y FROM dual",
+ "QUERY2" = "SELECT 1 X, 0 Y FROM dual"))
R> ore.ls()
[1] "QUERY1" "QUERY2" "TABLE1" "TABLE3" "TABLE4"
R> # The OML_USER user has been granted SELECT permission on the tables in the
R> # SH schema.
R> ore.sync("SH", table = c("CUSTOMERS", "SALES"))
R> # Find out if the CUSTOMERS ore.frame exists in the OML_USER environment.
R> ore.exists("CUSTOMERS")
[1] FALSE
R> # Find out if it exists in the SH environment.
R> ore.exists("CUSTOMERS", schema = "SH")
[1] TRUE
R> # List the ore.frame proxy objects in the SH environment.
R> ore.ls("SH")
[1] "CUSTOMERS" "SALES"
R> # Remove the ore.frame objects for the specified objects.
R> ore.rm(c("QUERY1", "QUERY2", "TABLE4"))
R> # List the ore.frame proxy objects in the current environment again.
R> ore.ls()
[1] "TABLE1" TABLE3"
Parent topic: Create R Objects for In-Database Data
2.2.1.3 Get Objects with the ore.get Function
After you have created an R environment and ore.frame
proxy objects with ore.sync
, you can get a proxy object by name with the ore.get
function.
You can use ore.get
to get the proxy ore.frame
for a table and assign it to a variable in R, as in SH_CUST <- ore.get(name = "CUSTOMERS", schema = "SH")
. The ore.frame
exists in the R global environment, which can be referred to using .GlobalEnv
, and so it appears in the list returned by the ls
function. Also, because this object exists in the R global environment, as opposed an R environment that represents a database schema, it is not listed by the ore.ls
function.
Example 2-11 Using ore.get to Get a Database Table
This example invokes the ore.sync
function to create an ore.frame
object that is a proxy for the CUSTOMERS table in the SH schema. The example then gets the dimensions of the proxy object.
ore.sync(schema = "SH", table = "CUSTOMERS", use.keys = FALSE) dim(ore.get(name = "CUSTOMERS", schema = "SH"))Listing for Example 2-11
R> ore.sync(schema = "SH", table = "CUSTOMERS", use.keys = FALSE) R> dim(ore.get(name = "CUSTOMERS", schema = "SH")) [1] 630 15
Parent topic: Create R Objects for In-Database Data
2.2.1.4 Add a Schema with the ore.attach Function
With ore.attach
, you add an R environment for a database schema to the R search path.
When you add the R environment, you have access to database tables by name through the proxy objects created by the ore.sync
function without needing to specify the schema environment.
The default schema is the one specified in creating the connection and the default position in the search path is 2. You can specify the schema and the position in the ore.attach
function invocation.. You can also specify whether you want the ore.attach
function to indicate whether a naming conflict occurs when adding the environment. You can detach the environment for a schema from the R search path with the ore.detach
function.
Example 2-12 Using ore.attach to Add an Environment for a Database Schema
This example demonstrates the use of the ore.attach
function. Comments in the example explain the function invocations.
# Connected as oml_user. # Add the environment for the oml_user schema to the R search path. ore.attach() # Create an unordered ore.frame proxy object in the SH environment for the # specifed table. ore.sync(schema = "SH", table = "CUSTOMERS", use.keys = FALSE) # Add the environment for the SH schema to the search path and warn if naming # conflicts exist. ore.attach("SH", 3, warn.conflicts = TRUE) # Display the number of rows and columns in the proxy object for the table. dim(CUSTOMERS) # Remove the environment for the SH schema from the search path. ore.detach("SH") # Invoke the dim function again. dim(CUSTOMERS)Listing for Example 2-12
R> # Connected as oml_user. R> # Add the environment for the oml_user schema to the R search path. R> ore.attach() R> # Create an unordered ore.frame proxy object in the SH environment for the R> # specifed table. R> ore.sync(schema = "SH", table = "CUSTOMERS", use.keys = FALSE) R> # Add the environment for the SH schema to the search path and warn if naming R> # conflicts exist. R> ore.attach("SH", 3, warn.conflicts = TRUE) R> # Display the number of rows and columns in the proxy object for the table. R> dim(CUSTOMERS) [1] 630 15 R> # Remove the environment for the SH schema from the search path. R> ore.detach("SH") R> # Invoke the dim function again. R> dim(CUSTOMERS) Error: object 'CUSTOMERS' not found
Parent topic: Create R Objects for In-Database Data
2.2.2 Create Ordered and Unordered ore.frame Objects
Oracle Machine Learning for R provides the ability to create ordered or unordered ore.frame
objects.
- About Ordering in ore.frame Objects
R objects such asvector
anddata.frame
have an implicit ordering of their elements. - Global Options Related to Ordering
OML4R has options that relate to the ordering of anore.frame
object. - Ordering Using Keys
You can use the primary key of a database table to order anore.frame
object. - Ordering Using Row Names
You can use row names to order anore.frame
object. - Using Ordered Frames
This example shows the result of merging two orderedore.frame
objects and two unorderedore.frame
objects.
Parent topic: Create and Manage R Objects in Oracle Database
2.2.2.1 About Ordering in ore.frame Objects
R objects such as vector
and data.frame
have an implicit ordering of their elements.
The data in an Oracle Database table is not necessarily ordered. For some R operations, ordering is useful whereas for other operations it is unnecessary. By ordering an ore.frame
, you are able to index the ore.frame
object by using either integer or character indexes.
Using an ordered ore.frame
object that is a proxy for a SQL query can be time-consuming for a large data set. Therefore, although OML4R attempts to create ordered ore.frame
objects by default, it also provides the means of creating an unordered ore.frame
object.
When you invoke the ore.sync
function to create an OML4R ore.frame
object as a proxy for a SQL query, you can use the use.keys
argument to specify whether the ore.frame
can be ordered or must be unordered.
An ore.frame
object can be ordered if one or more of the following conditions are true:
-
The value of the
use.keys
argument of theore.sync
function isTRUE
and a primary key is defined on the underlying table -
The row names of the
ore.frame
constitute a unique tuple -
The
ore.frame
object is produced by certain functions such asaggregate
andcbind
-
All of the
ore.frame
objects that are input arguments to relevant OML4R functions are ordered
An ore.frame
object is unordered if one or more of the following conditions are true:
-
The value of the
use.keys
argument of theore.sync
function isFALSE
-
No primary key is defined on the underlying table and either the row names of the
ore.frame
object are not specified or the row names of theore.frame
object are set toNULL
-
One or more of the
ore.frame
objects that are input arguments to relevant OML4R functions are unordered
An unordered ore.frame
object has null row names. You can determine whether an ore.frame
object is ordered by invoking is.null
on the row names of the objects, as shown in the last lines of Example 2-13. If the ore.frame
object is unordered, is.null
returns an error.
See Also:
Parent topic: Create Ordered and Unordered ore.frame Objects
2.2.2.2 Global Options Related to Ordering
OML4R has options that relate to the ordering of an ore.frame
object.
The ore.warn.order
global option specifies whether you want OML4R to display a warning message if you use an unordered ore.frame
object in a function that requires ordering. If you know what to expect in an operation, then you might want to turn the warnings off so they do not appear in the output. For examples of the warning messages, see Example 2-13 and Example 2-14.
You can see what the current setting is, or turn the option on or off, as in the following example.
R> options("ore.warn.order") $ore.warn.order [1] TRUE R> options("ore.warn.order" = FALSE) R> options("ore.warn.order" = TRUE)
With the ore.sep
option, you can specify the separator between the row name values that you use for multi-column keys, as in the following example.
R> options("ore.sep") $ore.sep [1] "|" R> options("ore.sep" = "/") R> options("ore.sep" = "|")
Parent topic: Create Ordered and Unordered ore.frame Objects
2.2.2.3 Ordering Using Keys
You can use the primary key of a database table to order an ore.frame
object.
The following example loads the spam data set from the kernlab
package. It adds two columns to the data set.
The example invokes ore.drop
to drop the named tables if they exist. It then invokes ore.create
to create two tables from the data set. It invokes ore.exec
to make the USERID and TS columns a composite primary key of the SPAM_PK table, and invokes ore.sync
to synchronize the table with its ore.frame
proxy.
Note:
The ore.exec
function executes a SQL statement in the Oracle Database schema. The function is intended for database definition language (DDL) statements that have no return value.
The example then displays the first eight rows of each table. The proxy object for the SPAM_PK table is an ordered ore.frame
object. It has row names that are a combination of the TS and USERID column values separated by the "|" character. The proxy object for the SPAM_NOPK table is an unordered ore.frame
object that has the symbol SPAM_NOPK
. By default, SPAM_NOPK
has row names that are sequential numbers.
Example 2-13 Ordering Using Keys
# Prepare the data. library(kernlab) data(spam) s <- spam # Create a column that has integer values. s$TS <- 1001:(1000 + nrow(s)) # Create a column that has integer values with each number repeated twice. s$USERID <- rep(351:400, each=2, len=nrow(s)) # Ensure that the database tables do not exist. ore.drop(table='SPAM_PK') ore.drop(table='SPAM_NOPK') # Create database tables. ore.create(s[,c(59:60,1:28)], table="SPAM_PK") ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK") # Using a SQL statement, alter the SPAM_PK table to add a composite primary key. ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key (\"USERID\",\"TS\")") # Synchronize the table to get the change to it. ore.sync(table = "SPAM_PK") # View the data in the tables. # The row names of the ordered SPAM_PK are the primary key column values. head(SPAM_PK[,1:8]) # The row names of the unordered SPAM_NOPK are sequential numbers. # The first warning results from the inner accessing of SPAM_NOPK to subset # the columns. The second warning is for the invocation of the head # function on that subset. head(SPAM_NOPK[,1:8]) # Verify that SPAM_NOPK is unordered. is.null(row.names(SPAM_NOPK))Listing for This Example
R> # Prepare the data. R> library(kernlab) R> data(spam) R> s <- spam R> # Create a column that has integer values. R> s$TS <- 1001:(1000 + nrow(s)) R> # Create a column that has integer values with each number repeated twice. R> s$USERID <- rep(351:400, each=2, len=nrow(s)) R> # Ensure that the database tables do not exist. R> ore.drop(table='SPAM_PK') R> ore.drop(table='SPAM_NOPK') R> # Create database tables. R> ore.create(s[,c(59:60,1:28)], table="SPAM_PK") R> ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK") R> # Using a SQL statement, alter the SPAM_PK table to add a composite primary key. R> ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key + (\"USERID\",\"TS\")") R> # Synchronize the table to get the change to it. R> ore.sync(table = "SPAM_PK") R> # View the data in the tables. R> # The row names of the ordered SPAM_PK are the primary key column values. R> head(SPAM_PK[,1:8]) TS USERID make address all num3d our over 1001|351 1001 351 0.00 0.64 0.64 0 0.32 0.00 1002|351 1002 351 0.21 0.28 0.50 0 0.14 0.28 1003|352 1003 352 0.06 0.00 0.71 0 1.23 0.19 1004|352 1004 352 0.00 0.00 0.00 0 0.63 0.00 1005|353 1005 353 0.00 0.00 0.00 0 0.63 0.00 1006|353 1006 353 0.00 0.00 0.00 0 1.85 0.00 R> # The row names of the unordered SPAM_NOPK are sequential numbers. R> # The first warning results from the inner accessing of SPAM_NOPK to subset R> # the columns. The second warning is for the invocation of the head R> # function on that subset. R> head(SPAM_NOPK[,1:8]) TS USERID make address all num3d our over 1 1001 351 0.00 0.64 0.64 0 0.32 0.00 2 1002 351 0.21 0.28 0.50 0 0.14 0.28 3 1003 352 0.06 0.00 0.71 0 1.23 0.19 4 1004 352 0.00 0.00 0.00 0 0.63 0.00 5 1005 353 0.00 0.00 0.00 0 0.63 0.00 6 1006 353 0.00 0.00 0.00 0 1.85 0.00 Warning messages: 1: ORE object has no unique key - using random order 2: ORE object has no unique key - using random order R> # Verify that SPAM_NOPK is unordered. R> is.null(row.names(SPAM_NOPK)) Error: ORE object has no unique key
Parent topic: Create Ordered and Unordered ore.frame Objects
2.2.2.4 Ordering Using Row Names
You can use row names to order an ore.frame
object.
The following example creates a data.frame
object in the local R session memory and pushes it to the ore.frame
object with the symbol a
, which exists in the memory of the Oracle database to which the R session is connected. The example shows that the ore.frame
object has the default row names of the R data.frame
object. Because the ore.frame
object is ordered, invoking the row.names
function on it does not produce a warning message.
The example uses the ordered SPAM_PK
and unordered SPAM_NOPK
ore.frame
objects to show that invoking row.names
on the unordered SPAM_NOPK
produces a warning message but invoking it on the ordered SPAM_PK
does not.
The SPAM_PK
object is ordered by the row names, which are the combined values of the TS and USERID column values separated by the "|" character. The example shows that you can change the row names.
Example 2-14 Ordering Using Row Names
# Prepare the data. library(kernlab) data(spam) s <- spam # Create a column that has integer values. s$TS <- 1001:(1000 + nrow(s)) # Create a column that has integer values with each number repeated twice. s$USERID <- rep(351:400, each=2, len=nrow(s)) # Ensure that the database tables do not exist. ore.drop(table='SPAM_PK') ore.drop(table='SPAM_NOPK') # Create database tables. ore.create(s[,c(59:60,1:28)], table="SPAM_PK") ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK") # Using a SQL statement, alter the SPAM_PK table to add a composite primary key. ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key (\"USERID\",\"TS\")") # Synchronize the table to get the change to it. ore.sync(table = "SPAM_PK") # Create an ordered ore.frame by default. a <- ore.push(data.frame(a=c(1:10,10:1), b=letters[c(1:10,10:1)])) # Display the values in the b column. Note that because the ore.frame is # ordered, no warnings appear. a$b # Display the default row names for the first six rows of the a column. row.names(head(a)) # SPAM_NOPK has no unique key, so row.names raises error messages. row.names(head(SPAM_NOPK)) # Row names consist of TS ‘|' USERID. # For display on this page, only the first four row names are shown. row.names(head(SPAM_PK)) # Reassign the row names to the TS column only row.names(SPAM_PK) <- SPAM_PK$TS # The row names now correspond to the TS values only. row.names(head(SPAM_PK[,1:4])) head(SPAM_PK[,1:4])Listing for This Example
R> # Prepare the data. R> library(kernlab) R> data(spam) R> s <- spam R> # Create a column that has integer values. R> s$TS <- 1001:(1000 + nrow(s)) R> # Create a column that has integer values with each number repeated twice. R> s$USERID <- rep(351:400, each=2, len=nrow(s)) R> # Ensure that the database tables do not exist. R> ore.drop(table='SPAM_PK') R> ore.drop(table='SPAM_NOPK') R> # Create database tables. R> ore.create(s[,c(59:60,1:28)], table="SPAM_PK") R> ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK") R> # Using a SQL statement, alter the SPAM_PK table to add a composite primary key. R> ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key + (\"USERID\",\"TS\")") R> # Synchronize the table to get the change to it. R> ore.sync(table = "SPAM_PK") R> # Create an ordered ore.frame by default. R> a <- ore.push(data.frame(a=c(1:10,10:1), b=letters[c(1:10,10:1)])) R> # Display the values in the b column. Note that because the ore.frame is R> # ordered, no warnings appear. R> a$b [1] a b c d e f g h i j j i h g f e d c b aLevels: a b c d e f g h i j R> # Display the default row names for the first six rows of the a column. R> row.names(head(a)) [1] 1 2 3 4 5 6 R> # SPAM_NOPK has no unique key, so row.names raises error messages. R> row.names(head(SPAM_NOPK)) Error: ORE object has no unique key In addition: Warning message: ORE object has no unique key - using random order R> # Row names consist of TS ‘|' USERID. R> # For display on this page, only the first four row names are shown. R> row.names(head(SPAM_PK)) 1001|351 1002|351 1003|352 1004|352 "1001|3.51E+002" "1002|3.51E+002" "1003|3.52E+002" "1004|3.52E+002" R> # Reassign the row names to the TS column only R> row.names(SPAM_PK) <- SPAM_PK$TS R> # The row names now correspond to the TS values only. R> row.names(head(SPAM_PK[,1:4])) [1] 1001 1002 1003 1004 1005 1006 R> head(SPAM_PK[,1:4]) TS USERID make address 1001 1001 351 0.00 0.64 1002 1002 351 0.21 0.28 1003 1003 352 0.06 0.00 1004 1004 352 0.00 0.00 1005 1005 353 0.00 0.00 1006 1006 353 0.00 0.00
Parent topic: Create Ordered and Unordered ore.frame Objects
2.2.2.5 Using Ordered Frames
This example shows the result of merging two ordered ore.frame
objects and two unordered ore.frame
objects.
Example 2-15 Merging Ordered and Unordered ore.frame Objects
# Prepare the data.
library(kernlab)
data(spam)
s <- spam
# Create a column that has integer values.
s$TS <- 1001:(1000 + nrow(s))
# Create a column that has integer values with each number repeated twice.
s$USERID <- rep(351:400, each=2, len=nrow(s))
# Ensure that the database tables do not exist.
ore.drop(table='SPAM_PK')
ore.drop(table='SPAM_NOPK')
# Create database tables.
ore.create(s[,c(59:60,1:28)], table="SPAM_PK")
ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK")
# Using a SQL statement, alter the SPAM_PK table to add a composite primary key.
ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key
(\"USERID\",\"TS\")")
# Synchronize the table to get the change to it.
ore.sync(table = "SPAM_PK")
# Create objects for merging data from unordered ore.frame objects.
x <- SPAM_NOPK[,1:4]
y <- SPAM_NOPK[,c(1,2,4,5)]
m1 <- merge(x, y, by="USERID")
# The merged result m1 produces a warning because it is not an ordered frame.
head(m1,3)
# Create objects for merging data from ordered ore.frame objects.
x <- SPAM_PK[,1:4]
y <- SPAM_PK[,c(1,2,4,5)]
# The merged result m1 does not produce a warning now because it is an
# ordered frame.
m1 <- merge(x, y, by="USERID")
head(m1,3)
Listing for This ExampleR> # Prepare the data.
R> library(kernlab)
R> data(spam)
R> s <- spam
R> # Create a column that has integer values.
R> s$TS <- 1001:(1000 + nrow(s))
R> # Create a column that has integer values with each number repeated twice.
R> s$USERID <- rep(351:400, each=2, len=nrow(s))
R> # Ensure that the database tables do not exist.
R> ore.drop(table='SPAM_PK')
R> ore.drop(table='SPAM_NOPK')
R> # Create database tables.
R> ore.create(s[,c(59:60,1:28)], table="SPAM_PK")
R> ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK")
R> # Uing a SQL statement, alter the SPAM_PK table to add a composite primary key.
R> ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key
+ (\"USERID\",\"TS\")")
R> # Synchronize the table to get the change to it.
R> ore.sync(table = "SPAM_PK")
R> # Create objects for merging data from unordered ore.frame objects.
R> x <- SPAM_NOPK[,1:4]
R> y <- SPAM_NOPK[,c(1,2,4,5)]
R> m1 <- merge(x, y, by="USERID")
R> # The merged result m1 produces a warning because it is not an ordered frame.
R> head(m1,3)
USERID TS.x make address.x TS.y address.y all
1 351 5601 0.00 0 1001 0.64 0.64
2 351 5502 0.00 0 1001 0.64 0.64
3 351 5501 0.78 0 1001 0.64 0.64
Warning messages:
1: ORE object has no unique key - using random order
2: ORE object has no unique key - using random order
R> # Create objects for merging data from ordered ore.frame objects.
R> x <- SPAM_PK[,1:4]
R> y <- SPAM_PK[,c(1,2,4,5)]
R> # The merged result m1 does not produce a warning now because it is an
R> # ordered frame.
R> m1 <- merge(x, y, by="USERID")
R> head(m1,3)
USERID TS.x make address.x TS.y address.y all
1001|1001 351 1001 0 0.64 1001 0.64 0.64
1001|1002 351 1001 0 0.64 1002 0.28 0.50
1001|1101 351 1001 0 0.64 1101 0.00 0.00
Parent topic: Create Ordered and Unordered ore.frame Objects
2.2.3 Move Data to and from the Database
You can create a temporary database table, and its corresponding proxy ore.frame
object, from a local R object with the ore.push
function.
With the ore.pull
function you can create a local R object that contains a copy of data represented by an OML4R proxy object.
The ore.push
function translates an R object into an OML4R object of the appropriate data type. The ore.pull
function takes an ore
class object and returns an R object. If the input object is an ore.list
, the ore.pull
function creates a data.frame
and translates each the data of each database column into the appropriate R representation.
Note:
You can pull data to a local R data.frame
only if the data can fit into the R session memory. Also, even if the data fits in memory but is still very large, you may not be able to perform many, or any, R functions in the client R session.
Unless you explicitly save them, the temporary database tables and their corresponding OML4R proxy objects that you create with the ore.push
function are discarded when you quit the R session.
See Also:
-
"Transparency Layer Support for R Data Types and Classes" for information on data type mappings
-
"Saving and Managing R Objects in the Database" for information on permanently saving the OML4R objects in the database
-
The
push_pull.R
example script
Example 2-16 Using ore.push and ore.pull to Move Data
This example demonstrates pushing an R data.frame
object to the database as a temporary database table with an associated ore.frame
object, iris_of
, then creating another ore.frame
object, iris_of_setosa
, by selecting one column from iris_of
, and then pulling the iris_of_setosa
object into the local R session memory as a data.frame
object. The example displays the class of some of the objects.
class(iris) # Push the iris data frame to the database. iris_of <- ore.push(iris) class(iris_of) # Display the data type of the Sepal.Length column in the data.frame. class(iris$Sepal.Length) # Display the data type of the Sepal.Length column in the ore.frame. class(iris_of$Sepal.Length) # Filter one column of the data set. iris_of_setosa <- iris_of[iris_of$Species == "setosa", ] class(iris_of_setosa) # Pull the selected column into the local R client memory. local_setosa = ore.pull(iris_of_setosa) class(local_setosa)Listing for This Example
R> class(iris) [1] "data.frame" R> # Push the iris data frame to the database. R> iris_of <- ore.push(iris) R> class(iris_of) [1] "ore.frame" attr(,"package") [1] "OREbase" R> # Display the data type of the Sepal.Length column in the data.frame. R> class(iris$Sepal.Length) [1] "numeric" R> # Display the data type of the Sepal.Length column in the ore.frame. R> class(iris_of$Sepal.Length) [1] "ore.numeric" attr(,"package") [1] "OREbase" R> # Filter one column of the data set. R> iris_of_setosa <- iris_of[iris_of$Species == "setosa", ] R> class(iris_of_setosa) [1] "ore.frame" attr(,"package") [1] "OREbase" R> # Pull the selected column into the local R client memory. R> local_setosa = ore.pull(iris_of_setosa) R> class(local_setosa) [1] "data.frame"
Parent topic: Create and Manage R Objects in Oracle Database
2.2.4 Create and Delete Database Tables
Use the ore.create
function to create a persistent table in an Oracle Database schema.
Note:
When creating a table in Oracle Machine Learning for R, if you use lowercase or mixed case for the name of the table, then you must use the same lowercase or mixed case name in double quotation marks when using the table in a SQL query or function. If, instead, you use an all uppercase name when creating the table, then the table name is case-insensitive: you can use uppercase, lowercase, or mixed case when using the table without using double quotation marks. The same is true for naming columns in a table.Creating the table automatically creates an ore.frame
proxy object for the table in the R environment that represents your database
schema. The proxy ore.frame
object has the same name as the table.
You can delete the persistent table in an Oracle Database schema with the
ore.drop
function.
Caution:
Only use the ore.drop
function to delete a database
table and its associated ore.frame
proxy object. Never use it
to remove an ore.frame
object that is not associated with a
permanent database table. To remove an ore.frame
object for a
temporary database table, use the ore.rm
function.
Example 2-17 Using ore.create and ore.drop to Create and Drop Tables
This example creates tables in the database and drops some of them.
# Create the AIRQUALITY table from the data.frame for the airquality data set.
ore.create(airquality, table = "AIRQUALITY")
# Create data.frame objects.
df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
# Create the DF1 and DF2 tables from the data.frame objects.
ore.create(df1, "DF1")
ore.create(df2, "DF2")
# Create the CARS93 table from the data.frame for the Cars93 data set.
ore.create(Cars93, table = "CARS93")
# List the OML4R proxy objects.
ore.ls()
# Drop the CARS93 object.
ore.drop(table = "CARS93")
# List the OML4R proxy objects again.
ore.ls()
Listing for This Example
R> # Create the AIRQUALITY table from the data.frame for the airquality data set.
R> ore.create(airquality, table = "AIRQUALITY")
R> # Create data.frame objects.
R> df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
R> df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
R> # Create the DF1_TABLE and DF2_TABLE tables from the data.frame objects.
R> ore.create(df1, "DF1")
R> ore.create(df2, "DF2")
R> # Create the CARS93 table from the data.frame for the Cars93 data set.
R> ore.create(Cars93, table = "CARS93")
R> # List the OML4R proxy objects.
R> ore.ls()
[1] "AIRQUALITY" "CARS93" "DF1" "DF2_"
R> # Drop the CARS93 object.
R> ore.drop(table = "CARS93")
R> # List the OML4R proxy objects again.
R> ore.ls()
[1] "AIRQUALITY" "DF1_" "DF2"
Note:
A text query having more than 4000 characters or storing a value of over 4000 characters in a CLOB column will result in an error stating “ORA-01704: string literal too long”. Use a bind variable if the data is large as shown below. For more information on bind variables see ROracle.
library(ROracle)
options(error = expression(NULL))
Sys.setlocale(‘LC_ALL’, ‘C’)
cat(‘\n Welcome to ROracle(OCI) World\n’);
cat(‘\n DBI Version : ’);
print(packageVersion(‘DBI’));
cat(‘\n’);
#Creating table whose fields are of different type
createStr <- ‘create table TMRQORABND1_TAB(row_num number, id1 clob)’;
insStr <- ‘insert into TMRQORABND1_TAB values(:1, :2)’;
selStr <- ‘select * from TMRQORABND1_TAB order by row_num’;
y <- ‘1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef’;
z <- y
z <- paste(y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, ‘1234567890abcdef1234567890abcdef’, sep = ‘’);
c32767 <- paste(z, z, z, z, z, z, z, z, y, y, y, y, y, y, y, y, y, y, y,
‘1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcde’,
sep = ‘’)
print(nchar(c32767))
c32766 <- paste(z, z, z, z, z, z, z, z, y, y, y, y, y, y, y, y, y, y, y,
‘1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcd’,
sep = ‘’)
print(nchar(c32766))
c32768 <- paste(z, z, z, z, z, z, z, z, y, y, y, y, y, y, y, y, y, y, y,
‘1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef’,
sep = ‘’)
print(nchar(c32768))
y <- paste(y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, ‘1234567890abcdef1234567890abcdef’, sep = ‘’);
y1 <- y
y <- paste(y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, ‘1234567890abcdef1234567890abcdef’, sep = ‘’);
y2 <- y
y <- paste(y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, ‘1234567890abcdef1234567890abcdef’, sep = ‘’);
y3 <- y
y4 <- paste(y3, y3, y3, y3, y3)
r1c2 <- paste(y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y, y,
y, y, ‘1234567890abcdef1234567890abcdef’, sep = ‘’);
print(nchar(y));
drv <- dbDriver(‘Oracle’);
cat(‘\n ROracle driver allocated.\n’);
con <- dbConnect(drv,‘scott’,‘tiger’);
cat(‘\n One database connection object created.\n’);
#tryCatch(
#{
if (dbExistsTable(con, ‘TMRQORABND1_TAB’))
dbGetQuery(con,‘drop table TMRQORABND1_TAB’);
dbGetQuery(con, createStr);
cat(‘\nTable created with columns data type as raw(n) \n’);
x <- 1;
dbGetQuery(con, insStr, data.frame(x,r1c2));
dbCommit(con);
x <- c(2, 3, 4, 5, 6, 7, 8, 9, 10);
yy <- c(y1, y2, y3, z, y4, c32767, c32766, c32768, ‘’);
dbGetQuery(con, insStr, data.frame(x, yy));
dbCommit(con)
print(dbGetQuery(con, ‘select row_num, length(id1) from TMRQORABND1_TAB’));
x <- 100;
y <- paste(y, c32767, sep = ‘’);
dbGetQuery(con, insStr, data.frame(x,y));
dbCommit(con)
s <- dbSendQuery(con, selStr)
cinfo <- dbColumnInfo(s)
print(dbGetQuery(con, ‘select row_num, length(id1) from TMRQORABND1_TAB’));
res <- dbGetQuery(con, selStr)
if (res[,2][1] != r1c2) {
print(paste(‘Row’, res[,1][1], cinfo[,1][2], res[,2][1],
‘not equal to’, r1c2))
} else {
print(paste(‘Row’, res[,1][1], cinfo[,1][2], ‘length is’,
nchar(res[,2][1]),
‘length of data is’, nchar(r1c2)), sep = ‘’)
}
for (i in 2:9)
{
if (res[,2][i] != yy[i-1]) {
print(paste(‘Row’, res[,1][i], cinfo[,1][2], res[,2][i],
‘not equal to’, yy[i-1]))
} else {
print(paste(‘Row’, res[,1][i], cinfo[,1][2], ‘length is’,
nchar(res[,2][i]),
‘length of data is’, nchar(yy[i-1])), sep = ‘’)
}
}
if (!is.na(res[,2][10])) {
print(paste(‘Row’, res[,1][10], cinfo[,1][2], res[,2][10],
‘not equal to’, yy[9]))
} else {
print(paste(‘Row’, res[,1][10], cinfo[,1][2], ‘length is’,
nchar(res[,2][10]),
‘length of data is’, nchar(yy[9])), sep = ‘’)
}
if (res[,2][11] != y) {
print(paste(‘Row’, res[,1][11], cinfo[,1][2], res[,2][11],
‘not equal to’, y))
} else {
print(paste(‘Row’, res[,1][11], cinfo[,1][2], ‘length is’,
nchar(res[,2][11]),
‘length of data is’, nchar(y)), sep = ‘’)
}
#}, finally = {
dbGetQuery(con,‘drop table TMRQORABND1_TAB’);
cat(‘\n ROracle driver deallocated successfully.\n’);
cat(‘Releasing resources...‘);
dbDisconnect(con);
cat(‘\n Connection with database removed successfully.\n’);
dbUnloadDriver(drv);
cat(‘done\n’);
#}) # tryCatch()
Parent topic: Create and Manage R Objects in Oracle Database
2.2.5 Save and Manage R Objects in the Database
Oracle Machine Learning for R provides datastores that you can use to save OML4R proxy objects, as well as any R object, in an Oracle database.
You can grant or revoke read privilege access to a datastore for one or more users. You can restore the saved objects in another R session. The objects in a datastore are also accessible to embedded R execution through both the R and the SQL interfaces.
This section describes the OML4R functions that you can use to create and manage datastores. The section contains the following topics:
- About Persisting Oracle Machine Learning for R Objects
With OML4R datastores, you can save R objects in the database. - About OML4R Datastores
Each database schema has a table that stores named OML4R datastores. - Save Objects to a Datastore
Theore.save
function saves one or more R objects in the specified datastore. - Control Access to Datastores
With theore.grant
andore.revoke
functions you can grant or revoke access to an OML4R datastore. - Get Information about Datastore Contents
You can get information about a datastore in the current user schema by using theore.datastore
andore.datastoreSummary
functions. - Restore Objects from a Datastore
Theore.load
function restores R objects saved in a datastore to the R global environment,.GlobalEnv
. - Delete a Datastore
With theore.delete
function, you can delete objects from an OML4R datastore or you can delete the datastore itself. - About Using a Datastore in Embedded R Execution
Saving objects in a datastore makes it very easy to pass arguments to, and reference R objects with, embedded R execution functions.
Parent topic: Create and Manage R Objects in Oracle Database
2.2.5.1 About Persisting Oracle Machine Learning for R Objects
With OML4R datastores, you can save R objects in the database.
R objects, including OML4R proxy objects, exist for the duration of the current R session unless you explicitly save them. The standard R functions for saving and restoring R objects, save
and load
, serialize objects in R memory to store them in a file and deserialize them to restore them in memory. However, for OML4R proxy objects, those functions do not save the database objects associated with the proxy objects in an Oracle database; therefore the saved proxy objects do not behave properly in a different R session.
You can save OML4R proxy objects, as well as any R object, with the ore.save
function. The ore.save
function specifies an OML4R datastore. A datastore persists in the database when you end the R session. The datastore maintains the referential integrity of the objects it contains. Using the ore.load
function, you can restore in another R session the objects in the datastore.
Using a datastore, you can do the following:
-
Save OML4R and other R objects that you create in one R session and restore them in another R session.
-
Pass arguments to R functions for use in embedded R execution.
-
Pass objects for use in embedded R execution. You could, for example, use a function in the
OREdm
package to build an Oracle Machine Learning for SQL model and save it in a datastore. You could then use that model to score data in the database through embedded R execution. For an example of using a datastore in an embedded R execution function, see Example 6-10.
The following table lists the functions that manipulate datastores and provides brief descriptions of them.
Table 2-1 Functions that Manipulate Datastores
Function | Description |
---|---|
|
Lists information about a datastore in the current Oracle database schema. |
|
Provides detailed information about the specified datastore in the current Oracle database schema. |
|
Deletes a datastore from the current Oracle database schema. |
ore.grant |
Grants read access to a datastore. |
|
Lazily restores objects from a datastore into an R environment. |
|
Restores objects from a datastore into an R environment. |
ore.revoke |
Revokes read access to a datastore. |
|
Saves R objects in a new or existing datastore. |
See Also:
"Using Oracle R Enterprise Embedded R Execution" for information on using the R and the SQL interfaces to embedded R execution
Parent topic: Save and Manage R Objects in the Database
2.2.5.2 About OML4R Datastores
Each database schema has a table that stores named OML4R datastores.
A datastore can contain OML4R objects and standard R objects.
You create a datastore with the ore.save
function. When you create a datastore, you specify a name for it. You can save objects in one or more datastores.
As long as a datastore contains an OML4R proxy object for a database object, the database object persists between R sessions. For example, you could use the ore.odmNB
function in the OREdm
package to build an Oracle Machine Learning for SQL Naive Bayes model. If you save the resulting ore.odmNB
object in a datastore and end the R session, then Oracle Database does not delete the OML4SQL model. If no datastore contains the ore.odmNB
object and the R session ends, then the database automatically drops the model.
Parent topic: Save and Manage R Objects in the Database
2.2.5.3 Save Objects to a Datastore
The ore.save
function saves one or more R objects in the specified datastore.
By default, OML4R creates the datastore in the current user schema. With the arguments to ore.save
, you can provide the names of specific objects, or provide a list of objects. You can specify whether read privilege access to the datastore can be granted to other users. You can specify a particular R environment to search for the objects you would like to save. The overwrite
and append
arguments are mutually exclusive. If you set the overwrite
argument to TRUE
, then you can replace an existing datastore with another datastore of the same name. If you set the append
argument to TRUE
, then you can add objects to an existing datastore. With the description
argument, you can provide some descriptive text that appears when you get information about the datastore. The description
argument has no effect when used with the append
argument.
Example 2-18 Saving Objects and Creating a Datastore
This example demonstrates creating datastores using different combinations of arguments.
# Create some R objects.
df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
iris_of <- ore.push(iris)
# Create a database table and an OML4R proxy object for the table.
ore.drop("AIRQUALITY")
ore.create(airquality, table = "AIRQUALITY")
# List the R objects.
ls()
# List the OML4R proxy objects.
ore.ls()
# Save the proxy object and all objects in the current workspace environment
# to the datastore named ds1 and supply a description.
ore.save(AIRQUALITY, list = ls(), name = "ds1", description = "My private datastore")
# Create some more objects.
x <- stats::runif(20) # x is an object of type numeric.
y <- list(a = 1, b = TRUE, c = "hoopsa")
z <- ore.push(x) # z is an object of type ore.numeric.
# Create another datastore.
ore.save(x, y, name = "ds2", description = "x and y")
# Overwrite the contents of datastore ds2.
ore.save(x, name = "ds2", overwrite = TRUE, description = "only x")
# Append object z to datastore ds2.
ore.save(z, name = "ds2", append = TRUE)
Listing for This Example
R> # Create some R objects.
R> df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
R> df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
R> iris_of <- ore.push(iris)
R>
R> # Create a database table and an OML4R proxy object for the table.
R> ore.drop("AIRQUALITY")
R> ore.create(airquality, table = "AIRQUALITY")
R>
R> # List the R objects.
R> ls()
[1] "df1" "df2" "iris_of"
R>
R> # List the OML4R proxy objects.
R> ore.ls()
[1] "AIRQUALITY"
R>
R> # Save the proxy object and all objects in the current workspace environment
R> # to the datastore named ds1 and supply a description.
R> ore.save(AIRQUALITY, list = ls(), name = "ds1", description = "My datastore")
R>
R> # Create some more objects.
R> x <- stats::runif(20) # x is an object of type numeric.
R> y <- list(a = 1, b = TRUE, c = "hoopsa")
R> z <- ore.push(x) # z is an object of type ore.numeric.
R>
R> # Create another datastore.
R> ore.save(x, y, name = "ds2", description = "x and y")
R>
R> # Overwrite the contents of datastore ds2.
R> ore.save(x, name = "ds2", overwrite = TRUE, description = "only x")
R>
R> # Append object z to datastore ds2.
R> ore.save(z, name = "ds2", append = TRUE)
Parent topic: Save and Manage R Objects in the Database
2.2.5.4 Control Access to Datastores
With the ore.grant
and ore.revoke
functions you can grant or revoke access to an OML4R datastore.
With the ore.grant
and ore.revoke
functions, you can control access to datastores. You can grant read access to a specified user to a datastore that you own or revoke the access privilege. The functions ore.save
, ore.load
, ore.datastore
, and ore.datastoreSummary
have arguments related to the accessibility of datastores.
Note:
If you use ore.create
to create a persistent database table and its proxy ore.frame
object, then save the proxy ore.frame
object in a grantable datastore, and then use ore.grant
to grant read privilege access to the datastore, the access applies only to the ore.frame
object. The read access does not extend to the persistent database table. To grant read permission to the table itself, you must execute an appropriate SQL command.
Example 2-19 Granting and Revoking Access to a Datastore
This example pushes the airquality data set from the local R session to the Oracle database, where it exists as the ore.frame
object AIRQUALITY
and as a temporary database table with the same name. The example then saves the AIRQUALITY
object to the datastore ds3
and specifies that access to the datastore can be granted to other users. It invokes function ore.datastore
with type = grantable
to display all of the datastores to which read access has been granted. It grants the read privilege for the ds3
datastore to SCOTT. It then invokes ore.datastore
with type = grant
to display the datastores to which read access has been granted. It revokes the read privilege for SCOTT, and again displays the datastores to which access has been granted.
AIRQUALITY <- ore.push(airquality)
ore.save(AIRQUALITY, name = "ds3",
description = "My datastore 3", grantable = TRUE)
ore.datastore(type = "grantable")
ore.datastore(type = "grant")
ore.grant("ds3", type = "datastore", user = "SCOTT")
ore.datastore(type = "grant")
ore.revoke("ds3", type = "datastore", user = "SCOTT")
ore.datastore(type = "grant")
Listing for This Example
R> AIRQUALITY <- ore.push(airquality)
R> ore.save(AIRQUALITY, name = "ds3",
+ description = "My datastore 3", grantable = TRUE)
R> ore.datastore(type = "grantable")
datastore.name object.count size creation.date description
1 ds3 1 1451 2015-11-30 18:48:25 My datastore 3
R> ore.datastore(type = "grant")
[1] datastore.name grantee
<0 rows> (or 0-length row.names)
R> ore.grant("ds3", type = "datastore", user = "SCOTT")
R> ore.datastore(type = "grant")
datastore.name grantee
1 ds3 SCOTT
R> ore.revoke("ds3", type = "datastore", user = "SCOTT")
R> ore.datastore(type = "grant")
[1] datastore.name grantee
<0 rows> (or 0-length row.names)
Parent topic: Save and Manage R Objects in the Database
2.2.5.5 Get Information about Datastore Contents
You can get information about a datastore in the current user schema by using the ore.datastore
and ore.datastoreSummary
functions.
Using the ore.datastore
function, you can list basic information about datastores. To get information about a specific type of datastore, you can use the optional character string type
argument. The valid values for type
are the following:
-
user
, which lists the datastores created by current session user. This is the default value. -
private
, which lists the datastores for which read access cannot be granted by the current session user to other users. -
all
, which lists all of the datastores to which the current session user has read access. -
grantable
, which lists the datastores the read privilege for which can be granted by the current session user to other users. -
grant
, which lists the datastores the read privilege for which has been granted by the current session user to other users. -
granted
, which lists the datastores the read privilege for which has been granted by other users to the current session user.
If you do not specify a type, then function ore.datastore
returns a data.frame
object with columns that correspond to the datastore name, the number of objects in the datastore, the datastore size, the creation date, and a description. Rows are sorted by column datastore.name
in alphabetical order. If you do specify a type, then the function returns a data.frame
that has a column for the specified type.
You can search for a datastore by name or by using a regular expression pattern.
The ore.datastoreSummary
function returns information about the R objects saved within a datastore in the user schema in the connected database. The function returns a data.frame
with columns that correspond to object name, object class, object size, and either the length of the object, if it is a vector
, or the number of rows and columns, if it is a data.frame
object. It takes one required argument, the name of a datastore, and has an optional argument, the owner of the datastore.
Example 2-20 Using the ore.datastore Function
This example demonstrates using the ore.datastore
function.
# Create some R objects.df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
iris_of <- ore.push(iris)
# Create a database table and an OML4R proxy object for the table.
ore.drop("AIRQUALITY")
ore.create(airquality, table = "AIRQUALITY")
# Save the objects to a datastore named ds1 and supply a description.
ore.save(AIRQUALITY, list = ls(), name = "ds1", description = "My private datastore")
# Create some more objects.
x <- stats::runif(20) # x is an object of type numeric.
y <- list(a = 1, b = TRUE, c = "hoopsa")
z <- ore.push(x) # z is an object of type ore.numeric.
# Create other datastores.
ore.save(x, y, name = "ds2", description = "x and y")
ore.save(df1, df2, name = "dfobj", description = "df objects")
ore.save(x, y, z, name = "another_ds", description = "For pattern matching")
# List all of the datastore objects.
ore.datastore()
# List the specified datastore.
ore.datastore("ds1")
# List the datastore objects with names that include "ds".
ore.datastore(pattern = "ds")
Listing for This Example
R> # Create some R objects.
R> df1 <- data.frame(x1 = 1:5, y1 = letters[1:5])
R> df2 <- data.frame(x2 = 5:1, y2 = letters[11:15])
R> iris_of <- ore.push(iris)
R>
R> # Create a database table and an OML4R proxy object for the table.
R> ore.drop("AIRQUALITY")
R> ore.create(airquality, table = "AIRQUALITY")
R>
R> # Save the objects to a datastore named ds1 and supply a description.
R> ore.save(AIRQUALITY, list = ls(), name = "ds1", description = "My private datastore")
R>
R> # Create some more objects.
R> x <- stats::runif(20) # x is an object of type numeric.
R> y <- list(a = 1, b = TRUE, c = "hoopsa")
R> z <- ore.push(x) # z is an object of type ore.numeric.
R>
R> # Create other datastores.
R> ore.save(x, y, name = "ds2", description = "x and y")
R> ore.save(df1, df2, name = "dfobj", description = "df objects")
R> ore.save(x, y, z, name = "another_ds", description = "For pattern matching")
R>
R> # List all of the datastore objects.
R> ore.datastore()
datastore.name object.count size creation.date description
1 another_ds 3 1284 2017-04-21 16:08:57 For pattern matching
2 dfobj 2 656 2017-04-21 16:08:38 df objects
3 ds1 4 3439 2017-04-21 16:03:55 My private datastore
4 ds2 2 314 2017-04-21 16:04:32 x and y
R> # List the specified datastore.
R> ore.datastore("ds1")
datastore.name object.count size creation.date description
1 ds1 4 3439 2017-04-21 16:03:55 My private datastore
R> # List the datastore objects with names that include "ds".
R> ore.datastore(pattern = "ds")
datastore.name object.count size creation.date description
1 another_ds 3 1284 2017-04-21 16:08:57 For pattern matching
2 ds1 4 3439 2017-04-21 16:03:55 My private datastore
3 ds2 2 314 2017-04-21 16:04:32 x and y
Example 2-21 Using the ore.datastoreSummary Function
This example demonstrates using the ore.datastoreSummary
function. The example uses the datastores created in the previous example.
ore.datastoreSummary("ds1")
ore.datastoreSummary("ds2")
Listing for This Example
R> ore.datastoreSummary("ds1")
object.name class size length row.count col.count
1 AIRQUALITY ore.frame 1213 6 NA 6
2 df1 data.frame 328 2 5 2
3 df2 data.frame 328 2 5 2
4 iris_of ore.frame 1570 5 NA 5
R> ore.datastoreSummary("ds2")
object.name class size length row.count col.count
1 x numeric 182 20 NA NA
2 y list 132 3 NA NA
Parent topic: Save and Manage R Objects in the Database
2.2.5.6 Restore Objects from a Datastore
The ore.load
function restores R objects saved in a datastore to the R global environment, .GlobalEnv
.
The function returns a character vector that contains the names of the restored objects.
You can load all of the saved objects or you can use the list
argument to specify the objects to load. With the envir
argument, you can specify an environment in which to load objects.
Example 2-22 Using the ore.load Function to Restore Objects from a Datastore
This example demonstrates using the ore.load
function to restore objects from datastores that were created in Example 2-20. The example runs in the same R session as that example.
# List the R objects. ls() # List the datastores. ore.datastore() # Delete the x and z objects. rm(x, z) ls() # Restore all of the objects in datastore ds2. ore.load("ds2") ls() # After ending the R session and starting another session. ls() # The datastore objects persist between sessions. ore.datastore() # Restore some of the objects from datastore ds1. ore.load("ds1", list = c("df1", "df2", "iris_of")) ls()Listing for Example 2-22
R> # List the R objects. R> ls() [1] "df1" "df2" "iris_of" "x" "y" "z" R> R> # List the datastores. R> ore.datastore() datastore.name object.count size creation.date description 1 another_ds 3 1243 2014-07-24 13:31:56 For pattern mattching 2 dfobj 2 656 2014-07-24 13:31:46 df objects 3 ds1 4 3162 2014-07-24 13:25:17 My datastore 4 ds2 2 1111 2014-07-24 13:27:26 only x R> R> # Delete the x and z objects. R> rm(x, z) R> ls() [1] "df1" "df2" "iris_of" "y" R> R> # Restore all of the objects in datastore ds2. R> ore.load("ds2") [1] "x" "z" R> R> ls() [1] "df1" "df2" "iris_of" "x" "y" "z" R> R> # After ending the R session and starting another session. R> ls() character(0) R> # The datastore objects persist between sessions. R> ore.datastore() datastore.name object.count size creation.date description 1 another_ds 3 1243 2014-07-24 13:31:56 For pattern mattching 2 dfobj 2 656 2014-07-24 13:31:46 df objects 3 ds1 4 3162 2014-07-24 13:25:17 My datastore 4 ds2 2 1111 2014-07-24 13:27:26 only x R> # Restore some of the objects from datastore ds1. R> ore.load("ds1", list = c("df1", "df2", "iris_of")) [1] "df1" "df2" "iris_of" R> ls() [1] "df1" "df2" "iris_of"
Parent topic: Save and Manage R Objects in the Database
2.2.5.7 Delete a Datastore
With the ore.delete
function, you can delete objects from an OML4R datastore or you can delete the datastore itself.
To delete a datastore, you specify the name of it. To delete one or more objects from the datastore, you specify the list
argument. The ore.delete
function returns the name of the deleted objects or datastore.
When you delete a datastore, OML4R discards all temporary database objects that were referenced by R objects in the deleted datastore. If you have saved an R object in more than one datastore, then OML4R discards a temporary database object only when no object in a datastore references the temporary database object.
Example 2-23 Using the ore.delete Function
This example demonstrates using ore.delete
to delete an object from a datastore and then to delete the entire datastore. The example uses objects created in Example 2-18.
# Delete the df2 object from the ds1 datastore. ore.delete("ds1", list = "df2") # Delete the datastore named ds1. ore.delete("ds1")Listing for Example 2-23
R> # Delete the df2 object from the ds1 datastore. R> ore.delete("ds1", list = "df2")[1] "df2" R> # Delete the datastore named ds1. R> ore.delete("ds1") [1] "ds1"
Parent topic: Save and Manage R Objects in the Database
2.2.5.8 About Using a Datastore in Embedded R Execution
Saving objects in a datastore makes it very easy to pass arguments to, and reference R objects with, embedded R execution functions.
You can save objects that you create in one R session in a single datastore in the database. You can pass the name of this datastore to an embedded R function as an argument for loading within that function. You can use a datastore to easily pass one object or multiple objects.
Parent topic: Save and Manage R Objects in the Database