Oracle R Enterprise provides the ability to create ordered or unordered ore.frame
objects. The following topics describe this feature.
R objects such as vector
and data.frame
have an implicit ordering of their elements. The data in an Oracle Database table is not necessarily ordered. For some R operations, ordering is useful whereas for other operations it is unnecessary. By ordering an ore.frame
, you are able to index the ore.frame
object by using either integer or character indexes.
Using an ordered ore.frame
object that is a proxy for a SQL query can be time-consuming for a large data set. Therefore, although Oracle R Enterprise attempts to create ordered ore.frame
objects by default, it also provides the means of creating an unordered ore.frame
object.
When you invoke the ore.sync
function to create an Oracle R Enterprise ore.frame
object as a proxy for a SQL query, you can use the use.keys
argument to specify whether the ore.frame
can be ordered or must be unordered.
An ore.frame
object can be ordered if one or more of the following conditions are true:
The value of the use.keys
argument of the ore.sync
function is TRUE
and a primary key is defined on the underlying table
The row names of the ore.frame
constitute a unique tuple
The ore.frame
object is produced by certain functions such as aggregate
and cbind
All of the ore.frame
objects that are input arguments to relevant Oracle R Enterprise functions are ordered
An ore.frame
object is unordered if one or more of the following conditions are true:
The value of the use.keys
argument of the ore.sync
function is FALSE
No primary key is defined on the underlying table and either the row names of the ore.frame
object are not specified or the row names of the ore.frame
object are set to NULL
One or more of the ore.frame
objects that are input arguments to relevant Oracle R Enterprise functions are unordered
An unordered ore.frame
object has null row names. You can determine whether an ore.frame
object is ordered by invoking is.null
on the row names of the objects, as shown in the last lines of Example 2-13. If the ore.frame
object is unordered, is.null
returns an error.
See Also:
Oracle R Enterprise has options that relate to the ordering of an ore.frame
object. The ore.warn.order
global option specifies whether you want Oracle R Enterprise to display a warning message if you use an unordered ore.frame
object in a function that requires ordering. If you know what to expect in an operation, then you might want to turn the warnings off so they do not appear in the output. For examples of the warning messages, see Example 2-13 and Example 2-14.
You can see what the current setting is, or turn the option on or off, as in the following example.
R> options("ore.warn.order") $ore.warn.order [1] TRUE R> options("ore.warn.order" = FALSE) R> options("ore.warn.order" = TRUE)
With the ore.sep
option, you can specify the separator between the row name values that you use for multi-column keys, as in the following example.
R> options("ore.sep") $ore.sep [1] "|" R> options("ore.sep" = "/") R> options("ore.sep" = "|")
You can use the primary key of a database table to order an ore.frame
object.
The following example loads the spam data set from the kernlab
package. It adds two columns to the data set.
The example invokes ore.drop
to drop the named tables if they exist. It then invokes ore.create
to create two tables from the data set. It invokes ore.exec
to make the USERID and TS columns a composite primary key of the SPAM_PK table, and invokes ore.sync
to synchronize the table with its ore.frame
proxy.
Note:
The ore.exec
function executes a SQL statement in the Oracle Database schema. The function is intended for database definition language (DDL) statements that have no return value.
The example then displays the first eight rows of each table. The proxy object for the SPAM_PK table is an ordered ore.frame
object. It has row names that are a combination of the TS and USERID column values separated by the "|" character. The proxy object for the SPAM_NOPK table is an unordered ore.frame
object that has the symbol SPAM_NOPK
. By default, SPAM_NOPK
has row names that are sequential numbers.
Example 2-13 Ordering Using Keys
# Prepare the data. library(kernlab) data(spam) s <- spam # Create a column that has integer values. s$TS <- 1001:(1000 + nrow(s)) # Create a column that has integer values with each number repeated twice. s$USERID <- rep(351:400, each=2, len=nrow(s)) # Ensure that the database tables do not exist. ore.drop(table='SPAM_PK') ore.drop(table='SPAM_NOPK') # Create database tables. ore.create(s[,c(59:60,1:28)], table="SPAM_PK") ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK") # Using a SQL statement, alter the SPAM_PK table to add a composite primary key. ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key (\"USERID\",\"TS\")") # Synchronize the table to get the change to it. ore.sync(table = "SPAM_PK") # View the data in the tables. # The row names of the ordered SPAM_PK are the primary key column values. head(SPAM_PK[,1:8]) # The row names of the unordered SPAM_NOPK are sequential numbers. # The first warning results from the inner accessing of SPAM_NOPK to subset # the columns. The second warning is for the invocation of the head # function on that subset. head(SPAM_NOPK[,1:8]) # Verify that SPAM_NOPK is unordered. is.null(row.names(SPAM_NOPK))Listing for This Example
R> # Prepare the data. R> library(kernlab) R> data(spam) R> s <- spam R> # Create a column that has integer values. R> s$TS <- 1001:(1000 + nrow(s)) R> # Create a column that has integer values with each number repeated twice. R> s$USERID <- rep(351:400, each=2, len=nrow(s)) R> # Ensure that the database tables do not exist. R> ore.drop(table='SPAM_PK') R> ore.drop(table='SPAM_NOPK') R> # Create database tables. R> ore.create(s[,c(59:60,1:28)], table="SPAM_PK") R> ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK") R> # Using a SQL statement, alter the SPAM_PK table to add a composite primary key. R> ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key + (\"USERID\",\"TS\")") R> # Synchronize the table to get the change to it. R> ore.sync(table = "SPAM_PK") R> # View the data in the tables. R> # The row names of the ordered SPAM_PK are the primary key column values. R> head(SPAM_PK[,1:8]) TS USERID make address all num3d our over 1001|351 1001 351 0.00 0.64 0.64 0 0.32 0.00 1002|351 1002 351 0.21 0.28 0.50 0 0.14 0.28 1003|352 1003 352 0.06 0.00 0.71 0 1.23 0.19 1004|352 1004 352 0.00 0.00 0.00 0 0.63 0.00 1005|353 1005 353 0.00 0.00 0.00 0 0.63 0.00 1006|353 1006 353 0.00 0.00 0.00 0 1.85 0.00 R> # The row names of the unordered SPAM_NOPK are sequential numbers. R> # The first warning results from the inner accessing of SPAM_NOPK to subset R> # the columns. The second warning is for the invocation of the head R> # function on that subset. R> head(SPAM_NOPK[,1:8]) TS USERID make address all num3d our over 1 1001 351 0.00 0.64 0.64 0 0.32 0.00 2 1002 351 0.21 0.28 0.50 0 0.14 0.28 3 1003 352 0.06 0.00 0.71 0 1.23 0.19 4 1004 352 0.00 0.00 0.00 0 0.63 0.00 5 1005 353 0.00 0.00 0.00 0 0.63 0.00 6 1006 353 0.00 0.00 0.00 0 1.85 0.00 Warning messages: 1: ORE object has no unique key - using random order 2: ORE object has no unique key - using random order R> # Verify that SPAM_NOPK is unordered. R> is.null(row.names(SPAM_NOPK)) Error: ORE object has no unique key
You can use row names to order an ore.frame
object.
The following example creates a data.frame
object in the local R session memory and pushes it to the ore.frame
object with the symbol a
, which exists in the memory of the Oracle database to which the R session is connected. The example shows that the ore.frame
object has the default row names of the R data.frame
object. Because the ore.frame
object is ordered, invoking the row.names
function on it does not produce a warning message.
The example uses the ordered SPAM_PK
and unordered SPAM_NOPK
ore.frame
objects to show that invoking row.names
on the unordered SPAM_NOPK
produces a warning message but invoking it on the ordered SPAM_PK
does not.
The SPAM_PK
object is ordered by the row names, which are the combined values of the TS and USERID column values separated by the "|" character. The example shows that you can change the row names.
Example 2-14 Ordering Using Row Names
# Prepare the data. library(kernlab) data(spam) s <- spam # Create a column that has integer values. s$TS <- 1001:(1000 + nrow(s)) # Create a column that has integer values with each number repeated twice. s$USERID <- rep(351:400, each=2, len=nrow(s)) # Ensure that the database tables do not exist. ore.drop(table='SPAM_PK') ore.drop(table='SPAM_NOPK') # Create database tables. ore.create(s[,c(59:60,1:28)], table="SPAM_PK") ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK") # Using a SQL statement, alter the SPAM_PK table to add a composite primary key. ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key (\"USERID\",\"TS\")") # Synchronize the table to get the change to it. ore.sync(table = "SPAM_PK") # Create an ordered ore.frame by default. a <- ore.push(data.frame(a=c(1:10,10:1), b=letters[c(1:10,10:1)])) # Display the values in the b column. Note that because the ore.frame is # ordered, no warnings appear. a$b # Display the default row names for the first six rows of the a column. row.names(head(a)) # SPAM_NOPK has no unique key, so row.names raises error messages. row.names(head(SPAM_NOPK)) # Row names consist of TS ‘|' USERID. # For display on this page, only the first four row names are shown. row.names(head(SPAM_PK)) # Reassign the row names to the TS column only row.names(SPAM_PK) <- SPAM_PK$TS # The row names now correspond to the TS values only. row.names(head(SPAM_PK[,1:4])) head(SPAM_PK[,1:4])Listing for This Example
R> # Prepare the data. R> library(kernlab) R> data(spam) R> s <- spam R> # Create a column that has integer values. R> s$TS <- 1001:(1000 + nrow(s)) R> # Create a column that has integer values with each number repeated twice. R> s$USERID <- rep(351:400, each=2, len=nrow(s)) R> # Ensure that the database tables do not exist. R> ore.drop(table='SPAM_PK') R> ore.drop(table='SPAM_NOPK') R> # Create database tables. R> ore.create(s[,c(59:60,1:28)], table="SPAM_PK") R> ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK") R> # Using a SQL statement, alter the SPAM_PK table to add a composite primary key. R> ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key + (\"USERID\",\"TS\")") R> # Synchronize the table to get the change to it. R> ore.sync(table = "SPAM_PK") R> # Create an ordered ore.frame by default. R> a <- ore.push(data.frame(a=c(1:10,10:1), b=letters[c(1:10,10:1)])) R> # Display the values in the b column. Note that because the ore.frame is R> # ordered, no warnings appear. R> a$b [1] a b c d e f g h i j j i h g f e d c b aLevels: a b c d e f g h i j R> # Display the default row names for the first six rows of the a column. R> row.names(head(a)) [1] 1 2 3 4 5 6 R> # SPAM_NOPK has no unique key, so row.names raises error messages. R> row.names(head(SPAM_NOPK)) Error: ORE object has no unique key In addition: Warning message: ORE object has no unique key - using random order R> # Row names consist of TS ‘|' USERID. R> # For display on this page, only the first four row names are shown. R> row.names(head(SPAM_PK)) 1001|351 1002|351 1003|352 1004|352 "1001|3.51E+002" "1002|3.51E+002" "1003|3.52E+002" "1004|3.52E+002" R> # Reassign the row names to the TS column only R> row.names(SPAM_PK) <- SPAM_PK$TS R> # The row names now correspond to the TS values only. R> row.names(head(SPAM_PK[,1:4])) [1] 1001 1002 1003 1004 1005 1006 R> head(SPAM_PK[,1:4]) TS USERID make address 1001 1001 351 0.00 0.64 1002 1002 351 0.21 0.28 1003 1003 352 0.06 0.00 1004 1004 352 0.00 0.00 1005 1005 353 0.00 0.00 1006 1006 353 0.00 0.00
This example shows the result of merging two ordered ore.frame
objects and two unordered ore.frame
objects.
Example 2-15 Merging Ordered and Unordered ore.frame Objects
# Prepare the data. library(kernlab) data(spam) s <- spam # Create a column that has integer values. s$TS <- 1001:(1000 + nrow(s)) # Create a column that has integer values with each number repeated twice. s$USERID <- rep(351:400, each=2, len=nrow(s)) # Ensure that the database tables do not exist. ore.drop(table='SPAM_PK') ore.drop(table='SPAM_NOPK') # Create database tables. ore.create(s[,c(59:60,1:28)], table="SPAM_PK") ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK") # Using a SQL statement, alter the SPAM_PK table to add a composite primary key. ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key (\"USERID\",\"TS\")") # Synchronize the table to get the change to it. ore.sync(table = "SPAM_PK") # Create objects for merging data from unordered ore.frame objects. x <- SPAM_NOPK[,1:4] y <- SPAM_NOPK[,c(1,2,4,5)] m1 <- merge(x, y, by="USERID") # The merged result m1 produces a warning because it is not an ordered frame. head(m1,3) # Create objects for merging data from ordered ore.frame objects. x <- SPAM_PK[,1:4] y <- SPAM_PK[,c(1,2,4,5)] # The merged result m1 does not produce a warning now because it is an # ordered frame. m1 <- merge(x, y, by="USERID") head(m1,3)Listing for This Example
R> # Prepare the data. R> library(kernlab) R> data(spam) R> s <- spam R> # Create a column that has integer values. R> s$TS <- 1001:(1000 + nrow(s)) R> # Create a column that has integer values with each number repeated twice. R> s$USERID <- rep(351:400, each=2, len=nrow(s)) R> # Ensure that the database tables do not exist. R> ore.drop(table='SPAM_PK') R> ore.drop(table='SPAM_NOPK') R> # Create database tables. R> ore.create(s[,c(59:60,1:28)], table="SPAM_PK") R> ore.create(s[,c(59:60,1:28)], table="SPAM_NOPK") R> # Uing a SQL statement, alter the SPAM_PK table to add a composite primary key. R> ore.exec("alter table SPAM_PK add constraint SPAM_PK primary key + (\"USERID\",\"TS\")") R> # Synchronize the table to get the change to it. R> ore.sync(table = "SPAM_PK") R> # Create objects for merging data from unordered ore.frame objects. R> x <- SPAM_NOPK[,1:4] R> y <- SPAM_NOPK[,c(1,2,4,5)] R> m1 <- merge(x, y, by="USERID") R> # The merged result m1 produces a warning because it is not an ordered frame. R> head(m1,3) USERID TS.x make address.x TS.y address.y all 1 351 5601 0.00 0 1001 0.64 0.64 2 351 5502 0.00 0 1001 0.64 0.64 3 351 5501 0.78 0 1001 0.64 0.64 Warning messages: 1: ORE object has no unique key - using random order 2: ORE object has no unique key - using random order R> # Create objects for merging data from ordered ore.frame objects. R> x <- SPAM_PK[,1:4] R> y <- SPAM_PK[,c(1,2,4,5)] R> # The merged result m1 does not produce a warning now because it is an R> # ordered frame. R> m1 <- merge(x, y, by="USERID") R> head(m1,3) USERID TS.x make address.x TS.y address.y all 1001|1001 351 1001 0 0.64 1001 0.64 0.64 1001|1002 351 1001 0 0.64 1002 0.28 0.50 1001|1101 351 1001 0 0.64 1101 0.00 0.00