Copies data from an Oracle database to HDFS.
This operation requires authentication by Oracle Database. See orch.connect
.
An ore.frame
object with the data in an Oracle database to be pushed.
The index or name of the key column.
Unique name for the object in HDFS.
TRUE
to allow dfs.name
to overwrite an object with the same name, or FALSE
to signal an error (default).
Identifies the driver used to copy the data. This argument is currently ignored because Sqoop is the only supported driver.
The column to use for data partitioning (required).
Because this operation is synchronous, copying a large data set may take a while. The prompt reappears and you regain use of R when copying is complete.
An ore.frame
object is an Oracle R Enterprise metadata object that points to a database table. It corresponds to an R data.frame
object.
If you omit the split.by
argument, then hdfs.push
might import only a portion of the data into HDFS.
This example creates an ore.frame
object named ontime_s2000
that contains the rows from the ONTIME_S
database table in where the year equals 2000. Then hdfs.push
uses ontime_s2000
to create /user/oracle/xq/ontime2000_DB in HDFS.
R> ontime_s2000 <- ONTIME_S[ONTIME_S$YEAR == 2000,] R> class(ontime_s2000) [1] "ore.frame" attr(,"package") [1] "OREbase" R> ontime2000.dfs <- hdfs.push(ontime_s2000, key='DEST', dfs.name='ontime2000_DB', 'split.by='YEAR') R> ontime2000.dfs [1] "/user/oracle/xq/ontime2000_DB" attr(,"dfs.id") [1] TRUE