hdfs.get

Copies data from HDFS into a data frame in the local R environment. All metadata is extracted and all attributes, such as column names and data types, are restored if the data originated in an R environment. Otherwise, generic attributes like val1 and val2 are assigned.

Usage

hdfs.get(
        dfs.id,
        sep)

Arguments

dfs.id

The name of a file in HDFS. The file name can include a path that is either absolute or relative to the current path.

sep

The symbol used to separate fields in the file. A comma (,) is the default separator.

Usage Notes

If the HDFS file is small enough to fit into an in-memory R data frame, then you can copy the file using this function instead of the hdfs.pull function. The hdfs.get function can be faster, because it does not use Sqoop and thus does not have the overhead incurred by hdfs.pull.

Return Value

A data.frame object in memory in the local R environment pointing to the exported data set, or NULL if the operation failed

Example

This example returns the contents of a data frame named res.

R> print(hdfs.get(res))
  val1      val2
1   AA 1361.4643
2   AS  515.8000
3   CO 2507.2857
4   DL 1601.6154
5   HP  549.4286
6   NW 2009.7273
7   TW 1906.0000
8   UA 1134.0821
9   US 2387.5000
10  WN  541.1538