Changes in This Release for Oracle Big Data Connectors User's Guide

This preface lists changes in the Oracle Big Data Connectors User's Guide.

Changes in Oracle Big Data Connectors Release 2 (2.0)

The following are changes in Oracle Big Data Connectors User's Guide for Oracle Big Data Connectors Release 2 (2.0).

New Features

Oracle Big Data Connectors support Cloudera's Distribution including Apache Hadoop version 4 (CDH4). For other supported platforms, see the individual connectors in Chapter 1.

The name of Oracle Direct Connector for Hadoop Distributed File System changed to Oracle SQL Connector for Hadoop Distributed File System.

  • Oracle SQL Connector for Hadoop Distributed File System

    • Automatic creation of Oracle Database external tables from Hive tables, Data Pump files, or delimited text files.

    • Management of location files.

    See Chapter 2.

  • Oracle Loader for Hadoop

    • Support for Sockets Direct Protocol (SDP) for direct path loads

    • Support for secondary sort on user-specified columns

    • New input formats for regular expressions and Oracle NoSQL Database. The Avro record InputFormat is supported code instead of sample code.

    • Simplified date format specification

    • New reject limit threshold

    • Improved job reporting and diagnostics

    See Chapter 3.

  • Oracle Data Integrator Application Adapter for Hadoop

    • Uses Oracle SQL Connector for HDFS or Oracle Loader for Hadoop to load data from Hadoop into an Oracle database.

  • Oracle R Connector for Hadoop

    Several analytic algorithms are now available: linear regression, neural networks for prediction, matrix completion using low rank matrix factorization, clustering, and non-negative matrix factorization.

    Oracle R Connector for Hadoop supports Hive data sources in addition to HDFS files.

    Oracle R Connector for Hadoop can move data between HDFS and Oracle Database. Oracle R Enterprise is not required for this basic transfer of data.

    The following functions are new in this release:

    as.ore.*
    hadoop.jobs
    hdfs.head
    hdfs.tail
    is.ore.*
    orch.connected
    orch.dbg.lasterr
    orch.evaluate
    orch.export.fit
    orch.lm 
    orch.lmf
    orch.neural
    orch.nmf
    orch.nmf.NMFalgo
    orch.temp.path
    ore.*
    predict.orch.lm
    print.summary.orch.lm
    summary.orch.lm
    

    See Chapter 5.

Deprecated Features

The following features are deprecated in this release, and may be desupported in a future release:

  • Oracle SQL Connector for Hadoop Distributed File System

    • Location file format (version 1): Existing external tables with content published using Oracle Direct Connector for HDFS version 1 must be republished using Oracle SQL Connector for HDFS version 2, because of incompatible changes to the location file format.

      When Oracle SQL Connector for HDFS creates new location files, it does not delete the old location files.

      See Chapter 2.

    • oracle.hadoop.hdfs.exttab namespace (version 1): Oracle SQL Connector for HDFS uses the following new namespaces for all configuration properties:

      • oracle.hadoop.connection: Oracle Database connection and wallet properties

      • oracle.hadoop.exttab: All other properties

      See Chapter 2.

    • HDFS_BIN_PATH directory: The preprocessor directory name is now OSCH_BIN_PATH.

      See "Oracle SQL Connector for Hadoop Distributed File System Setup."

  • Oracle R Connector for Hadoop

Desupported Features

The following features are no longer supported by Oracle.

  • Oracle Loader for Hadoop

    • oracle.hadoop.loader.configuredCounters

      See Chapter 3.

Other Changes

The following are additional changes in the release:

  • Oracle Loader for Hadoop

    The installation zip archive now contains two kits:

    • oraloader-2.0.0-2.x86_64.zip for CDH4

    • oraloader-2.0.0-1.x86_64.zip for Apache Hadoop 0.20.2 and CDH3

    See "Oracle Loader for Hadoop Setup."