Skip Headers
Oracle® Big Data Connectors User's Guide
Release 2 (2.0)
Part Number E36961-03
Home
Book List
Index
Contact Us
Next
PDF
·
Mobi
·
ePub
Contents
Title and Copyright Information
Preface
Audience
Documentation Accessibility
Related Documents
Text Conventions
Syntax Conventions
Changes in This Release for Oracle Big Data Connectors User's Guide
Changes in Oracle Big Data Connectors Release 2 (2.0)
1
Getting Started with Oracle Big Data Connectors
1.1
About Oracle Big Data Connectors
1.2
Big Data Concepts and Technologies
1.2.1
What is MapReduce?
1.2.2
What is Apache Hadoop?
1.3
Downloading the Oracle Big Data Connectors Software
1.4
Oracle SQL Connector for Hadoop Distributed File System Setup
1.4.1
Software Requirements
1.4.2
Installing and Configuring Hadoop
1.4.3
Installing Oracle SQL Connector for HDFS
1.4.4
Granting User Access to Oracle SQL Connector for HDFS
1.5
Oracle Loader for Hadoop Setup
1.5.1
Software Requirements
1.5.2
Installing Oracle Loader for Hadoop
1.6
Oracle Data Integrator Application Adapter for Hadoop Setup
1.6.1
System Requirements and Certifications
1.6.2
Technology-Specific Requirements
1.6.3
Location of Oracle Data Integrator Application Adapter for Hadoop
1.6.4
Setting Up the Topology
1.7
Oracle R Connector for Hadoop Setup
1.7.1
Installing the Software on Hadoop
1.7.1.1
Software Requirements for a Third-Party Hadoop Cluster
1.7.1.2
Installing Sqoop on a Hadoop Cluster
1.7.1.3
Installing Hive on a Hadoop Cluster
1.7.1.4
Installing R on a Hadoop Cluster
1.7.1.5
Installing the ORCH Package on a Hadoop Cluster
1.7.2
Installing Additional R Packages
1.7.3
Providing Remote Client Access to R Users
1.7.3.1
Software Requirements for Remote Client Access
1.7.3.2
Configuring the Server as a Hadoop Client
1.7.3.3
Installing Sqoop on a Hadoop Client
1.7.3.4
Installing R on a Hadoop Client
1.7.3.5
Installing the ORCH Package on a Hadoop Client
1.7.3.6
Installing the Oracle R Enterprise Client Packages (Optional)
2
Oracle SQL Connector for Hadoop Distributed File System
2.1
About Oracle SQL Connector for HDFS
2.2
About External Tables
2.2.1
What Are Location Files?
2.2.2
Enabling Parallel Processing
2.2.3
Location File Management
2.2.4
Location File Names
2.3
Using the ExternalTable Command-Line Tool
2.3.1
About ExternalTable
2.3.2
Altering HADOOP_CLASSPATH
2.3.3
ExternalTable Command-Line Tool Syntax
2.4
Creating External Tables
2.4.1
Creating External Tables with the ExternalTable Tool
2.4.2
Creating External Tables from Data Pump Format Files
2.4.2.1
Required Properties
2.4.2.2
Optional Properties
2.4.2.3
Example
2.4.3
Creating External Tables from Hive Tables
2.4.3.1
Hive Table Requirements
2.4.3.2
Required Properties
2.4.3.3
Optional Properties
2.4.3.4
Example
2.4.4
Creating External Tables from Delimited Text Files
2.4.4.1
Required Properties
2.4.4.2
Optional Properties
2.4.4.3
Example
2.4.5
Creating External Tables in SQL
2.5
Publishing the HDFS Data Paths
2.6
Listing Location File Metadata and Contents
2.7
Describing External Tables
2.8
Querying Data in HDFS
2.9
Configuring Oracle SQL Connector for HDFS
2.9.1
Creating a Configuration File
2.9.2
Configuration Properties
3
Oracle Loader for Hadoop
3.1
What Is Oracle Loader for Hadoop?
3.2
Using Oracle Loader for Hadoop
3.2.1
Implementing InputFormat
3.2.1.1
HiveToAvroInputFormat
3.2.1.2
DelimitedTextInputFormat
3.2.1.3
RegexInputFormat
3.2.1.4
AvroInputFormat
3.2.1.5
KVAvroInputFormat
3.2.2
Creating the loaderMap Document
3.2.2.1
Example loaderMap Document
3.2.3
Accessing Table Metadata
3.2.3.1
Running the OraLoaderMetadata Utility
3.2.4
Invoking OraLoader
3.2.5
Loading Files Into an Oracle Database (Offline Loads Only)
3.2.5.1
Loading From Delimited Text Files Into an Oracle Database
3.3
Output Modes During OraLoader Invocation
3.3.1
JDBC Output
3.3.2
Oracle OCI Direct Path Output
3.3.3
Delimited Text Output
3.3.4
Oracle Data Pump Output
3.4
Error Handling and Diagnostics
3.4.1
Logging Rejected Records in Bad Files
3.4.2
Setting a Job Reject Limit
3.5
Balancing Loads When Loading Data into Partitioned Tables
3.5.1
Using the Sampling Feature
3.5.2
Tuning Load Balancing and Sampling Behavior
3.5.2.1
Properties to Tune Load Balancing
3.5.2.2
Properties to Tune Sampling Behavior
3.5.3
Does Oracle Loader for Hadoop Always Use the Sampler's Partitioning Scheme?
3.5.4
What Happens When a Sampling Feature Property Has an Invalid Value?
3.5.5
Primary Configuration Properties for the Load Balancing Feature
3.6
OraLoader Configuration Properties
3.6.1
Primary Job Configuration Properties
3.6.2
General Properties
3.7
Example of Using Oracle Loader for Hadoop
3.8
Target Table Characteristics
3.8.1
Supported Data Types
3.8.2
Supported Partitioning Strategies
3.9
Loader Map XML Schema Definition
3.10
XML Document for the Configuration Properties
3.11
Third-Party Licenses for Bundled Software
3.11.1
Apache Licensed Code
3.11.2
Apache Avro 1.6.3
3.11.3
Apache Commons Mathematics Library 2.2
3.11.4
Jackson JSON 1.8.8
4
Oracle Data Integrator Application Adapter for Hadoop
4.1
Introduction
4.1.1
Concepts
4.1.2
Knowledge Modules
4.1.3
Security
4.2
Setting Up the Topology
4.2.1
Setting Up File Data Sources
4.2.2
Setting Up Hive Data Sources
4.2.3
Setting Up the Oracle Data Integrator Agent to Execute Hadoop Jobs
4.2.4
Configuring Oracle Data Integrator Studio for Executing Hadoop Jobs on the Local Agent
4.3
Setting Up an Integration Project
4.4
Creating an Oracle Data Integrator Model from a Reverse-Engineered Hive Model
4.4.1
Creating a Model
4.4.2
Reverse Engineering Hive Tables
4.5
Designing the Interface
4.5.1
Loading Data from Files into Hive
4.5.2
Validating and Transforming Data Within Hive
4.5.2.1
IKM Hive Control Append
4.5.2.2
CKM Hive
4.5.2.3
IKM Hive Transform
4.5.3
Loading Data into an Oracle Database from Hive and HDFS
5
Oracle R Connector for Hadoop
5.1
About Oracle R Connector for Hadoop
5.2
Access to HDFS Files
5.3
Access to Hive
5.3.1
ORE Functions for Hive
5.3.2
Generic R Functions Supported in Hive
5.3.3
Support for Hive Data Types
5.3.4
Usage Notes for Hive Access
5.3.5
Example: Loading Hive Tables into Oracle R Connector for Hadoop
5.4
Access to Oracle Database
5.4.1
Usage Notes for Oracle Database Access
5.4.2
Scenario for Using Oracle R Connector for Hadoop with Oracle R Enterprise
5.5
Analytic Functions in Oracle R Connector for Hadoop
5.6
ORCH mapred.config Class
5.7
Examples and Demos of Oracle R Connector for Hadoop
5.7.1
Using the Demos
5.7.2
Using the Examples
5.8
Security Notes for Oracle R Connector for Hadoop
6
ORCH Library Reference
6.1
Functions in Alphabetical Order
6.2
Functions by Category
6.2.1
Making Connections
6.2.2
Copying Data
6.2.3
Exploring Files
6.2.4
Writing MapReduce Functions
6.2.5
Debugging Scripts
6.2.6
Using Hive Data
6.2.7
Writing Analytical Functions
hadoop.exec
hadoop.run
hdfs.attach
hdfs.cd
hdfs.cp
hdfs.describe
hdfs.download
hdfs.exists
hdfs.get
hdfs.head
hdfs.id
hdfs.ls
hdfs.mkdir
hdfs.mv
hdfs.parts
hdfs.pull
hdfs.push
hdfs.put
hdfs.pwd
hdfs.rm
hdfs.rmdir
hdfs.root
hdfs.sample
hdfs.setroot
hdfs.size
hdfs.tail
hdfs.upload
is.hdfs.id
orch.connect
orch.connected
orch.dbcon
orch.dbg.lasterr
orch.dbg.off
orch.dbg.on
orch.dbg.output
orch.dbinfo
orch.disconnect
orch.dryrun
orch.export
orch.keyval
orch.keyvals
orch.pack
orch.reconnect
orch.temp.path
orch.unpack
orch.version
Index
Scripting on this page enhances content navigation, but does not change the content in any way.