1/10
Contents
Title and Copyright Information
Preface
Audience
Documentation Accessibility
Related Documents
Conventions
1
Introducing Oracle Big Data Appliance
1.1
What Is Big Data?
1.1.1
High Variety
1.1.2
High Complexity
1.1.3
High Volume
1.1.4
High Velocity
1.2
The Oracle Big Data Solution
1.3
Software for Big Data
1.3.1
Software Component Overview
1.4
Acquiring Data for Analysis
1.4.1
Hadoop Distributed File System
1.4.2
Hive
1.4.3
Oracle NoSQL Database
1.5
Organizing Big Data
1.5.1
MapReduce
1.5.2
Oracle R Support for Big Data
1.5.3
Oracle Big Data Connectors
1.5.3.1
Oracle SQL Connector for Hadoop Distributed File System
1.5.3.2
Oracle Loader for Hadoop
1.5.3.3
Oracle R Connector for Hadoop
1.5.3.4
Oracle Data Integrator Application Adapter for Hadoop
1.6
Analyzing and Visualizing Big Data
2
Administering Oracle Big Data Appliance
2.1
Monitoring a Cluster Using Oracle Enterprise Manager
2.1.1
Using the Enterprise Manager Web Interface
2.1.2
Using the Enterprise Manager Command-Line Interface
2.2
Managing CDH Operations Using Cloudera Manager
2.2.1
Monitoring the Status of Oracle Big Data Appliance
2.2.2
Performing Administrative Tasks
2.2.3
Managing Services With Cloudera Manager
2.3
Using Hadoop Monitoring Utilities
2.3.1
Monitoring the JobTracker
2.3.2
Monitoring the TaskTracker
2.4
Using Hue to Interact With Hadoop
2.5
About the Oracle Big Data Appliance Software
2.5.1
Software Components
2.5.2
Logical Disk Layout
2.6
About the CDH Software Services
2.6.1
Monitoring the CDH Services
2.6.2
Where Do the CDH Services Run?
2.6.2.1
Service Locations on a Single Rack
2.6.2.2
Service Locations in Multirack Clusters
2.6.3
Automatic Failover of the NameNode
2.6.4
Automatic Failover of the JobTracker
2.6.5
Unconfigured Software
2.6.6
Map and Reduce Resource Configuration
2.7
Configuring HBase
2.8
Effects of Hardware on Software Availability
2.8.1
Critical and Noncritical Nodes
2.8.2
First Namenode
2.8.3
Second NameNode
2.8.4
First JobTracker
2.8.5
Second JobTracker
2.8.6
Noncritical Nodes
2.9
Collecting Diagnostic Information for Oracle Customer Support
2.10
Security on Oracle Big Data Appliance
2.10.1
About Predefined Users and Groups
2.10.2
Port Numbers Used on Oracle Big Data Appliance
2.10.3
About CDH Security Using Kerberos
2.10.4
About Puppet Security
3
Supporting User Access to Oracle Big Data Appliance
3.1
Providing Remote Client Access to CDH
3.1.1
Prerequisites
3.1.2
Installing CDH on Oracle Exadata Database Machine
3.1.3
Installing a CDH Client on Any Supported Operating System
3.1.4
Configuring CDH
3.2
Managing User Accounts
3.2.1
Creating Hadoop Cluster Users
3.2.2
Providing User Login Privileges (Optional)
3.3
Recovering Deleted Files
3.3.1
Restoring Files from the Trash
3.3.2
Changing the Trash Interval
3.3.3
Disabling the Trash Facility
3.3.3.1
Completely Disabling the Trash Facility
3.3.3.2
Disabling the Trash Facility for Local HDFS Clients
3.3.3.3
Disabling the Trash Facility for a Remote HDFS Client
4
Optimizing MapReduce Jobs Using Perfect Balance
4.1
What is Perfect Balance?
4.1.1
About Balancing Jobs Across Map and Reduce Tasks
4.1.2
Methods of Running Perfect Balance
4.1.3
Perfect Balance Components
4.2
Getting Started with Perfect Balance
4.3
About the Perfect Balance Examples
4.3.1
About the Examples in this Chapter
4.3.2
Extracting the Example Data Set
4.4
Analyzing a Job for Imbalanced Reducer Loads
4.4.1
About Job Analyzer
4.4.1.1
Methods of Running Job Analyzer
4.4.2
Running Job Analyzer as a Standalone Utility
4.4.2.1
Job Analyzer Utility Example
4.4.2.2
Job Analyzer Utility Syntax
4.4.3
Running Job Analyzer With the Perfect Balance Driver
4.4.3.1
Job Analyzer Example
4.4.3.2
Collecting Additional Metrics
4.4.4
Reading the Job Analyzer Report
4.5
Running a Balanced MapReduce Job
4.5.1
Using the Perfect Balance Driver
4.5.2
Using the Perfect Balance API
4.5.2.1
Modifying Your Java Code to Use Perfect Balance
4.5.2.2
Running Your Modified Java Code with Perfect Balance
4.6
About Perfect Balance Reports
4.7
About Configuring Perfect Balance
4.8
Perfect Balance Configuration Property Reference
5
Configuring Oracle Exadata Database Machine for Use with Oracle Big Data Appliance
5.1
About Optimizing Communications
5.1.1
About Applications that Pull Data Into Oracle Exadata Database Machine
5.1.2
About Applications that Push Data Into Oracle Exadata Database Machine
5.2
Prerequisites
5.3
Specifying the InfiniBand Connections to Oracle Big Data Appliance
5.4
Enabling SDP on Exadata Database Nodes
5.5
Configuring a JDBC Client for SDP
5.6
Creating an SDP Listener on the InfiniBand Network
Glossary
Index
Scripting on this page enhances content navigation, but does not change the content in any way.