Index

A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  R  S  T  U  W  X  Y  Z 

A

activity reports, 2.10.3
application adapters, 1.5.3.5
applications
data pull, 5.1, 5.1.1
data push, 5.1.2
Audit Vault plug-in, 2.10.2
auditing data collected from services, 2.10.1
authentication, 3.1
autoAnalyze configuration property, 4.6, 4.12
autoAnalyze property, 4.4.3.2
autoBalance configuration property, 4.6, 4.12
Automated Service Manager
See OASM

B

Backup and Disaster Recovery, 2.6.5
BALANCER_HOME environment variable, 4.3, 4.3
bdadiag utility, 2.11
Berkeley DB, 1.4.3
best practices, 4.1
big data description, 1.1
business intelligence, 1.2, 1.4, 1.6
byteWeight configuration property, 4.12

C

CDH
about, 1.3
diagnostics, 2.11
file system, 1.4.1
remote client access, 3.2
security, 3.1
version, 2.5.1
chopped keys, 4.12
chunking files, 1.4.1
client access
HDFS cluster, 3.2.4
HDFS secured cluster, 3.2.5
Hive, 3.3
client configuration, 3.2
Cloudera's Distribution including Apache Hadoop
See CDH
Cloudera Manager
about, 2.2
accessing administrative tools, 2.2.2
connecting to, 2.2
effect of hardware failure on, 2.7.4
software dependencies, 2.7.4
starting, 2.2
UI overview, 2.2.1
version, 2.5.1
clusters, definition, 1.3
coefficients, key load, 4.4.3.2
confidence configuration property, 4.12
Counting Reducer, 4.1.3
CSV files, 1.5.3.1

D

Data Pump files, 1.5.3.2
data replication, 1.4.1
data skew, 4.1
DataNode, 2.7.1
dba group, 2.9.1
diagnostics, collecting, 2.11
disks, 2.5.2
dnsmasq service, 5.4
duplicating data, 1.4.1

E

emcli utility, 2.1.2
enableSorting configuration property, 4.12
engineered systems, 1.2
Exadata Database Machine, 1.2
Exadata InfiniBand connections, 5.3
Exalytics In-Memory Machine, 1.2
external tables, 1.5.3.1

F

failover
JobTracker, 2.6.4
NameNode, 2.6.3
files, recovering HDFS, 3.5
first NameNode, 2.7.2
Flume, 2.6.5, 2.9.1
ftp.oracle.com, 2.11

G

GC overhead limit exceeded error, 4.9.1
groups, 2.9.1, 3.4

H

Hadoop Distributed File System
See HDFS
hadoop group, 3.4
Hadoop Map/Reduce Administration, 2.3.1
Hadoop version, 1.3
HADOOP_CLASSPATH environment variable, 4.3, 4.11.1
HADOOP_USER_CLASSPATH_FIRST environment variable, 4.3
HBase, 2.6.5, 2.9.1
HDFS
about, 1.3, 1.4.1
auditing, 2.10.1
data files, 1.5.3.1
user identity, 2.9.1
help from Oracle Support, 2.11
Hive, 2.9.1
about, 1.4.2
auditing, 2.10.1
client access, 3.3
node location, 2.7.5
software dependencies, 2.7.4
tables, 3.4.1
user identity, 2.9.1, 2.9.1, 2.9.1
hive group, 3.4
HiveQL, 1.4.2
HotSpot
See Java HotSpot Virtual Machine
Hue, 2.7.5
user identity, 2.9.1
users, 3.4.1

I

Impala, 2.6.5
InfiniBand connections to Exadata, 5.3
InfiniBand network configuration, 5
inputFormat.mapred.* configuration properties, 4.12
installing CDH client, 3.2

J

Java HotSpot Virtual Machine, 2.5.1
JDBC client, configuring for SDP, 5.6
Job Analyzer, 4.1.3, 4.4.1
job duration, 4.1
jobconfPath property, 4.12
jobHistoryPath configuration property, 4.12
JobTracker
failover, 2.6.4
monitoring, 2.3.1
opening, 2.3.1
security, 3.1
user identity, 2.9.1
JobTracker node, 2.7.4

K

Kerberos authentication, 3.1
Kerberos commands, 3.1
Kerberos user setup, 3.4.1.2
key chopping, 4.1.1
key load coefficients, 4.4.3.2
keyLoad.minChopBytes configuration property, 4.12
keys, assigning to reducers, 4.1.1
key-value database, 1.4.3
keyWeight configuration property, 4.12
knowledge modules, 1.5.3.5

L

linearKeyLoad properties, 4.4.3.2
linearKeyLoad.* configuration properties, 4.12
Linux
disk location, 2.5.2
installation, 2.5.1
load, 4.1
Load Balancer, 4.1.3
loading data, 1.5.3.1, 1.5.3.2
login privileges, 3.4.2

M

Mahout, 2.6.5
mapper workload, 4.1.1
mapred configuration properties, 4.12
mapred user, 2.9.1
mapred.map.tasks configuration property, 4.12
MapReduce, 1.3, 1.5.1, 2.10.1, 3.1, 3.4.1
mapreduce configuration properties, 4.12
map.tasks property, 4.12
maxLoadFactor configuration property, 4.12
maxSamplesPct configuration property, 4.12
max.split.size configuration property, 4.12
minChopBytes configuration property, 4.12
minSplits configuration property, 4.12
monitoring
JobTracker, 2.3.1
TaskTracker, 2.3.2
monitoring activity, 2.10.3
multirack clusters
service locations, 2.6.2.2
MySQL Database
about, 2.7.4
port number, 2.9.3
user identity, 2.9.1
version, 2.5.1

N

NameNode, 3.1
first, 2.7.2
NameNode failover, 2.6.3
Navigator, 2.6.5
NoSQL databases
See Oracle NoSQL Database
numThreads configuration property, 4.12

O

OASM, port number, 2.9.3
ODI
See Oracle Data Integrator
oinstall group, 2.9.1, 3.4
Oozie, 2.7.5
auditing, 2.10.1
software dependencies, 2.7.4, 2.7.4
software services, 2.9.1
user identity, 2.9.1
openib.conf file, 5.5
operating system users, 2.9.1
Oracle Audit Vault and Database Firewall, 2.10
Oracle Automated Service Manager
See OASM
Oracle Data Integrator
about, 1.5.3.1, 1.5.3.5
node location, 2.7.5
software dependencies, 2.7.4
version, 2.5.1
Oracle Data Integrator agent, 2.9.3
Oracle Data Pump files, 1.5.3.2
Oracle Database Instant Client, 2.5.1
Oracle Exadata Database Machine, 1.2, 5
using as a CDH client, 3.2.2
Oracle Exalytics In-Memory Machine, 1.2
Oracle Linux
about, 1.3
relationship to HDFS, 1.3
version, 2.5.1
Oracle Loader for Hadoop, 1.5.3.2, 2.5.1
Oracle NoSQL Database
about, 1.4.3, 1.5.3.3
port numbers, 2.9.3
version, 2.5.1
Oracle R Advanced Analytics for Hadoop, 1.5.3.4, 2.5.1
Oracle R Enterprise, 1.5.2
Oracle SQL Connector for HDFS, 1.5.3.1
Oracle Support, creating a service request, 2.11
oracle user, 2.9.1, 3.4
Oracle XQuery for Hadoop, 1.5.3.3, 2.5.1
oracle.hadoop.balancer.* configuration properties, 4.12
oracle.hadoop.balancer.autoAnalyze configuration property, 4.6
oracle.hadoop.balancer.autoAnalyze property, 4.4.3.2
oracle.hadoop.balancer.autoBalance configuration property, 4.6
oracle.hadoop.balancer.Balancer class, 4.10
oracle.hadoop.balancer.KeyLoadLinear class, 4.12, 4.12
oracle.hadoop.balancer.linearKeyLoad.* properties, 4.4.3.2
oracle.hadoop.balancer.tools.printRecommendation property, 4.4.3.2
out of heap space errors, 4.9.2

P

partitioning, 2.5.2, 4.1.1
Perfect Balance
application requirements, 4.2
basic steps, 4.3
description, 4.1
planning applications, 1.2
port map, 2.9.3
port numbers, 2.9.3, 2.9.3
printRecommendation configuration property, 4.12
printRecommendation property, 4.4.3.2
pulling data into Exadata, 5.1, 5.1.1
puppet
port numbers, 2.9.3
security, 2.9.4
user identity, 2.9.1
puppet master
node location, 2.7.2
pushing data into Exadata, 5.1.2

R

R Connector
See Oracle R Advanced Analytics for Hadoop
R distribution, 2.5.1
R language support, 1.5.2
range partitioning, 4.1.1
recovering HDFS files, 3.5
reducer load, 4.1
remote client access, 3.2, 3.3
replicating data, 1.4.1
report.overwrite configuration property, 4.12
reportPath configuration property, 4.12
rowWeight configuration property, 4.12
rpc.statd service, 2.9.3

S

SDP listener configuration, 5.7
SDP over InfiniBand, 5
SDP, enabling on Exadata, 5.5
Search, 2.6.5
security, 2.9
service requests, creating for CDH, 2.11
service tags, 2.9.3
services
auditing, 2.10.1
node locations, 2.6.2
See also software services
skew, 4.1
Sockets Direct Protocol, 5.1
software components, 2.5.1
software framework, 1.3
software services
monitoring, 2.6.1
node locations, 2.6.2
port numbers, 2.9.3
Sqoop, 2.6.5, 2.9.1
ssh service, 2.9.3
svctag user, 2.9.1

T

tables, 1.5.3.1, 1.5.3.2, 3.4.1
Task Tracker Status interface, 2.3.2
TaskTracker
monitoring, 2.3.2
user identity, 2.9.1
tmpDir configuration property, 4.12
tools.* configuration properties, 4.12
trash facility, 3.5
trash facility, disabling, 3.5.3.1
trash interval, 3.5.2
troubleshooting CDH, 2.11

U

uploading diagnostics, 2.11
useClusterStats configuration property, 4.12
useMapreduceApi configuration property, 4.12
user accounts, 3.4.1
user groups, 3.4
users
Cloudera Manager, 2.2.2
operating system, 2.9.1

W

Whirr, 2.6.5
writeKeyBytes configuration property, 4.12

X

xinetd service, 2.9.3
XQuery connector
See Oracle XQuery for Hadoop

Y

YARN support, 1.5.1

Z

ZooKeeper, 2.9.1