Use the Oracle Shell for Hadoop Loaders Interface (OHSH)

The following sections describe how to use the Oracle Shell for Hadoop Loaders (OHSH) interface.

Oracle Shell for Hadoop Loaders is the preferred way to use the Oracle Loader for Hadoop and Copy to Hadoop database connectors. It includes a command line interface (whose simple command syntax can also be scripted) for moving data between Hadoop and Oracle Database using the database connectors.

About Oracle Shell for Hadoop Loaders

Oracle Shell for Hadoop Loaders is a helper shell that provides an easy-to-use command line interface to Oracle Loader for Hadoop and Copy to Hadoop. It has basic shell features such as command line recall, history, inheriting environment variables from the parent process, setting new or existing environment variables, and performing environmental substitution in the command line.

The core functionality of OHSH includes the following:

  • Defining named external resources with which OHSH interacts to perform loading tasks.

  • Setting default values for load operations.

  • Running load commands.

  • Delegating simple pre and post load tasks to the Operating System, HDFS, Hive, and Oracle. These tasks include viewing the data to be loaded, and viewing the data in the target table after loading.

Configure Big Data Cloud for Oracle Shell for Hadoop Loaders

To get started with OHSH in Oracle Big Data Cloud:

  1. SSH to a node on Big Data Cloud and log in, then execute the following:
    sudo su oracle
  2. Add /opt/oracle/dbconnector/ohsh/bin to your PATH variable. The OHSH executable is at this location.
  3. Start OHSH with the following command:
    ohsh
You’re now ready to run OHSH commands to move data between Big Data Cloud and Oracle Database.

Get Started with Oracle Shell for Hadoop Loaders

Starting an OHSH Interactive Session

To start an interactive session, enter ohsh on the command line. This brings you to the OHSH shell (if you have ohsh in your path):

$ ohsh
ohsh>

You can execute OHSH commands in this shell (using the OHSH syntax). You can also execute commands for Beeline/Hive, Hadoop, Bash, and SQL*Plus. For non-OHSH commands, you add a delegation operator prefix (“%”) to the name of the resource used to execute the command. For example:

ohsh> %bash0 ls —l

Scripting OHSH

You can also script the same commands that work in the CLI. The ohsh command provides three parameters for working with scripts.

  • ohsh —i <filename>.ohsh

    The —i parameter tells OHSH to initialize an interactive session with the commands in the script before the prompt appears. This is a useful way to set up the required session resources and automate other preliminary tasks before you start working within the shell.

    $ ohsh –i initresources.ohsh
  • ohsh —f <filename>.ohsh

    The ohsh command with the —f parameter starts a non-interactive session and runs the commands in the script.

    $ ohsh –f myunattendedjobs.ohsh
  • ohsh —i —f <filename>.ohsh

    You can use —i and —f together to initialize a non-interactive session and then run another script in the session.

    $ ohsh -i mysetup.ohsh –f myunattendedjobs.ohsh
  • ohsh —c

    This command dumps all Hadoop configuration properties that an OHSH session inherits at start up.

Working With OHSH Resources

A resource is some named entity that OHSH interacts with. For example: a Hadoop cluster is a resource, as is a JDBC connection to an Oracle database, a Hive database, a SQL*Plus session with an Oracle database, and a Bash shell on the local OS.

OHSH provides two default resources at start up: hive0 (to connect to the default Hive database) and hadoop0.

  • Using hive0 resource to execute a Hive command:

    ohsh> %hive0 show tables;

    You can create additional Hive resources to connect to other Hive databases.

  • Using the hadoop0 resource to execute a Hadoop command:

    ohsh> %hadoop0 fs -ls

Within an interactive or scripted session, you can create instances of additional resources, such as SQL*Plus and JDBC. You need to create these two resources in order to connect to Oracle Database through OHSH.

  • Creating an SQL*Plus resource:

    ohsh> create sqlplus resource sql0 connectid=”bigdatalite.localdomain:1521/orcl”

  • Creating a JDBC resource:

    ohsh> create jdbc resource jdbc0 connectid=<database connection URL>
  • Showing resources:

    ohsh> show resources

    This command lists default resources and any additional resources created within the session.

Getting Help

The OHSH shell provides online help for all commands.

To get a list of all OHSH commands:

ohsh> help

To get help on a specific command, enter help, followed by the command:    

ohsh> help show

The table below describes the help categories available.

Help Command Description
help load Describes load commands for Oracle and Hadoop tables.
help set Shows help for setting defaults for load operations. It also describes what load methods are impacted by a particular setting.
help show Shows help for inspecting default settings.
help shell Shows shell-like commands.
help resource Show commands for creating and dropping named resources.