This chapter includes the following sections:
The following table summarizes the steps you need to perform to execute Oozie Workflows with Oracle Data Integrator.
Table 5-1 Executing Oozie Workflows
Step | Description |
---|---|
Set up the Oozie runtime engine |
Set up the Oozie runtime engine to configure the connection to the Hadoop data server where the Oozie engine is installed. This Oozie runtime engine is used to execute ODI Design Objects or Scenarios on the Oozie engine as Oozie workflows. |
Execute or deploy an Oozie workflow |
Run the ODI Design Objects or Scenarios using the Oozie runtime engine created in the previous step to execute or deploy an Oozie workflow. |
Audit Hadoop Logs |
Audit the Hadoop Logs to monitor the execution of the Oozie workflows from within Oracle Data Integrator. See Auditing Hadoop Logs. |
Before you set up the Oozie runtime engine, ensure that the Hadoop data server where the Oozie engine is deployed is available in the topology. The Oozie engine needs to be associated to this Hadoop data server.
To set up the Oozie runtime engine:
The following table describes the fields that you need to specify on the Definition tab when defining a new Oozie runtime engine. An Oozie runtime engine models an actual Oozie server in a Hadoop environment.
Table 5-2 Oozie Runtime Engine Definition
Field | Values |
---|---|
Name |
Name of the Oozie runtime engine that appears in Oracle Data Integrator. |
Host |
Name or IP address of the machine on which the Oozie runtime agent has been launched. |
Port |
Listening port used by the Oozie runtime engine. Default Oozie port value is 11000. |
Web application context |
Name of the web application context. Type |
Protocol |
Protocol used for the connection. Possible values are |
Hadoop Server |
Name of the Hadoop server where the oozie engine is installed. This Hadoop server is associated with the oozie runtime engine. |
Poll Frequency |
Frequency at which the Hadoop audit logs are retrieved and stored in ODI repository as session logs. The poll frequency can be specified in seconds (s), minutes (m), hours (h), days (d), and years (d). For example, 5m or 4h. |
Lifespan |
Time period for which the Hadoop audit logs retrieval coordinator stays enabled to schedule audit logs retrieval workflows. Lifespan can be specified in minutes (m), hours (h), days (d), and years (d). For example, 4h or 2d. |
Schedule Frequency |
Frequency at which the Hadoop audit logs retrieval workflow is scheduled as an Oozie Coordinator Job. Schedule workflow can be specified in minutes (m), hours (h), days (d), and years (d). For example, 20m or 5h. |
The following table describes the properties that you can configure on the Properties tab when defining a new Oozie runtime engine.
Table 5-3 Oozie Runtime Engine Properties
Field | Values |
---|---|
OOZIE_WF_GEN_MAX_DETAIL |
Limits the maximum detail (session level or fine-grained task level) allowed when generating ODI Oozie workflows for an Oozie engine. Set the value of this property to TASK to generate an Oozie action for every ODI task or to SESSION to generate an Oozie action for the entire session. |
To create a logical oozie agent:
You can run an ODI design object or scenario using the Oozie runtime engine to execute an Oozie Workflow on the Oozie engine. When running the ODI design object or scenario, you can choose to only deploy the Oozie workflow without executing it.
To deploy or execute an ODI Oozie workflow:
When the ODI Oozie workflows are executed, log information is retrieved and captured according to the frequency properties on the Oozie runtime engine. This information relates to the state, progress, and performance of the Oozie job.
You can retrieve the log data of an active Oozie session by clicking the Retrieve Log Data in the Operator menu. Also, you can view information regarding the oozie session in the oozie webconsole or the MapReduce webconsole by clicking the URL available in the Definition tab of the Session Editor.
The Details tab in the Session Editor, Session Step Editor, and Session Task Editor provides a summary of the oozie and MapReduce job.
Support of userlib jars for ODI Oozie workflows allows a user to copy jar files into a userlib HDFS directory, which is referenced by ODI Oozie workflows that are generated and submitted with the oozie.libpath
property.
This avoids replicating the libs/jars
in each of the workflow app's lib HDFS directory. The userlib directory is located in HDFS in the following location:
<ODI HDFS Root>/odi_<version>/userlib