12 Working with Procedures, Variables, Sequences, and User Functions

This chapter describes how to work with procedures, variables, sequences, and user functions. An overview of these components and how to work with them is provided.

This chapter includes the following sections:

12.1 Working with Procedures

This section provides an introduction to procedures and describes how to create and use procedures in Oracle Data Integrator.

12.1.1 Introduction to Procedures

A Procedure is a set of commands that can be executed by an agent. These commands concern all technologies accessible by Oracle Data Integrator (OS, JDBC, JMS commands, etc).

A Procedure is a reusable component that allows you to group actions that do not fit in the Interface framework. Procedures should be considered only when what you need to do can't be achieved in an interface. In this case, rather than writing an external program or script, you would include the code in Oracle Data Integrator and execute it from your packages. Procedures require you to develop all your code manually, as opposed to interfaces.

A procedure is composed of command lines, possibly mixing different languages. Every command line may contain two commands that can be executed on a source and on a target. The command lines are executed sequentially. Some command lines may be skipped if they are controlled by an option. These options parameterize whether or not a command line should be executed as well as the code of the commands.

The code within a procedure can be made generic by using string options and the ODI Substitution API.

Before creating a procedure, note the following:

  • Although you can perform data transformations in procedures, using them for this purpose is not recommended; use interfaces instead.

  • If you start writing a complex procedure to automate a particular recurring task for data manipulation, you should consider converting it into a Knowledge Module. Refer to the Knowledge Module Developer's Guide for Oracle Data Integrator for more information.

  • Whenever possible, try to avoid operating-system-specific commands. Using them makes your code dependent on the operating system that runs the agent. The same procedure executed by 2 agents on 2 different operating systems (such as Unix and Windows) will not work properly.

The following sections describe how to create and use procedures.

12.1.2 Creating Procedures

Creating a procedure follows a standard process which can vary depending on the use case. The following step sequence is usually performed when creating a procedure:

  1. Create a New Procedure

  2. Define the Procedure's Options

  3. Create and Manage the Procedure's Commands.

When creating procedures, it is important to understand the following coding guidelines:

12.1.2.1 Create a New Procedure

To create a new procedure:

  1. In Designer Navigator select the Procedures node in the folder under the project where you want to create the procedure.

  2. Right-click and select New Procedure.

  3. On the Definition tab fill in the procedure Name.

  4. Check the Multi-Connections if you want the procedure to manage more than one connection at a time.

    Multi-Connections: It is useful to choose a multi-connection procedure if you wish to use data that is retrieved by a command sent on a connection (the source connection, indicated on the Source tab) in a command sent to another connection (the target connection, indicated on the Target tab). This data will pass though the execution agent. If you access one connection at a time (which enables you to access different connections, but only one at a time) leave the Multi-Connections box unchecked.

  5. Select the Target Technology and if the Multi-Connections box is checked also select the Source Technology. Each new Procedure line will be based on this technology. You can also leave these fields empty and specify the technologies in each procedure command.

    Caution:

    Source and target technologies are not mandatory for saving the Procedure. However, the execution of the Procedure might fail, if the related commands require to be associated with certain technologies and logical schemas.
  6. From the File menu, click Save.

A new procedure is created, and appears in the Procedures list in the tree under your Project.

12.1.2.2 Define the Procedure's Options

Procedure options act like parameters for your steps and improve the code reusability.

There are two types of options:

  • Boolean options called Check Boxes. Their value can be used to determine whether individual command are executed or not. They act like an “if” statement.

  • Value and Text options used to pass in short or long textual information respectively. The values of these options can only be recovered in the code of the procedure's commands, using the getOption() substitution method. When using your procedure in a package, its values can be set on the step.

To create procedure's options:

  1. In Designer Navigator select your procedure's node.

  2. Right-click and select New Option. The Procedure Option Editor is displayed.

  3. Fill in the following fields:

    • Name: Name of the option as it appears in the graphical interface

    • Type: Type of the option.

      • Check Box: The option is boolean: Yes = 1/No = 0. These are the only options used for procedures and KMs to determine if such tasks should be executed or not.

      • Default Value: It is an alphanumerical option. Maximum size is 250 characters.

      • Text: It is an alphanumerical option. Maximum size is not limited. Accessing this type of option is slower than for default value options.

    • Description: Short description of the option. For Check Box options, this is displayed on the Command editor where you select which options will trigger the execution of the command.

    • Position: Determines the order of appearance of the option when the procedure or KM options list is displayed.

    • Help: Descriptive help on the option. For KMs, is displayed in the properties pane when the KM is selected in an interface.

    • Default value: Value that the option will take, if no value has been specified by the user of the procedure or the KM.

  4. From the File menu, click Save.

  5. Repeat these operations for each option that is required for the procedure.

Options appear in the Projects accordion under the Procedure node and on the Options tab of the Procedure Editor.

12.1.2.3 Create and Manage the Procedure's Commands

To create a procedure's command line:

  1. In Designer Navigator double-click the procedure for which you want to create a command. The Procedure Editor opens.

  2. In the Procedure Editor, go to the Details tab

  3. Click Add. Enter the name for your new command. The Command Line Editor opens.

  4. In the Command Line Editor fill in the following fields:

    • Log Counter: Shows which counter (Insert, Update, Delete or Errors) will record the number of rows processed by this command. Note that the Log Counter works only for Insert, Update, Delete, and Errors rows resulting from an Insert or Update SQL statement.

      After executing the Procedure, you can view the counter results in Operator Navigator. They are displayed on the Definition tab of the Step and Task editor in the Record Statistics section.

    • Log level: Logging level of the command. At execution time, the task generated for this command will be kept in the Session log based on this value and the log level defined in the execution parameters. Refer to Table 19-1 for more details on the execution parameters.

    • Ignore Errors must be checked if you do not want the procedure to stop if this command returns an error. If this box is checked, the procedure command will go into "warning" instead of "error", and the procedure will not be stopped.

  5. In the Command on Target tab, fill in the following fields:

    • Technology: Technology used for this command. If it is not set, the technology specified on the Definition tab of the Procedure editor is used.

    • Transaction Isolation: The transaction isolation level for the command.

    • Context: Forced Context for the execution. If it is left undefined, the execution context will be used. You can leave it undefined to ensure the portability of the code in any context.

    • Schema: Logical schema for execution of the command.

    • Transaction: Transaction where the command will be executed.

    • Commit: Indicates the commit mode of the command in the transaction.

    • Command: Text of the command to execute. You may call the expression Editor by clicking Launch the Expression Editor.

    The command must be entered in a language appropriate for the selected technology. Refer to Writing Code in Procedures for more information.

    It is advised to use the substitution methods to make the code generic and dependent on the topology information. Refer to Using the Substitution API.

    The Transaction and Commit options allow you to run commands within transactions. Refer to Handling RDBMS Transactions for more information.

    Note:

    The transaction, commit and transaction isolation options work only for technologies supporting transactions.

    Most of the procedures use only commands on the target. In some cases, it is required to read data and perform actions using this data. For these use case, you specify the command to read the data in the Command on Source tab and the actions performed with this data in the Command on Target tab. Refer to "Binding Source and Target Data" for more information. Skip step 6 if you are not in this use case.

  6. For Multi-Connections Procedures, repeat step 5 for the Command on Source tab.

  7. In the Options section, check the Always Execute box if you want this command to be executed all the time regardless of the option values. Otherwise, check the options of type boolean that control the command execution. At run-time, if any of the selected options is set to Yes, the command is executed.

  8. From the File menu, click Save.

To duplicate a command:

  1. Go to the Details tab of the Procedure.

  2. Select the command to duplicate.

  3. Right-click then select Duplicate. The Command Line Editor opens. It is a copy of the selected command.

  4. Make the necessary modifications and from the File menu, click Save.

The new command is listed on the Details tab.

To delete a command line:

  1. Go to the Details tab of the Procedure.

  2. Select the command line to delete.

  3. From the Editor toolbar, click Delete.

The command line will disappear from the list.

To order the command lines:

The command lines are executed in the order displayed in the Details tab of the Procedure Editor. It may be necessary to reorder them.

  1. Go to the Details tab of the Procedure.

  2. Click on the command line you wish to move.

  3. From the Editor toolbar, click the arrows to move the command line to the appropriate position.

Writing Code in Procedures

Commands within a procedure can be written in several languages. These include:

  • SQL: or any language supported by the targeted RDBMS such as PL/SQL, Transact SQL etc. Usually these commands can contain Data Manipulation Language (DML) or Data Description Language (DDL) statements. Using SELECT statements or stored procedures that return a result set is subject to some restrictions. To write a SQL command, you need to select:

    • A valid RDBMS technology that supports your SQL statement, such as Teradata or Oracle etc.

    • A logical schema that indicates where it should be executed. At runtime, this logical schema will be converted to the physical data server location selected to execute this statement.

    • Additional information for transaction handling as described further in section Handling RDBMS Transactions.

  • Operating System Commands: Useful when you want to run an external program. In this case, your command should be the same as if you wanted to execute it from the command interpreter of the operating system of the Agent in charge of the execution. When doing so, your objects become dependent on the platform on which the agent is running. To write an operating system command, select “Operating System” from the list of technologies of you current step. It is recommended to use for these kind of operations the OdiOSCommand tool as this tool prevents you from calling and setting the OS command interpreter.

  • ODI Tools: ODI offers a broad range of built-in tools that you can use in procedures to perform some specific tasks. These tools include functions for file manipulation, email alerts, event handling, etc. They are described in detail in the online documentation. To use an ODI Tool, select ODITools from the list of technologies of your current step.

  • Scripting Language: You can write a command in any scripting language supported by Oracle Data Integrator. By default, ODI includes support for the following scripting languages that you can access from the technology list box of the current step: Jython, JavaScript, NetRexx and Java BeanShell.

Using the Substitution API

It is recommended that you use the ODI substitution API when writing commands in a procedure to keep it independent of the context of execution. You can refer to the online documentation for information about this API. Common uses of the substitution API are given below:

  • Use getObjectName()to obtain the qualified name of an object in the current logical schema regardless of the execution context, rather than hard coding it.

  • Use getInfo() to obtain general information such as driver, URL, user etc. about the current step

  • Use getSession() to obtain information about the current session

  • Use getOption() to retrieve the value of a particular option of your procedure

  • Use getUser() to obtain information about the ODI user executing your procedure.

Handling RDBMS Transactions

Oracle Data Integrator procedures include an advanced mechanism for transaction handling across multiple steps or even multiple procedures. Transaction handling applies only for RDBMS steps and often depends on the transaction capabilities of the underlying database. Within procedures, you can define for example a set of steps that would be committed or rolled back in case of an error. You can also define up to 10 (from 0 to 9) independent sets of transactions for your steps on the same server. Using transaction handling is of course recommended when your underlying database supports transactions. Note that each transaction opens a connection to the database.

However, use caution when using this mechanism as it can lead to deadlocks across sessions in a parallel environment.

Binding Source and Target Data

Data binding in Oracle Data Integrator is a mechanism in procedures that allows performing an action for every row returned by a SQL SELECT statement.

To bind source and target data:

  1. Open the Command Line Editor.

  2. In the Command on Source tab, specify the SELECT statement.

  3. In the Command on Target tab, specify the action code. The action code can itself be an INSERT, UPDATE or DELETE SQL statement or any other code such as an ODI Tool call or Jython. Refer to Appendix A, "Oracle Data Integrator Tools Reference" for details about the ODI Tools syntax.

The values returned by the source result set can be referred to in the action code using the column names returned by the SELECT statement. They should be prefixed by colons “:” whenever used in a target INSERT, UPDATE or DELETE SQL statement and will act as “bind variables”. If the target statement is not a DML statement, then they should be prefixed by a hash “#” sign and will act as substituted variables. Note also that if the resultset of the Source tab is passed to the Target tab using a hash "#" sign, the target command is executed as many times as there are values returned from the Source tab command.

The following examples give you common uses for this mechanism. There are, of course, many other applications for this powerful mechanism.

Example 12-1 Loading Data from a Remote SQL Database

Suppose you want to insert data into the Teradata PARTS table from an Oracle PRODUCT table. Table 12-1 gives details on how to implement this in a procedure step.

Table 12-1 Procedure Details for Loading Data from a Remote SQL Database

Source Technology

Oracle

Source Logical Schema

ORACLE_INVENTORY

Source Command

select PRD_ID       MY_PRODUCT_ID,
       PRD_NAME     PRODUCT_NAME,
from   <%=odiRef.getObjectName("L","PRODUCT","D")%>

Target Technology

Teradata

Target Logical Schema

TERADATA_DWH

Target Command

insert into PARTS
(PART_ID, PART_ORIGIN, PART_NAME)
values
(:MY_PRODUCT_ID, 'Oracle Inventory',
:PRODUCT_NAME)

ODI will implicitly loop over every record returned by the SELECT statement and bind its values to “:MY_PRODUCT_ID” and “:PRODUCT_NAME” bind variables. It then triggers the INSERT statement with these values after performing the appropriate data type translations.

When batch update and array fetch are supported by the target and source technologies respectively, ODI prepares arrays in memory for every batch, making the overall transaction more efficient.

Note:

This mechanism is known to be far less efficient than a fast or multi load in the target table. You should only consider it for very small volumes of data.

The section Using the Agent in the Loading Strategies further discusses this mechanism.

Example 12-2 Sending Multiple Emails

Suppose you have a table that contains information about all the people that need to be warned by email in case of a problem during the loading of your Data Warehouse. You can do it using a single procedure step as described in Table 12-2.

Table 12-2 Procedure Details for Sending Multiple Emails

Source Technology

Oracle

Source Logical Schema

ORACLE_DWH_ADMIN

Source Command

Select FirstName FNAME, EMailaddress EMAIL 
From <%=odiRef.getObjectName("L","Operators","D")%>
Where RequireWarning = 'Yes'

Target Technology

ODITools

Target Logical Schema

None

Target Command

OdiSendMail -MAILHOST=my.smtp.com -FROM=admin@mycompany.com “-TO=#EMAIL” “-SUBJECT=Job Failure”
Dear #FNAME,
I'm afraid you'll have to take a look at ODI Operator, because session <%=snpRef.getSession(“SESS_NO”)%> has just failed!
-Admin

The “–TO” parameter will be substituted by the value coming from the “Email” column of your source SELECT statement. The “OdiSendMail” command will therefore be triggered for every operator registered in the “Operators” table.

12.1.3 Using Procedures

A procedure can be used in the following ways:

12.1.3.1 Executing the Procedure

To run a procedure:

  1. In the Project view of the Designer Navigator, select the procedure you want to execute.

  2. Right-click and select Execute.

  3. In the Execution dialog, set the execution parameters. Refer to Table 19-1 for more information.

  4. Click OK.

  5. The Session Started Window appears.

  6. Click OK.

Note:

During this execution the Procedure uses the option values set on the Options tab of the Procedure editor.

12.1.3.2 Using a Procedure in a Package

Procedures can be used as package steps. Refer to the Section 10.3.1.2, "Executing a Procedure" for more information on how to execute a procedure in a package step. Note that if you use a procedure in a package step, the procedure is not a copy of the procedure you created but a link to it. If this procedure is modified outside of the package, the package using the procedure will be changed, too.

Note:

If you don't want to use the option values set on the Options tab of the Procedure, set the new options values directly in the Options tab of the Procedure step.

12.1.3.3 Generating a Scenario for a Procedure

It is possible to generate a scenario to run a procedure in production environment, or to schedule its execution without having to create a package using this procedure. The generated scenario will be a scenario with a single step running this procedure. How to generate a scenario for a procedure is covered in Section 13.2, "Generating a Scenario".

12.1.4 Encrypting and Decrypting Procedures

Encrypting a Knowledge Module (KM) or a procedure allows you to protect valuable code. An encrypted KM or procedure can neither be read nor modified if it is not decrypted. The commands generated in the log by an Encrypted KM or procedure are also unreadable.

Oracle Data Integrator uses a DES Encryption algorithm based on a personal encryption key. This key can be saved in a file and reused to perform encryption or decryption operations.

WARNING:

There is no way to decrypt an encrypted KM or procedure without the encryption key. It is therefore strongly advised to keep this key in a safe location. It is also advised to use a unique key for all the developments.

12.1.4.1 Encrypting a KM or Procedure

To encrypt a KM or a Procedure:

  1. Right-click the KM or procedure you wish to encrypt.

  2. Select Encrypt.

  3. In the Encryption Option dialog, either:

    • Select the Encrypt with a personal key option and select an existing Encryption Key file

    • Select the Encrypt with a personal key option and then type in (or paste) the string corresponding to your personal key

    • or let Oracle Data Integrator generate a key using the Get a new encryption key option.

  4. The Encryption Key dialog appears when the encryption is finished. From this dialog, you can save the key.

Note that if you type in a personal key with too few characters, an invalid key size error appears. In this case, please type in a longer personal key. A personal key of 10 or more characters is required.

12.1.4.2 Decrypting a KM or Procedure

To decrypt a KM or a procedure:

  1. Right-click the KM or procedure that you wish to decrypt.

  2. Select Decrypt.

  3. In the KM/Procedure Decryption dialog, either:

    • Select an existing encryption key file

    • or type in (or paste) the string corresponding to your personal key.

A message appears when the decryption has finished.

12.2 Working with Variables

This section provides an introduction to variables and describes how to create and use variables in Oracle Data Integrator.

12.2.1 Introduction to Variables

A variable is an object that stores a single value. This value can be a string, a number or a date. The variable value is stored in Oracle Data Integrator. It can be used in several places in your projects, and its value can be updated at run-time.

Depending on the variable type, a variable can have the following characteristics:

  • It has a default value defined at creation time.

  • Its value can be passed as a parameter when running a scenario using the variable.

  • Its value can be refreshed with the result of a statement executed on one of your data servers. For example, it can retrieve the current date and time from a database.

  • Its value can be set or incremented in package steps.

  • It can be evaluated to create conditions and branches in packages.

  • It can be used in the expressions and code of interfaces, procedures, steps,...

Variables can be used in any expression (SQL or others), as well as within the metadata of the repository. A variable is resolved when the command containing it is executed by the agent or the graphical interface.

A variable can be created as a global variable or in a project. This defines the variable scope. Global variables can be used in all projects, while project variables can only be used within the project in which they are defined.

The variable scope is detailed in Section 12.2.3, "Using Variables".

The following section describes how to create and use variables.

12.2.2 Creating Variables

To create a variable:

  1. In Designer Navigator select the Variables node in a project or the Global Variables node in the Others view.

  2. Right-click and select New Variable. The Variable Editor opens.

  3. Specify the following variable parameters:

    Properties Description
    Name Name of the variable, in the form it will be used. This name should not contain characters that could be interpreted as word separators (blanks, etc.) by the technologies the variable will be used on. Variable names are case-sensitive. That is, "YEAR" and "year" are considered to be two different variables. The variable name is limited to a length of 400 characters.
    Datatype Type of variable:
    • Alphanumeric (255 characters)

    • Date (This format is a Java date format that matches your machine's local parameters. Note that you may need to adapt the format depending to the RDBMS)

    • Numeric (Maximum 10 digits)

    • Text (Unlimited length)

    Action This parameter shows the length of time the value of a variable is kept for:
    • Non-persistent: The value of the variable is kept in memory for a whole session.

    • Last value: Oracle Data Integrator stores in its repository the latest value held by the variable.

    • Historize: Oracle Data Integrator keeps a history of all the values held by this variable. Use this option if you want to keep a history of the variable values or for debugging purposes.

      Note that Historize should be used selectively as the variable history is not automatically purged and grows each time the variable is used. Using history for many variables has an impact on the size of the work repository.

    Default Value The value assigned to the variable by default.
    Description Detailed description of the variable

  4. If you want the variable's value to be set by a query:

    1. Select the Refreshing tab.

    2. Select the logical Schema where the command will be executed, then edit the command text in the language of the schema's technology. You can use the Expression Editor for editing the command text. It is recommended to use Substitution methods such as getObjectName in the syntax of your query expression.

    3. Click Testing query on the DBMS to check the syntax of your expression.

    4. Click Refresh to test the variable by executing the query immediately. If the variable action is set to Historize or Latest Value, you can view the returned value on the History tab of the Variable editor.

  5. From the File menu, click Save.

The variable appears in the Projects or Others tree in Designer Navigator.

Note:

It is advised to use the Expression editor when you refer to variables. By using the Expression editor, you can avoid the most common syntax errors. For example, when selecting a variable in the Expression editor, the variable name will be automatically prefixed with the correct code depending on the variable scope. Refer to Variable scope for more information on how to refer to your variables.

12.2.3 Using Variables

Using Variables is highly recommended to create reusable packages or packages with a complex conditional logic, interfaces and procedures. Variables can be used everywhere within ODI. Their value can be stored persistently in the ODI Repository if their action type is set to “Historize” or “Last Value”. Otherwise, with an action type of “Non-Persistent”, their value will only be kept in the memory of the agent during the execution of the current session.

This section provides an overview of how to use variables in Oracle Data Integrator. Variables can be used in the following cases:

Variable scope

Use the Expression editor to refer to your variables in Packages, integration interfaces, and procedures. When you use the Expression editor the variables are retrieved directly from the repository.

You should only manually prefix variable names with GLOBAL or the PROJECT_CODE, when the Expression editor is not available.

Referring to variable MY_VAR in your objects should be done as follows:

  • #MY_VAR: With this syntax, the variable must be in the same project as the object referring to it. Its value will be substituted. To avoid ambiguity, consider using fully qualified syntax by prefixing the variable name with the project code.

  • #MY_PROJECT_CODE.MY_VAR: Using this syntax allows you to use variables by explicitly stating the project that contains the variable. It prevents ambiguity when 2 variables with the same name exist for example at global and project level. The value of the variable will be substituted at runtime.

  • #GLOBAL.MY_VAR: This syntax allows you to refer to a global variable. Its value will be substituted in your code. Refer to section Global Objects for details.

  • Using “:” instead of “#”: You can use the variable as a SQL bind variable by prefixing it with a colon rather than a hash. However this syntax is subject to restrictions as it only applies to SQL DML statements, not for OS commands or ODI API calls and using the bind variable may result in performance loss. It is advised to use ODI variables prefixed with the '#'character to ensure optimal performance at runtime.

    • When you reference an ODI Variable prefixed with the ':' character, the name of the Variable is NOT substituted when the RDBMS engine determines the execution plan. The variable is substituted when the RDBMS executes the request. This mechanism is called Binding. If using the binding mechanism, it is not necessary to enclose the variables which store strings into delimiters (such as quotes) because the RDBMS is expecting the same type of data as specified by the definition of the column for which the variable is used.

      For example, if you use the variable TOWN_NAME = :GLOBAL.VAR_TOWN_NAME the VARCHAR type is expected.

    • When you reference an ODI variable prefixed with the "#" character, ODI substitutes the name of the variable by the value before the code is executed by the technology. The variable reference needs to be enclosed in single quote characters, for example TOWN = '#GLOBAL.VAR_TOWN'. The call of the variable works for OS commands, SQL, and ODI API calls.

12.2.3.1 Using Variables in Packages

Variables can be used in packages for different purposes:

  • Declaring a variable: When a variable is used in a package (or in certain elements of the topology that are used in the package), it is strongly recommended that you insert a Declare Variable step in the packaget. This step explicitly declares the variable in the package. How to create a Declare Variable step is covered in "Declaring a Variable". Other variables that you explicitly use in your packages for setting, refreshing or evaluating their values do not need to be declared.

  • Refreshing a variable from its SQL SELECT statement: A Refresh Variable step allows you to re-execute the command or query that computes the variable value. How to create a Refresh Variable step is covered in "Refreshing a Variable".

  • Assigning the value of a variable: A Set Variable step of type Assign sets the current value of a variable.

    In Oracle Data Integrator you can assign a value to a variable in the following ways:

    • Retrieving the variable value from a SQL SELECT statement: When creating your variable, define a SQL statement to retrieve its value. For example, you can create a variable NB_OF_OPEN_ORDERS and set its SQL statement to: select COUNT(*) from <%=odiRef.getObjectName("L","ORDERS","D")%> where STATUS = 'OPEN'.

      Then in your package, you will simply drag and drop your variable and select the “Refresh Variable” option in the Properties panel. At runtime, the ODI agent will execute the SQL statement and assign the first returned value of the result set to the variable.

    • Explicitly setting the value in a package: You can also manually assign a value to your variable for the scope of your package. Simply drag and drop your variable into your package and select the “Set Variable” and “Assign” options in the Properties panel as well as the value you want to set.

    • Incrementing the value: Incrementing only applies to variables defined with a numeric data type. Drag and drop your numeric variable into the package and select the “Set Variable” and “Assign” options in the Properties panel as well as the desired increment. Note that the increment value can be positive or negative.

    • Assigning the value at runtime: When you start a scenario generated from a package containing variables, you can set the values of its variables. You can do that in the StartScenario command by specifying the VARIABLE=VALUE list. Refer to the OdiStartScen API command and the section Section 19.3.2, "Executing a Scenario from a Command Line".

      How to create a Assign Variable step is covered in "Setting a Variable".

  • Incrementing a numeric value: A Set Variable step of type Increment increases or decreases a numeric value by the specified amount. How to create a Set Variable step is covered in "Setting a Variable".

  • Evaluating the value for conditional branching: An Evaluate Variable step acts like an IF-ELSE step. It tests the current value of a variable and branches in a package depending on the result of the comparison. For example, you can choose to execute interfaces A and B of your package only if variable EXEC_A_AND_B is set to “YES”, otherwise you would execute interfaces B and C. To do this, you would simply drag and drop the variable in your package diagram, and select the “Evaluate Variable” type in the properties panel. Evaluating variables in a package allows great flexibility in designing reusable, complex workflows. How to create an Evaluate Variable step is covered in Evaluating a Variable.

12.2.3.2 Using Variables in Interfaces

Variables can be used in interfaces in two different ways:

  1. As a value for a textual option of a Knowledge Module.

  2. In all Oracle Data Integrator expressions such as mappings, filters, joins, and constraints.

To substitute the value of the variable into the text of an expression, precede its name by the '#' character. The agent or the graphical interface will substitute the value of the variable in the command before executing it.

The following example shows the use of a global variable named 'YEAR':

Update CLIENT set LASTDATE = sysdate where DATE_YEAR = '#GLOBAL.YEAR' /* DATE_YEAR is CHAR type */
Update CLIENT set LASTDATE = sysdate where DATE_YEAR = #GLOBAL.YEAR /* DATE_YEAR is NUMERIC type */

The "bind variable" mechanism of the SQL language can also be used, however, this is less efficient, because the relational database engine does not know the value of the variable when it constructs the execution plan for the query. To use this mechanism, precede the variable by the ':' character, and make sure that the datatype being searched is compatible with that of the variable. For example:

update CLIENT set LASTDATE = sysdate where DATE_YEAR =:GLOBAL.YEAR

You can drag-and-drop a variable into most expressions with the Expression Editor.

Table 12-3 Examples of how to use Variables in Interfaces

Type Expression

Mapping

'#PRODUCT_PREFIX' || PR.PRODUCT_CODE

Concatenates the current project's product prefix variable with the product code. As the value of the variable is substituted, you need to enclose the variable with single quotes because it returns a string.

Join

CUS.CUST_ID = #DEMO.UID * 1000000 + FF.CUST_NO

Multiply the value of the UID variable of the DEMO project by 1000000 and add the CUST_NO column before joining it with the CUST_ID column.

Filter

ORDERS.QTY between #MIN_QTY and #MAX_QTY

Filter orders according to the MIN_QTY and MAX_QTY thresholds.

Option Value

TEMP_FILE_NAME: #DEMO.FILE_NAME

Use the FILE_NAME variable as the value for the TEMP_FILE_NAME option.


12.2.3.3 Using Variables in Object Properties

It is also possible to use variables as substitution variables in graphical module fields such as resource names or schema names in the topology. You must use the fully qualified name of the variable (Example: #GLOBAL.MYTABLENAME) directly in the Oracle Data Integrator graphical module's field.

Using this method, you can parameterize elements for execution, such as:

  • The physical names of files and tables (Resource field in the datastore) or their location (Physical schema's schema (data) in the topology)

  • Physical Schema

  • Data Server URL

12.2.3.4 Using Variables in Procedures

You can use variables anywhere within your procedures' code as illustrated in the Table 12-4.

Table 12-4 Example of how to use Variables in a Procedure

Step ID: Step Type Step Code Description

1

SQL

Insert into #DWH.LOG_TABLE_NAME

Values (1, 'Loading Step Started', current_date)

Add a row to a log table that has a name only known at runtime

2

Jython

f = open('#DWH.LOG_FILE_NAME', 'w')

f.write('Inserted a row in table %s' % ('#DWH.LOG_TABLE_NAME') )

f.close()

Open file defined by LOG_FILE_NAME variable and write the name of the log table into which we have inserted a row.


You should consider using options rather than variables whenever possible in procedures. Options act like input parameters. Therefore, when executing your procedure in a package you would set your option values to the appropriate values.

In the example of Table 12-4, you would write Step 1's code as follows:

Insert into <%=snpRef.getOption(“LogTableName”)%> 
Values (1, 'Loading Step Started', current_date)

Then, when using your procedure as a package step, you would set the value of option LogTableName to #DWH.LOG_TABLE_NAME.

12.2.3.5 Using Variables within Variables

It is sometimes useful to have variables depend on other variable values as illustrated in Table 12-5.

Table 12-5 Example of how to use a variable within another variable

Variable Name Variable Details Description

STORE_ID

Alphanumeric variable. Passed as a parameter to the scenario

Gives the ID of a store

STORE_NAME

Alphanumeric variable.

SELECT statement:

Select name
From <%=odiRef.getObjectName("L","STORES","D")%>
Where id='#DWH.STORE_ID'||'#DWH.STORE_CODE'

The name of the current store is derived from the Stores table filtered by the value returned by the concatenation of the STORE_ID and STORE_CODE variables.


In Table 12-5, you would build your package as follows:

  1. Drag and drop the STORE_ID variable to declare it. This would allow you to pass it to your scenario at runtime.

  2. Drag and drop the STORE_NAME variable to refresh its value. When executing this step, the agent will run the select query with the appropriate STORE_ID value. It will therefore retrieve the corresponding STORE_NAME value.

  3. Drag and drop the other interfaces or procedures that use any of these variables.

12.2.3.6 Using Variables in the Resource Name of a Datastore

You may face some situations where the names of your source or target datastores are dynamic. A typical example of this is when you need to load flat files into your Data Warehouse with a file name composed of a prefix and a dynamic suffix such as the current date. For example the order file for March 26 would be named ORD2009.03.26.dat.

Note that you can only use variables in the resource name of a datastore in a scenario when the variable has been previously declared.

To develop your loading interfaces, you would follow these steps:

  1. Create the FILE_SUFFIX variable in your DWH project and set its SQL SELECT statement to select current_date (or any appropriate date transformation to match the actual file suffix format)

  2. Define your ORDERS file datastore in your model and set its resource name to: ORD#DWH.FILE_SUFFIX.dat.

  3. Use your file datastore normally in your interfaces.

  4. Design a package as follows:

    1. Drag and drop the FILE_SUFFIX variable to refresh it.

    2. Drag and drop all interfaces that use the ORDERS datastore.

At runtime, the source file name will be substituted to the appropriate value.

Note:

The variable in the datastore resource name must be fully qualified with its project code.

When using this mechanism, it is not possible to view the data of your datastore from within Designer.

12.2.3.7 Using Variables in a Server URL

There are some cases where using contexts for different locations is less appropriate than using variables in the URL definition of your data servers. For example, when the number of sources is high (> 100), or when the topology is defined externally in a separate table. In these cases, you can refer to a variable in the URL of a server's definition.

Suppose you want to load your warehouse from 250 source applications - hosted in Oracle databases - used within your stores. Of course, one way to do it would be to define one context for every store. However, doing so would lead to a complex topology that would be difficult to maintain. Alternatively, you could define a table that references all the physical information to connect to your stores and use a variable in the URL of your data server's definition. Example 12-3 illustrates how you would implement this in Oracle Data Integrator:

Example 12-3 Referring to a Variable in the URL of a Server's Definition

  1. Create a StoresLocation table as follows:

    Table 12-6 Stores Location table

    StoreID Store Name Store URL IsActive

    1

    Denver

    10.21.32.198:1521:ORA1

    YES

    2

    San Francisco

    10.21.34.119:1525:SANF

    NO

    3

    New York

    10.21.34.11:1521:NY

    YES

    ...

    ...

    ...

    ...


  2. Create three variables in your EDW project:

    • STORE_ID: takes the current store ID as an input parameter

    • STORE_URL: refreshes the current URL for the current store ID with SELECT statement: select StoreUrl from StoresLocation where StoreId = #EDW.STORE_ID

    • STORE_ACTIVE: refreshes the current activity indicator for the current store ID with SELECT statement: select IsActive from StoresLocation where StoreId = #EDW.STORE_ID

  3. Define one physical data server for all your stores and set its JDBC URL to:

    jdbc:oracle:thin:@#EDW.STORE_URL

  4. Define your package for loading data from your store.

    The input variable STORE_ID will be used to refresh the values for STORE_URL and STORE_ACTIVE variables from the StoresLocation table. If STORE_ACTIVE is set to “YES”, then the next 3 steps will be triggered. The interfaces refer to source datastores that the agent will locate according to the value of the STORE_URL variable.

    To start such a scenario on Unix for the New York store, you would issue the following operating system command:

    startscen.sh LOAD_STORE 1 PRODUCTION “EDW.STORE_ID=3”

    If you want to trigger your LOAD_STORE scenario for all your stores in parallel, you would simply need to create a procedure with a single SELECT/action command as follows:

    Table 12-7 SELECT/action command

    Source Technology

    Oracle (technology of the data server containing the StoresLocation table).

    Source Logical Schema

    Logical schema containing the StoresLocation table.

    Source Command

    Select StoreId
    From StoresLocation
    

    Target Technology

    ODITools

    Target Logical Schema

    None

    Target Command

    SnpsStartScen “-SCEN_NAME=LOAD_STORE” “-SCEN_VERSION=1” “-SYNC_MODE=2” “-EDW.STORE_ID=#StoreId”
    

The LOAD_STORE scenario will then be executed for every store with the appropriate STORE_ID value. The corresponding URL will be set accordingly.

Refer to "Binding Source and Target Data" and Section 4.3, "Managing Agents" for further details.

12.2.3.8 Passing a Variable to a Scenario

It is also possible to pass a variable to a scenario in order to customize its behavior. To do this, pass the name of the variable and its value on the OS command line which executes the scenario. For more information, see Section 19.3.2, "Executing a Scenario from a Command Line".

12.2.3.9 Generating a Scenario for a Variable

It is possible to generate a single step scenario for running a variable.

How to generate a scenario for a variable is covered in Section 13.2, "Generating a Scenario".

12.3 Working with Sequences

This section provides an introduction to sequences and describes how to create and use sequences in Oracle Data Integrator.

12.3.1 Introduction to Sequences

A Sequence is a variable that increments itself automatically each time it is used. Between two uses, the value can be stored in the repository or managed within an external RDBMS table. Sequences can be strings, lists, tuples or dictionaries.

Oracle Data Integrator sequences are intended to map native sequences from RDBMS engines, or to simulate sequences when they do not exist in the RDBMS engine. Non-native sequences' values can be stored in the Repository or managed within a cell of an external RDBMS table.

A sequence can be created as a global sequence or in a project. Global sequences are common to all projects, whereas project sequences are only available in the project where they are defined.

Oracle Data Integrator supports three types of sequences:

  • Standard sequences, whose current values ares stored in the Repository.

  • Specific sequences, whose current values are stored in an RDBMS table cell. Oracle Data Integrator reads the value, locks the row (for concurrent updates) and updates the row after the last increment.

  • Native sequence, that maps a RDBMS-managed sequence.

Note the following on standard and specific sequences:

  • Oracle Data Integrator locks the sequence when it is being used for multi-user management, but does not handle the sequence restart points. In other words, the SQL statement ROLLBACK does not return the sequence to its value at the beginning of the transaction.

  • Oracle Data Integrator standard and specific sequences were developed to compensate for their absence on some RDBMS. If native sequences exist, they should be used. This may prove to be faster because it reduces the dialog between the agent and the database.

The following sections describe how to create and use sequences.

12.3.2 Creating Sequences

The procedure for creating sequences vary depending on the sequence type. Refer to the corresponding section:

12.3.2.1 Creating Standard Sequences

To create a standard sequence:

  1. In Designer Navigator select the Sequences node in a project or the Global Sequences node in the Others view.

  2. Right-click and select New Sequence. The Sequence Editor opens.

  3. Enter the sequence Name, then select Standard Sequence.

  4. Enter the Increment.

  5. From the File menu, click Save.

The sequence appears in the Projects or Others tree in Designer Navigator.

12.3.2.2 Creating Specific Sequences

Select this option for storing the sequence value in a table in a given data schema.

To create a specific sequence:

  1. In Designer Navigator select the Sequences node in a project or the Global Sequences node in the Others view.

  2. Right-click and select New Sequence. The Sequence Editor opens.

  3. Enter the sequence Name, then select Specific Sequence.

  4. Enter the Increment value.

  5. Specify the following sequence parameters:

    Schema Logical schema containing the sequences table
    Table Table containing the sequence value
    Column Name of the column containing the sequence value.
    Filter to retrieve a single row Type in a Filter which will allow Oracle Data Integrator to locate a specific row in the table when the sequence table contains more than one row. This filter picks up the SQL syntax of the data server.

    For example: CODE_TAB = '3'

    You can use the Expression Editor to edit the filter. Click Testing query on the DBMS to check the syntax of your expression.


  6. From the File menu, click Save.

The sequence appears in the Projects or Others tree in Designer Navigator.

Note:

When Oracle Data Integrator wants to access the specific sequence value, the query executed on the schema will be SELECT column FROM table WHERE filter.

12.3.2.3 Creating Native Sequences

Select this option if your sequence is implemented in the database engine. Position and increment are fully handled by the database engine.

To create a native sequence:

  1. In Designer Navigator select the Sequences node in a project or the Global Sequences node in the Others view.

  2. Right-click and select New Sequence. The Sequence Editor opens.

  3. Enter the sequence Name, then select Native Sequence.

  4. Select the logical Schema containing your native sequence.

  5. Type in the Native Sequence Name or click the browse button to select a sequence from the list pulled from the data server.

  6. If you clicked the Browse button, in the Native Sequence Choice dialog, select a Context to display the list of sequences in this context for your logical schema.

  7. Select one of these sequences and click OK.

  8. From the File menu, click Save.

The sequence appears in the Projects or Others tree in Designer Navigator.

12.3.3 Using Sequences and Identity Columns

In order to increment sequences, the data needs to be processed row-by-row by the agent. Therefore, using sequences is not recommended when dealing with large numbers of records. In this case, you would use database-specific sequences such as identity columns in Teradata, IBM DB2, Microsoft SQL Server or sequences in Oracle.

The sequences can be used in all Oracle Data Integrator expressions, such as in:

  • Mappings,

  • Filters,

  • Joins,

  • Constraints,

  • ...

Sequences can be used either as:

  • A substituted value, using the #<SEQUENCE_NAME>_NEXTVAL syntax

  • A bind variable in SQL statements, using the :<SEQUENCE_NAME>_NEXTVAL syntax

Using a sequence as a substituted value

A sequence can be used in all statements with the following syntax: #<SEQUENCE_NAME>_NEXTVAL

With this syntax, the sequence value is incremented only once before the command is run and then substituted by its valued into the text of the command. The sequence value is the same for all records.

Using a sequence as a bind variable

Only for SQL statements on a target command of a KM or procedure, sequences can be used with the following syntax: :<SEQUENCE_NAME>_NEXTVAL

With this syntax, the sequence value is incremented, then passed as a bind variable of the target SQL command. The sequence value is incremented in each record processed by the command. The behavior differs depending on the sequence type:

  • Native sequences are always incremented for each processed record.

  • Standard and specific sequences are resolved by the run-time agent and are incremented only when records pass through the agent. The command in a KM or procedure that uses such a sequence must use a SELECT statement on the source command and an INSERT or UPDATE statement on the target command rather than a single INSERT/UPDATE... SELECT in the target command.

For example:

  • In the SQL statement insert into fac select :NO_FAC_NEXTVAL, date_fac, mnt_fac the value of a standard or specific sequence will be incremented only once, even if the SQL statement processes 10,000 rows, because the agent does not process each record, but just sends the command to the database engine. A native sequence will be incremented for each row.

  • To increment the value of a standard or specific sequence for each row, the data must pass through the agent. To do this, use a KM or procedure that performs a SELECT on the source command and an INSERT on the target command:

    SELECT date_fac, mnt_fac /* on the source connection */
    
    INSERT into FAC (ORDER_NO, ORDER_DAT, ORDER_AMNT) values (:NO_FAC_NEXTVAL, :date_fac, :mnt_fac) /* on the target connection */
    

Sequence Scope

Unlike for variables, you do not need to state the scope of sequences explicitly in code.

12.3.3.1 Tips for Using Standard and Specific Sequences

To make sure that a sequence is updated for each row inserted into a table, each row must be processed by the Agent. To make this happen, follow the steps below:

  1. Make the mapping containing the sequence be executed on the target.

  2. Set the mapping to be active for inserts only. Updates are not supported for sequences.

  3. If you are using an "incremental update" IKM, you should make sure that the update key in use does not contain a column populated with the sequence. For example, if the sequence is used to load the primary key for a datastore, you should use an alternate key as the update key for the interface.

  4. If using Oracle Data Integrator sequences with bind syntax (:<SEQUENCE_NAME>_NEXTVAL), you must configure the data flow such that the IKM transfers all the data through the agent. You can verify this by checking the generated integration step in Operator. It should have separate INSERT and SELECT commands executed on different connections, rather than a single SELECT...INSERT statement.

Limitations of Sequences

Sequences have the following limitations:

  • A column mapped with a sequence should not be checked for not null.

  • Similarly, static control and flow control cannot be performed on a primary or alternate key that references the sequence.

12.3.3.2 Identity Columns

Certain databases also natively provide identity columns, which are automatically populated with unique, self-incrementing values.

When populating an identity column, you should follow these steps:

  1. The mapping loading the identity column should be blank and inactive. It should not be activated for inserts or updates.

  2. If you are using "incremental update" IKMs, make sure that the update key in use does not contain the identity column. If the identity column is part of the primary key, you should define an alternate key as the update key for the interface.

Limitations of Identity Columns

Identity columns have the following limitations:

  • Not null cannot be checked for an identity column.

  • Static and flow control cannot be performed on a primary or alternate key containing the identity column.

12.4 Working with User Functions

This section provides an introduction to user functions and describes how to create and use user functions in Oracle Data Integrator.

12.4.1 Introduction User Functions

User functions are used for defining customized functions that can be used in interfaces or procedures. It is recommended to use them in your projects when the same complex transformation pattern needs to be assigned to different datastores within different interfaces. User functions improve code sharing and reusability and facilitate the maintenance and the portability of your developments across different target platforms.

User functions are implemented in one or more technologies and can be used anywhere in mappings, joins, filters and conditions. Refer to Section 12.4.3, "Using User Functions".

A function can be created as a global function or in a project. In the first case, it is common to all projects, and in the second, it is attached to the project in which it is defined.

User functions can call other user functions. A user function cannot call itself recursively.

Note:

Aggregate functions are not supported User Functions. The aggregate function code will be created, but the GROUP BY expression will not be generated.

The following sections describe how to create and use user functions.

12.4.2 Creating User Functions

To create a user function:

  1. In Designer Navigator select the User Functions node in a project or the Global User Functions node in the Others view.

  2. Right-click and select New User Function. The User Function Editor opens.

  3. Fill in the following fields:

    • Name: Name of the user function, for example NullValue

    • Group: Group of the user function. If you type a group name that does not exist, a new group will be created with this group name when the function is saved.

    • Syntax: Syntax of the user function that will appear in the expression Editor; The arguments of the function must be specified in this syntax, for example NullValue($(variable), $(default))

  4. From the File menu, click Save.

The function appears in the Projects or Others tree in Designer Navigator. Since it has no implementation, it is unusable.

To create an implementation:

  1. In Designer Navigator double-click the User Function for which you want to create the implementation. The User Function Editor opens.

  2. In the Implementations tab of the User Function Editor, click Add Implementation. The Implementation dialog opens.

  3. In the Implementation syntax field, type the code of the implementation, for example nvl($(variable), $(default))

  4. Check the boxes for the implementation's Linked technologies

  5. Check Automatically include new technologies if you want the new technologies to use this syntax.

  6. Click OK.

  7. From the File menu, click Save.

To change an implementation:

  1. In the Implementations tab of the User Function Editor, select an implementation, then click Edit.

  2. In the Implementations tab of the user function, select an implementation, then click Edit Implementation. The Implementation dialog opens.

  3. Change the Implementation syntax and the Linked technologies of this implementation

  4. Check Automatically include new technologies if you want the new technologies to use this syntax.

  5. Click OK.

  6. From the File menu, click Save.

To remove an implementation:

In the implementations tab of the user function, select an implementation, then click Delete Implementation.

12.4.3 Using User Functions

The user functions can be used in all Oracle Data Integrator expressions:

  • Mappings,

  • Filters,

  • Joins,

  • Constraints,

  • ...

A user function can be used directly by specifying its syntax, for example: NullValue(CITY_NAME, 'No City')

User functions are implemented in one or more technologies. For example, the Oracle nvl(VARIABLE,DEFAULT_VALUE), function - which returns the value of VARIABLE, or DEFAULT_VALUE if VARIABLE is null - has no equivalent in all technologies and must be replaced by the formula:

case when VARIABLE is null
then DEFAULT_VALUE
else VARIABLE
end

With user functions, it is possible to declare a function called NullValue(VARIABLE,DEFAULT_VALUE) and to define two implementations for the syntax above. When executing, depending on the technology on which the order will be executed, the NullValue function will be replaced by one syntax or the other.

The next example illustrates how to implement a user function that would be translated into code for different technologies:

Suppose you want to define a function that, given a date, gives you the name of the month. You want this function to be available for your mappings when executed on Oracle, Teradata or Microsoft SQL Server. Table 12-8 shows how to implement this as a user function.

Table 12-8 User Function Translated into Code for Different Technologies (Example 1)

Function Name

GET_MONTH_NAME

Function Syntax

GET_MONTH_NAME($(date_input))

Description

Retrieves the month name from a date provided as date_input

Implementation for Oracle

Initcap(to_char($(date_input), 'MONTH'))

Implementation for Teradata

case 
        when extract(month from $(date_input)) = 1 then 'January'
        when extract(month from $(date_input)) = 2 then 'February'
        when extract(month from $(date_input)) = 3 then 'March'
        when extract(month from $(date_input)) = 4 then 'April'
        when extract(month from $(date_input)) = 5 then 'May'
        when extract(month from $(date_input)) = 6 then 'June'
        when extract(month from $(date_input)) = 7 then 'July'
        when extract(month from $(date_input)) = 8 then 'August'
        when extract(month from $(date_input)) = 9 then 'September'
        when extract(month from $(date_input)) = 10 then 'October'
        when extract(month from $(date_input)) = 11 then 'November'
        when extract(month from $(date_input)) = 12 then 'December'
end

Implementation for Microsoft SQL

datename(month, $(date_input))


You can now use this function safely in your interfaces for building your mappings, filters and joins. Oracle Data Integrator will generate the appropriate code depending on the execution location of your expression.

Another example of a user function translated into code for different technologies is defining the following mapping:

substring(GET_MONTH_NAME(CUSTOMER.LAST_ORDER_DATE), 1, 3), Oracle Data Integrator will generate code similar to the following, depending on your execution technology:

Table 12-9 User Function Translated into Code for Different Technologies (Example 2)

Implementation for Oracle

substring(Initcap(to_char(CUSTOMER.LAST_ORDER_DATE 'MONTH')) , 1, 3)

Implementation for Teradata

substring(case    when extract(month from CUSTOMER.LAST_ORDER_DATE) = 1 then 'January' when extract(month from CUSTOMER.LAST_ORDER_DATE) = 2 then 'February'        ...end, 1, 3)

Implementation for Microsoft SQL

substring(datename(month, CUSTOMER.LAST_ORDER_DATE) , 1, 3)


A function can be created as a global function or in a project. In the first case, it is common to all projects, and in the second, it is attached to the project in which it is defined.

User functions can call other user functions.