Component: LOAD DATA

The following table describes the fields of the Load Data.

Table 11-10 Fields in the Load Data and their Descriptions

Field Description
Load Mode

Select the load mode from the drop-down list. The options are Table to Table and File to Table.

Table to Table should be selected for Data Mapping definitions such as T2T, T2H, H2T, H2H and L2H definitions.

File to Table should be selected for Data Mapping definitions such as F2T and F2H definitions.

Source Name

Select the required source on which the Data Mapping or Data File Mapping definition you want to execute is defined, from the drop-down list.

Based on the selection of Load Mode, the list displays the corresponding sources.

File Name Select the Data Mapping or Data File Mapping definition you want to execute, from the drop-down list. Based on the selected Load Mode and Source Name, the list displays the corresponding definitions.
Data File Name

The data filename refers to the .dat file that exists in the database. Specifying Data File Name is mandatory for Load Mode selected as File to Table and optional for Load Mode selected as File to Table. If the file name or the .dat file name is incorrect, the task fails during execution.

In case of L2H, you can specify the WebLog name.

Default Value

Used to pass values to the parameters defined in Load Data Definition.

You can pass multiple runtime parameters while defining a batch by specifying the values separated by ‘comma’.

For example, $MIS_DATE=value,$RUNSKEY=value,[DLCY]=value and so on.

Note the following:

  • The parameters can either be specified with $ or within [ ]. For example, $RUNSKEY=value or [RUNSKEY]=value. When the definition is saved from the UI, no value is assigned to these parameters and these are just passed for syntax correctness only. Actual values will be passed to these parameters while defining an ICC batch or a RUN.
  • The list of valid Default Parameters are:
    • RUNID- Data type is String and can be mapped to VARCHAR2
    • PHID- Data type is String and can be mapped to VARCHAR2
    • EXEID- Data type is String and can be mapped to VARCHAR2
    • RUNSK- Data type is Integer and can be mapped to VARCHAR2 or INTEGER.
    • SYSDATE- Data type is Date and can be mapped to DATE, VARCHAR2.
    • TASKID- Data type is String and can be mapped to VARCHAR2
    • MISDATE- Data type is Date and can be mapped to DATE, VARCHAR2.
    • BATCHRUNID- Data type is String and can be mapped to VARCHAR2

    Note: RUNID, PHID, EXEID, RUNSK, MISDATE and BATCHRUNID are implicitly passed through RRF. Rest must be explicitly passed.

  • EXEC_ENV_SOURCE- This parameter is used to replace an External Data source or Infodom based Data Source of the T2T, T2H, H2T or H2H definition during run time, provided the structure of the source in the mapping definition is same as that of the replacing source. Hence you can convert a T2T definition into H2T or T2H into H2H and so on. If the resultant definition is T2T, then T2Texecution using CPP engine is not supported.

    For external Data Source, prefix it with ‘EXT.’ and for Infodom based sources, prefix it with ‘INF.’. For example, [EXEC_ENV_SOURCE]=EXT.<newSourceName>

    or

    [EXEC_ENV_SOURCE]=INF.<newSourceName>

    Additionally, Cluster properties of the current logged-in Infodom will be considered for the execution of the Data Mapping definition.

  • EXEC_ENV_SOURCE_OWNER_INFODOM –This parameter is used to specify the Infodom where the Data Source being replaced (<newSourceName>) was created, in case that Infodom is different from the current Infodom where the batch is executed. If this is not provided, it will look for the Data Source in the current Infodom and may result in failed execution.
  • EXEC_ENV_TARGET- This parameter is used to replace the target Infodom of the T2T, T2H, H2T or H2H definition during run time, provided the structure of the target in the mapping definition is same as that of the replacing target. Hence you can convert a T2T definition into T2H or H2T into H2H and so on. But if the resultant definition is T2T, then T2Texecution using CPP engine is not supported.

    For example, [EXEC_ENV_TARGET]=newTargetName

    Also, DMT Configurations and Cluster properties of the new target Infodom will be considered for the execution of the Data Mapping definition.

    Note: You can use both EXEC_ENV_SOURCE and EXEC_ENV_TARGET together as well. Only limitation is, if the resultant definition is T2T, execution using CPP engine is not supported.

    Note: If you are converting a mapping definition to T2H using EXEC_ENV_SOURCE/EXEC_ENV_TARGET, there is no provision in UI to specify the Split By Column/Generic Options. In such scenarios, execution via Sqoop may fail, when the split by column is defaulted to a string/date column.

  • EXECUTION_ENGINE_MODE- This parameter is used to execute H2H on Spark. For example, [EXECUTION_ENGINE_MODE]=SPARK
  • CLOSE_SPARK_SESSION- This parameter is used to close the Spark session after executing the last H2H-Spark task in the batch.

    In a batch execution, a new Spark session is created when the first H2H-Spark task is encountered, and the same Spark session is reused for the rest of the H2H-Spark tasks in the same run. For the Spark session to close at the end of the run, user needs to set the CLOSE_SPARK_SESSION to YES in the last H2H-spark task in the batch.

    For example, [CLOSE_SPARK_SESSION]=YES

  • SRCHINT- This parameter is used to provide Source Hints. For example, [SRCHINT]= FIRST_ROWS(2)

Note that the value should not contain /*+ */. Only the content should be given.

  • SRCPRESCRIPT- This parameter is used to provide Source Prescript.

    Note: ALTER keyword is not supported.

  • TARGETHINT- This parameter is used to provide Target Hints. For example, [TARGETHINT]= FIRST_ROWS(2)

    Note that the value should not contain /*+ */. Only the content should be given.

  • TARGETPRESCRIPT- This parameter is used to provide Target Prescript.

    Note: ALTER keyword is not supported.

    Apart from these, L2H/H2H/T2H/H2T/F2H data mappings also support following additional default parameters. Values for these are implicitly passed from ICC/RRF.

  • $MISDT_YYYY-MM-DD - Data type is String and can be mapped to VARCHAR2. Value will be the MISDATE in ‘yyyy-MM-dd‘ format.
  • $MISYEAR_YYYY - Data type is String and can be mapped to VARCHAR2. Value will be the year value in ‘yyyy‘ format from MISDATE.
  • $MISMONTH_MM - Data type is String and can be mapped to VARCHAR2. Value will be the month value in ‘MM‘ format from MISDATE.
  • $MISDAY_DD - Data type is String and can be mapped to VARCHAR2. Value will be the date value in ‘dd‘ format from MISDATE.
  • $SYSDT_YYYY-MM-DD- Data type is String and can be mapped to VARCHAR2. Value will be the System date in ‘yyyy-MM-dd‘ format.
  • $SYSHOUR_HH24 - Data type is String and can be mapped to VARCHAR2. Value will be the hour value in ‘HH24‘ format from System date.

Note: The aforementioned parameters are not supported for T2T and F2T.

  • Only those variable which start with $ or [, will be replaced at run time and the value of this variable will be equal to anything starting after “=” and ending before comma “,”.

    For example, if $DCCY/[DCCY] =’USD’, $RUNSKEY=1, then the replaced value in query for $DCCY will be ‘USD’ and for $RUNSKEY will be 1.

  • If you are using “RUNSKEY” parameter in ICC Batch, then ensure that you specify the value of it instead of specifying $RUNSKEY / [RUNSKEY]. For example, FCT_STANDARD_ACCT_HEAD.N_RUN_SKEY=’$RUNSKEY’. Since the value of RUNSKEY will not be replaced during runtime.
  • If there are quotes specified in parameter name, then ensure not to use quotes while defining the expression or vice versa to avoid SQL errors. For example, if the parameter name is $DCCY=’USD’ and the expression is defined using ‘$DCCY’ instead of $DCCY, then the final value will be ‘ ‘USD’ ’.
  • When you execute a RUN, the run is always tagged with a RUNSK value (a unique value for each run fired directly from the RRF). You might have a DERIVED COLUMN in your T2T with expression like $RUNSK. If you execute this T2T through a RUN, a unique RUNSK value is passed implicitly to the T2T engine, which then assigns that value wherever $RUNSK is found. But if you try to execute the T2T through ICC, then you need to explicitly pass a $RUNSK as a parameter so that the T2T engine can use it.

Two additional parameters are now supported for L2H mappings:

  • [INCREMENTALLOAD] – Specify the value as TRUE/FALSE. If set to TRUE, historically loaded data files will not be loaded again (load history is checked against the definition name, source name, target infodom, target table name and the file name combination). If set to FALSE, the execution is similar to a snapshot load, and everything from the source folder/file will be loaded irrespective of load history.
  • [FOLDERNAME] – Value provided will be used to pick up the data folder to be loaded.
    • For HDFS based Weblog source: Value will be suffixed to HDFS File Path specified during the source creation.
    • For Local File System based Weblog source: By default the system will look for execution date folder (MISDATE: yyyymmdd) under STAGE/<source name>. If the user has specified the FOLDERNAME for this source, system will ignore the MISDATE folder and look for the directory provided as [FOLDERNAME].