Customizing Intra-ETL for Oracle Communications Data Model

The Oracle Communications Data Model uses workflow implemented using PL/SQL packages to execute the intra-ETL process. The workflow consists of four major components:

  1. Lookup Values

  2. Executing Derived Intra-ETL Programs:

    1. Independent Derived intra-ETL programs - Level 0

    2. First level dependent Derived intra-ETL programs - Level 1

    3. Second level dependent Derived intra-ETL programs - Level 2

  3. Refreshing Aggregate Materialized Views:

    1. Independent Aggregate materialized views - Level 0

    2. First level dependent Aggregate materialized views - Level 1

  4. Refreshing Data mining models

  5. Refreshing OLAP Cubes

Figure 4-2 illustrates the Oracle Communications Data Model intra-ETL workflow.

Figure 4-2 Oracle Communications Data Model Intra-ETL Workflow

Description of Figure 4-2 follows
Description of "Figure 4-2 Oracle Communications Data Model Intra-ETL Workflow"

Handling Lookup Values in Staging

Some Intra-ETLs expect some default values in order to work properly. They are usually associated with codes and stored as numbers but saved as text. For example, a typical status code (STAT_CD) is expected to start with 1 for pre-activated status, 2 with active status, 4 with deactivated status, and 5 with canceled status.

The advantage of defining numbers saved as text is that this allows the addition of custom codes that can be associated with an active state (for example 21, 2199, 21000001, and so on) that will be taken into account without having to change anything in the codes.

But of course, you will need to map the original codes to the text values. When you use an ETL lookup Matrix, as table, this allows the identifier of the source system, source table, the source column and the source code, and the Oracle Communications Data Model target table, column, and code.

You can define a simple function to search and leverage the correct code and map it to a number (stored as text to allow the use of SQL TEXT functions). You might want to add Oracle Communications Data Model Lookup tables with a SHORT_NAME column that would represent the original code of the source system, or the one that the end-users use, for reporting purposes only.

An alternative approach is to change the default value in the Intra-ETLs. Note that as soon as multiple sources map to the same Oracle Communications Data Model table, you will need some similar lookup code unification process.

Executing Derived Intra-ETL Programs

The first workflow component is the Derived intra-ETL programs. This component has three subcomponents that handle the dependency among Derived intra-ETL programs:

  1. Independent Derived intra-ETL programs, the first subcomponent, has Derived intra-ETL programs that get data from foundation layer tables, that is base, lookup, and reference tables.
  2. Derived intra-ETL programs, the second subcomponent, handles parts that depend on the first subcomponent, Independent Derived intra-ETL programs. The second subcomponent intra-ETL programs get data from foundation layer tables, that is base, lookup, and reference tables and also from derived tables that have intra-ETL programs in first subcomponent.
  3. The third subcomponent has Derived intra-ETL programs that depend on the both the first (Independent Derived intra-ETL programs) and the second subcomponents (First level dependent Derived intra-ETL programs). The third subcomponent intra-ETL programs get data from foundation layer tables, that is base, lookup, and reference tables and also from derived tables that have intra-ETL programs in both first and second subcomponents.

Intra-ETL programs in all three subcomponents are implemented using PL/SQL packages. All Intra-ETL packages except two (PKG_DWD_CNT_DAY and PKG_DWD_CUST_DNA) insert data for the ETL period mentioned in DWC_ETL_PARAMETER table for "OCDM-INTRA-ETL" process. The process name for DWD_CNT_DAY table is "DWD_CNT_DAY" and for DWD_CUST_DNA table it is "DWD_CUST_DNA". Modify the ETL period of all three processes according to your data load requirements. If you are trying to load data for ETL period, for which data is already loaded, intra-ETL program first truncates the partitions existing for the ETL period, and then loads data into the target derived table.

Modifying existing or adding new intra-ETLs is a common customization of Oracle Communications Data Model. If new data marts are required or if some existing data warehouse requires modifications, it is usual to either create an Intra-ETL from scratch or to copy an existing Intra-ETL and modify it. In both cases, the new or modified intra-ETL program needs to be added in the Package PKG_INTRA_ETL_PROCESS, and the old one needs to be switched off or commented, at the correct level of dependency.

Be sure to create or modify the target entity as required.

Refreshing Aggregate Materialized Views

This is the second component of the workflow. This component depends on the first component, "Executing Derived Intra-ETL Programs". The execution of this component happens only when the execution of the first component completes successfully. This component has two subcomponents to deal with the dependency among the Aggregate materialized views:

  1. Independent Aggregate materialized views, the first subcomponent, has aggregate materialized views that do not depend on any other aggregate materialized views and most of them get data from derived tables and reference tables. Whereas few materialized views get data from foundation layer tables and derived tables.
  2. First level dependent Aggregate materialized views, the second component, has aggregate materialized views that depend on the first subcomponent, Independent Aggregate materialized views. The aggregate materialized views in this subcomponent get data from aggregate materialized views in first the subcomponent

Modifications or additions in this layer follow the same principle as the ones in the derived layer.

Refreshing Data mining models

This is the third component of the workflow. This component depends on the first component, "Executing Derived Intra-ETL Programs". The execution of this component happens only when the execution of the first component completes successfully. This component refreshes data mining models based on the training day and apply day specified in ETL parameter table, DWC_ETL_PARAMETER table for BUILD-MINING-MODELS process.

The creation of new mining models or the adaptation of existing mining models for a specific business need could be seen as typical "customization" (or configuration in case of existing models). One should follow the standard mining process as described in the specific documentation for the Advanced Analytics option of the database (because data mining is a process as such, before being automated as part of the customized Intra-ETL processes).

Additional models, once finalized, should be seen as normal personalization of Oracle Communications Data Model to one's business.

As your model changes over time any customized models need to be reviewed and fine tuned to conform and provide useful information with the new data available, as part of the standard reprocessing of the mining models on a regular basis. It is usual to re-run and fine tune any given mining model at least every quarter, to make sure the current mining model takes into account any new trends from available data.

The mining intra-ETLs should be customized to correspondingly consider any model data additions or changes.

Refreshing OLAP Cubes

This is the fourth component of the workflow. This component depends on the second component, "Refreshing Aggregate Materialized Views", which in turn depends on the first component, "Executing Derived Intra-ETL Programs". The execution of this component happens only when the execution of the second component completes successfully. This component refreshes data in OLAP cubes and dimensions based on the parameters given in DWC_OLAP_ETL_PARAMETER table.

Similarly to data mining and aggregate customization, when you add a cube or change an existing cube is part of typical customization of Oracle Communications Data Model, your refresh must follows the same process as a cube creation or modification as described in the Oracle OLAP User's Guide. The Intra-ETL that fills the cubes should be correspondingly modified.

Executing Intra-ETL Workflow

Oracle Communications Data Model intra-ETL workflow is implemented using a PL/SQL package, PKG_INTRA_ETL_PROCESS. Each component and their subcomponents of intra-ETL workflow have one procedure each. All these procedures are private to the package. The package has only one public procedure, which invokes all the private procedures as depicted in Figure 4-2. Before executing the workflow, ensure that you set all ETL parameters in DWC_OLAP_PARAMETER and DWC_OLAP_ETL_PARAMETER tables. Invoking PKG_INTRA_ETL_PROCESS.RUN procedure starts the workflow execution.