Skip Headers
Oracle® Health Sciences Pharmacovigilance Operational Analytics User and Administrator Guide
Release 1.0

E23554-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

4 Extract Transform Load Programs

This chapter contains the following topics:

To load data from the source systems to the data warehouse, OPVA uses Extract Transform and Load (ETL) programs that

In OPVA, Oracle Argus Safety is the source system for which Oracle provides pre-defined ETL.

4.1 ETL Architecture

Figure 5-1 displays the ETL process delivered with OPVA.

Figure 4-1 The OPVA ETL Architecture

Surrounding text describes Figure 4-1 .

Set up as a recurring job in DAC, the Extraction, Transformation, and Load process (ETL) is designed to periodically capture targeted metrics (dimension and fact data) from multiple Safety databases, transform and organize them for efficient query, and populate the star-schema tables.

While the OPVA data model supports data extraction from multiple sources, OPVA only includes source-dependent extract (SDE) mappings for the Oracle Argus Safety database. However, you can also define SDE mappings from additional external sources that write to the appropriate staging tables. Note that you are responsible for resolving any duplicate records that may be created as a consequence.

The SDE programs map the transactional data to staging tables, in which the data must conform to a standardized format, effectively merging the data from multiple, disparate database sources. This is the architectural feature that accommodates external database sourcing.

The staged data is transformed using the source-independent loads (SILs) to star-schema tables, where such data are organized for efficient query by the Oracle BI Server.

There is one SDE mapping for each target table, which extracts data from the source system and loads it to the staging tables. SDEs have the following features:

There is one SIL mapping for each target table. The SIL extracts the normalized data from the staging table and inserts it into the data warehouse star-schema target table. SILs have the following attributes:

4.1.1 Oracle Health Sciences Pharmacovigilance Operational Analytics Hierarchy

Figure 5-2 displays the OPVA hierarchy:

Figure 4-2 OPVA Hierarchy

Description of Figure 4-2 follows
Description of "Figure 4-2 OPVA Hierarchy"

Following is the OPVA hierarchy:

  • CONTAINER (OPVA_Datawarehouse) - A single container that holds all objects used for OPVA.

  • EXECUTION PLAN (OPVA Data Warehouse Load) - A data transformation plan defined on subject areas that needs to be transformed at certain frequencies of time. An execution plan is defined based on business requirements for when the data warehouse needs to be loaded. Single Execution Plan to Load Complete Warehouse.

  • SUBJECT AREAS - A logical grouping of tables related to a particular subject or application context. It also includes the tasks that are associated with the tables, as well as the tasks required to load the tables. Subject areas are assigned to execution plans, which can be scheduled for full or incremental loads.

  • TASK GROUPS - This is a group of tasks that should be run in a given order.

  • TASKS - A unit of work for loading one or more tables. A task comprises the following: source and target tables, phase, execution type, truncate properties, and commands for full or incremental loads. A single Informatica workflow.

4.2 Executing the ETL Execution Plans

To load data from the source to their target tables in the data warehouse, run the Execution Plan packaged with OPVA. Perform the following tasks in DAC:

  1. Navigate to the Execute view.

  2. Select OPVA Warehouse execution plan.

  3. Set the parameter values under the Parameter tab.

  4. Build the execution plan.

  5. Click Run.

    Note:

    The p_last_extract_date parameter provides information about the last time the ETL was performed.

4.3 Customizing an ETL Execution Plan

When you customize an ETL Execution Plan, it is your responsibility to maintain version control over changes to ETL mappings.

Oracle recommends that you carefully track the changes you make to Oracle-supplied ETL so that you can re-apply these changes in subsequent releases.

4.4 Creating an ETL Execution Plan

Though OPVA includes ETL Execution Plans for extracting data from Oracle Argus Safety to OPVA data warehouse, you may want to create your own ETL to extract data from other data sources.

Note:

The value of DATASOURCE_NUM_ID is set to 1 for Oracle Argus Safety. If you want to add your own data sources, set this value to a number greater than 100.

See Also:

To add one or more tables or columns along with the associated ETL Execution Plans to populate data into these table, perform the following tasks:

  1. Create the new source and target table metadata inside Informatica.

  2. Work in Informatica PowerCenter and create the ETL components (transformation or workflow) for this program.

  3. Create the required workflow for this mapping.

  4. Connect to DAC and create a new task for this new mapping.

  5. Synchronize the task.

  6. Add the task to subject area.

  7. Build the Execution Plan (OPVA Data Warehouse Load).

4.5 Modifying an ETL Execution Plan

You may also want to modify an existing ETL to meet your reporting requirements.

See Also:

To modify an ETL without any changes to the associated tables or columns, perform the following tasks:

  1. Identify the Execution Plan that needs to be modified in Informatica repository.

  2. Open and Modify the ETLs (transformation and/or workflow).

  3. Test and save the changes in repository.

  4. Connect to DAC and navigate to the corresponding task.

  5. Right-click the task and synchronize it.

  6. Navigate to the execution plan and execute ETL to verify the changes.

Note:

The ETL Execution Plans that extract data for the warehouse fact tables assume that the dimensions to which each fact is related are up-to-date at the time the fact ETL Execution Plans are executed. This assumption is the basis for certain fact calculations that would provide erroneous results if the assumption were not true.

As shipped, OPVA ETL workflows ensure this interlock by executing the ETL for related dimensions immediately before running the ETL for a fact. This is standard warehouse management practice, but especially important given the interdependence of the dimensions and the fact. The need to execute dimension ETL immediately before corresponding fact ETL, and the danger of not doing it, is emphasized here because it is possible (though discouraged) to modify these shipped workflows.

To modify one or more tables or columns without any changes to the associated ETL programs (typically to widen a column):

  1. Change the table properties as needed.

  2. Save the mapping and refresh the workflow.

  3. Connect to DAC and navigate to corresponding task and refresh it.

Note:

If the changes to the tables or columns are not compatible with the table that is installed in the data warehouse schema, you will get a warning while making the change. For example, if you are reducing the length of a number column from 15 to 10, the change is not compatible with the data existing in the table.

4.6 Scheduling an ETL Execution Plan

When you submit a Execution Plan for execution in DAC, you can schedule it execute at regular intervals. To schedule a Execution Plan, perform the following tasks:

  1. Navigate to the Scheduler tab within the Execute view.

  2. Create a new schedule plan.

  3. Enter the required details and click Save.

4.7 Customizable User Exits in OPVA ETLs

The opvaUtilCustomisableValues package can be used to customize some of the attribute values used in OPVA per the organizations requirement. It is called during the ETL executions for the Case Version Fact table present in the Retrospective Metrics and also in the source code for the view Pending Cases Fact present in the Current Metrics of OPVA.

It is available in the OPVA_SRC schema created on the Argus Safety DB Instance specified during OPVA Installation.

The package contains two functions:

  1. fgetDefaultSeriousTgtDayValues

    This function, by default installation of OPVA, will return a value of 10 days for a SERIOUS case and 30 days if the case is not, per enterprise present. It can be modified as per the needs of the organization accordingly.

    Table 4-1 OPVA Presentation Catalog Attributes Affected - Lock Target Days

    ID Fact Column

    M-CVH01

    Case Version

    Lock Target Days

    M-PC04

    Pending Cases

    Lock Target Days


    Note:

    Refer to the Facts in OPVA Presentation Catalog for the other attributes that are dependent on the attributes mentioned in the above table.
  2. fgetDefaultDueSoonValue

    This function, by default installation of OPVA, will return a value of 2 days for the attributes mentioned below, per enterprise present. It can be modified as per the needs of the organization accordingly.

    Table 4-2 OPVA Presentation Catalog Attributes Affected - Due Soon Days

    ID Fact Column

    M-PC06

     

    Lock Due Soon Date

    M-PC02

    Pending Cases

    Earliest Exp Report Due Soon Date

    M-PC06

    Pending Cases

    Lock Due Soon Date

    M-PC09

    Pending Cases

    State Due Soon Date - I

    M-PC12

    Pending Cases

    State Due Soon Date - II

    M-PC16

    Pending Cases

    # Unlocked Cases w Expedited Report Due Soon

    M-PC19

    Pending Cases

    # Unlocked Cases Due Soon

    M-PC22

    Pending Cases

    # Unlocked Cases w State Due Soon - I

    M-PC25

    Pending Cases

    # Unlocked Cases w State Due Soon - II


    Note:

    Refer to the Facts in OPVA Presentation Catalog for the other attributes that are dependent on the attributes mentioned in the above table.