Skip Headers
Oracle® Business Intelligence Applications Configuration Guide for Informatica PowerCenter Users
Release 7.9.6.4

Part Number E35272-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

19 Customizing ETL Mappings for Oracle BI Applications

This chapter provides a high-level overview of some of the common concepts and tasks required to customize Extract, Transform, and Load (ETL) mappings for Oracle Business Intelligence Applications.

Customizing Oracle BI Applications is a multifaceted process. Before implementing customizations, you should have a thorough understanding of the following:

This chapter includes the following main topics:

19.1 Overview of ETL Customization for Oracle BI Applications

In Oracle BI Applications, ETL customization is defined as changing the preconfigured ETL logic to enable you to analyze information in new ways in your business intelligence dashboards.

This chapter explains how to customize ETL functionality after you have performed a business analysis and technical analysis. It does not cover the following tasks that you also need to perform:

19.2 Before You Begin ETL Customization for Oracle BI Applications

This section explains important concepts that you need to know before beginning to customize ETL for Oracle BI Applications. This section contains the following topics:

19.2.1 Upgrade Considerations

One of the most difficult aspects about working with customizations is handling the customizations at the time of an upgrade. Informatica does not provide a 'diff-merge' capability that would automatically detect changes introduced by customers and add them into upgraded mappings. Therefore, customizations must be reapplied manually to upgraded mappings. Oracle BI Applications attempts to minimize the amount of effort required to reapply customizations after an upgrade. Following the customization standards described in this chapter should help to reduce the effort required at the time of upgrade.

19.2.1.1 How Upgrade Impacts Mappings

When upgrading, you will deploy customized mappings on an individual basis. By encapsulating the logic as much as possible, any changes made to the preconfigured logic can be switched as either part of a patch release or upgrade without impacting any extension logic, as shown in the following figure.

Figure 19-1 Encapsulating Logic

This screenshot or diagram is described in surrounding text.

If there is a change to an exposed object, the new logic will always take precedence over the extension logic. However, rather than losing all of the extensions, much of the extension logic is retained and only has to be reapplied to the exposed objects. For example, if you add an additional column from the source and load it into the target, during an upgrade, the upgraded mapping brings the additional column from the source and loads them into the target.

Figure 19-2 Encapsulated Logic and Extension Logic

This screenshot or diagram is described in surrounding text.

The source and target are completely replaced so any extensions to these are lost in Informatica (note that the columns will still exist in the database). However, the extension logic itself still exists after the upgrade. The source and target must be re-extended and then reconnected to the extension logic.

Figure 19-3 Re-extending and Reconnecting to Extension Logic

This screenshot or diagram is described in surrounding text.

If you extend a mapping and the mapping does not change during an upgrade, all extensions are retained. If changes to the encapsulated logic occur, all extensions are also retained. However, if exposed objects change during the upgrade, extensions to these objects are lost, but the underlying extension logic is retained. Extensions to exposed objects must be manually reapplied.

19.2.1.1.1 Points to Remember

Note the following points:

  • Encapsulated objects must never be customized unless directed by Oracle. Encapsulated objects are usually mapplets and reusable transformations.

  • Exposed objects can be extended but must never be otherwise modified. Exposed objects might be completely replaced at upgrade.

  • Custom objects are never changed during an upgrade.

  • To minimize the work required for upgrading, try to minimize the number of changes to exposed objects by using custom objects. For example, rather than adding a table to the Source Qualifier to bring in a column from a related table, add a lookup to that table in the mapping.

  • In customizing objects, you must evaluate the options and determine the best approach for your environment. If you find the custom object approach allows the ETL to run in an acceptable amount of time, then this is the preferred approach. If the custom object causes the ETL process to take too long, you might want to consider incorporating the extension into an exposed object.

  • For Siebel sources, when you add custom columns to the Oracle Business Analytics Data Warehouse, you do not need to manually add change capture views. DAC automatically creates change capture views for all columns (including new columns) at runtime.

Note:

Most SDE adapter folders use the concept of Business Component mapplets. These are extract mapplets that might contain relational, application, or flat file sources. The Siebel adapter folders do not use Business Component mapplets; the sources are exposed directly in the mapping. Usually, the Business Component mapplet can be treated as an exposed object and is the only mapplet object that should be modified.

19.2.2 Using Custom Folders in the Oracle Business Analytics Warehouse

If you want to make changes to the Oracle Business Analytics Warehouse, you must create a custom folder and make the changes in it. Do not change objects in any of the shipped folders unless explicitly directed by Oracle. The shipped folders and the objects within them might be overwritten in future upgrades.

The deployed Informatica Repository does not include any custom folders; you must create your own. You should create a custom folder for each SDE folder you have deployed that will have customizations. These folders hold the extract mappings to various sources. You should also create a separate custom folder for customizations you want to make to the SILOS folder. Do not store customized extract and load mappings in the same folder.

The easiest way to modify an object is to copy an existing object from the shipped folder into the corresponding custom folder and to reuse existing business components, source and target definitions, transformations, mapplets, and mappings.

Note:

If source tables are extended, then the source tables require manual editing in Informatica PowerCenter Designer. Do not import the tables from the database into the repository, because it changes the source table definitions for the entire Oracle Business Analytics Warehouse.

When importing new tables from the database into the custom folder, use the Oracle Business Analytics Warehouse and transactional database ODBC database connections (using the ODBC drivers supplied by your database vendor) to connect to the source and target databases.

After importing new table definitions, change the database type to Oracle in the Informatica Repository, regardless of the database platform you are using. This has no effect on the choice of a relational database. This step is very important because in Informatica if the database type of the source tables are not identical, all mappings and workflows that refer to the source tables will be invalid.

19.2.3 Creating Custom Informatica Workflows

You must create custom workflows for all customized mappings. The general requirements for creating custom workflows are as follows:

  • Create the workflow such that each loads only one table. This helps to integrate the workflow with DAC.

  • The workflow name should match a session name that is used inside the workflow. This helps DAC to collect certain statistics.

  • The flag 'Fail parent if this task fails' should be checked for all the sessions within a workflow.

  • The flag 'Fail parent if this task does not run' should be checked for all the sessions within a workflow.

  • The Stop on Errors parameter should be set to 1 for all sessions within a workflow. This parameter is located in the Error Handling area of the Config Object tab in Informatica PowerCenter Designer.

  • Set the appropriate source and target connection values in Informatica PowerCenter Designer.

  • If the workflow is going to be used for full load command, you can choose to load in bulk mode (applicable only to Oracle and DB2-UDB databases). If the workflow is going to be used for the full load command in DAC, then in the Properties tab of Informatica PowerCenter Designer, set the Target Load type to Bulk. This requires that the target table does not have any indices during the load. DAC will drop the indices automatically; no action is required on your part.

  • For all entities, such as dimension and fact tables, create two workflows, one to be used for a full load and the other to be used for an incremental load. Both workflows are based on the same mapping. The same mapping gets executed during both full and incremental loads. This provides an opportunity to tune each of these load scenarios.

  • If the workflow is designed for loading a dimension in a full mode, make sure the workflow also includes a session for the unspecified row creation.

  • When defining a task in DAC, you must choose the appropriate truncate option. This helps DAC to decide whether to drop and create indices on the table.

  • You should not truncate target tables using the Informatica 'truncate target' option. It is especially important that DAC handle truncating tables when extracting from more than a single source system and loading into the same data warehouse. DAC will dynamically determine when tables need to be truncated. By setting the truncate option in the Informatica workflow, the table will always be truncated, limiting your ability to extract and load data from multiple sources. For example, you cannot extract data from two systems and load the data into the same staging table in parallel as the Informatica session will truncate the staging table when another session is running.

  • If some sessions have to be run in serial order and if all the sessions need to be rerun upon failure of the workflow, then you should consider designing a single workflow that has sessions run in serial. If upon failure, not all of the sessions need to be run again, then consider designing separate workflows and define dependencies in DAC.

  • The custom workflows can be plugged into an ETL process by registering them in DAC. All new tasks need to be registered in DAC, with the appropriate properties set. Also, you must register in DAC, source and target tables, task definitions, and dependencies. For instructions on registering objects in DAC, see Oracle Business Intelligence Data Warehouse Administration Console User's Guide.

19.2.4 Additional Points to Consider When Customizing Oracle BI Applications

You should consider the following points before customizing Oracle BI Applications:

  • Table definitions in Informatica. Ensure that the SQL style is set to Oracle while importing the table definitions from external data sources. Even if the actual data source is of another database type, such as DB2 or MSSQL, it does not affect the logic of how data gets loaded.

  • Update strategy. For loading new fact and dimension tables, design a custom process on the source side to detect the new and modified records. The SDE process should be designed to load only the changed data (new and modified). If the data is loaded without the incremental process, the data that was previously loaded will be updated again, which is a costly process. For example, the logic in the preconfigured SIL mappings looks up the destination tables based on the INTEGRATION_ID and DATASOURCE_NUM_ID and returns the ROW_WID if the combination exists, in which case it updates the record. If the lookup returns null, it inserts the record instead. In some cases, last update date(s) stored in target tables are also compared in addition to the columns specified above to determine insert or update. Look at the similar mappings in the preconfigured folder for more details.

  • ETL process. When using multiple sources for the data warehouse, you can decide to load from all of them at the same time or at different time frequencies using different Execution Plans.

  • Truncating target tables. Truncating should be done through DAC. A single task has place holders for a full load, and one for an incremental load.

    • For the SDE workflows, the commands for full load and incremental load are the same. They should have the Truncate Always flag checked in DAC. For these kinds of tasks, the command for full load and incremental load are based on the same mapping.

    • For SIL workflows, the command can be different for full and incremental loads. They should have the Truncate For Full Load option checked in DAC. When a table gets truncated, the indices are automatically dropped and created after the data is loaded. The workflow associated with the full load command can have the Bulk Load option turned on for an optimized version of the mapping that quickly inserts data. Note that if there are indexes on the table, the bulk load might fail, so it is very important that the indexes are registered in DAC and that you drop all of the indexes on this table during a full load if you use the bulk load option.

    • If a source requires an auxiliary task, it must be run only during incremental mode. So, for these tasks, the full load command is empty. No truncate options should be set.

  • DATASOURCE_NUM_ID. Use a parameter to define this value in the mapping. DAC will automatically create a parameter file with the correct DATASOURCE_NUM_ID, which will be picked up by the parameter in the mapping. This enables you to make multiple copies of the same mapping when you have multiple instances of the same transactional database type. You do not have to do any additional hard-coding other than to register the sources in DAC.

  • Creating indices and naming conventions. Staging tables typically do not require any indices. Use care to determine if indices are required on staging tables. Create indices on all the columns that the ETL will use for dimensions and facts (for example, ROW_WIDs of Dimensions and Facts, INTEGRATION_ID and DATASOURCE_NUM_ID, and flags). Carefully consider which columns or combination of columns filter conditions should exist, and define indices to improve query performance. Inspect the OTB objects for guidance. Name all the newly created tables as WC_. This helps visually isolate the new tables from the OTB tables. Keep good documentation of the customizations done; this helps when upgrading your data warehouse. Once the indices are decided upon, they should be registered in DAC, either manually or by right-clicking on the certain table and invoking the Import Indices command.

  • Currency. For currency-related data, populate the table with the base currency and exchange date field (in order to convert the data appropriately). The data for currency conversion should be maintained in the main data source. (The currency data is maintained by converting all the currency information into a single Base Currency code specified in DAC.)

  • Day Dimension. For the data that is related to W_DAY_D, use the reusable transformation EXP_DAY_DIMENSION_FK_RESOLUTION. Note that this transformation will take a date as input and return the foreign key to the Day dimension as output in the appropriate format (YYYYMMDD) avoiding the need of costly join or lookup to the W_DAY_D dimension table each time for resolution. Copy the reusable transformation and use it.

  • List Of Values. This applies in particular to the Category 1 and 2 customization scenarios. The preconfigured columns that depend on a list of values have a language-dependent column and a language-independent column. Use the mapplet MPLT_LOV_TRANSLATION to populate the language-dependent and independent columns in the dimension tables. For fact tables, use MPLT_LOV_D_ROW_WID to create a new foreign key to the LOV dimension. You could also deal with translations directly in SQL overrides to improve performance.

19.3 Oracle BI Applications Customization Scenarios

In customizing Oracle BI Applications ETL mappings, various scenarios are available based on the type of your data source:

Figure 19-4 shows the categories of supported customization scenarios, based on the data source.

Figure 19-4 Supported Customizations Based on Data Source

This screenshot or diagram is described in surrounding text.

For detailed information about tables and naming conventions, see Oracle Business Analytics Warehouse Data Model Reference.

19.4 Category 1 Customizations: Adding Columns to Existing Fact or Dimension Tables

This section includes the following topics:

19.4.1 About Extending Oracle Business Analytics Warehouse

Category 1 customizations involve extracting additional columns from source systems for which pre-packaged adapters are included (for example, Oracle JD Edwards EnterpriseOne or Oracle) and loading the data into existing data warehouse tables. For Category 1 customizations, data can also come from non-packaged sources, but this section assumes that the sources have already been mapped with a Universal adapter and only need to be extended to capture additional columns. (The initial mapping of a Universal adapter is considered a Category 3 customization. For information, see Section 19.7, "Category 3 Customizations: Adding New Data as a Whole Row into a Standard Dimension Table.")

In order to see additional columns in the data warehouse, the columns must first be passed through the ETL process. The existing mappings and tables are extensible. Sample placeholders demonstrate how to pass and store additional data. Oracle BI Applications provides a methodology to extend preconfigured mappings to include these additional columns and load the data into existing tables.

Oracle BI Applications recognizes two types of customization: extension, which is the supported customization method; and modification, which is not supported. The supported extension logic enables you to add to existing objects. For example, you can extract additional columns from a source, pass them through existing mappings, and populate new columns added to an existing table.

In contrast, Oracle BI Applications does not support modification of existing logic or columns. You should not change existing calculations to use different columns, and you should not remap existing columns to be loaded from different sources. For example, if you want to calculate revenue differently from the existing logic, you should create a new transformation and connect that calculation to a new column, for example, X_REVENUE. You can then remap the Oracle Business Intelligence repository to point to the new X_REVENUE column.

Most mappings have a single placeholder column, named X_CUSTOM, that marks a safe path through the mapping. All extension logic should follow the same route through the mapping as X_CUSTOM. You can add additional transformations to the mapping, but they should follow the same route through the mapping as X_CUSTOM.

In the following figure, the preconfigured logic is shaded in gray. You should not modify anything contained within these objects. You should add customizations to the existing mapping, which allows them to run parallel to the existing logic.

Figure 19-5 Preconfigured Logic and Customizations

This screenshot or diagram is described in surrounding text.

Because some objects need to be modified in order to allow for extensions, Oracle BI Applications breaks down extensions into the following categories:

  • Exposed objects. These objects allow changes, but the changes must be in the form of extensions (additive), and must never modify existing preconfigured logic. These objects are included in the mapping when shipped and are usually sources, targets, and nonreusable transformations.

    Note: Most SDE adapter folders use the concept of Business Component mapplets. These are extract mapplets that contain relational, application, or flat file sources. Generally, the Business Component mapplet can be treated as an exposed object and is the only mapplet object that should be modified. Keep in mind that you can modify exposed objects, but there is a risk that these changes might be lost at the time of upgrade.

  • Encapsulated objects. These objects cannot be extended. They attempt to hide as much of the shipped transformation logic as possible to prevent breaking the preconfigured logic. These objects are included in the mapping when shipped and are usually mapplets and reusable transformations.

  • Custom objects. You add custom objects to a mapping. (They are not shipped by Oracle.) Custom objects can be sources, transformations (reusable and nonreusable), or mapplets. Reusable transformations and mapplets that are shipped are considered encapsulated objects, but when you add such objects to an existing mapping, they are considered custom objects to that particular mapping. For example, if you want to add another amount to a fact table and that amount must be converted from its original currency to the data warehouse currency, you would normally add the existing Currency Exchange mapplet to the mapping to convert this new amount. In this case, the mapplet is considered a custom object to this particular mapping; however, it is also encapsulated, so the internal logic must not be changed.

    Note:

    Targets should not be added to a mapping.

19.4.2 High-Level Steps to Extend Oracle Business Analytics Warehouse

The most common scenario for extending the data warehouse is to extract additional columns from a source and pass them through to an existing data warehouse table (either fact or dimension). This type of change requires extending the source dependent extract (SDE) and source independent load (SIL) mappings.

This section provides typical, high-level instructions for extending the SDE and SIL mappings to add an additional column to a data warehouse table.

If the data is coming from a non-packaged source, then you must use a Universal adapter mapping. (You will need to create a Universal adapter mapping if an appropriate one does not already exist).

To extend the SDE and SIL mappings:

  1. In Informatica PowerCenter Repository Manager, create a custom folder for the custom SDE mappings and a custom folder for the custom SIL mappings.

    Common naming conventions for custom folders are the following:

    • CUSTOM_SDE_<source system>_Adapter for custom SDE mappings.

    • CUSTOM_SILOS for custom SIL mappings.

  2. Copy the SDE mapping you want to customize:

    1. Locate the SDE mapping you want to extend in the standard extract folder, for example, SDE_<source system>_Adapter.

    2. In the Repository Manager toolbar, select Edit, then select Copy.

    3. Open the custom SDE folder, and select Edit, then Paste.

      The mapping, any mapplets, sources, and targets are copied to the custom folder.

  3. Copy the SIL mapping you want to customize from the standard SIL folder and paste it into the custom SIL folder.

  4. Use a database client tool to add the new columns to the source and target tables in the database.

    Note: The last column in data warehouse tables is X_CUSTOM. As a best practice, Oracle recommends that you add new columns after X_CUSTOM and use X_ as a prefix for all custom columns to ensure there are no name conflicts with any columns Oracle might add later to that table.

  5. In Informatica PowerCenter Designer, use the Target Designer to import the source and target definitions into the custom folders.

  6. Extend the SDE mapping by bringing in the new columns.

    1. Open the mapping, and then open the business component mapplet. The naming convention for the business component mapplet is mplt_bc_<name>.

    2. Drag and drop the new columns from the source definition to the Source Qualifier.

    3. Modify the SQL override in the Source Qualifier to include the new columns.

      For tips on modifying the SQL override, see Section 19.4.3.1, "Tips for Modifying the SQL Override in a Source Qualifier."

    4. Drag and drop the new columns from the Source Qualifier to the 'Exp_Custom' expression.

    5. Drag and drop the new columns from the 'Exp_Custom' expression to the Output transformation, and then close the mapplet editor.

    6. Add a new custom Expression transformation to the mapping.

    7. Drag and drop the new columns from the mapplet to the new custom Expression transformation and then to the target definition.

  7. Extend the SIL mapping by bringing in the new columns.

    1. Open the mapping, and drag and drop the new columns from the source definition to the Source Qualifier.

    2. Edit the SQL override in the Source Qualifier to include the new columns.

    3. Drag and drop the new columns from the Source Qualifier to the 'Exp_Custom' expression.

    4. Drag and drop the new columns from the 'Exp_Custom' expression to the Update Strategy transformation.

    5. Drag and drop the new columns from the update strategy expression to the target definition.

  8. Copy the SDE workflow into the custom SDE folder and the SIL workflow into the custom SIL folder. In Informatica PowerCenter Workflow Manager, refresh and validate the workflows.

  9. Register the customized mappings in DAC. You must include the table definition, with any additional columns or indexes, and the required changes so the tasks execute the modified sessions in the new custom folders.

    For instructions on registering data warehouse objects in DAC, see Oracle Business Intelligence Data Warehouse Administration Console User's Guide.

19.4.3 Example of Extending Oracle Business Analytics Warehouse

In this example, data is passed from an existing source to an existing data warehouse table. The company in this example has identified additional fields in a base table that need to be added to the data warehouse table W_ORG_D. The company used an extension field to capture information related to organizations referred to as 'ACCOUNT_LOG.' In addition, the company wants to include the name of the person who last updated the record as an attribute of the organization.

This scenario is applied to examples with the different source types that are supported: Prepackaged adapters for Siebel, non-Siebel prepackaged application sources, and non-packaged data.

This section includes the following topics:

19.4.3.1 Tips for Modifying the SQL Override in a Source Qualifier

  • It is very important that the connected columns in the Source Qualifier appear in the same order in the SQL Override. A common mistake is to have the ports appear in the Source Qualifier in a different order than in the SQL Override.

  • The column in the SELECT clause must reference the aliased name of the table if an alias is used. In the example Section 19.4.3.2, "Example of Extracting Data from a Siebel Source," the actual source for the new column LOGIN is S_CONTACT but the SELECT clause references the alias LAST_UPDATE_BY.

  • Make sure you include a comma before new columns in a SELECT clause or before a new table in a FROM clause.

  • A new table should always be defined using LEFT OUTER join syntax. Do not use INNER join or RIGHT OUTER join syntax, because you could lose records as a result.

  • Make sure you define joins to match on a unique set of values. If you do not define a join that ensures a unique relationship, you might get a cartesian product, which changes the granularity and will result in duplicate errors downstream. If you cannot define a unique join, then you should bring the data in with a Lookup Transformation, which guarantees that at most one record will be returned.

  • As a best practice, you should comment custom code you have introduced. Comments should include at least the developer's name and the date the code was added.

19.4.3.2 Example of Extracting Data from a Siebel Source

In this example, the company is using Siebel Business Analytics version 7.8 and has used the S_ORG_EXT_X.ATTRIB_04 field in the S_ORG_EXT extension table to capture data related to 'ACCOUNT_LOG'. The name of the last person to update the record is retrieved by joining to the S_CONTACT table on S_ORG_EXT.LAST_UP_BY.

Note:

The Siebel adapter folders do not use Business Component mapplets; the sources are exposed directly in the mapping.

To extract data from a Siebel source:

  1. In Informatica PowerCenter Repository Manager, create a new folder named CUSTOM_SDE_SBL_78_Adapter.

  2. Copy the SDE_OrganizationDimension mapping and workflow to this folder.

  3. Edit the target definition W_ORG_DS to include the following columns:

    Column Name Data Type

    X_ACCOUNT_LOG

    VARCHAR2(10)

    X_LAST_LOGIN

    VARCHAR2(10)


    Note:

    If the source table has been customized, it would be necessary to re-import the source table into the custom folder, replacing the existing version. For this example, the source table has not changed.

  4. The extension table S_ORG_EXT_X is already joined in this mapping. Drag the ATTRIB_04 column from the source definition to the Source Qualifier. The column should appear after the X_CUSTOM column.

  5. Since S_CONTACT is not already joined for the last update person, drag a copy of S_CONTACT into the mapping. (Delete the new Source Qualifier associated with this source if one exists.)

  6. As a best practice, you should rename the source to indicate what it represents. In this case, rename the source to S_CONTACT_LAST_UPDATE_BY.

  7. Drag the LOGIN column from the source definition to the Source Qualifier.

  8. Drag ATTRIB_04 and LOGIN to the EXP_Custom expression.

  9. As a best practice, rename these ports to indicate both the table and column they come from.

    In case the mapping is changed and the related exposed objects are replaced, this will make it easier to reconnect because the custom expression will not be replaced.

  10. Connect the appropriate ports to the target definition

  11. Edit the SQL override in the Source Qualifier.

    1. Add the ATTRIB_04 and LOGIN columns immediately after X_CUSTOM in the SELECT clause.

    2. Add the table to the FROM clause.

    3. Add the join criteria.

      The Siebel application can run on various database platforms. To maintain database independence, you should write the SQL using Informatica's join syntax, which will automatically be translated to the appropriate database syntax when run. When adding tables, be sure to follow the Informatica syntax when defining the join.

      Below is an example of the SQL with the modified code in bold font.

      S_ADDR_ORG.LAST_UPD,
      S_ORG_EXT_T.LAST_UPD,
      0 AS X_CUSTOM
      - Added by J.Smith on 1/10/2007
      ,S_ORG_EXT_X.ATTRIB_04
      ,LAST_UPDATE_BY.LOGIN
      
      FROM
      V_ORG_EXT S_ORG_EXT,
      S_ORG_EXT BU_NAME,
      ...
      
      S_ORG_EXT_T,
      S_ORG_EXT_X,
      S_ADDR_ORG, 
      …
      
      S_MKT_SEG PRTNR_MKTSEG,
      S_MKT_SEG TRGT_MKTSEG
      -Added by J.Smith on 1/10/2007 
      ,S_CONTACT LAST_UPDATE_BY
      
      WHERE
      {
      V_ORG_EXT S_ORG_EXT
      LEFT OUTER JOIN S_ORG_EXT_X ON
      S_ORG_EXT.ROW_ID = S_ORG_EXT_X.PAR_ROW_ID
      …
      
      LEFT OUTER JOIN S_MKT_SEG TRGT_MKTSEG ON
      ORG.PR_ORG_TRGT_MKT_ID = TRGT_MKTSEG.ROW_ID 
      
      - Added by J.Smith on 1/10/2007
      LEFT OUTER JOIN S_CONTACT LAST_UPDATE_BY ON
      S_ORG_EXT.LAST_UPD_BY = LAST_UPDATE_BY.ROW_ID
      }
      
  12. Save your changes.

  13. Refresh and validate the session in Informatica PowerCenter Workflow Manager.

    This is necessary because it is possible that changes made to the mapping might invalidate the session.

19.4.3.3 Including a Source Table for the Change Capture Process

If you are bringing in data from a new table that was not previously included in an existing SDE mapping, it might be necessary to create an auxiliary change capture mapping so that when a row changes in the new table, the corresponding row in the main table is marked as changed. If you do not create an auxiliary process, it is possible that the new column from the new table is changed but if the base table does not experience any changes, this event will not be picked up. Keep in mind that auxiliary processing can have a negative impact on ETL performance. So, if it is not necessary to flag the main record as changed when there is a change in the related table, you might want to avoid building this mapping.

19.4.3.4 Example of Extracting Data from a Non-Siebel Packaged Source

In this example, the company is using Oracle Applications version 11.5.8 and has used the HZ_CUST_ACCOUNTS.ATTRIBUTE1 field to capture data related to 'ACCOUNT_LOG.' The name of the last person to update the record is already stored in the HZ_CUST_ACCOUNTS.LAST_UPDATE_LOGIN field. There is no need to join to any additional tables.

For Oracle Applications that run on an Oracle database, you do not need to use Informatica's SQL syntax when defining joins in a SQL Override. If you need to add additional tables, you can define the join using standard Oracle syntax.

If you add another table as a source, in addition to defining the join, you must also include the table's 'LAST_UPDATE_DATE' in the WHERE clause using the following syntax:

OR TABLE_NAME.LAST_UPDATE_DATE > TO_DATE('$$LAST_EXTRACT_DATE', 'MM/DD/YYYY
HH24:MI:SS')
)
AND
…

This ensures that changes to a record in that table will trigger an extract. If this were the only table to have an update and the other tables were not updated, then this change would not be detected.

Note:

Most SDE adapter folders use the concept of Business Component mapplets. These are extract mapplets that contain relational, application, or flat file sources. Generally, the Business Component mapplet can be treated as an exposed object and is the only mapplet object that should be modified. Keep in mind that you can modify exposed objects, but there is a risk that these changes might be lost at upgrade time.

To extract data from a non-Siebel packaged source:

  1. In Informatica PowerCenter Repository Manager, create a new folder named CUSTOM_SDE_ORA1158_Adapter.

  2. Copy the SDE_ORA_OrganizationDimension_Customer mapping and workflow to this folder.

  3. Edit the target definition W_ORG_DS to include the following columns:

    Column Name Data Type

    X_ACCOUNT_LOG

    VARCHAR2(10)

    X_LAST_LOGIN

    VARCHAR2(10)


    Note:

    If the source table has been customized, it would be necessary to re-import the source table into the custom folder, replacing the existing version. For this example, the source table has not changed.

  4. Open the mapping.

  5. Edit the Business Component 'mplt_BC_ORA_OrganizationDimension_Customer' by right-clicking the mapplet and selecting 'Open Mapplet.'

    Remember, the Business Component mapplets are the only mapplets you can normally edit. You should not edit any other mapplets unless directed by Oracle.

  6. Drag the columns LAST_UPDATE_LOGIN and ATTRIBUTE1 to the Source Qualifier, and then drag these columns to the Mapplet Output.

  7. Edit the Source Qualifier to include the new columns, as shown below:

    SELECT
    …
    
    HZ_PARTIES.SIC_CODE
    - Added by J.Smith on 1/10/2007
    , HZ_CUST_ACCOUNTS.LAST_UPDATE_LOGIN
    , HZ_CUST_ACCOUNTS.ATTRIBUTE1
    
    FROM
    HZ_CUST_ACCOUNTS, HZ_PARTIES
    WHERE
    …
    
  8. Return to the mapping.

  9. Add a new expression and rename it to 'X_CUSTOM.'

  10. Connect the new columns from the Business Component mapplet to this expression.

  11. As a best practice, you should rename these ports to indicate both the table and column they came from. If the mapping is changed and the related exposed objects are replaced, this will make it easier to reconnect, because the custom expression will not be replaced

  12. Connect these columns to the appropriate columns in the target definition.

  13. Save your changes.

  14. Refresh and validate the session in Informatica PowerCenter Workflow Manager.

    This is necessary because it is possible that changes made to the mapping might invalidate the session.

19.4.3.5 Example of Extracting Data from a Universal Source

In this example, the company has a legacy mainframe with data they would like to incorporate into the data warehouse. This requires that data be preformatted to match the source definition for FILE_ORG. Since the preexisting source definition does not include columns for the additional data the company desires, the source definition must be modified to include these columns.

Note:

The Universal adapter folders do not use Business Component mapplets; the sources are exposed directly in the mapping.

To extract data from a Universal source:

  1. In Informatica PowerCenter Repository Manager, create a new folder named CUSTOM_SDE_Universal_Adapter.

  2. Copy the SDE_Universal_OrganizationDimension mapping and workflow to this folder.

  3. Edit the source definition to include the following columns:

    Column Name Data Type

    ACCOUNT_LOG

    String(10)

    LAST_LOGIN

    String(10)


  4. Edit the target definition W_ORG_DS to include the following columns:

    Column Name Data Type

    X_ACCOUNT_LOG

    VARCHAR2(10)

    X_LAST_LOGIN

    VARCHAR2(10)


  5. Open the mapping.

  6. Drag the columns LAST_UPDATE_LOGIN and ATTRIBUTE1 to the Source Qualifier.

  7. Add a new expression and rename it to 'EXP_CUSTOM.'

  8. Connect the new columns from the Source Qualifier to this expression.

  9. Connect these columns to the appropriate columns in the target definition.

  10. Save your changes.

  11. Refresh and validate the session in Informatica PowerCenter Workflow Manager.

    This is necessary because it is possible that changes made to the mapping might invalidate the session.

19.4.3.6 Example of Loading Data into an Existing Target Table

Once the required data has been extracted and staged, it must be loaded into an existing target table in the data warehouse.

To load data into an existing target table in the data warehouse:

  1. Create a new folder named CUSTOM_SILOS.

  2. Copy the SIL_OrganizationDimension mapping and workflow to this folder.

  3. Edit the source definition W_ORG_DS to include the following columns:

    Column Name Data Type

    X_ACCOUNT_LOG

    VARCHAR2(10)

    X_LAST_LOGIN

    VARCHAR2(10)


  4. Edit the target definition W_ORG_D to include the following columns:

    Column Name Data Type

    X_ACCOUNT_LOG

    VARCHAR2(10)

    X_LAST_LOGIN

    VARCHAR2(10)


  5. Open the mapping.

  6. Drag the columns X_ACCOUNT_LOG and X_LAST_LOGIN to the Source Qualifier.

  7. Drag the columns X_ACCOUNT_LOG and X_LAST_LOGIN from the Source Qualifier to the Filter.

    Normally, existing transformations should not be modified. Filters are active transformations, and it is not possible to route data around an active transformation and bring it back to the same data flow. In this case, the filter is considered an exposed object and may be modified, but any changes are at risk of being lost at upgrade time.

  8. Drag the columns X_ACCOUNT_LOG and X_LAST_LOGIN from the Filter to the expression EXP_Custom. If you need to apply any transformations, you should do so in this expression.

  9. Drag the columns X_ACCOUNT_LOG and X_LAST_LOGIN from the expression to the Update Strategy.

    The Update Strategy is another active transformation and is, therefore, considered an to be an exposed object, just like the Filter.

  10. Connect these columns to the appropriate columns in the target definition.

  11. Save your changes.

  12. Refresh and validate the session in Informatica PowerCenter Workflow Manager.

    This is necessary because it is possible that changes made to the mapping might invalidate the session.

19.4.3.7 Updating DAC

After making these changes to the mappings, you must register the changes in DAC. You must include the table definition, with any additional columns or indexes, and the required changes so the tasks execute the modified sessions in the new custom folders. For information about registering data warehouse objects in DAC, see Oracle Business Intelligence Data Warehouse Administration Console User's Guide.

19.5 Other Types of Customizations Requiring Special Handling

This section includes the following topics:

19.5.1 Modifying Category 2 SCD Triggers

If you want to capture historical changes with dimensions that are configured as Type 1 dimensions, you need to modify them so that they can capture Type 2 changes. A common form of customization is to change the criteria that triggers a Type 2 change in a dimension. Most changes in a dimension are treated as Type 1 changes in that the existing column is simply overwritten with the new value. Once enabled, there are only a small number of columns that will trigger a Type 2 change. You can extend the logic that triggers a Type 2 change by adding additional columns to the logic that tracks Type 2 changes. In addition, you can remove columns from this logic in case you do not want these types of changes to trigger a Type 2 change. Modifying the Type 2 tracking logic is one of the only exceptions to the rule that you should not make changes to shipped logic. The logic that tracks Type 2 changes is contained in exposed objects in each SIL dimension mapping that supports Type 2 changes.

There is a lookup between the Source Qualifier and the Filter. This lookup is used to determine if the record already exists in the target and, therefore, must be updated in addition to other system columns. Columns that track Type 2 changes are returned in this lookup and passed to the next expression. The columns returned by the lookup are compared with the columns passed from the staging table. If any of these columns are different, the record is flagged for a Type 2 change.

This expression contains a variable port named 'TYPE2_COLS_DIFF'. If this port is flagged as 'Y' then a Type 2 change will be triggered. If it is flagged as 'N' then a Type 1 change will be triggered.

To change the columns used to determine a Type 2 change, modify the lookup to pass any additional columns you want to be evaluated for Type 2 changes. Then, modify the variable port 'TYPE2_COLS_DIFF' to include this column when being evaluated.

For example, the SIL_BusinessLocationDimension mapping compares the following columns:

BUSN_LOC_NAME
CITY_NAME
POSTAL_CODE

If you wanted to include COUNTY as part of Type 2 logic, you would change the expression for 'TYPE2_COLS_DIFF' from the following:

IIF(VAR_BUSN_LOC_NAME != VAR_LKP_BUSN_LOC_NAME, 'Y',
IIF(VAR_CITY_NAME != VAR_LKP_CITY_NAME, 'Y',
IIF(VAR_POSTAL_CODE != VAR_LKP_POSTAL_CODE, 'Y', 'N')))

To this:

IIF(VAR_BUSN_LOC_NAME != VAR_LKP_BUSN_LOC_NAME, 'Y',
IIF(VAR_CITY_NAME != VAR_LKP_CITY_NAME, 'Y',
IIF(VAR_POSTAL_CODE != VAR_LKP_POSTAL_CODE, 'Y',
IIF(VAR_COUNTY != VAR_LKP_COUNTY, 'Y', 'N'))))

As mentioned previously, you would have to modify the lookup transformation and pass the COUNTY column and store in LKP_COUNTY, and then null-evaluate it and store it in a variable VAR_LKP_COUNTY. Also, the COUNTY column that came in from the stage table should also be null-evaluated and stored in another variable column called VAR_COUNTY.

Now that COUNTY is part of Type 2 logic, you need to remove this column from any other mappings that still assume this is a Type 1 column. The only other mapping that makes this assumption is SIL_BusinessLocationDimension_SCDUpdate. This mapping consists of the source table, followed by a Source Qualifier transformation, an Expression transformation, a Filter Transformation, and finally the target table.

To ensure that the newly added column COUNTY is treated as a Type 2 column in SIL_BusinessLocationDimension_SCDUpdate, make the following required changes:

  1. Source Qualifier transformation changes:

    • Delete the COUNTY port.

    • From the SQL Override, remove the item "TARGET_TABLE.COUNTY" from the SELECT clause.

  2. Expression transformation changes:

    • Delete the input-only port COUNTY.

    • Delete the corresponding variable port COUNTY_VAR.

    • Delete the corresponding output-only port COUNTY_OUT.

  3. Filter transformation changes:

    • Delete the port COUNTY_OUT.

Note:

The previous example illustrates a situation where you want to "add" a new column to the Type 2 set. In situations where you want to "remove" a column from the Type 2 set, you would need to do the opposite steps. For example, assume that you want to remove CITY_NAME from being a Type 2 column:

  1. In the mapping SIL_BusinessLocationDimension, the TYPE2_COLS_DIFF expression would reduce to:

    IIF(VAR_BUSN_LOC_NAME != VAR_LKP_BUSN_LOC_NAME, 'Y',
    IIF(VAR_POSTAL_CODE != VAR_LKP_POSTAL_CODE, 'Y', 'N'))
    
  2. In the mapping SIL_BusinessLocationDimension_SCDUpdate, the following changes are required:

    - Source Qualifier: Instead of removing a column, you will now add CITY_NAME to the select clause.

    - Expression Transformation: Follow other examples of Type 1 columns to add three new ports corresponding to CITY_NAME.

    - Filter Transformation: Add the new port CITY_NAME.

    - Make the appropriate connections.

19.5.2 Adding Dimensions to Existing Facts

This section covers adding a dimension (preexisting or custom) to an existing fact. It assumes you have already built the required process to populate this dimension.

This process involves extending both the fact staging table and the fact data warehouse table to include the new column. In Informatica, remember to define the tables using the Oracle database type. The staging table should be defined as a varchar2(80) field and named with in _ID suffix. The data warehouse table column should be defined as an integer and named with a _WID suffix.

The SDE fact mapping must be modified to pass through the unique identifier of the dimension key. This assumes that there is some relationship between the base table and this unique identifier. It might already be stored in the base table or stored by joining to a related table. Depending on the source system, this identifier might be based on a single column or derived from multiple columns. Table 19-1 depicts various formats used to derive the INTEGRATION_ID, which is used to identify a dimension key. The INTEGRATION_ID value should be passed to the fact staging table.

Table 19-1 Formats to Derive INTEGRATION_ID

Dimension Foreign Key When Source is Oracle Application When Source is Siebel Application When Source is Oracle's JD Edwards EnterpriseOne or JD Edwards World Application

W_AP_TERMS_D

-

TO_CHAR(TERM_ID)

Not applicable

PNPTC

W_BUSN_LOCATION_D

ASSET_LOC_WID

ASSET_LOC~' || LOCATION_ID

Not applicable

Not applicable

W_BUSN_LOCATION_D

EMP_LOC_WID

EMP_LOC~' || LOCATION_ID

Not applicable

Not applicable

W_BUSN_LOCATION_D

INVENTORY_LOC_WID

STORAGE_LOC' || '~' || ORGANIZATION_ID || '~' || SUBINVENTORY_CODE || '~' || INVENTORY_LOCATION_ID

Not applicable

Not applicable

W_BUSN_LOCATION_D

PLANT_LOC_WID

'PLANT' || '~' || TO_CHAR(ORGANIZATION_ID)

Not applicable

Not applicable

W_BUSN_LOCATION_D

RECEIVING_LOC_WID

'RECIPIENT_LOC' || '~' || TO_CHAR(LOCATION_ID)

Not applicable

Not applicable

W_BUSN_LOCATION_D

STORAGE_LOC_WID

'STORAGE_LOC' || '~' || ORGANIZATION_ID || '~' || SECONDARY_INVENTORY_NAME || '~'

Not applicable

Not applicable

W_CUSTOMER_FIN_PROFL_D

CUSTOMER_FIN_PROFL_WID

P'||'~'||TO_CHAR(CUSTOMER_ID) ||'~' || TO_CHAR(SITE_USE_ID) ||'~' || CURRENCY_CODE - CUSTOMER_ID is CUST_ACCOUNT_ID from HZ_CUST_ACCOUNTS and CURRENCY_CODE is from HZ_CUST_PROF_CLASS_AMTS

Not applicable

AN8 ||'~'|| CO

W_CUSTOMER_LOC_D

-

To get the customer location key, look up W_CUSTOMER_LOC_USE_D as below

Not applicable

AIAN8||'~'|| AICO

W_CUSTOMER_LOC_USE_D

-

TO_CHAR(SITE_USE_ID) - Get Site Use Id from HZ_CUST_ACCOUNT_ROLES

Not applicable

Not applicable

W_FREIGHT_TERMS_D

-

LOOKUP_CODE

Not applicable

Not applicable

W_GL_ACCOUNT_D

-

to_char(ccid)

Not applicable

AID||'~'|| SBL||'~'|| SBLT

W_INT_ORG_D

COMPANY_ORG_KEY

COMPANY'||'~'||TO_CHAR(SET_OF_BOOKS_ID)

S_ORG_EXT.ROW_ID

CO

W_INT_ORG_D

*_ORG_KEY

Remove any prefixes and use TO_CHAR()

S_ORG_EXT.ROW_ID

MCU

W_ORG_D

CUSTOMER_WID

TO_CHAR(CUSTOMER_ID) - CUSTOMER_ID is CUST_ACCOUNT_ID from HZ_CUST_ACCOUNTS

UNION OF S_ORG_EXT AND S_CONTACT. When source is S_ORG_EXT, ROW_ID is passed. When source is S_CONTACT, use ''C-'||ROW_ID - the ROW_ID is the ROW_ID from the S_PERSON (not from S_ORG_EXT). This is the new value passed to lookup the Contact Customers in W_ORG_D for resolving the ACCOUNT_WID (read as CUSTOMER_WID)

Not applicable

W_PAYMENT_METHOD_D

-

LOOKUP_CODE

Not applicable

SY||'~'|| RT||'~'|| KY

W_PAYMENT_METHOD_D

-

TO_CHAR(TERM_ID)

Not applicable

SY||'~'|| RT||'~'|| KY

W_PERSON_D

CUST_CONTCT_WID

TO_CHAR(PARTY_ID) - PARTY_ID from HZ_PARTY_RELATIONS

S_CONTACT.ROW_ID

Not applicable

W_PRODUCT_D

PRODUCT_WID

TO_CHAR(INVENTORY_ITEM_ID)

S_PROD_INT.ROW_ID

ITM

W_SALES_PRODUCT_D

-

TO_CHAR(INVENTORY_ITEM_ID) ||'~'||TO_CHAR(ORGANIZATION_ID)

Not applicable

Not applicable


If you are adding an existing dimension, the SIL mapping should be extended to include the preexisting reusable Lookup transformation to that dimension. Pass the dimension's INTEGRATION_ID through the mapping along the path identified by the X_CUSTOM column and connect it to the Lookup after the Filter transformation. Also, connect the DATASOURCE_NUM_ID to the Lookup. If the dimension is a slowly changing dimension, the fact table's standard or 'canonical' date should be passed to the lookup as well, even if the dimension has not been enabled to capture Type 2 changes.

Remember to connect the ROW_WID of the Lookup to the X_CUSTOM transformation and include logic to default this value to 0 if no record is returned from the Lookup. Pass this column on to the Update strategy, and then on to the target.

Update DAC to include the foreign key to this dimension in the fact table's definition. You should reassemble the subject Area and rebuild the Execution Plan to ensure that DAC populates this dimension table before this fact table starts to load.

19.5.3 Adding Date Dimensions to Existing Facts

If adding a date dimension to a fact table, you merely have to pass the date itself through the SDE mapping to the stage table. In the SIL mapping, pass the date along the same path as X_CUSTOM. Add the reusable expression EXP_DAY_DIMENSION_FK_RESOLUTION after the Filter. Connect the date to any input and connect the appropriate output to the EXP_Custom transformation, then on to the Update Strategy and finally to the target.

19.5.4 Adding Currencies to an Existing Table

Amounts must be converted from the original currency to the data warehouse currency. Along with the amount, you must pass the currency code if it is not already connected in the mapping. Depending on the source system, there might be more than one currency code.

If the source is Siebel, there is generally only one currency type.

Other sources might have several currency types. Be sure to read the section on configuring currency codes to get a better understanding of how these work.

If the SIL mapping does not already include it, add the mapplet MPLT_CURCY_CONVERSION_RATES after the Filter and connect all required input ports.

Connect the appropriate exchange rate(s) to the EXP_Custom expression. Use the appropriate exchange rate to convert the amount to the data warehouse currency. Pass the converted currency to the Update strategy then onto the target.

19.6 Category 2 Customizations: Adding Additional Tables

This section includes the following topics:

19.6.1 About Creating New Dimension or Fact Tables

This section relates to building entirely new tables that will be loaded with data from a source table that is not already extracted from. For example, you might want to create a new Project dimension table. In this case, you create new dimension and staging tables as well as new extract and load ETL mappings.

When creating a new custom table, use the prefix WC_ to help distinguish custom tables from tables provided by Oracle as well as to avoid naming conflicts in case Oracle later releases a table with a similar name. For example, for your Project dimension you might create a WC_PROJECT_DS and a WC_PROJECT_D table.

When you create a new dimension or fact table, use the required system columns that are part of each of the data warehouse tables to maintain consistency and the ability to reference existing table structures. When you create a new table, you need to register the tables and indices in DAC. You will also have to register in DAC the new tasks for new Informatica workflows and then reassemble the appropriate Subject Area and rebuild the appropriate Execution Plan. For information about assembling Subject Areas and building Execution Plans, see the Oracle Business Intelligence Data Warehouse Administration Console Guide.

Note:

If you are creating tables on a DB2-UDB database, make sure when you register the tables in DAC the Not Logged Initially option is enabled.

19.6.1.1 Required Columns

For custom staging tables, the following columns are required:

  • INTEGRATION_ID. Stores the primary key or the unique identifier of a record as in the source table.

  • DATASOURCE_NUM_ID. Stores the data source from which the data is extracted.

For dimension and fact tables, the required columns are the INTEGRATION_ID and DATASOURCE_NUM_ID columns as well as the following:

  • ROW_WID. A sequence number generated during the ETL process, which is used as a unique identifier for the data warehouse.

  • ETL_PROC_WID. Stores the ID of the ETL process information. The details of the ETL process are stored in the W_ETL_RUN_S table on the data warehouse side. This is also the Process ID on Current Run/Run History screen in DAC.

19.6.1.2 About the Oracle Business Analytics Warehouse DATASOURCE_NUM_ID Column

All the tables in the Oracle Business Analytics Warehouse schema have DATASOURCE_NUM_ID as part of their unique user key. While the transactional application normally ensures that a primary key is unique, it is possible that a primary key is duplicated between transactional systems. To avoid problems when loading this data into the data warehouse, uniqueness is ensured by including the DATASOURCE_NUM_ID as part of the user key. This means that the rows can be loaded in the same data warehouse tables from different sources if this column is given a different value for each data source.

Note:

The DATASOURCE_NUM_ID is maintained by DAC. Ensure that each source system has a unique value assigned. It is possible to have multiple instances of the same source system (for example, a U.S.-based and a European-based Siebel transactional database both loading into the same data warehouse). The two different transactional database systems should be assigned different DATASOURCE_NUM_ID values in DAC. DAC is predefined with one entry for Siebel and the DATASOURCE_NUM_ID is assigned the value of 1. If you are going to extract from additional Siebel transactional database systems and load the data into the same data warehouse, a different DATASOURCE_NUM_ID must be assigned to each Siebel transactional database system.

19.6.2 Including a Source Table for Change Capture Process

This procedure applies to Siebel source tables only.

To include a source table for the change capture process:

  1. Verify whether the source table is registered in DAC.

    1. If an entry is not in DAC, create a new record for the table and assign an image suffix to it.

    2. If the table is registered, make sure there is an image suffix assigned for this table.

  2. If an image suffix does not exist for the source table, assign one.

    The image suffix should be three characters long. The recommended naming convention is to start with C. For example, use C1, C2, CA1, CA2, and so on.

  3. Make sure the image suffix is not used by other tables by querying the tables list in DAC for that image suffix.

    DAC does not validate this information when entering data.

  4. Create Image tables in the Siebel transactional database.

    1. Right-click the table record in DAC, and choose Generate Change Capture Scripts.

      This will assist you in creating the image tables, necessary indices, and triggers if you plan on tracking deletes on the transactional database.

    2. Execute these scripts on the transactional database if you have the appropriate permission, otherwise ask the OLTP DBA to create it.

  5. For the task that is created for the extract process, set the Build Image flag to True and select the new table as an auxiliary table or primary table.

    Note:

    At the end of the change capture process, DAC creates a view on the actual source table. Use this view as the main source table in all the extract procedures. For example, if your new source table is S_COMPENSATION, then the default view name is V_COMPENSATION.

19.6.3 Adding a New Dimension in the Oracle Business Analytics Warehouse

Follow this procedure to add a new dimension in the Oracle Business Analytics Warehouse.

To add a new dimension and use it with an existing fact table:

  1. Create a DDL for the new dimension based on the standard structure (with appropriate system columns). Create a staging table for this dimension.

  2. Register the new source table and its staging table (if it does not already exist) in the DAC repository and associate it with the appropriate database connection.

    For instructions on registering data warehouse objects in DAC, see Oracle Business Intelligence Data Warehouse Administration Console User's Guide.

  3. Create a new custom map SDE_XYZ to populate the dimension stage. Instead of the actual source table (for example S_ABC), use the view that will be generated by the change capture process (for example V_ABC) in the SQL so that it extracts only the incremental data. Use existing reference maps as examples of how to populate the system columns. Make sure you truncate the stage table in corresponding tasks.

  4. Create a new custom map SIL_XYZ to populate the new dimension from the stage table. Use the above referenced map as example for how to populate the system columns.

  5. Register the new dimension table in DAC and associate it with the appropriate database connection.

    If you are planning to build a new dimension incrementally, assign an image suffix to the source table.

  6. Register the workflows as tasks in DAC.

  7. For SDE mapping of the dimension make sure you set the Build Image flag to True, and the Truncate Always option to True. And in the list of source tables, mark the primary/auxiliary source(s) of this dimension.

  8. For SIL workflows of the dimension make sure you set only Truncate for Full Load option to True.

  9. Make sure the target table of the SDE_XYZ is defined as source table for SIL_XYZ.

19.6.4 Adding a New Fact Table in the Oracle Business Analytics Warehouse

Follow this procedure to add a new fact table in the Oracle Business Analytics Warehouse.

To add a new fact table:

  1. Create a DDL for the new fact based on the standard structure (with appropriate system columns). Create a staging table for this fact.

  2. Register the new source table (if it does not already exist) in the DAC repository and associate it with a database connection.

    For instructions on registering data warehouse objects in DAC, see Oracle Business Intelligence Data Warehouse Administration Console User's Guide.

  3. Create the change capture tables by right-clicking and selecting Generate Change Capture Scripts. For instructions, see Section 19.6.2, "Including a Source Table for Change Capture Process."

  4. Create SDE mappings to populate the custom stage table. Use the view created by change capture as the main table in the SQL so that it extracts only the incremental data. Use the reference maps (above) as examples of how to populate the system columns. Be sure to truncate the stage table in corresponding workflows.

  5. Create SIL mapping to populate the custom fact table. Use reference maps as examples of how to populate the system columns.

  6. Use lookups or SQL override joins to dimension tables for populating dimension foreign keys (ROW_WIDs) pointing to the existing dimension.

  7. In DAC, register the target tables.

  8. Create new tasks for the workflows.

  9. For the SDE task, make sure you have the Build Image flag set to True, and list all the source tables that it queries from. Choose one or more tables as primary or auxiliary. For the target tables choose the staging table. Set the Truncate Always flag to True.

  10. For the SIL task, list all the dimensions that will be required under source tables.

19.6.5 Adding a New Dimension Table for a New Fact Table in the Oracle Business Analytics Warehouse

The steps for creating a new dimension table are similar to the steps for incremental change capture.

To add a new dimension table for a new fact table:

  1. In the new custom fact loading mapping (SIL), use lookups for getting foreign keys to the new dimension.

  2. Use existing maps as examples.

19.7 Category 3 Customizations: Adding New Data as a Whole Row into a Standard Dimension Table

Follow this procedure to add new data as a whole row into a standard dimension table in the Oracle Business Analytics Warehouse.

To add new data as a whole row into the standard dimension table:

  1. Identify and understand the existing structure of staging tables. Refer to Oracle Business Analytics Warehouse Data Model Reference for the table structure. Non-system columns can include the null value.

  2. Create a custom SDE mapping to load the data into the staging table in the custom folder for this purpose. The staging table must be populated with incremental data (rows that have been added or changed since the last Refresh ETL process), for performance reasons.

  3. Populate the INTEGRATION_ID column with the unique identifier for the record.

    The combination of INTEGRATION_ID and DATASOURCE_NUM_ID is unique. When importing the data, ensure that a unique identifier for the external data source is inserted in the DATASOURCE_NUM_ID column. The DATASOURCE_NUM_ID is set to 1 for mappings that source data from the Siebel transactional database. This is a reserved value and is used in all standard mappings. For example, a value of 2 can be defined for DATASOURCE_NUM_ID in the custom SDE mapping. The standard SDE mappings populate the INTEGRATION_ID column of the dimension staging table (used for resolving the dimension's Siebel transactional database ROW_ID value). The custom process must be used to populate the same column with a unique identifier from the external data source.

  4. After the data is populated in the staging table, use the standard SIL mappings to populate the dimension target tables.

  5. Modify the SDE and SIL mappings of all the related fact tables (fact tables that need to be linked to this dimension).

    The custom fact SDE mappings must populate the foreign key column of the changed dimension (using a custom map table process to convert from Siebel's row IDs to the external data source row IDs). The custom SIL mapping should be modified to use the appropriate DATASOURCE_NUM_ID, because the standard SIL mappings assume DATASOURCE_NUM_ID for the dimensions are the same as the fact table's DATASOURCE_NUM_ID.

    It is very important to decide when the data is going to be loaded. If it is going to be loaded along with the Siebel source data, you must be careful with how you handle failure recovery. The preconfigured workflows truncate the target staging table prior to loading. Upon failure, when the DAC Server restarts the task, all the data is truncated and all the data is loaded again.

    If the data from the external source gets loaded into the same staging table, be careful with how you handle this situation, since you cannot use the truncate table functionality. The data migrating into the staging table is not incrementally loaded, and, therefore, should be cleaned up prior to attempting to load this table again.

    In such a case, it is recommended that you encapsulate the extract part from both the sources inside an Informatica workflow. Upon failure of either of the extracts, the whole workflow gets rerun. Note that the data from both the sources should be run at the same time, all the time.

    If it is decided that the data is going to be loaded at different time frequencies, then the new SDE workflows need not depend on the preconfigured SDE workflows and can use the Truncate Table option for failure recovery. In this case, in the DAC Design view, define a new Execution Plan in the Execution Plans tab, and define the new data source in the Database Connections child tab. Make sure the shared SIL process depends on the SDE processes from both sources.

19.8 Configuring Extracts

Oracle BI Applications includes prepackaged logic to extract particular data from a particular source. This section discusses how to capture all data relevant to your reports and ad hoc queries by addressing what type of records you want and do not want to load into the data warehouse. This section contains the following topics:

19.8.1 Extracting Additional Data

You can configure extract mappings and mapplets in the Oracle Business Analytics Warehouse to accommodate additional source data. For example, if your business divides customer information into separate tables based on region, then you would have to set up the extract mapping to include data from these tables.

19.8.1.1 Extracting New Data Using an Existing Source Table

Extract mappings generally consist of a source table or Business Component, an Expression transformation, and a staging table. If you want to extract new data using the existing mapping, you have to modify the extract mapping to include the new data by performing the following tasks:

To modify an existing mapping to include new data:

  1. Modify the existing Business Component to extract information from the source, and add it to an appropriate extension column.

    Tip:

    You can perform calculation transformations in the Business Component mapplet or the source adapter mapplet in the extract mapping. However, do not use performance-expensive calculations in the extract that could tie up your source transaction system. For these types of calculations, it is recommended that you perform them in the source adapter mapplet.

  2. Modify the Expression transformation to perform any necessary transformations.

  3. Connect all input and output ports within the extract mapping so that the data moves from the source or Business Component to the Expression transformation and through the source adapter mapplet, and finally to the staging table's appropriate extension column.

You have to determine which type of extension column to map the data to in the staging table. After you modified the extract mapping, you would also have to modify the corresponding load mapping to make sure the extension columns that you added are connected all the way from the staging table to the warehouse table.

19.8.1.2 Extracting Data from a New Source Table

Business Components are packaged as mapplets, which reside in source-specific folders within the repository. Business Components are used to extract data from the source system. You can configure these mapplets to perform the following:

  • Extract data from a new source table

  • Set incremental extraction logic

The following procedure contains instructions for adding a new table to the Business Component. The procedure includes adding a new source definition, connecting the ports to the Source Qualifier, editing the Source Qualifier, connecting the ports to the Output transformation, and editing the Output transformation.

To add a new source table to an existing Business Component mapplet:

  1. In Informatica PowerCenter Designer, open the applicable source system configuration folder.

  2. Open Mapplet Designer tool.

  3. Drag the Business Component mapplet into Mapplet Designer to view the transformations that comprise the Business Component.

  4. Expand the Sources folder, and copy a source table into the mapplet by dragging and dropping the table into Mapplet Designer.

  5. Connect the applicable ports from the new source definition to the Source Qualifier by clicking on the port in the new source table and dragging it to the connecting port in the Source Qualifier.

  6. Double-click the Source Qualifier to open the Edit Transformations box.

    In the Ports tab, make any changes to the new ports for data type, precision, scale, or all these values, as necessary.

  7. Connect the applicable ports from the Source Qualifier to the Mapplet Output transformation (MAPO).

    Note:

    In some cases, the Business Component contains an Expression transformation between the Source Qualifier and the MAPO.

  8. In the Properties tab, make changes to the SQL statement as necessary.

  9. Validate and save your changes to the repository.

19.8.2 Setting Up the Delimiter for Source Files

You must ensure that the delimiter used in the CSV files is not present in the source data. Oracle Business Analytics Warehouse is preconfigured with a comma as the source file delimiter. If your data contains commas, then you need to enclose the data fields using a character that is not present in the data set. For example, common enclosing characters include single quotes and double quotes.

To set up the delimiter for source files:

  1. Open the CSV file.

  2. Enclose the data fields with the enclosing character that you have identified.

    You can identify an enclosing character by identifying a character that is not present in the source data. Common enclosing characters include single quotes and double quotes.

  3. Save and close the CSV file.

  4. Identify all the source definitions associated with the modified files.

  5. Change the properties for each of these source definitions to use the enclosing character.

  6. Validate and save your changes to the repository.

19.8.3 Configuring a Source Adapter Mapplet

The majority of all source-specific transformations occur in the source adapter mapplet; source-independent transformations generally take place in the Analytic Data Interface (load mapping). The source adapter mapplet converts source-specific data elements into standard formats and then stores them in a staging table. The source independent loading mapping then picks up these records, which are already transformed into standard format.

Figure 19-6 illustrates the three components of the source adapter mapplet that allow transformations of data to occur. The three components are Mapplet Input (MAPI), Expression transformation (EXP), and Mapplet Output (MAPO).

Figure 19-6 Components of the Source Adapter Mapplet

This screenshot or diagram is described in surrounding text.

In Figure 19-6, if the input data is transformed, the data is passed to the Expression transformation (EXP) as input only. After the data is transformed, it is output through a new port, which is prefixed with EXT_. If the data is not transformed, it comes in as input-only and leaves through an output-only port.

If you want to add a new transformation, you must add a new port to contain the expression that is used to transform the data.

To add a new port to the source adapter mapplet:

  1. In Informatica PowerCenter Designer, open the applicable source system configuration folder.

  2. Open the applicable source adapter mapplet.

  3. Double-click the MAPI component of the mapplet, and add a new input port following the INP_* naming convention.

  4. Copy the new input port from the MAPI to the Expression transformation.

  5. Connect the new port from the MAPI to the Expression transformation.

  6. In the Expression transformation, uncheck the Output indicator for the new input port; you use the value from this port in an transformation expression.

  7. Perform any necessary transformations within the Expression transformation.

    The transformed data is passed out of an EXT_* output-only port.

  8. Connect the port from the Expression transformation to the MAPO.

  9. Validate and save your repository.

19.9 Marking Records as "Deleted" in Oracle Business Analytics Warehouse

Note:

This section does not apply to Oracle's Siebel sources.

In a typical implementation, records that are deleted from your source system are not removed from the Oracle Business Analytics Warehouse. If you want to mark these records as deleted in the data warehouse, which were removed from the source system's database and archived in a separate database, you must enable the primary extract and delete mappings.

Primary extract mappings flag records that are deleted from the data warehouse. Delete mappings sets the DELETE_FLG column to 'Y' for these records in the warehouse tables. When enabled, primary extract and delete mappings by default look for any records removed from the source system's database. If these mappings find that the records no longer exist in that database, the mappings mark them as deleted in the data ware

Note:

When you mark a record as deleted in Oracle Business Analytics Warehouse, the delete flag is set to 'Y', but the record is not physically deleted from the table. This type of delete is often referred to as a "soft delete."

Caution:

Delete and primary extract mappings must always be disabled together; you may not disable only one type.

19.9.1 About Primary Extract and Delete Mappings Process

Note:

This section does not apply to Oracle's Siebel adapter, which does not have Primary Extract mappings.

Before you decide to enable primary extract and delete sessions, it is important to understand their function within the Oracle Business Analytics Warehouse. Primary extract and delete mappings allow your analytics system to determine which records are removed from the source system by comparing primary extract staging tables with the most current Oracle Business Analytics Warehouse table.

The primary extract mappings perform a full extract of the primary keys from the source system. Although many rows are generated from this extract, the data only extracts the Key ID and Source ID information from the source table. The primary extract mappings load these two columns into staging tables that are marked with a *_PE suffix.

Figure 19-7 provides an example of the beginning of the extract process. It shows the sequence of events over a two day period during which the information in the source table has changed. On day one, the data is extracted from a source table and loaded into the Oracle Business Analytics Warehouse table. On day two, Sales Order number three is deleted and a new sales order is received, creating a disparity between the Sales Order information in the two tables.

Figure 19-7 Extract and Load Mappings

This screenshot or diagram is described in surrounding text.

Figure 19-8 shows the primary extract and delete process that occurs when day two's information is extracted and loaded into the Oracle Business Analytics Warehouse from the source. The initial extract brings record four into the Oracle Business Analytics Warehouse. Then, using a primary extract mapping, the system extracts the Key IDs and the Source IDs from the source table and loads them into a primary extract staging table.

The extract mapping compares the keys in the primary extract staging table with the keys in the most current the Oracle Business Analytics Warehouse table. It looks for records that exist in the Oracle Business Analytics Warehouse but do not exist in the staging table (in the preceding example, record three), and sets the delete flag to Y in the source adapter mapplet, causing the corresponding record to be marked as deleted.

The extract mapping also looks for any new records that have been added to the source, and which do not already exist in the Oracle Business Analytics Warehouse; in this case, record four. Based on the information in the staging table, Sales Order number three is physically deleted from Oracle Business Analytics Warehouse, as shown in Figure 19-8. When the extract and load mappings run, the new sales order is added to the warehouse.

Figure 19-8 Primary Extract and Delete Mappings

This screenshot or diagram is described in surrounding text.

19.9.2 About Working with Primary Extract and Delete Mappings

The primary extract (*_Primary) and delete mappings (*_IdentifyDelete and *_Softdelete) serve a critical role in identifying which records have been physically deleted from the source system. However, there are some instances when you can disable or remove the primary extract and delete mappings, such as when you want to retain records in the data warehouse that were removed from the source systems' database and archived in a separate database.

Because delete mappings use Source IDs and Key IDs to identify purged data, if you are using multiple source systems, you must modify the SQL Query statement to verify that the proper Source ID is used in the delete mapping. In addition to the primary extract and delete mappings, the configuration of the delete flag in the load mapping also determines how record deletion is handled.

You can manage the extraction and deletion of data in the following ways:

  • Deleting the configuration for source-archived records

  • Deleting records from a particular source

  • Enabling delete and primary-extract sessions

  • Configuring the Record Deletion flag

  • Configuring the Record Reject flag

This topic provides procedures for these management tasks.

19.9.2.1 Deleting the Configuration for Source-Archived Records

Some sources archive records in separate databases and retain only the current information in the main database. If you have enabled the delete mappings, you must reconfigure the delete mappings in the Oracle Business Analytics Warehouse to retain the archived data.

To retain source-archived records in the Oracle Business Analytics Warehouse, make sure the $$LAST_ARCHIVE_DATE parameter value is set properly to reflect your archive date. The delete mappings will not mark the archived records as 'deleted'. For more information about extract and delete mappings, see Section 19.9.2, "About Working with Primary Extract and Delete Mappings."

19.9.2.2 Enabling Delete and Primary Extract Sessions

If you want to mark your source-deleted records as deleted in the Oracle Business Analytics Warehouse, you need to enable the delete and primary extract tasks for your application.

As shown in the following procedures, there are two different ways to enable the sessions, depending on the modules you implemented.

To enable primary extract and delete sessions on Tasks:

  1. In DAC, go to the Design view, and select the appropriate custom container from the drop-down list.

  2. Display the Tasks tab.

  3. Query for all tasks containing the string 'Delete' or 'Primary'.

  4. Deselect the Inactive check box for those tasks.

  5. Reassemble your subject areas and rebuild your execution plans.

To enable primary extract and delete sessions on Configuration Tags:

  1. In DAC, go to the Design view, and select the appropriate custom container from the drop-down list.

  2. Display the Configuration Tags tab.

  3. Query for all configuration tags containing the string 'Identify and Soft Delete'.

  4. Select the tag of your module, and then click the subtab 'Subject Areas'.

  5. Deselect the Inactive check box for the subject areas.

  6. Reassemble your subject areas and rebuild your execution plans.

19.10 Configuring Slowly Changing Dimensions

The Oracle Business Analytics Warehouse provides Type 2 slowly changing dimension (SCD) functionality, which enables you to track the history of updates to dimension records. When a record in the Oracle Business Analytics Warehouse has an update, the updated information is posted into a new row and the old information is kept for historical reporting purposes.

The Oracle Business Analytics Warehouse identifies and applies the slowly changing dimension logic chosen by the user after data has been extracted and transformed to be source-independent. Users may configure Oracle BI Applications to support both Type 1 SCDs, in which data is overwritten with updates, and Type 2 SCDs, in which the original records are maintained while a new record stores the updated data. Choosing Type 1 or Type 2 SCDs depends on identifying your historically significant attributes.

Most dimensions use Type 1 updates by default. If you need change a dimension to Type 2 SCD update, use the following procedure.

To turn a dimension to Type 2 SCD update enabled:

  1. In DAC, go to the Design view, and select the appropriate custom container from the drop-down list.

  2. Display the Tasks tab.

  3. Query for the SIL task that populating the dimension.

  4. Display the Parameters subtab, and set the value for $$TYPE2_FLG to Y.

A few dimensions are set up by default to be Type 2 SCDs in the standard repository. These dimensions are listed in Table 19-2.

Table 19-2 Type 2 SCDs Set by Default

Dimension Adapter

W_CUSTOMER_FIN_PROFL_D

PeopleSoft

W_EMPLOYEE_D

Oracle EBS and PeopleSoft

W_INT_ORG_D

PeopleSoft

W_INVENTORY_PRODUCT_D

Oracle EBS

W_POSITION_D

All

W_PRODUCT_D

Oracle EBS, Oracle Siebel, and Oracle JD Edwards EnterpriseOne

W_Party_ORG_D

Oracle JD Edwards EnterpriseOne

W_BUSN_LOCATION_D

PeopleSoft

W_HR_POSITION_D

All

W_JOB_D

All

W_PAY_GRADE_D

All

W_PAY_TYPE_D

All

W_POSITION_DH

All


Note:

If you want to turn off the TYPE2_FLG that is set by default for any dimension, you must customize the source-dependent extract mapping. Make the following changes:

  1. If the SQL is not yet overridden, you must override it to filter out the historical as well as the future effective records. In other words, the ultimate SQL should always fetch the latest records, as effective at the time of the session run. If the SQL is already overridden, change it to achieve the same result mentioned above. There should not be any duplicate INTEGRATION_ID values in the staging table as a result of the SQL. Although the unique index, if any, typically is on INTEGRATION_ID, DATASOURCE_NUM_ID, and SRC_EFF_FROM_DT, you should form your SQL in such a way that the uniqueness is not violated, even if SRC_EFF_FROMT_DT is kept out of the equation.

    The SQL change will vary depending on the OLTP data model and the entity in question. In some cases, you can get the record based on the MAX(EFFDT) from the source, whereas in other cases, where future data is supported, you can select only records effective at the time of the session run. The Informatica parameter $$$SessStartTime gives the system date and time of the computer where Informatica Server is running. Note that this might be different than the database hosting your OLTP data.

  2. Disconnect the ports that bring the source effective from and to dates.

  3. In a downstream transformation, hardcode the value of SRC_EFF_FROM_DT as 01/01/1899 (or use the Oracle BI Applications standard $$LOW_DT parameter value). The SRC_EFF_TO_DT can be left as NULL.

  4. Set the value of $$TYPE2_FLG as 'N' for the task loading the table where you intend this behavior.

  5. Optionally, you can disable the corresponding SCDUpdate mapping, if you want to save on overall ETL time. If you choose to disable it, you might have to rebuild your execution plan.

    Note that keeping the SCDUpdate mapping enabled does not harm the functionality.

19.10.1 About Identifying Historically Significant Attributes

You might want to retain a history of all the updates to a particular dimension so that you can use them in reports. These dimensions are known as historically significant attributes. For example, if a customer moves to a different region and you assign that customer a new regional salesperson and territory ID, you might want to keep records of that customer's account history with the original salesperson and territory ID. In this case, the salesperson and territory IDs are historically significant attributes. In contrast, you might have a load that populates the telephone number field. If your business does not perform data analysis on phone number history, then this information might be considered a historically insignificant attribute.

Identifying attributes as significant or insignificant enables you to determine the category of SCD you require. However, before you can select the appropriate type of SCD, you must understand their differences.

19.10.1.1 About the Extract View

The extract view of any given table in the staging area consists of four types of records:

  • New records

  • Changed records with data that is historically insignificant

  • Changed records having historical significance

  • Changed records whose changes have no significance of any kind and are ignored altogether

Of the four kinds of records, only the first three are of interest for the data mart. Of those three, brand new records and records whose changes are tracked as SCDs are both treated as new and become inserts into the data warehouse. Records with changes that are important but not historically tracked are overwritten in the data warehouse, based on the primary key.

19.10.2 Type 1 and Type 2 Slowly Changing Dimensions

After you have correctly identified your significant and insignificant attributes, you can configure the Oracle Business Analytics Warehouse based on the type of slowly changing dimension (SCD) that best fits your needs—Type 1 or Type 2.

19.10.2.1 Type 1 Slowly Changing Dimension

A Type 1 SCD overwrites the column's value and is the default SCD for the Oracle Business Analytics Warehouse. Although a Type 1 does not maintain history, it is the simplest and fastest way to load dimension data. Type 1 is used when the old value of the changed dimension is not deemed important for tracking or is an historically insignificant attribute. For example, you might want to use Type 1 when changing incorrect values in a column.

In Figure 19-9, the State Name column for the supplier KMT is changed in the source table Suppliers, because it was incorrectly entered as California. When the data is loaded into the data warehouse table, no historical data is retained and the value is overwritten. If you look up supplier values for California, records for KMT do not appear; they only appear for Michigan, as they have from the beginning.

Figure 19-9 An Example Type 1 Slowly Changing Dimension

This screenshot or diagram is described in surrounding text.

19.10.2.2 Type 2 Slowly Changing Dimension

A Type 2 SCD creates another record and leaves the old record intact. Type 2 is the most common SCD because it enables you to track historically significant attributes. The old records point to all history prior to the latest change, and the new record maintains the most current information.

Slowly changing dimensions work in different parts of a star schema (the fact table and the dimension table). Figure 19-10 shows how an extract table (SOURCE_CUSTOMERS) becomes a data warehouse dimension table (W_ORG_D). Although there are other attributes that are tracked, such as Customer Contact, in this example there is only one historically tracked attribute, Sales Territory. This attribute is of historical importance because businesses frequently compare territory statistics to determine performance and compensation. Then, if a customer changes region, the sales activity is recorded with the region that earned it.

This example deals specifically with a single day's extract, which brings in a new record for each customer. The extracted data from SOURCE_CUSTOMERS is loaded into the target table W_ORG_D, and each record is assigned a unique primary key (ROW_WID).

Figure 19-10 An Example Type 2 Slowly Changing Dimension

This screenshot or diagram is described in surrounding text.

However, this data is not static; the next time a data extract shows a change for your customers in W_ORG_D, the records must change. This situation occurs when slowly changing dimensions are invoked. Figure 19-11 shows that records for the two customers, ABC Co., and XYZ inc. have changed. Notice that ABC's Customer Contact has changed from Mary to Jane, and XYZ's Sales Territory has changed from West to North.

As discussed earlier in this example, the Customer Contact column is historically insignificant; therefore a Type 1 SCD is applied and Mary is overwritten with Jane. Because the change in ABC's record was a Type 1 SCD, there was no reason to create a new customer record. In contrast, the change in XYZ's record shows a change of sales territory, an attribute that is historically significant. In this example, the Type 2 slowly changing dimension is required.

As shown in Figure 19-11, instead of overwriting the Sales Territory column in the XYZ's record, a new record is added, assigning a new ROW_WID, 172, to XYZ in W_ORG_D. XYZ's original record, 102, remains and is linked to all the sales that occurred when XYZ was located in the West sales territory. However, new sales records coming in are now attributed to ROW_WID 172 in the North sales territory.

Figure 19-11 An Example Type 2 Slowly Changing Dimension

This screenshot or diagram is described in surrounding text.

19.10.2.3 Effective Dates

Effective dates specify when a record was effective. For example, if you load a new customer's address on January 10, 2003 and that customer moves locations on January 20, 2003, the address is only effective between these dates. Effective Dates are handled in the following manner:

  • If the source supplies both effective dates, these dates are used in the warehouse table.

  • If the source does not supply both the effective to and effective from dates, then the Type 2 logic creates effective dates.

  • If the source supplies one of the two effective dates, then the Oracle Business Analytics Warehouse automatically populates the missing effective dates using a wrapper mapping. This situation is discussed in this section.

For example, in the W_ORG_D table previously discussed, XYZ moved to a new sales territory.

If your source system supplied historical data on the location changes, your table might contain a record for XYZ in the West sales territory with an effective from date of January 1, 2001 and an effective to date of January 1, 3714. If the next year your source indicates XYZ has moved to the North sales territory, then a second record is inserted with an effective from date of January 1, 2002, and an effective to date of January 1, 3714, as shown in Table 19-3.

Table 19-3 Records Before a Wrapper Session in W_CUSTOMER

Customer Name Sales Territory Customer Contact Effective From Effective To Current

ABC

East

Jane

1/1/2001

1/1/3714

Y

XYZ

West

John

1/1/2001

1/1/3714

Y

XYZ

North

John

1/1/2002

1/1/3714

Y


Note your first record for XYZ still shows as effective from January 1, 2001 to January 1, 3714, while a second record has been added for XYZ in the North territory with the new effective from date of January 1, 2002. In this second record the effective to date remains the same, January 1, 3714.

When the wrapper session executes, the effective dates for the first XYZ are corrected (January 1, 2001-January 1, 2002), and the Current Flag is adjusted in the Analytic Data Interface (load mapping) so that only the second record (January 1, 2002-January 1, 3714) is set to Y. After the wrapper session completes its work, you have Type 2 information for XYZ in your data warehouse rather than two disparate records, as shown in Table 19-4.

Table 19-4 Records After a Wrapper Session in W_CUSTOMER

Customer Name Sales Territory Customer Contact Effective From Effective To Current

ABC

East

Jane

1/1/2001

1/1/3714

Y

XYZ

West

John

1/1/2001

1/1/2002

N

XYZ

North

John

1/1/2002

1/1/3714

Y


In the previous paragraph, the wrapper session corrected the effective to dates and current flag. However, if the record's dates had been correct, the wrapper mapping would simply have set the current flag as needed, because its logic is set to check dates and flags and only adjust columns that contain discrepancies.

19.11 About Stored Lookups

A lookup transformation enables you to specify a reference table, and then retrieve information such as code descriptions, exchange rates, and currency codes. The main types of preconfigured lookups within the Oracle Business Analytics Warehouse are:

19.11.1 Code Lookups

Some source systems use intelligent codes that are intuitively descriptive, such as HD for hard disks, while other systems use non-intuitive codes (like numbers, or other vague descriptors), such as 16 for hard disks. While codes are an important tool with which to analyze information, the variety of codes and code descriptions used poses a problem when performing an analysis across source systems. The lack of uniformity in source system codes must be resolved to integrate data for the Oracle Business Analytics Warehouse.

The code lookup in the load mapping integrates both intelligent and non-intuitive codes by performing a separate extract for codes, and inserting the codes and their description into a codes table. The codes table provides the load mapping with a resource from which it can automatically perform a lookup for code descriptions.

The Analytic Data Interface's architecture uses components, as well as both fact and dimension tables, to facilitate lookup functionality. The following components and process are used in a lookup:

19.11.1.1 W_CODES Table

The load control table W_CODES consolidates all codes for future reference and assigns them a category and a single language for efficient lookup capability.

19.11.1.2 Codes Mappings

The Oracle Business Analytics Warehouse uses mappings designed to extract codes from source systems and populate the W_CODES table in preparation for use by the load mapping.

To understand how codes mappings function, it is helpful to first understand the columns within W_CODES. Table 19-5 describes these columns.

Table 19-5 Columns in Code Mapplet

Column Description

DATASOURCE_NUM_ID

Unique identifier of the source system from which data was extracted

SOURCE_CODE1

The first code in the hierarchy of the various source system codes used to identify a particular code and description combinations

SOURCE_CODE2

The second code in the hierarchy of the various source system codes used to identify a particular code and description combinations

SOURCE_CODE3

The third code in the hierarchy of the various source system codes used to identify a particular code and description combinations

SOURCE_DESC_1

Short description of the source system code

SOURCE_DESC_2

Long description for code


The naming convention for mappings designed for codes lookup is SDE_[SOURCE]_CodeDimension_[CATEGORY]. Figure 19-12 shows an example of a code mapping in Informatica PowerCenter Designer.

Figure 19-12 Example of a Code Mapping in Informatica PowerCenter Designer

This screenshot or diagram is described in surrounding text.

19.11.1.3 Codes Mapplets

There are several mapplets that support the codes mappings in preparation for the source-independent load mapping. They are as follows:

  • Source adapter mapplets. The source adapter mapplet connects the source-specific input attributes of CODES and the attributes from control or warehouse tables to the expression transform for mapping them. The naming convention for the source adapter codes mapplet is MPLT_SA_CODES.

  • Business Component mapplets. The Business Component mapplet makes the source system attributes of CODES_CUST_CLASS available to the extract mapping. The naming convention for the Business Component codes mapplet is MPLT_BC_CODES_[CATEGORY].

  • ADI Mapplet. The Analytic Data Interface (load mapping) mapplet is source system independent and resolves the codes for the target table. The naming convention for the load mapping codes mapplet is MPLT_ADI_CODES.

The load mapping integrates multiple source system codes by designating one source system instance as a master in a mapping. All other source system codes are then mapped to the master. When the load mapping encounters a code that requires definition, it references the load control lookup table to match the source system code to an Oracle Business Analytics Warehouse source-independent code, which retains all the source system codes' original functionality.

The following columns are used to designate a source system instance as the master source system:

  • MASTER_ID. Code for the source system designated as master.

  • DATASOURCE_NUM_ID. Unique identifier for the source system.

19.11.1.4 Configuring Extension Column Code Description Lookups

You can configure dimension and fact load sessions to perform specific lookups by editing the category of the data to be extracted from the W_CODES table and loading the code information into a target table. If the code and code name do not exist in the W_CODES table, then you must add them to the table. To configure the lookup, create a session override; do not modify the load mapping in the load mapping.

To configure sessions for lookups:

  1. In Informatica PowerCenter Workflow Manager, open the applicable source system configuration folder.

  2. Open the Edit Tasks box.

  3. In the Transformations tab, edit the SQL statement for the lookup.

    For example, you might wish to edit the following lookup:

    MPLT_ADI_SUPPLIERS.LKP_SPLR_ATTR1
    
  4. Edit the SQL statement to use the desired code category.

  5. Edit the SQL statement from 'GENERIC' to the category you wish to use for the lookup.

19.11.2 About Resolving Dimension Keys

By default, dimension key resolution is performed by the Oracle Business Analytics Warehouse in the load mapping. The load mapping uses prepackaged, reusable lookup transformations to provide prepackaged dimension key resolution. This section describes how dimension keys are looked up and resolved.

There are two commonly used methods for resolving dimension keys. The first method, which is the primary method used, is to perform a lookup for the dimension key. The second method is to supply the dimension key directly into the fact load mapping.

19.11.2.1 Resolving the Dimension Key Using Lookup

If the dimension key is not provided to the load mapping through database joins, the load mapping performs the lookup in the dimension table. The load mapping does this using prepackaged lookup transformations.

The load mapping uses the integration ID, the DATASOURCE_NUM_ID and Lookup date in looking up the dimension key. All these columns are necessary for the load mapping to return the dimension key. The ports are described in Table 19-6.

Table 19-6 Columns Used in the load mapping Dimension Key Lookup

Port Description

INTEGRATION ID

Uniquely identifies the dimension entity within its source system. Formed from the transaction in the source adapter of the fact table.

DATASOURCE_NUM_ID

Unique identifier of the source system instance.

Lookup Date

The primary date of the transaction; for example, receipt date, sales date, and so on.


In Figure 19-6, the Supplier Products Key Lookup transformation illustrates the three input columns needed for the load mapping lookup—the INTEGRATION ID, DATASOURCE_NUM_ID, and Date (lookup date). The transformation then outputs the Supplier Product key (the dimension key) to the data warehouse table W_SUPPLIER_PRODUCT_D.

If Type 2 slowly changing dimensions are enabled, the load mapping uses the unique effective dates for each update of the dimension records. When a dimension key is looked up, it uses the fact's primary date to resolve the appropriate dimension key.

The effective date range gives the effective period for the dimension record. The same entity can have multiple records in the dimension table with different effective periods due to Type 2 slowly changing dimensions. This effective date range is used to exactly identify a record in its dimension, representing the information in a historically accurate manner. In the lookup for Employee Contract Data shown in Figure 19-13, you can see the effective dates used to provide the effective period of employee contracts.

Figure 19-13 Lookup for Employee Contract Data

This screenshot or diagram is described in surrounding text.

19.12 About Domain Values

The Oracle Business Analytics Warehouse foundation comprises a data model that accommodates data from disparate source systems. Data is sourced from operational systems and systematically molded into a source-independent format. After the data is made source independent, it can then be used to create key metrics for analytic reporting, so that metric calculations are not source dependent. This clear separation enables you to swap source systems or integrate additional source systems without having to reconfigure the metric calculations to accommodate each source system's requirements.

One method for transforming source data into a source-independent format is to convert the source-supplied values to domain values. Domain values are a set of distinct values used to calculate prepackaged metrics. These values are provided by the Oracle Business Analytics Warehouse to allow you to create metric calculations independent of source system values.

19.12.1 About the Domain Value Conversion Process

To best understand the domain value conversion process, consider an example of two source systems—Source System A and Source System B. Each source system stores two types of employee events—hire and rehire. Source system A uses H to denote a hire event and R to denote a rehire event, whereas source system B uses 1 to denote a hire event and 2 to denote a rehire event. When the Oracle Business Analytics Warehouse extracts data from both systems, it ports those source values through the extract mapping until the data reaches the W_EVENT_GRP_CODE column in the W_EVENT_TYPE_DS staging table.

The load mapping then ports the extracted source values (H and R from source system A, and 1 and 2 from source system B) into the source adapter mapplet. Within the source adapter, source values are translated into domain values (HIR and REH) based on a set of rules that are particular to your business practices.

19.12.1.1 Preparing to Define the Rules

You must define the rules so that the source adapter knows how to map your specific source values to the given set of domain values. Before you set up the rules you must:

  1. Analyze all of your source values and how they map to the prepackaged domain values. You might find that you need to create additional domain values for particular columns. The result of this preparation work is a list of each source value and how it is mapped to a domain value.

  2. Implement this logic in the applicable source adapter mapplet. To set up the logic, modify the Expression transformation in the source adapter mapplet for each affected column. For information on setting up the rules for domain values, see Section 19.14, "Configuring the Domain Value Set Using Informatica PowerCenter Designer."

Figure 19-14 illustrates how the source values are converted to the domain values—HIR and REH.

Figure 19-14 Source Values Translated to Domain Values

This screenshot or diagram is described in surrounding text.

Figure 19-15 illustrates a different situation where the records might not contain a source value that flags the record as Hire or Rehire. In this case, the source system stores hires in one table and rehires in another table. To make this work, one possible solution is to modify the extract mappings to populate the W_EVENT_GRP_CODE column with HIR or REH. If the field is populated in the extract mapping, you can then carry those same values through the source adapter mapplet.

Figure 19-15 Source Values in Different Tables Translated to Domain Values

This screenshot or diagram is described in surrounding text.

After the source adapter mapplet converts the source-specific values to domain values, the domain values are inserted into an Oracle Business Analytics Warehouse table. In this example, the HIR and REH values populate the W_EVENT_TYPES table, as illustrated in Figure 19-16.

Figure 19-16 HIR and REH Values Populating the W_EVENT_TYPES Table

This screenshot or diagram is described in surrounding text.

19.12.2 About the Importance of Domain Values

Values in the W_EVENT_TYPES table are used to create metrics in the front end. Some metrics are defined using domain values. For example, seven metrics use the HIR and REH event group code in their calculation. The following are the seven metrics, along with their descriptions and calculations:

19.12.2.1 Hire Count

This metric counts all hires for a specified period. The calculation is:

SUM(CASE WHEN (CMMNEVTP.W_EVENT_GRP_CODE IN ('HIR','REH')) THEN EVNT.EVENT_CNT 
ELSE 0 END)

19.12.2.2 Re-hires Ratio

This metric determines the ratio of rehires to all employees hired during a specified period. The calculation is:

CASE WHEN SUM(CASE WHEN CMMNEVTP.W_EVENT_GRP_CODE IN ('REH','HIR') THEN 
EVNT.EVENT_CNT ELSE 0 END) = 0 THEN 0 ELSE SUM(CASE WHEN CMMNEVTP.W_EVENT_GRP_CODE
IN ('REH') THEN EVNT.EVENT_CNT ELSE 0 END)/SUM(CASE WHEN CMMNEVTP.W_EVENT_GRP_CODE
IN ('REH','HIR') THEN EVNT.EVENT_CNT ELSE 0 END) END

19.12.2.3 New Hire Count

This metric counts the head count hired for regular full-time positions. The calculation is:

SUM(CASE WHEN CMMNEMPT.FULL_TIME_FLAG = 'Y' AND CMMNEMPT.EMP_CAT_CODE = 'R' AND 
(CMMNEVTP.W_EVENT_GRP_CODE = 'HIR' OR CMMNEVTP.W_EVENT_GRP_CODE = 'REH') AND 
EVNT.EVENT_DK >= (CMMNDATE.DATE_KEY - 365) AND EVNT.EVENT_DK <= CMMNDATE.DATE_KEY 
THEN EVNT.EVENT_CNT ELSE 0 END)

19.12.2.4 Newly Separated Veterans - New Hires

This metric counts the regular full-time and part-time employees who belong to this category of veterans and were hired during the previous 12 months. The calculation is:

SUM(CASE WHEN CMMNEMPD.VETERAN_STAT_CODE = '4' AND CMMNEMPT.EMP_CAT_CODE = 'R' AND 
(CMMNEVTP.W_EVENT_GRP_CODE = 'HIR' OR CMMNEVTP.W_EVENT_GRP_CODE = 'REH') AND 
EVNT.EVENT_DK >= (CMMNDATE.DATE_KEY - 365) AND EVNT.EVENT_DK <= CMMNDATE.DATE_KEY 
THEN EVNT.EVENT_CNT ELSE 0 END)

19.12.2.5 Other Protected Veterans - New Hires

This metric counts regular full-time and part-time employees who belong to this category of veterans. The calculation is:

SUM(CASE WHEN CMMNEMPD.VETERAN_STAT_CODE = '3' AND CMMNEMPT.EMP_CAT_CODE = 'R' AND 
(CMMNEVTP.W_EVENT_GRP_CODE = 'HIR' OR CMMNEVTP.W_EVENT_GRP_CODE = 'REH') AND 
EVNT.EVENT_DK >= (CMMNDATE.DATE_KEY - 365) AND EVNT.EVENT_DK <= CMMNDATE.DATE_KEY 
THEN EVNT.EVENT_CNT ELSE 0 END)

19.12.2.6 Special Disabled Veteran Head count - New Hires

This metric counts regular full-time and part-time employees who belong to this category of veterans and were hired during the previous 12 months. The calculation is:

SUM(CASE WHEN CMMNEMPD.VETERAN_STAT_CODE = '1' AND CMMNEMPT.EMP_CAT_CODE = 'R' AND 
(CMMNEVTP.W_EVENT_GRP_CODE = 'HIR' OR CMMNEVTP.W_EVENT_GRP_CODE = 'REH') AND 
EVNT.EVENT_DK >= (CMMNDATE.DATE_KEY - 365) AND EVNT.EVENT_DK <= CMMNDATE.DATE_KEY 
THEN EVNT.EVENT_CNT ELSE 0 END)

19.12.2.7 Vietnam Era Veteran Head count - New Hires

This metric counts regular full-time and part-time employees who belong to this category of veterans and were hired during the previous 12 months. The calculation is:

SUM(CASE WHEN CMMNEMPD.VETERAN_STAT_CODE = '2' AND CMMNEMPT.EMP_CAT_CODE = 'R' AND 
(CMMNEVTP.W_EVENT_GRP_CODE = 'HIR' OR CMMNEVTP.W_EVENT_GRP_CODE = 'REH') AND 
EVNT.EVENT_DK >= (CMMNDATE.DATE_KEY - 365) AND EVNT.EVENT_DK <= CMMNDATE.DATE_KEY 
THEN EVNT.EVENT_CNT ELSE 0 END)

Each of these metric calculations is based on the domain values HIR and REH. All records whose source values are converted to one of these domain values are included in the metric calculations, as shown in Figure 19-17.

Figure 19-17 Metric Values From HIR and REH Domain Values

This screenshot or diagram is described in surrounding text.

19.12.3 About Extending the Domain Value Set

The Oracle Business Analytics Warehouse is also extensible in that you can create additional domain values for those columns that do not fit into the existing domain value definitions. However, before you modify the domain value set for a particular column, you first perform impact analysis on existing metrics. For example, the Oracle Business Analytics Warehouse prepackages the following two events:

  • New Hire. This event occurs when a new person is hired.

  • New Position. This event occurs when a position is created, but an existing employee might be hired internally.

If you have an event that represents both a New Hire and a New Position, you might have to create a third event that depicts both. If you create this new event type domain value, you need to include it in the applicable metric definitions so as to account for all hires and positions.

19.13 Configuring the Domain Value Set with CSV Worksheet Files

Domain values are a set of distinct values used to calculate prepackaged metrics. These values are provided by Oracle Business Analytics Warehouse to allow you to create metric calculations independent of source system values. Oracle Business Analytics Warehouse provides CSV worksheet files to map source system values to domain values.

You can add to these worksheet files if you need extra source system values and map them to domain values. You can also modify the worksheet files if you need to customize the domain values. You can use an existing domain value if you want to change the preconfigured metrics. Otherwise you can create a new domain value and create new metrics based on this domain value.

The source system values that are not mapped to a domain values in the CSV worksheet files have a question mark (?) as the domain value in the Oracle Business Analytics Warehouse. These values do not affect the domain values metrics.

If there are no worksheet files to map the source system values to the domain values, you need to modify the domain values using Informatica PowerCenter Designer. For more information on configuring domain values using Informatica PowerCenter Designer, see Section 19.14, "Configuring the Domain Value Set Using Informatica PowerCenter Designer."

For a list of CSV worksheet files and their domain values for your application, see your application configuration chapter.

To map source values to domain values using CSV worksheet files:

  1. Identify all the Oracle Business Analytics Warehouse table columns that use domain values.

    For a list of columns that use domain values, see the Oracle Business Analytics Warehouse Data Model Reference.

  2. List all of your source values that qualify for conversion to one of the domain values.

  3. Map each source value to a domain value.

    If any of your source system values do not map to a prepackaged domain value, and you might modify the list of domain values, then create a list of new domain values and map your orphaned source system values to your newly created domain values.

    You cannot modify all domain value sets. Also, you must check which metrics are affected by the modified domain value set. For more information, see the Oracle Business Analytics Warehouse Data Model Reference.

  4. Open the CSV worksheet file in the $PMServer\LkpFiles directory (for example, INFA_HOME\server\infa_shared\LkpFiles).

  5. Edit the file to map your source values to the existing domain values.

    Alternately, if you want to add additional domain values, add them in this worksheet file.

  6. Save and close the worksheet file.

19.14 Configuring the Domain Value Set Using Informatica PowerCenter Designer

If there are no worksheet files to map the source system values to the domain values, you need to modify the values using Informatica PowerCenter Designer. For more information on configuring the domain value set with CSV worksheet files, see Section 19.13, "Configuring the Domain Value Set with CSV Worksheet Files."

Configuring the domain value set for a particular column, using Informatica PowerCenter Designer, entails one or both of the following activities:

Regardless of which activity you choose, the configuration occurs in the Expression transformation of the applicable source adapter mapplet. The following procedure shows how to configure the Expression transformation to change the domain values.

To map source values to domain values using Informatica PowerCenter Designer:

  1. Identify all the Oracle Business Analytics Warehouse table columns that use domain values.

    For a list of columns that use domain values, see the Oracle Business Analytics Warehouse Data Model Reference.

  2. List all of your source values that qualify for conversion to one of the domain values.

  3. Map each source value to a domain value.

    If any of your source system values do not map to a prepackaged domain value, and you might modify the list of domain values, then create a list of new domain values and map your orphaned source system values to your newly created domain values.

    You cannot modify all domain value sets. Also, you must check which metrics are affected by the modified domain value set. For more information, see the Oracle Business Analytics Warehouse Data Model Reference.

  4. In Informatica PowerCenter Designer, open the applicable source adapter mapplet.

  5. Open the Expression transformation.

  6. Locate the applicable port's expression so that you can modify it.

  7. Edit the port's expression to map your source values to the existing domain values.

    Alternately, if you want to add additional domain values, add them in this same expression.

  8. Save and validate your changes to the repository.

19.15 Configuring Conformed Dimensions

This section provides procedures on configuring objects that apply to more than one application and contains the following topic:

19.15.1 Configuring Conformed Dimensions for Universal Adapter

This section provides configuration procedures for modifying dimensions that are loaded using Universal Adapter.

19.15.1.1 Product Effective Dates in the Products Dimension

The Oracle Business Analytics Warehouse stores product Effective To (SRC_EFF_TO_DT) and From (SRC_EFF_FROM_DT) dates in the Products dimension table, W_PRODUCTS. In addition, the Products dimension stores a Support Withdrawn Date column, SPRT_WITHDRAWN_DT.

By default, the Support Withdrawn Date takes precedence over the product Effective To Date. This prioritization means that if you supply a value for the Support Withdrawn Date column in your flat file upload, the Oracle Business Analytics Warehouse uses that value as the product Effective To value as well, overwriting anything in the SRC_EFF_TO_DT column. You can change this default behavior by modifying the Products Expression in the Universal Source Products Extract mapping.

To modify the product Effective To Date logic for a flat file extract:

  1. In Informatica PowerCenter Designer, open the Configuration for Universal Source folder.

  2. In the SDE_Universal_ProductDimension mapping, open the W_PRODUCT_D expression.

  3. Edit the logic for the SRC_EFF_TO_DT_OUT port.

  4. Validate and save your changes.