This document provides an overview of the EPM lineage spreadsheets and discusses how to use the spreadsheets to:
View lineage information.
Generate lineage information for a job.
The EPM lineage spreadsheets provide information about the ETL jobs that are delivered with the EPM warehouses. The spreadsheets act like a reverse-engineering tool or family tree; they enable you to view the ancestry of source, target, and lookup tables and their relevant ETL jobs. Each spreadsheet provides lineage information for a single warehouse. The following table lists the lineage spreadsheets that are currently available:
Lineage Spreadsheet Filename |
Warehouse |
ETL_CS_Lineage_Spreadsheet.xls |
Campus Solutions Warehouse |
ETL_CRM_Lineage_Spreadsheet.xls |
CRM Warehouse |
ETL_FMS_Lineage_Spreadsheet.xls |
FMS Warehouse |
ETL_HCM_Lineage_Spreadsheet.xls |
HCM Warehouse |
ETL_SCM_Lineage_Spreadsheet.xls |
SCM Warehouse |
By using the spreadsheets, you can:
View lineage information for staging, dimension, and fact ETL jobs, or source, target, and lookup tables.
Identify the sequence of jobs to run for a specific data mart.
Identify inter-mart and cross-warehouse dependencies.
Generate lineage information for a specific ETL job.
Spreadsheet Structure
Each EPM lineage spreadsheet includes several worksheets. The following table provides a description of each worksheet, by name, listed in the order in which it appears:
Worksheet |
Description |
Template |
This worksheet contains overview information, a legend, and a definition of the columns used in the worksheets. |
Setup |
This worksheet contains ETL lineage information for all of the setup and staging jobs required for the warehouse. |
Com Dims |
This worksheet contains ETL lineage information for the common dimension jobs required for the warehouse. |
Utils |
This worksheet contains ETL lineage information for the currency conversion jobs required for the warehouse. |
Global Dims |
This worksheet contains ETL lineage information for the global dimension jobs required for the warehouse. |
Local Dims |
This worksheet contains ETL lineage information for the local dimension jobs required for the warehouse. |
<Data Mart> For example: GL & Profitablity, ESA, Campus Community, and so on. |
This worksheet contains ETL lineage information for the jobs required for a specific data mart. Note. Each spreadsheet includes several data mart worksheets. |
Dynamic_Lineage_Generator |
This worksheet provides a macro that enables you to enter the name of an ETL job and automatically generate a list of the complete lineage for that job. |
JobOrder |
This worksheet is an extension of the Dynamic_Lineage_Generator worksheet. It displays the order in which jobs need to be run. |
Column Descriptions
The following table provides descriptions of the columns in the worksheets.
Column |
Description |
Sequencer Job |
The name of the job sequencer, which is responsible for invoking and running other ETL server jobs. |
Server Job |
The name of the server job that is called by the job sequencer. |
Server Job Category |
The location of the server job in the IBM WebSphere DataStage project. |
Target Table |
The name of the target table used in the server job. |
Target Update Action |
The target load strategy for the server job. |
Source Table |
The name of the source table used in the server job. |
Source Extraction Type |
The type of extraction from the source table in the server job (for example, incremental date time or cyclical redundancy check). |
Lookup Tables |
The name of the lookup tables that are used in the server job. Lookups can be hashed files or direct DRS lookups. The lineage information captures the table names from which the hash files are populated and the table names for the direct DRS lookup. |
Setup Jobs |
The name of the setup job that populates the source and/or the lookup table. |
Setup Sequencer Job |
The name of the job sequencer that calls the setup server job. |
MDW |
The name of the MDW server job. This column has an entry if the source table or lookup table is populated from an MDW server job. |
MDW Sequencer |
The name of the MDW sequence job. |
OWS |
The name of the OWS server job. This column has an entry if the source table or lookup tables are populated from an OWS server job. |
OWS Sequencer |
The name of the OWS sequence job. |
OWE |
The name of the OWE server job. This column has an entry if the source table or lookup tables are populated from an OWE server job. |
OWE Sequencer |
The name of the OWE sequence job. |
EPM Foundation |
The application or EPM foundation setup page that populates the source table or the lookup table, such as Global Consolidations, Dimension Mapper, or setup PIA pages. |
Category |
The categories in which the setup jobs, MDW jobs, OWS jobs or OWE jobs are placed. |
Comments |
Any additional comments, if applicable. |
Note. The spreadsheet does not contain lineage details for OWE jobs and Tree jobs, except for the GL&Profitability Mart of the FMS warehouse, which does include lineage information for OWE jobs.
This section discusses how to use the spreadsheet to:
Find lineage information for a server job.
Identify the list of Jobs to be run for a data mart.
To find lineage information for a server job:
Access the worksheet in which the job is categorized.
Use Excel's Find feature to find the server job name in column B.
Type Ctrl-F to access the Find and Replace Dialog box.
Enter the name of the server job in the Find what edit box.
Click Find Next until the job name is found in the Server Job column (column B).
Close the Find dialog box.
Review the lineage information in the adjacent columns.
The Sequencer Job column (column A) lists the sequencer which calls this job. The Server Job Category column (column C) lists the category this job is associated with. The Target Table, Target Update Action, Source Table, and Source Extraction Type for this server job are listed in columns D, E, F, and G respectively. The Lookup Tables column (Column H) lists all the lookups used by this job.
The source tables and the lookup tables are placed in separate rows. This enables you to find the lineage information for each of these tables by navigating through the other subsequent columns in the same row. Columns I through R list the dependent jobs that are required to populate the source and lookup tables, and entries in these columns indicate whether the table is populated by Setup jobs, (column I), MDW jobs (column K), OWS jobs (column M), OWE jobs (column O), or Foundation setup / Apps (column Q). The Category column (column R) lists the category that the dependent job is associated with.
Source tables that are from a different data mart (inter-mart) or different warehouse (cross-warehouse) are indicated by the colors specified in the legend on the Template worksheet page.
The spreadsheet lists the lineage of a source or lookup table to the level of the job that directly populates it. The lineage information does not extend to level of the last staging job. To get the complete lineage for a fact or dimension job fully extended through the lowest staging level, you can use the dynamic lineage generator tool, which generates a list of all the required dependent jobs that need to be run in order to load a particular fact or dimension.
Example
This example, from the ETL FMS Lineage spreadsheet, takes you through the tasks you would complete to review the information for the fact job J_Fact_PS_F_APAR_NETTING_E, which is used for the AR Data Mart.
Navigate to the AR worksheet page.
Type Ctrl-F and type J_Fact_PS_F_APAR_NETTING_E into the Find and Replace dialog box.
Type Ctrl-F and type J_Fact_PS_F_APAR_NETTING_E into the Find and Replace dialog box.
Click Find Next until you access the cell in the Server Job column that contains the J_Fact_PS_F_APAR_NETTING_E job.
Close the Find and Replace dialog box. You should see the following information:
Scroll to the right to review the columns shown here:
The Target Table, Target Update Action, Source Table, and Source Extraction Type for the J_Fact_PS_F_APAR_NETTING_E server job are listed in columns D, E, F, and G, respectively.
Continue to scroll to the right to view the remaining columns.
The Lookup Tables column (Column H) lists all the lookups used in J_Fact_PS_F_APAR_NETTING_E.
In this example there are three source tables: PS_D_CUST_ORG, PS_D_SUPPLIER, PS_F_AR_AGING. The lookup table is PS_F_AP_AGING. The source tables and the lookup tables are each placed in a unique row one after the other. This enables you to view the lineage information for each of these tables by navigating through the succeeding columns within the same row.
Columns I through R list out the dependent jobs required to populate these source and lookup tables. In this example, the source table PS_D_CUST_ORG has an entry in the MDW column, which means that is is populated from the MDW dimension J_Dim_PS_D_CUST_ORG_SCM, which is placed in the category Global_Dimensions_E\OWS_To_MDW\Base\Load_Tables\Server.
As shown in the following screenshot, the source table PS_D_SUPPLIER is an SCM warehouse dimension. The cross-warehouse dependency is identified by the different color (the color legend is located on the first worksheet page).
Similarly, the lookup table PS_F_AP_AGING is populated from the fact job J_Fact_PS_F_AP_AGING placed in the category FMS_E\Payables_Mart\AP\OWS_To_MDW\Facts\Base\Load_Tables\Server. This fact job belongs to a different mart as indicated by the different color.
You can use the information in the spreadsheet to identify the list of jobs that need to be run for a specific data mart. These include common jobs that are required for every data mart, which we refer to as prerequisite jobs, as well as jobs specific to the particular data mart.
If you prefer, you can create your own master sequencers based on the information provided in this section.
Alternatively, you can generate the list of jobs by using the Dynamic Lineage Generator tool. For more information, see "Generating Lineage Information for a Job".
Note. All the server jobs relating to Hash files that are present within the Load_Hash_Files category need to be run first before running other Sequence jobs within the Load_Tables category since these hash files are being used in other server jobs.
Prerequisite Jobs
The prerequisite jobs include setup jobs, staging jobs, and dimension jobs.
The following sets of jobs need to be run for every mart, in the order that they are listed in the worksheets:
Run these setup jobs in the Setup worksheet:
All jobs within the Setup_E\OWS\<Warehouse> category.
(For example all jobs within the Setup_E\OWS\FSCM category for the FMS warehouse and all jobs within the Setup_E\OWS\CS category for the CS warehouse).
All jobs within the Setup_E\Dimension mapper category.
Note. Please ensure that you run the Business Unit Wizard before proceeding with the following steps.
See Importing Source Business Units into EPM to Create Warehouse Business Units.
All jobs within the Shared_Lookups\DimensionMapper_Lookups category.
All jobs within the Shared_Lookups\Control_Tables category.
All jobs within the Shared_Lookups\System_Lookups category.
All jobs within the Shared_Lookups\Language_Lookups category.
All jobs within the Setup_E\OWE category (this step does not apply to the Campus Solutions warehouse).
If you are implementing currency conversion, then run the jobs listed in the Utils worksheet.
Run the staging jobs listed in the OWS Sequencer column (column N) in the following worksheets:
Com Dims.
Global Dims.
Local Dims.
<Data Mart>, where <Data Mart> is the name of the data mart, for example AP, AR, Campus Community, Student Financials .
Run the Common Dimension Jobs listed in the Com Dims worksheet.
Run the Global Dimensions jobs listed in the Global Dims worksheet. (These jobs are required for running the FMS warehouse jobs.)
Run the Local Dimension Jobs placed in the Local Dims worksheet.
Data Mart Specific Jobs
Run all the Server jobs listed in column B of the worksheet for the specific data mart, to populate the corresponding Dimension and Fact tables for that mart.
Note. Do not run the jobs that are listed within the Reusable Jobs category. These jobs are not used to load target tables. They are automatically triggered by various Sequence jobs.
The Dynamic_Lineage_Generator worksheet contains a macro that generates a list of all the dependent jobs that are required for any ETL job. This will easily help you identify all the list of jobs to be run for a specific fact or dimension job.
To use the Dynamic Lineage Generator:
Access the Dynamic_Lineage_Generator worksheet.
Enter the job name in cell B1.
Click the Get Job Lineage button.
The macro retrieves the lineage required for running this fact job from the setup, staging, and the dimension jobs and displays it in the cells below. The macro also copies the entire list of dependent jobs to the JobOrder worksheet, so you can identify the complete list to be run in sequence.
You must run the following prerequisite setup jobs before you run the jobs listed in the JobOrder worksheet:
Setup_E\OWS\<Warehouse Name> Job Sequencer.
For example Setup_E\OWS\FSCM Job Sequencer or Setup_E\OWS\CS Job Sequencer.
Setup_E\Dimension mapper Job Sequencer.
Run the Business Unit Wizard to populate the Dimension mapper tables.
See Importing Source Business Units into EPM to Create Warehouse Business Units.
Shared_Lookups\DimensionMapper_Lookups
Shared_Lookups\Control_Tables
Shared_Lookups\System_Lookups
Shared_Lookups\Language_Lookups
Setup_E\OWE Job Sequencer (this step does not apply to the Campus Solutions warehouse).
After you run the prerequisite setup jobs, then run the jobs listed in the JobOrder worksheet.