Skip Headers
Oracle® Business Intelligence Applications Fusion Edition Installation and Configuration Guide
Release 7.9.5

Part Number E12083-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

3 Pre-installation and Predeployment Requirements for Oracle Business Intelligence Applications

This chapter provides information about preparing to install and deploy Oracle Business Intelligence Applications. You should review this information before you begin the installation and deployment process. For example, as a minimum you should read the general guidelines for setting up the Oracle Business Analytics Warehouse, and read the appropriate database-specific guidelines for the source OLTP databases that you are using.

You also need to satisfy the database and Informatica PowerCenter requirements that are specified in Chapter 4 (for more information, see Section 4.3, "Mandatory Requirements and Pre-installation Tasks").

Notes

This chapter contains the following topics:

3.1 Operating System, Driver, and Connectivity Requirements for Oracle Business Analytics Warehouse

Table 3-1 provides the operating system, driver, and connectivity software requirements for the Oracle Business Analytics Warehouse components.

Note:

For information about the supported versions of the components listed in Table 3-1, see System Requirements and Supported Platforms for Oracle Business Intelligence Applications Fusion Edition.

Table 3-1 OS, Driver and Connectivity Requirements for Oracle Business Analytics Warehouse

Component Operating System Software Connectivity and Driver

ETL Servers

  • Solaris

  • AIX

  • HP

  • Windows

  • Linux

  • Informatica PowerCenter Integration Service

  • DAC Server

  • Java SDK

  • JDBC drivers

  • Javamail

ETL Clients

Windows

  • Informatica client tools

  • DAC client

  • Java SDK

  • JDBC drivers

  • ODBC drivers

Oracle Business Analytics Warehouse

  • Solaris

  • AIX

  • HP

  • Windows

  • OS/390, Z/os

Database software with Oracle Business Analytics Warehouse

Not applicable

ETL Repositories

  • Solaris

  • AIX

  • HP

  • Windows

Database software with ETL repositories

Not applicable


3.2 General Guidelines for Setting Up Oracle Business Analytics Warehouse

The Oracle Business Analytics Warehouse is a database that contains dimensional schemas. Although it is technically possible to put the Oracle Business Analytics Warehouse in the same database as the transactional database, it is not recommended for performance reasons. The transactional database is structured as an online transaction processing (OLTP) database, whereas the Oracle Business Analytics Warehouse is structured as an online analytical processing (OLAP) database, each optimized for its own purpose. The reasons for not combining the two databases are:

The Informatica repository stores all of the Informatica object definitions for the ETL mappings that populate the Oracle Business Analytics Warehouse. It is a series of repository tables that are stored in a database, which can be transactional, analytical, or a separate database.

The Oracle Business Analytics Warehouse works with relational database management systems. In addition to the general requirements, there are additional database management systems (DBMS)-specific requirements depending on the DBMS you are using.

The following general guidelines will help you set up the data warehouse physical database for performance and growth:

During the Oracle Business Analytics Warehouse configuration process, when you create the data warehouse tables using the procedure Section 4.10.1, "How to Create Data Warehouse Tables", you can create tables in one tablespace and indices in another tablespace. However, for performance reasons, it is recommended that you create tablespaces as described in Table 3-2.

Table 3-2 Recommended Tablespace Configuration

Tablespace Name List of Tables

DIM_STG

W_*DS

FACT_STG

W_*FS

DIM

W_*D and W_*MD

FACT

W_*F

AGG

W_*A

OTHER

Remaining W* tables

DIM_INDX

Indices of W_*D tables

(for example, other tables would include W*G and W*GS tables)

FACT_INDX

Indices of W_*F tables

OTHER_INDX

Remaining Indices of W* tables


Note:

To avoid fatal deadlocks during the ETL, make sure that you select the 'Session Level Retry on Deadlock' option in Informatica.

3.3 IBM DB2 UDB-Specific Database Guidelines for Oracle Business Analytics Warehouse

Table 3-3 provides guidelines for parameter settings for DB2 relational database management system (RDBMS) usage. Use these guidelines as a starting point. You will need to make changes based on your specific database sizes, data shape, server size (CPU and memory), and type of storage. The database administrator should make changes to the settings based on performance monitoring and tuning considerations.

Table 3-3 Recommended DB2 Parameter Settings

Parameter DB2 UDB V7 DB2 UDB V8 Notes

SHEAPTHRES

400000

400000


ASLHEAPSZ

15

15


RQRIOBLK

65535

65535


QUERY_HEAP_SZ

16384

16384


JAVA_HEAP_SZ

2048

2048


MAXAGENTS

400

400


NUM_INITAGENTS

10

10


NUM_POOLAGENTS

200

200


INTRA_PARALLEL

YES

YES


FCM_NUM_BUFFERS

12288

12288


SHEAPTHRES_SHR

N/A

=SHEAPTHRES


DBHEAP

16384

16384


CATALOGCACHE_SZ

5558

5558


LOGBUFSZ

2048

2048


UTIL_HEAP_SZ

10000

10000


NUM_ESTORE_SEGS

16

NIL

Restore is not needed in DB2 V8 64-bit because the V7 limit of 1.75 GB addressable memory has been lifted.

ESTORE_SEG_SZ

65536

NIL


LOCKLIST

25000

25000


APP_CTL_HEAP_SZ

5000

5000


SORTHEAP

4000

4000


STMTHEAP

40960

40960


APPLHEAPSZ

2560

2560


PCKCACHESZ

2560

2560


STAT_HEAP_SZ

20000

20000


DLCHKTIME

10000

10000


MAXLOCKS

50

50


LOCKTIMEOUT

1200

1200


MAXAPPLS

500

500


AVG_APPLS

10

10


MAXFILOP

500

500


GROUPHEAP_RATIO

N/A

70

New in V8

APPGROUP_MEM_SZ

N/A

30000

New in V8

DATABASE_MEMORY

N/A

AUTOMATIC

New in V8


Note:

To avoid fatal deadlocks during the ETL, make sure that you select the 'Session Level Retry on Deadlock' option in Informatica.

3.4 IBM DB2 UDB zOS and OS/390 and Z/os-Specific Database Guidelines for Oracle Business Analytics Warehouse

The following requirements apply to IBM DB2 RDBMS usage for zOS and OS/390:

Table 3-4 Variable Settings for IBM DB2 UDB zOS and OS/390 Databases

Parameter Recommended Setting Notes

IDTHTOIN

1800


CDSSRDEF

Any


STARJOIN

1

This setting indicates that star join is enabled. The one table with the largest cardinality is the fact table. However, if there is more than one table with this cardinality, star join is not enabled.


3.5 Additional Suggestions for Optimizing Oracle Performance in Oracle Business Analytics Warehouse

This section contains additional suggestions for optimizing performance.

3.6 SQL Server-Specific Database Guidelines for Oracle Business Analytics Warehouse

This section provides guidelines for SQL Server database usage.

Note:

The SQL Server database must be created with a collation sequence that supports binary sort order or case-sensitive dictionary sort order. Case-insensitive dictionary sort order is not supported. For example, for binary sort order with the U.S English character set, use the collation 'Latin1_General_BIN'. If you use the default collation setting of 'SQL_Latin1_General_CP1_CI_AS', the database is set to case-insensitive, which is not supported, and causes index creation failures.

This section includes the following topics:

3.6.1 Setting the ANSI NULL Option

Oracle Business Intelligence Applications requires that SQL Server databases be created with the ANSI NULL option selected.

To set the ANSI NULL option

  1. In the SQL Server Enterprise Manager, right-click the appropriate database, and choose Properties.

  2. Click the Options tab and select the box for ANSI NULL default.

3.6.2 Modifying the DB Library Options Setting

In a SQL Server 2000 environment, when loading Analytics tables with international data, or loading more than one language, you need to modify the DB Library Options setting.

To modify the DB Library Options setting

  1. From the Microsoft SQL Server program menu, select Client Network Utility.

  2. Select the DB Library Options tab.

  3. Clear the option Automatic ANSI to OEM.

    Note:

    SQL Server 2000 automatically tunes many of the server configuration options; therefore, an administrator is required to do little, if any, tuning. Although these configuration options can be modified, the general recommendation is that these options be left at their default values, allowing SQL Server to automatically tune itself based on run-time conditions.

3.6.3 Recommended SQL Server Database Parameters

If necessary, SQL Server components can be configured to optimize performance, as shown in Table 3-5.

Table 3-5 Recommended Variable Settings for SQL Server Databases

Parameter Recommended Setting Notes

Affinity mask

0


Allow updates

0


Awe enabled

0


C2 audit mode

0


Cost threshold for parallelism

5


Cursor threshold

–1


Default full-text language

1033


Default language

0


Fill factor

95%

For insert-intensive transactions, set Fill Factor between 90 and 95%. For better query performance, set Fill factor to 95 or even 100%.

Index create memory

1024 KB

Default is 0.

Lightweight pooling

0


Locks

0


Max degree of parallelism

0

Default is 0. This turns off parallelism. Max degree of parallelism should be left at 0, which means use parallel plan generation. It should be set to 1 (use only 1 process) if you run multi threaded components (for example, several EIM threads).

Max server memory

2000 MB

Default is 2147483647.

Max text repl size

65536 B


Max worker threads

100

Default is 255.

Media retention

0


Min memory per query

1024 KB


Min server memory

500 MB

Default is 0.

Nested triggers

1


Network packet size

8192 B

Default is 4096.

Open objects

0


Priority boost

0


Query governor cost limit

0

Modify to 60 only if CPU is high.

Query wait

–1 sec


Recovery interval

0 min


Remote access

1


Remote login timeout

20 sec


Remote proc trans

0


Remote query timeout

600 sec


Scan for startup procs

0


Set working set size

0


Two-digit year cutoff

2049


User connections

0


User options

0



  • SQL Server memory: Make sure adequate memory is available.

  • Transaction logs and TempDB: Reside on a separate disk from those used by database data.

  • Full load: Full Recovery model for the database.

  • Incremental (Refresh) load: Change from Full to Bulk-Logged Recovery model.

3.7 Teradata-Specific Database Guidelines for Oracle Business Analytics Warehouse

This sections contains the recommended best practices and guidelines for maximizing performance in Teradata deployments, and contains the following topics:

3.7.1 General Guidelines for Teradata Deployments

The following requirements apply to Teradata database usage:

  • Install the FastLoad and TPump Teradata utilities on the machine on which the Informatica Server is installed.

  • When installing Informatica PowerCenter, make sure there are no spaces in the Informatica Server directory name or the directory path. The default directory contains spaces that you must remove manually.

  • When you install Oracle Business Intelligence Applications in a Teradata environment, you must load the pre-built Informatica repository file \dwrep\Oracle_BI_DW_Teradata.rep.

  • In the ODBC configuration for Teradata, set the session Mode to ANSI and DateFormat to AAA, before you create the schema. Tables must be created as case specific. If you do the ODBC configuration after the schema is created, tables might not be created as case specific.

  • To avoid overflows if the arithmetic calculation involves any decimal data type more than 18,3 (precision, scale), add the following static Source System Parameters:

    • "as decimal(18,3))" for $$Hint_Tera_Post_Cast

    • "Cast (" for $$Hint_Tera_Pre_Cast $$Hint_Tera_Post_Cast $$Hint_Tera_Pre_Cast)

    For more information about setting Source System Parameters In the DAC Client, refer to Section 4.15.9, "How to Set DAC Source System Parameters".

Note:

To avoid fatal deadlocks during the ETL, make sure that you select the 'Session Level Retry on Deadlock' option in Informatica.

3.7.2 Best Practices For Teradata Deployments

This sections contains the recommended best practises for maximizing performance in Teradata deployments, and contains the following topics:

3.7.2.1 LEFT OUTER JOIN Issue

Teradata redistributes the OUTER table according to the join key, unless the INNER table is tiny. If the INNER table is tiny, the INNER table could be simply copied to all AMPs and the OUTER table would not be redistributed.

Many fact SIL mappings need to get the dimensional ROW_WID from ROW_ID/INTEGRATION_ID. For example, W_PER_RANK_FS.ACCNT_ID needs to be converted to ACCNT_WID before loading into W_PER_RANK_F table. Since ACCT_ID is nullable, the join between W_PER_RANK_FS and W_ORG_D is defined as a LEFT OUTER JOIN.

However, the percentage of NULLs in ACCT_ID column can be as high as 50% or more depending on the data set. When redistributing the W_PER_RANK_FS according to ACCT_ID, all rows with ACCT_ID = NULL are put onto a single AMP.

Although a Teradata database usually has hundreds of gigabytes of spool space, the spool space is allocated across hundreds of AMPs. The spool space for each AMP is limited (for example, to two gigabytes).

When a large percentage of W_PER_RANK_FS is distributed to a single AMP, this can result in insufficient spool space being available. This happens when too much data is spooled onto one AMP, not because the spool space is too small.

To work with Teradata's mechanism of parallel processing, and resolving the LEFT OUTER JOIN, SQL must be re-written.

As an example, refer to the following original SQL:

SELECT ...  FROM 
 W_PER_RANK_FS FS LEFT OUTER JOIN  W_ORG_D ORG ON
   FS.ACCNT_ID = ORG.INTEGRATION_ID  AND
   FS.DATASOURCE_NUM_ID = ORG.DATASOURCE_NUM_ID

 The above SQL should be re-coded as follows: 

SELECT ... FROM
 W_PER_RANK_FS FS LEFT OUTER JOIN  
   (SELECT
   FS.INTEGRATION_ID, FS.DATASOURCE_NUM_ID, ORG.ROW_WID, ORG.GEO_WID
    FROM
   W_PER_RANK_FS FS, W_ORG_D ORG
   WHERE
   FS.ACCNT_ID = ORG.INTEGRATION_ID AND FS.DATASOURCE_NUM_ID = 
   ORG.DATASOURCE_NUM_ID AND FS.ACCNT_ID IS NOT NULL)
ORG  ON
   FS.DATASOURCE_NUM_ID = ORG.DATASOURCE_NUM_I AND
   FS.INTEGRATION_ID = ORG.INTEGRATION_ID 

The same SQL re-coding method can be used for other Source Qualifiers that run into spool space problems.

In addition, there is another solution to the lack of spool space issue. This solution is to convert the NULLs to some evenly distributed but non-matched values, as shown in the following SQL example:

SELECT ...  FROM
W_PER_RANK_FS FS LEFT OUTER JOIN  W_ORG_D ORG ON
   CASE WHEN FS.ACCNT_ID IS NOT NULL
   THEN FS.ACCNT_ID
   ELSE  '#' || FS.INTEGRATION_ID END
   = ORG.INTEGRATION_ID
AND FS.DATASOURCE_NUM_ID = ORG.DATASOURCE_NUM_ID

Note:

The re-coded SQL is shown in bold type.

3.7.2.2 Database Statistics

This section describes database statistics.

3.7.2.3 Additional Statistics Required

Oracle Business Intelligence provides a list of column and index statistics collection statements. These are generally applicable for all situations, but should be evaluated for each site and situation. Additional statistics might be required, depending on your requirements. For example, some workflows need more statistics to avoid an out of spool space error on Teradata server.

3.7.2.3.1 Collect Statistics As A Pre-requisite

Once the tables have been created in the staging and target databases, you must run the supplied statistics collection. Failure to do so can affect ETL performance, and possibly result in a spool space error (error number 2646)

The DAC does the re-collection of statistics as part of the ETL processes. However, the DAC issues the collect statistics statement at the table level only (for example, collect statistics on w_org_d), and only for existing statistics.

3.7.2.3.2 Group By Vs Distinct

When there is a low number of distinct values, it is more efficient to use the GROUP BY phrase. Do not use the DISTINCT phrase, unless the number of distinct values is high.

3.7.2.3.3 Pruning Of Mapping

If you do not use all of the OTB fields supplied, you can improve performance by stripping the extraneous fields from the mapping and the tables.

3.7.2.3.4 Loader Configurations

This section explains the loaders that are available in Teradata, and how they are used in Oracle Business Intelligence Applications.

Teradata has 3 different types of Teradata loader processes, as follows:

  • Fastload

  • Mload

  • Tpump

Each loader process can be used in two different modes, as follows:

  • Staged Mode: The Informatica process does the following in this order:

    • Reads from the source data.

    • Creates a data file.

    • Invokes the loader process to load the table using the data file created.

    Advantages: In the event of failures, you can recover using the Teradata recovery process.

    Disadvantages: Staged mode is slower than Piped mode, and you need more disk space, as it can create large data files.

  • Piped Mode: The Informatica process reads from the source and simultaneously pipes that data to the loader to start loading the target table.

    Advantages: Quicker than Staged mode, and you do not require large amounts of disk space because no data files are created.

    Disadvantages: In the event of failures, you cannot recover using the Teradata recovery process (because tpump does row commits unlike fastload and mload).

3.7.2.3.5 Fastload

You use the Fastload process on empty tables, such as loading staging tables and in initial loads where the tables are empty.

When the Fastload process starts loading, it locks the target table, which means that processes (for example, lookups) cannot access that table. One solution to this problem is to specify dummy SQL for the look up overrides at the session level.

Tip:

If a session fails during a Fastlload process, use SQL Assistant to run a simple SQL command (for example, count(*)), to determine whether the table is locked by a Fastload process.

If a table is locked (for example, (for W_ORG_DS), use the following script to release the lock:

LOGON DATABASEALIAS/USER,PASSWORD
BEGIN LOADING USER.W_ORG_DS
ERRORFILES USER.ET_W_ORG_DS,USER.UV_W_ORG_DS;
END LOADING;

If you save the above text in a file called test.ctl, you would run this process by entering the following command at a command prompt:

C:\fastload\test.ctl

Tip:

To create a load script for a table, edit the test.ctl script above to change the login information, and replace all occurrences of W_ORG_DS with the required target table name.

After a load process script runs successfully, you should be able to run the command 'select count(*)' on the target table. If you are not able release the lock, you might need to drop and re-create the table to remove the lock. If you do so, you must re-create the statistics.

Tip:

Fastload is typically used in piped mode to load staging tables and initial loads. In the event of errors, reload the entire data.
3.7.2.3.6 Mload

The Mload process is slower than Fastload but quicker than Tpump. The Mload process can work on both empty tables as well as on tables with data. In the event of errors when running in piped mode, you cannot recover the data.

3.7.2.3.7 Tpump

The Tpump process is slower than Mload but faster than ODBC. The Tpump process does row commits, which enables you to recover processed operations, even if you use piping mode. In other words, if you re-start the process, Tpump starts loading data from the last committed data.

Tpump can be used in the following modes:

  • Tpump_Insert: Use to do inserts.

  • Tpump_Update: Use to do updates (this mode requires you to define the primary key in the Informatica target table definition).

  • Tpump_Upsert: Use to do update otherwise insert (this mode requires you to define the primary key in the Informatica target table definition).

  • Tpump_Delete: Use to do deletes (this mode requires you to define the primary key in the Informatica target table definition).

Informatica uses the actual target table name to generate the error table and log tables to be used as part of its control file generation. If you have two instances of Tpump loading into same target table at the same time, you need to modify the session to use a different error table and log table name.

The Tpump load process in piped mode is useful for incremental loads, and where the table is not empty. In the event of errors, restart the process and it starts re-loading from the last committed data.

Refer to Informatica documentation for information about configuring a session to use Teradata loaders.

3.8 Miscellaneous Information About Oracle Business Intelligence Applications Deployments

This section contains the following topics:

3.8.1 Preconfigured Mapping for the SA System Subject Area

For Oracle's Siebel Applications customers, Table 3-6 describes the preconfigured mappings for the SA system subject area. Fields that are not available in Oracle's Siebel transactional database will default to values shown in the table.

  • Overriding Defaults. You can add user-specific values for these fields, by creating an extension table to the S_USER table. to store the user-specific defaults for these fields. Additionally, you can change any of the default values. The metadata for the following logical table can be modified to include any physical extension table.

    SA User.(User)
    

    For instructions, refer to the documentation about configuring tables and columns for Oracle's Siebel Business Applications.

  • Setting Provider Information. Typically, the cell phone and the fax numbers in the Oracle Business Analytics Warehouse do not contain a provider name. Therefore, the Pager will typically be a numeric value such as 555-483-3843. To append a provider to this address, use the following guidelines:

    • If the entire company has the same provider, then you can append the provider in the column mappings.

    • If users can have different providers, you need to create an extension table. For instructions, refer to the documentation about configuring tables and columns for Oracle's Siebel business applications.

Table 3-6 Preconfigured Mappings for the User Table in the SA System Subject Area

Logical Column Physical Table Expression Comments

Cell Phone


''

It might be mapped to S_CONTACT.CELL_PH_NUM if this field contains SMTP address.

Cell Phone Priority


''

Defaults to N

Display Name

S_CONTACT

"Real Time OLTP"."".SIEBEL.S_CONTACT_User.FST_NAME || ' ' || "Real Time OLTP"."".SIEBEL.S_CONTACT_User.LAST_NAME

First Name concatenated with Last Name

Email

S_CONTACT

EMAIL_ADDR


Email Priority


'HNL'

Defaults to N

Email Type


'html'

Defaults to HTML

Group Name

S_RESP

NAME


Handheld


''

Defaults to an empty string

Handheld Priority


''

Defaults to an empty string

Language


'en'

Defaults to 'en'

Locale


'en'

Defaults to 'en'

Logon

S_USER

LOGIN


Pager


''

It could be mapped to S_CONTACT.PAGER_PH_NUM if this field contains SMTP address

Pager Priority


''

Defaults to N

Time Zone

S_TIMEZONE

NAME



3.8.2 Using Initialization Blocks

Table 3-7 lists some of the initialization blocks common to all Oracle Business Intelligence applications and their purposes. Initialization blocks that are specific to each Oracle Business Intelligence Applications area are not listed here.

Table 3-7 Initialization Blocks and Their Purposes

Initialization Block Purpose

Authorization

Calculates user's responsibilities from the database.

Authentication

Authenticates against the database and verifies the user exists as a operational application user.

External Metadata Strings

Gets the translated value of metadata strings for the user's locale. This initialization block is critical to Intelligence Dashboards in international deployment.

LOCALE

Sets the user's locale specification in the Oracle BI Server.

Login Properties

Gets user's login properties, such as full name, from the database. This initialization block also sets the logging level of the users. By default, the log level is 0 for all users. If you wish to generate Oracle BI query logs for all users, this initialization block should be updated by changing the default value as well as the value in the initialization SQL.

Default System Language ID

Sets the variable OLTP_LANG_ID by querying the operational application database.

Organizations for Org-based Security

Queries the operational application database to get the organization membership for each user. It sets the variable ORGANIZATION.

Primary Owner ID

Sets the primary owner ID based on the user login ID.

Primary Position ID

Queries the operational application database to set the variable PRIMARY_POSTN_ID.

Warehouse Refresh Date

Sets several time-based variables such as CURRENT_YEAR.

ETL Run Date

Retrieves the ETL run date.

ETL Default Currency

Retrieves the default currency.


3.8.3 Performance Improvements for Change Capture for Incremental ETL Run

To maximize performance for Siebel (CRM) and Oracle E-Business Suite OLTP Systems, you need to implement the indexes specified in this section.

3.8.3.1 Implementing Transactional Database (OLTP) Indexes

To implement indexes, you use SQL files that are available in the \OracleBI\dwrep directory. Table 3-8 describes the SQL files that are appropriate for the specified applications.

Table 3-8 SQL files for Siebel transactional databases

Application Name SQL File Name

Horizontal Application

PerfIndex_Horizontal.sql

Industry Application

PerfIndex_Industry.sql


The SQL files generate indexes on all S_.* tables that are being used by the out-of-the-box applications.

Note:

If you move from a test to a production environment, you need to drop the indexes and re-create them in the production environment.

3.8.3.2 An Example Of Change Capture SQL And Required Indexes

Change capture SQL generates the following SQL:

Insert into S_ETL_I_IMG_XX (ROW_ID, LAST_UPD)
AS
SELECT ROW_ID, LAST_UPD, MODIFICATION_NUM
From
S_XXX
WHERE LAST_UPD > 'LAST REFRESH_DATE – PRUNE DAYS'
AND NOT EXISTS
(
SELECT 'X' FROM S_ETL_R_IMAGE
WHERE S_ETL_R_IMAGE.ROW_ID = S_XXX.ROW_ID AND
S_ETL_R_IMG_XX.MODIFICATION_NUM = S_XXX.MODIFICATION_NUM AND
S_ETL_R_IMG_XX.LAST_UPD = S_XXX.LAST_UPD
)

Table 3-9 shows the indexes that, based on the SQL above, are created on the S_CONTACT table by the SQL generation scripts.

Table 3-9 Indexes Created on the S_CONTACT Table in Siebel transactional database

Index Index Column

S_CONTACT_W1

LAST_UPD, ROW_ID_MODIFICATION_NUM

S_CONTACT_W11

LAST_UPD


3.8.4 Running an ETL with a Non-English OLTP Data Source

To run an ETL with a non-English OLTP Data Source, you must make a copy of an appropriate source system container, and configure the language, country, and continent parameters.

To run an ETL with a non-English OLTP Data Source:

  1. In Data Warehouse Administration Console, choose File, then New Source System to display the New Source System Container dialog.

  2. Select the Create as a Copy of Existing Container radio button.

  3. From the Existing Containers drop down list, select the container that you want to copy, then click OK.

  4. Display the Design view.

  5. Ensure that you have selected the correct container from the containers drop down list.

  6. Select the Source System Parameters tab.

  7. Use the Edit tab below the list of Source System Parameters to change the value of the following parameters in the list:

    • $$DFLT_LANG (Default Language) – For example, for a Japanese data source, change this value to JPN.

    • (optional) $$DFLT_COUNTRY (Default Country)

    • (optional) $$DFLT_CONTINENT (Default Continent)

    Tip: To find out the value to specify for the $$DFLT_LANG parameter, issue the query 'select VAL from S_SYS_PREF where SYS_PREF_CD=<ETL value>' against the OLTP database. For example, to find out the default ETL language, issue the following command:

    select VAL from S_SYS_PREF where SYS_PREF_CD='ETL Default Language';

  8. Create a new ETL plan for the new source system container, and edit its parameters, as follows:

    1. Display the Execute tab.

    2. Click the Execution Plans sub-tab.

    3. Click New to create a new blank execution tab and use the sub-tabs below (for example, Subject Areas, Parameters, Ordered Tasks) to specify the execution plan details.

    4. Click Save.

  9. Click Run Now to run the new ETL plan (or use the Schedule tab to specify when you want to run the new execution plan).

3.9 Partitioning Guidelines For Large Fact Tables

This section explains how to use partitioning to maximize performance in your Oracle Business Intelligence Applications deployment, and contains the following topics:

3.9.1 Introduction to Partitioning Large Fact Tables

Taking advantage of range partitioning for fact tables will not only reduce index and statistics maintenance time during ETL, but also improve web queries performance. Since the majority of inserts and updates impact the last partition(s), you would need to disable only local indexes on a few impacted partitions, and then rebuild disabled indexes after the load and compute statistics on updated partitions only. Online reports and dashboards should also render results faster, since the optimizer would build more efficient execution plans using partitions elimination logic.

Large fact tables, with more than 10 million rows, are good candidates for partitioning. To build an optimal partitioned table with reasonable data distribution, you can consider partitioning by month, quarter, year, etc. You can either identify and partition target fact tables before initial run, or convert the populated tables into partitioned objects after the full load.

To implement the support for partitioned tables in Oracle Business Analytics Data Warehouse, you need to update DAC metadata and manually convert the candidates into partitioned tables in the target database.

To Deploy Partitioned Fact Tables:

  1. Partition the Large Fact Tables (for more information, see Section 3.9.2, "How to Partition Large Fact Tables").

  2. Configure the DAC to support ETL for partitioned tables (for more information, see Section 3.9.3, "Configure DAC To Support ETL For Partitioned Tables").

3.9.2 How to Partition Large Fact Tables

If you have large fact tables that are affecting performance, you can maximize performance by partitioning the fact tables as described in this section.

  1. Identify a partitioning key and decide on a partitioning interval.

    Choosing the correct partitioning key is the most important factor for effective partitioning, since it defines how many partitions will be involved in web queries or ETL updates. Review the following guidelines for selecting a column for a partitioning key:

    • Identify eligible columns of type DATE for implementing range partitioning.

    • Connect to the Oracle BI Server repository and check the usage or dependencies on each column in the logical and presentation layers.

    • Analyze the summarized data distribution in the target table by each potential partitioning key candidate and data volumes per time range, month, quarter or year.

    • Based on the compiled data, decide on the appropriate partitioning key and partitioning range for your future partitioned table.

    • The recommended partitioning range for most implementations is a month.

    The following table summarizes the recommended partitioning keys for some large Oracle Business Intelligence Applications Fact tables:

    Table 3-10 Recommended partitioning keys for large fact tables

    Area Table Name Partitioning Key

    Financials

    W_AP_XACT_F

    POSTED_ON_DT_WID

    Financials

    W_AR_XACT_F

    POSTED_ON_DT_WID

    Financials

    W_GL_REVN_F

    POSTED_ON_DT_WID

    Financials

    W_GL_COGS_F

    POSTED_ON_DT_WID

    Financials

    W_TAX_XACT_F

    ACCT_PERIOD_END_DT_WID

    Financials

    W_GL_OTHER_F

    ACCT_PERIOD_END_DT_WID

    Sales

    W_SALES_INVOICE_LINE_F

    INVOICED_ON_DT_WID

    Sales

    W_SALES_SCHEDULE_LINE_F

    ORDERED_ON_DT_WID

    Sales

    W_REVN_F

    CLOSE_DT_WID


  2. Identify future global and local indexes.

    Determine all active indexes on a partitioned table candidate (W_REVN_F in the example) using the following query in the DAC schema:

    select I.NAME, C.NAME, C.POSTN, I.ISUNIQUE, I.ISBITMAP
    from W_ETL_TABLE T, W_ETL_INDEX I, W_ETL_INDEX_COL C
    where T.ROW_WID = I.TABLE_WID
    and T.NAME='W_REVN_F'
    and I.ROW_WID=C.INDEX_WID
    and I.INACTIVE_FLG = 'N'
    order by I.NAME, C.POSTN;
    

    Review the query output and choose an existing unique index as a global unique index for your partitioned table. All bitmap indexes must be re-created as local partitioned indexes. If the query returns B-Tree indexes other than the chosen candidate for a global index, consider creating them as local indexes as well.

  3. Create a partitioned table.

    You can pre-create a partitioned table prior to the initial load, or load data into the regular table and then create its partitioned copy and migrate the summarized data. If you have already completed the initial load into a regular table and then decided to partition it, you do not need to re-run the initial load.

    You can consider two options to convert a table into a partitioned one:

    • Create table as select.

      This method is simpler and faster.

    • Create table exchange partition syntax and then split partitions.

      This method is suitable in high availability data warehouses when you have to carry out partitioning with end users accessing the data.

      Note:

      It is important to make sure that your partition names include the values from the DAC system variables CURRENT_MONTH_WID and PREVIOUS_MONTH_WID, which are defined in the next section. These variables will be used in custom XML files to handle local and global indexes using syntax: 'partition PART_@DAC_CURRENT_MONTH_WID' and 'partition PART_@DAC_PREVIOUS_MONTH_WID'. Note that the DDL syntax must have the prefix 'DAC_' in the custom XML files.

      You should create enough partitions to cover the time range for the summarized data. You should create additional partitions for future ranges as well. You may also consider distributing the partitions across different tablespaces.

      The syntax in the following SQL commands assumes everything in one tablespace USERS:

      SQL> rename W_REVN_F to W_REVN_F_ORIG;
      SQL> create table W_REVN_F
      partition by range (CLOSE_DT_WID) (
      partition PART_MIN    values less than (200601),
      partition PART_200601 values less than (200602),
      partition PART_200602 values less than (200603),
      ...
      partition PART_200711 values less than (200712),
      partition PART_200712 values less than (200801),
      ...
      partition PART_MAX    values less than (maxvalue)
      )
      tablespace USERS
      nologging parallel
      enable row movement
      as
      select * from W_REVN_F_ORIG
      ;
      
  4. Drop all indexes on the renamed table.

    To ensure index naming standards in the target schema, drop all indexes on the renamed table (W_REVN_F_ORIG table in the example).

  5. Create a global unique index.

    Create a global unique B-Tree index using the information from item Step 2, for example:

    create unique index W_REVN_F_P1 
    on W_REVN_F (ROW_WID)
    tablespace USERS_IDX global nologging parallel;
    
  6. Create local indexes.

    You can generate the set of required 'Create index' DDL commands for Bitmap indexes using the DAC repository, for example:

    select 'CREATE BITMAP INDEX '|| i.name || ' ON W_REVN_F ('|| c.name || ' ASC) tablespace USERS_IDX LOCAL NOLOGGING PARALLEL;'
    from W_ETL_TABLE T, W_ETL_INDEX I, W_ETL_INDEX_COL C
    where T.ROW_WID = I.TABLE_WID
    and T.NAME='W_REVN_F'
    and I.ROW_WID = C.INDEX_WID
    and I.ISBITMAP = 'Y'
    and I.INACTIVE_FLG = 'N';
    

    If there are any local B-Tree indexes identified in Step 2, create them as local partitioned indexes as well. You can modify the query above and add I.ISBITMAP = 'N'. Connect to the Target schema and execute the generated DDLs to create the indexes.

  7. Compute statistics.

    Compute statistics on the newly partitioned table and its indexes using the DBMS_STATS API, for example:

    SQL> BEGIN
    DBMS_STATS.GATHER_TABLE_STATS( NULL
    , tabname=>'W_REVN_F'
    , cascade => TRUE
    , estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE
    , method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO'
    );
    END;
    
  8. Enable row movement.

    Since the custom XML scripts will make local indexes unusable during incremental runs, you need to set the parameter skip_unusable_indexes to true in the session running ETL. Connect to your Informatica Workflow Manager, navigate to Connections, then Relational, and edit Data Warehouse connection. Add the following SQL statement in the field Environment SQL:

    alter session set skip_unusable_indexes=true;
    
  9. Drop the original table.

    After you confirm performance gains with the partitioned table and indexes in your target database, you can consider dropping the original renamed table (_ORIG).

You now need to configure the DAC to support ETL for partitioned tables (for more information, see Section 3.9.3, "Configure DAC To Support ETL For Partitioned Tables").

3.9.3 Configure DAC To Support ETL For Partitioned Tables

Once you have partitioned your fact tables as described in Section 3.9.2, "How to Partition Large Fact Tables", you need to configure the DAC console to support the partitioned tables, as follows:

  1. Inactivate all indexes on a partitioned target table, as follows:

    1. In the DAC Client, display the Design View.

      For more information about logging into the DAC, see Section A.1, "How to Log Into the DAC Client".

    2. Display the Indices tab in the upper right pane.

    3. Click the Query button, then enter the table name, then click Go.

    4. Select the Inactive check box for all indexes on the target table.

      These indexes will be handled outside DAC through pre-SQL and post-SQL at a task level.

    Note:

    The DAC version released with Oracle Business Intelligence Applications Version 7.9.5 does not disable computing statistics at a table level. Since computing statistics for a partitioned table will be handled by a post-SQL XML file, Analyze Table Task in DAC for the table will be redundant. To workaround this limitation, you can abort the execution plan in DAC, mark the task Analyze Table for your staging table as Completed and restart the Execution Plan.
  2. Create two new source system parameters In the DAC Client, as follows:

    1. In the DAC Client, display the Design View.

      For more information about logging into the DAC, see Section A.1, "How to Log Into the DAC Client".

    2. Select your custom container from the pull-down menu (next to the Execute Button).

      Note:

      You cannot create any custom parameters in the pre-packaged containers. You must first create a new container. Refer to the Oracle Business Intelligence Applications Database Administration Console Guide for information to create a Custom Container as a copy of a packaged one.
    3. Display the Source System Parameters tab in the upper right pane.

    4. Use the New button to create two new source system parameters with the following values:

      Table 3-11 Source system parameters for partitioned tables

      Name Data Type Value

      CURRENT_MONTH_WID

      SQL

      select to_char(row_wid) from w_month_d where w_current_cal_month_code = 'Current'

      When you click on the Value field to display the Enter Parameter Value dialog, select DBConnection_OLAP from the Logical Data Source drop down list.

      PREVIOUS_MONTH_WID

      SQL

      select to_char(row_wid) from w_month_d where w_current_cal_month_code = 'Previous'

      When you click on the Value field to display the Enter Parameter Value dialog, select DBConnection_OLAP from the Logical Data Source drop down list.


      These source system parameters will be used in pre-SQL and post-SQL xml files during incremental loads.

  3. Prepare pre-SQL xml files and post-SQL xml files for full and incremental loads.

    You can either obtain the template file samplesql.xml, located under <DAC_HOME>\CustomSQLs directory and generate the required pre- and post- SQL xml files, or use the XML files examples in the next section for your custom XML files. DAC will execute the SQL from pre-SQL XMLs before Informatica tasks, and then run the SQL from post-SQL XML files.

    Create the following set of custom xml files, one set per partitioned table:

    • pre_<TABLE_NAME>_ini.xml - drops all local and global indexes.

    • post_<TABLE_NAME>_ini.xml - creates all local and global indexes.

    • pre_<TABLE_NAME>incr.xml - makes local indexes unusable for last two partitions 'PART@DAC_CURRENT_MONTH_WID' and 'PART_@DAC_PREVIOUS_MONTH_WID'.

    • post_<TABLE_NAME>incr.xml - rebuilds the unusable local indexes and computes statistics on the two partitions, the rebuilt local and global indexes.

      Important: Make sure that your partition names include the values from the DAC system variables CURRENT_MONTH_WID and PREVIOUS_MONTH_WID. These variables must be used in the custom XML files to handle local and global indexes using syntax: 'partition PART_@DAC_CURRENT_MONTH_WID' and 'partition PART_@DAC_PREVIOUS_MONTH_WID'. Note that the DDL syntax must have the prefix 'DAC_' in the custom XML files.

  4. Copy all pre- and post-SQL xml files under <DAC_HOME>\CustomSQLs location.

  5. Associate the four generated custom XML files with a task, which loads data into the target partitioned table.

    1. In the DAC Client, display the Design View.

      For more information about logging into the DAC, see Section A.1, "How to Log Into the DAC Client".

    2. Select your custom container from the pull-down menu (next to the Execute Button).

    3. Click the Tables menu in the upper right pane, and find or query the target partitioned Fact table.

    4. Click the menu Target for Tasks (RO) in the lower right pane.

    5. Find the SIL primary mapping name. For example the primary mapping for W_GL_REVN_F fact table is SIL_GLRevenueFact. Copy or save the mapping name.

    6. Click the menu Tasks in the upper right pane.

    7. Find or query the identified mapping name from the preceding step.

    8. Fill in the custom XML file names in the following fields on the bottom right pane and save the change:

      Table 3-12 Field Values on Tasks Tab

      Field Name on Tasks Tab Value Description

      Pre-SQL for Full Load

      pre_<TABLE_NAME>_ini.xml

      Pre SQL xml file name for Full Load. For an example XML file, see Section 3.9.4.1, "Pre_W_REVN_F_ini.xml - Pre SQL xml file name for Full Load".

      Post-SQL for Full Load

      post_<TABLE_NAME>_ini.xml

      Post SQL xml file name for Full Load. For an example XML file, see Section 3.9.4.2, "Post_W_REVN_F_ini.xml - Post SQL xml file name for Full Load".

      Pre-SQL for Incremental Load

      pre_<TABLE_NAME>incr.xml

      Pre SQL xml file name for Incremental Load. For an example XML file, see Section 3.9.4.3, "Pre_W_REVN_F_incr.xml- Pre SQL xml file name for Incremental Load".

      Post-SQL for Incremental Load

      post_<TABLE_NAME>incr.xml

      Post SQL xml file name for Incremental Load. For an example XML file, see Section 3.9.4.4, "Post_W_REVN_F_incr.xml- Post SQL xml file name for Incremental Load".


  6. Save the changes in DAC.

  7. Select the checkbox Analyze Truncate Table Only under the Execution Plan tab in the DAC.

    1. Display the Execute view.

    2. Click the tab Execution Plans in the upper right pane.

    3. Find your Execution Plan.

    4. Check the checkbox Analyze Truncated Tables only and save the changes.

      The DAC will compute statistics on tables that were truncated prior to an ETL and reloaded during the plan execution.

    5. Rebuild the Execution Plan in DAC.

      Click the tab Build in the upper right pane to rebuild the execution plan. When the Build process is complete, the ETL is ready to run.

3.9.4 Example XML Files

This section contains example XML files for pre-SQL and post-SQL for use in Step 5 of the task Section 3.9.3, "Configure DAC To Support ETL For Partitioned Tables", and contains the following topics:

3.9.4.1 Pre_W_REVN_F_ini.xml - Pre SQL xml file name for Full Load

<?xml version="1.0" ?>
<CustomSQLs>
   <!--  Start Drop Index -->
   <!--  Drop W_REVN_F_F1 -->
   <sql name="Drop Index W_REVN_F_F1" type="SQL" continueOnFail="true" validDBPlatforms="Oracle">
      <![CDATA[ drop index W_REVN_F_F1 ]]>
   </sql>
   <!--  Drop W_REVN_F_F2 -->
      <sql name="Drop Index W_REVN_F_F2" type="SQL" continueOnFail="true" validDBPlatforms="Oracle">
      <![CDATA[ drop index W_REVN_F_F2 ]]>
   </sql>
...
...
...
   <!--  Drop W_REVN_F_U1 -->
      <sql name="Drop Index W_REVN_F_U1" type="SQL" continueOnFail="true" validDBPlatforms="Oracle">
      <![CDATA[ drop index W_REVN_F_U1 ]]>
   </sql>
</CustomSQLs>

3.9.4.2 Post_W_REVN_F_ini.xml - Post SQL xml file name for Full Load

<?xml version="1.0" ?>
<CustomSQLs>
   <!--  Start Create Index -->
   <!--  Create global index W_REVN_F_U1 -->
      <sql name="Create Index W_REVN_F_P1" type="SQL" continueOnFail="false" validDBPlatforms="Oracle">
      <![CDATA[  create unique index W_REVN_F_P1 on W_REVN_F(ROW_WID) tablespace USERS_IDX GLOBAL NOLOGGING PARALLEL ]]>
   </sql>
   <!--  Create local index W_REVN_F_F1 -->
      <sql name="Create Index W_REVN_F_F1" type="SQL" continueOnFail="false" validDBPlatforms="Oracle">
      <![CDATA[ CREATE BITMAP INDEX W_REVN_F_F1 ON W_REVN_F(COLUMN_NAME1 ASC) TABLEPSACE USERS_IDX LOCAL NOLOGGING PARALLEL ]]>
   </sql>
   <!--  Create local index W_REVN_F_F2 -->
      <sql name="Create Index W_REVN_F_F2" type="SQL" continueOnFail="false" validDBPlatforms="Oracle">
      <![CDATA[ CREATE BITMAP INDEX W_REVN_F_F2 ON W_REVN_F(COLUMN_NAME2 ASC) TABLEPSACE USERS_IDX LOCAL NOLOGGING PARALLEL ]]>
   </sql>
...
...
...
</CustomSQLs>

3.9.4.3 Pre_W_REVN_F_incr.xml- Pre SQL xml file name for Incremental Load

<?xml version="1.0" ?>
<CustomSQLs>
   <!--  Start W_REVN_F_F1 -->
   <sql name="W_REVN_F_F1 Disable Previous" type="SQL" continueOnFail="true" validDBPlatforms="Oracle">
      <![CDATA[ alter index W_REVN_F_F1 modify partition PART_@DAC_PREVIOUS_MONTH_WID unusable ]]>
   </sql>
   <sql name="W_REVN_F_F1 Disable Current" type="SQL" continueOnFail="true" validDBPlatforms="Oracle">
      <![CDATA[ alter index W_REVN_F_F1 modify partition PART_@DAC_CURRENT_MONTH_WID unusable ]]>
   </sql>
   <!--  Start W_REVN_F_F2 -->
   <sql name="W_REVN_F_F2 Disable Previous" type="SQL" continueOnFail="true" validDBPlatforms="Oracle">
      <![CDATA[ alter index W_REVN_F_F2 modify partition PART_@DAC_PREVIOUS_MONTH_WID unusable ]]>
   </sql>
   <sql name="W_REVN_F_F2 Disable Current" type="SQL" continueOnFail="true" validDBPlatforms="Oracle">
      <![CDATA[ alter index W_REVN_F_F2 modify partition PART_@DAC_CURRENT_MONTH_WID unusable ]]>
   </sql>
...
...
...
</CustomSQLs>

3.9.4.4 Post_W_REVN_F_incr.xml- Post SQL xml file name for Incremental Load

<?xml version="1.0" ?>
<CustomSQLs>
   <!--  Start W_REVN_F PART_PREVIOUS_MONTH_WID -->
   <sql name="W_REVN_F Rebuild Previous" type="SQL" continueOnFail="false" validDBPlatforms="Oracle">
      <![CDATA[ alter table W_REVN_F modify partition PART_@DAC_PREVIOUS_MONTH_WID rebuild unusable local indexes]]>
   </sql>
   <!--  Start W_REVN_F PART_CURRENT_MONTH_WID -->
   <sql name=" W_REVN_F Rebuild Current " type="SQL" continueOnFail="false" validDBPlatforms="Oracle">
      <![CDATA[alter table W_REVN_F modify partition PART_@DAC_CURRENT_MONTH_WID rebuild unusable local indexes]]>
   </sql>
   <!--  Gather Stats on global indexes -->
   <sql name="Gather Stats on W_REVN_F_U1" type="Stored Procedure" continueOnFail="false" validDBPlatforms="Oracle">
      <![CDATA[
               DBMS_STATS.GATHER_INDEX_STATS( NULL
               , indname => 'W_REVN_F_P1    '
               , estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE
               , degree => DBMS_STATS.DEFAULT_DEGREE
               )
      ]]>
   </sql>
   <!--  Gather Stats on W_REVN_F table -->
   <sql name="W_REVN_F Gather Stats Previous" type="Stored Procedure" continueOnFail="false" validDBPlatforms="Oracle">
      <![CDATA[ 
               DBMS_STATS.GATHER_TABLE_STATS( NULL
               , tabname=>'W_REVN_F'
               , cascade => TRUE
               , partname => 'PART_@DAC_PREVIOUS_MONTH_WID'
               , estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE
               , granularity => 'PARTITION'
               , method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO'
               , degree => DBMS_STATS.DEFAULT_DEGREE
               )
      ]]>
   </sql>
   <sql name="W_REVN_F Gather Stats Current" type="Stored Procedure" continueOnFail="false" validDBPlatforms="Oracle">
      <![CDATA[ 
               DBMS_STATS.GATHER_TABLE_STATS( NULL
               , tabname=>'W_REVN_F'
               , cascade => TRUE
               , partname => 'PART_@DAC_CURRENT_MONTH_WID'
               , estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE
               , granularity => 'PARTITION'
               , method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO'
               , degree => DBMS_STATS.DEFAULT_DEGREE
               )
      ]]>
   </sql>
</CustomSQLs>