Oracle® Business Intelligence Applications Data Warehouse Administration Console Guide Version 7.9.4 E10759-01 |
|
Previous |
Next |
This chapter provides an overview of the Oracle Business Analytics Warehouse and the Data Warehouse Administration Console (DAC). It includes the following topics:
The Oracle Business Analytics Warehouse is a unified data repository for all customer-centric data. The purpose of the Oracle Business Analytics Warehouse is to support the analytical requirements of Oracle Business Intelligence Applications.
The Oracle Business Analytics Warehouse includes the following:
A data integration engine that combines data from multiple source systems to build a data warehouse.
An open architecture to allow organizations to use third-party analytical tools in conjunction with the Oracle Business Analytics Warehouse using the Oracle Business Intelligence Server.
Prebuilt data extractors to incorporate data from external applications into the Oracle Business Analytics Warehouse.
A set of ETL (extract-transform-load) processes that takes data from multiple source systems and creates the Oracle Business Analytics Warehouse tables.
The DAC, a centralized console for schema management as well as configuration, administration, loading, and monitoring of the Oracle Business Analytics Warehouse.
High-level analytical queries, like those commonly used in Oracle Business Analytics Warehouse, scan and analyze large volumes of data using complex formulas. This process can take a long time when querying a transactional database, which impacts overall system performance.
For this reason, the Oracle Business Analytics Warehouse was constructed using dimensional modeling techniques to allow for fast access to information required for decision making. The Oracle Business Analytics Warehouse derives its data from operational applications, and uses Informatica's data integration technology to extract, transform, and load data from transactional databases into the Oracle Business Analytics Warehouse.
Figure 2-1 illustrates how the Oracle Business Analytics Warehouse interacts with the other components of Oracle BI Applications.
The Oracle Business Analytics Warehouse architecture comprises the following components:
DAC client. A command and control interface for the data warehouse to allow for schema management, and configuration, administration, and monitoring of data warehouse processes. It also enables you to design subject areas and build execution plans.
DAC server. Executes the instructions from the DAC client. The DAC server manages data warehouse processes, including loading of the ETL and scheduling execution plans. It dynamically adjusts its actions based on information in the DAC repository. Depending on your business needs, you might incrementally refresh the Oracle Business Analytics Warehouse once a day, once a week, once a month, or on another similar schedule.
DAC repository. Stores the metadata (semantics of the Oracle Business Analytics Warehouse) that represents the data warehouse processes.
Informatica Server. Loads and refreshes the Oracle Business Analytics Warehouse.
Informatica Repository Server. Manages the Informatica repository.
Informatica Repository. Stores the metadata related to Informatica workflows.
Informatica client utilities. Tools that enable you to create and manage the Informatica repository.
The DAC provides a framework for the entire life cycle of data warehouse implementations. It enables you to create, configure, execute, and monitor modular data warehouse applications in a parallel, high-performing environment. For information about the DAC process life cycle, see "About the DAC Process Life Cycle".
The DAC complements the Informatica ETL platform. It provides application-specific capabilities that are not prebuilt into ETL platforms. For example, ETL platforms are not aware of the semantics of the subject areas being populated in the data warehouse nor the method in which they are populated. The DAC provides the following application capabilities at a layer of abstraction above the ETL execution platform:
Dynamic generation of subject areas and execution plans
Dynamic settings for parallelism and load balancing
Intelligent task queue engine based on user- defined and computed scores
Automatic full and incremental mode aware
Index management for ETL and query performance
Embedded high performance Siebel OLTP change capture techniques
Ability to restart at any point of failure
Phase-based analysis tools for isolating ETL bottlenecks
Important DAC features enable you to do the following:
Minimize installation, setup, and configuration time
Create a physical data model in the data warehouse
Set language, currency, and other settings
Design subject areas and build execution plans
Manage metadata driven dependencies and relationships
Generate custom ETL execution plans
Automate change capture for the Siebel transactional database
Capture deleted records
Assist in index management
Perform dry runs and test runs of execution plans
Provide reporting and monitoring to isolate bottlenecks
Perform error monitoring and email alerting
Perform structured ETL analysis and reporting
Utilize performance execution techniques
Automate full and incremental mode optimization rules
Set the level of Informatica session concurrency
Load balance across multiple Informatica servers
Restart from point of failure
Queue execution tasks for performance (See Figure 2-2.)
The DAC manages the task execution queue based on metadata driven priorities and scores computed at runtime. This combination allows for flexible and optimized execution. Tasks are dynamically assigned a priority based on their number of dependents, number of sources, and average duration.
Source system containers hold repository objects that correspond to a specific source system. For information about the different kinds of repository objects, see "About DAC Repository Objects".
You can use the preconfigured source system containers to create your own source system container. You cannot modify objects in the preconfigured source system containers. You must make a copy of a preconfigured container in order to make any changes to it.
For instructions on creating a new source system container or copying an existing container, see "Creating or Copying a Source System Container".
All DAC repository objects are associated with a source system container. For more information about source system containers, see "About Source System Containers" and "About Object Ownership in the DAC".
The DAC repository stores application objects in a hierarchical framework that defines a data warehouse application. The DAC enables you to view the repository application objects based on the source system container you specify. The source system container holds the metadata that corresponds to the source system with which you are working.
A data warehouse application comprises the following repository objects:
Subject area. A logical grouping of tables related to a particular subject or application context, as well as the tasks that are associated with the tables. Subject areas are assigned to execution plans, which can be scheduled for full or incremental loads. A subject area also includes the tasks required to load the subject area tables.
Tables. Physical database tables defined in the database schema. Can be transactional database tables or data warehouse tables. Table types can be fact, dimension, hierarchy, aggregate, and so on, as well as flat files that can be sources or targets.
Task. A unit of work for loading one or more tables. A task comprises the following: source and target tables, phase, execution type, truncate properties, and commands for full or incremental loads. When you assemble a subject area, the DAC automatically assigns tasks to it. Tasks that are automatically assigned to the subject area by the DAC are indicated by the Autogenerated flag in the Tasks subtab of the Subject Areas tab.
Task Groups. A group of tasks that you define because you want to impose a specific order of execution. A task group is considered to be a "special task."
Execution plan. A data transformation plan defined on subject areas that needs to be transformed at certain frequencies of time. An execution plan is defined based on business requirements for when the data warehouse needs to be loaded. An execution plan comprises the following: ordered tasks, indexes, tags, parameters, source system folders, and phases.
Schedule. A schedule specifies when and how often an execution plan runs. An execution plan can be scheduled for different frequencies or recurrences by defining multiple schedules.
The DAC is used by different user groups to design, execute, monitor, and diagnose execution plans. These phases together make up the DAC process life cycle, as shown in Figure 2-3.
The phases of the process and the actions associated with them are as follows:
Setup
Set up database connections
Set up ETL processes (Informatica)
Set up email recipients
Design
Define application objects
Design execution plans
Execute
Define scheduling parameters to run execution plans
Access runtime controls to restart or stop currently running schedules
Monitor
Monitor runtime execution of data warehouse applications
Monitor users, DAC repository, and application maintenance jobs