Pre-Data Warehouse Load Validations
Before loading data into the data warehouse, mandatory global parameters must be configured and the initial setup job must be executed successfully. This section will validate all mandatory parameters and Pre-ETL jobs before starting the data warehouse load.
On this page:
Global Parameters
Ensure that the following global parameters are set with the proper values based on the target system. You will use these steps to perform all the validations listed in this table:
- Login to Oracle Utilities Analytics Administration tool.
- Click on Global Configuration under ETL Configuration.
- Validate that the default value is present for each record with product as B1.
- Edit default values with appropriate value if required. Otherwise skip these scenarios.
# | Validation |
---|---|
J01 |
Database Edition Type Mention the type of the Oracle Database, whether it is Standard Edition (SE) or Enterprise Edition (EE). |
J02 |
The Product Sync mode type Valid values include DPUMP_EXPORT, DPUMP_IMPORT, DBLINK, DPUMP_SHARED Default: DBLINK |
J03 |
Maximum number of MV interfaces which can run in parallel MV jobs will take time depending on volume of data present in target dimensions and facts. Running too many MV jobs parallel might impact the target system performance. Default value is 4. This value can be increased/decreased depending on the target system hardware. |
J04 |
Global ODI execution log level Default value is 6. It is recommended to keep the log level to 6 at initial stage. Logs will help finding the root cause. Later this value can be reduced |
J05 |
Number of days for which the ODI operator logs would be retained. Default value is 7. This can be /decreased depending on the target system hardware. |
J06 |
Cut Off Time The time of the day up to which all changes from the source will be loaded by the ETL jobs. Any changes after this will not be loaded on the same day. Set this to default value (i.e 23:59:59) to keep loading data continuously. |
J07 |
Extract Start Date This date will be used to slice initial source data in the replication schema. |
Pre-ETL Jobs
# | Validation | Script or Steps to Validate |
---|---|---|
J08 |
Validate that GoldenGate replicat processes must be in stopped state. |
Open GoldenGate command line interface in target and run the following command: ggsci> info all Validate that all GoldenGate replicat processes are in a stopped state. |
J09 |
Validate that no incremental data loaded into the replication schema before initial data loaded. |
Run the following query in target database as an MDADM user and validate the following: Select s.prod_flg,'select count(1) from '||i.context_cd||'REP.'||tbl_name||';' from b1_source_table s, b1_prod_instance i where base_replicate_flg='Y' and s.prod_flg=i.prod_flg and i.context_cd='CCB1'; 1. Above query will display SQL queries to get the count of rows for each table to be replicated. 2. Run these SQL queries and validate that no query should return count greater than 0. 3. If count is greater than 0 for a table, that mean some incremental data synced before initial data sync started. Data should be deleted from those tables. |
J10 |
Validate that initial setup package is executed successfully. Initial setup package populates null and missing keys into target dimensions. |
Run the following query in target metadata schema and validate that at least one row is displayed. select * from oua_work.snp_session where sess_name ='B1_INITIAL_SETUP_PKG' and sess_status='D'; |
J11 |
Validate that not available and missing keys (0,-99) record exists in each target dimension. |
Run the following query in target metadata schema and validate the following: Select 'select count(1) from dwadm.'||entity_name||';' dim_row_count_sql from b1_target_entity where entity_name like 'CD%'; 1. Above query will display SQL queries to get the count of rows for each target dimensions. 2. Run these SQL queries and validate that every dimension should have two records. |
J12 |
Validate that data loaded into Date and Time dimension. |
Run the following query in target metadata schema and validate that date and time dimension should have more than two records: select count(1) from dwadm.cd_date having count(1)>2; select count(1) from dwadm.cd_time having count(1)>2; |
J13 |
Validate that all ETL jobs are in disabled state. This is to make sure that target ETL jobs do not get executed by the scheduler agent until initial data replication is completed. |
Run the following query in target metadata schema and validate that no result is displayed. select * from b1_config_vw where entity_active_flg='Y' and context_cd='CCB1'; |
J14 |
Validate that source application must be available after the configuration. Source application was disabled before running the source configuration. Now application can be enabled for business transactions. |
At the time of source configuration with Oracle Utilities Analytics Warehouse, source application should be down, but the source database should be in read/write mode. Now source application can be enabled for business transactions. |
This concludes the validation of Pre-Data Warehouse Load checks. The next step is to load target dimensions, facts, and materialized views.