4.4.1.1 Additional Configurations

To enhance the efficiency of history maintenance and delta processing, perform the following:

Note:

The default values are based on hardware configurations (Eight-core CPU and 64 GB RAM) and delta size (ten million records).
  1. Log in to ER Schema.
  2. Navigate to the FCC_ER_CONFIG table and configure the V_PARAM_VALUE value based on the DB performance.
    You can modify the following parameters in the table with Pipeline_ID as CSA_812 before executing the job based on your volume of data:
    • PREV_CHUNKS: The number of chunks of history tables during the last execution of the job. By default, it is set to 10. You should not modify the value. This parameter value will be modified automatically when you modify the TODAY_CHUNKS value and execute the job successfully.
    • TODAY_CHUNKS: The number of chunks of history tables for the current day/date. By default, it is set to 10. You can modify this value to change the number of chunks to be processed in the respective history tables when the job execution time is longer.

      Note:

      Here the chunk value is based on the volume of data being processed. It is recommended to increase the value to 15 when the volume of data being processed is more than 50 million records and monitor the performance.
    • MAX_JOBS: Maximum number of jobs to schedule in the Database at a time. By default, it is set to 35. You can modify this value to reduce job execution time.

      Note:

      Increasing this value only when the Database is not shared for the other processes is recommended.
    • CHUNK_SIZE: The number of records to process in one chunk. It is set to 2000000 (2 million records in each chunk) by default.

      Note:

      It is recommended to retain the default value. You can decrease it to a lower value for better performance only when the server (where the Database is installed) has less than eight CPUs.
    • MAX_HISTORY_PARTITIONS: The maximum number of partitions to be retained in the H$ tables.

      The minimum allowed value is 1. If the user provides a value less than this number, then it will retain 1 partition by default.

      The maximum allowed value is 3. If the user provides a value greater than this number, then it will retain 3 partitions by default.

      Note:

      • The value for MAX_HISTORY_PARTITIONS parameter should be a positive integer. The valid range is 1 to 3.
      • Tables with regular expression H$STG_%_PRE_DELETED would be excluded from this MAX_HISTORY_PARTITIONS limit.
    • DB_PARALLEL_LEVEL: It configures a degree of parallelism for data survival (Job 3). By default, it is set to 8, and you can modify this value to change the level of parallelism.
    • BULK_APPLY_DS_FOR_SINGLETON_PARTIES: It configures whether data survival (Job 3) should be applied or not for singleton parties. By default, it is set to “N.” In this case, the data survival will not be applied to the singleton parties.

      If the value is set to “Y,” then data survival will be applied to the singleton parties.

    • F_ER_DS_SUBSEQUENT_BATCH: This parameter is used when the user approves a record from the Pending - System Requests tab of the Merge & Split Global Entities UI.

      The valid values are True and False. By default, the value is set to False.

      If it is set to True, then data survival is applied to the approved system request on the subsequent day's batch run.

      If it is False, then the data survival is applied immediately upon approving the system request from the UI.

    • ER_DS_SYSTEM_PENDING_MAX_NO_REC: This is the maximum number of records which can be approved from the Pending - System Requests tab of the Merge & Split Global Entities UI at once.

      By default, the value is set to 10. The valid values range is 1 to 100.

      If the user tries to approve more records than the number mentioned for this parameter, an alert is displayed to the user on the UI.

      This is applicable only when F_ER_DS_SUBSEQUENT_BATCH is set to False.

      If F_ER_DS_SUBSEQUENT_BATCH is set to True, this count is overridden and all the records from the UI can be approved using the Approve All button.

    • F_CAPTURE_COUNT_STAT: This flag indicates count statistics to be captured during the entity resolution job execution. If it is set to true, count statistics are logged in the FCC_ER_JOB_VOL_STATS table of the ER/FSDF schema. By default, this value is set to true.

      Note:

      This parameter is applicable for all the entity resolution jobs.
    • SINGLETON_TASK_PARALLEL_LEVEL: The parameter indicates the maximum number of parallel processes spawned during DBMS PARALLEL EXECUTE. By default, the value is 8. This configuration is populated with default value when the value is not available during the Create Index and Load the Data (Job 1) immediate run. This is a onetime configuration that has to be handled for all runs which includes bulk volume runs.
    • CAN_SEL_BUCKET_SIZE: The maximum bucket size in the source data. By default, it is 2000.
  3. Save the changes.

Profiler Table

The table, ER_PERFORMANCE_TIME_PROFILER in ER schema, helps the user track the current status of the batch and debug performance issues.

The ER_PERFORMANCE_TIME_PROFILER table stores the queries that are executed during delta processing. Here are a few parameters that help to debug which query is failed:
  • V_TABLE_NAME: It stores the table name for which the query was executed.
  • N_CHUNK: It stores the chunk number that is executed.
  • D_STARTTIME: It stores Database time when the query starts to execute.
  • D_ENDTIME: It stores the Database time when the query got executed.
  • V_TOTAL_TIME: It stores the duration of the query execution.
  • V_STATUS: Current status of the query. The values are START, RUNNING, or END.
  • V_QUERY: It stores the query that was executed.
  • N_RUN_SKEY: It stores the runSKey value of the currently executing job.
To check the query status, perform the following:
  1. Log in to ER Schema.
  2. Run the following command:SELECT * FROM ER_PERFORMANCE_TIME_PROFILER WHERE N_RUN_SKEY = <CURRENT_RUNSKEY>

    For example, SELECT * FROM ER_PERFORMANCE_TIME_PROFILER WHERE N_RUN_SKEY = 200

  3. Check V_STATUS. The status other than the END value indicates the failed query.

    Note:

    If any unexpected failure occurs, there is no explicit cleanup activity to be performed in the Create Index and Load Data job as it is automatically taken care of re-run of the job.

Cleanup Steps for Job Termination

Execution of manual cleanup is required in case of any fatal user error, such as executing the job against incorrect FIC_MIS_DATE, except for any semantic and logic validation taken. After contacting My Oracle Support, you can perform cleanup steps. For more information about cleanup steps, see the Cleanup Steps When the Create Index and Load Data Job Terminated Manually section.

For more information about parameters, see the Parameters for Entity Resolution Job execution section.