Adjusting the Master Index Configuration - Analyzing and Cleansing Data for a Master Index

Skip Navigation Links
Exit Print View
	Analyzing and Cleansing Data for a Master Index Java CAPS Documentation

Oracle Technology Network

Document Information

Analyzing and Cleansing Data for a Master Index

Data Cleansing and Analysis Overview

About the Data Profiler

About the Data Cleanser

Data Cleansing and Profiling Process Overview

Required Format for Flat Data Files

Generating the Data Profiler and Data Cleanser

To Generate the Data Profiler and Data Cleanser

Configuring the Environment

To Configure the Environment

Extracting the Legacy Data

Determining the Fields to Analyze

Defining the Data Analysis Rules

To Define Data Analysis Rules

Performing the Initial Data Analysis

To Perform the Initial Data Analysis

Reviewing the Data Profiler Reports

Configuring the Data Cleansing Rules

To Configure the Data Cleansing Rules

Cleansing the Legacy Data

To Cleanse the Data

Performing Frequency Analyses on Cleansed Data

Adjusting the Master Index Configuration

Data Profiler Rules Syntax

Data Profiler Processing Attributes

Data Profiler Global Variables

Simple Frequency Analysis Rules

Constrained Frequency Analysis Rules

Pattern Frequency Analysis Rules

Data Cleanser Rules Syntax

Data Cleanser Processing Attributes

Data Cleanser Global Variables

Data Validation Rules

validateDBField

Data Transformation Rules

Conditional Data Rules

Conditional Operators

Data Profiler Report Samples

Simple Frequency Analysis Report Samples

Constrained Frequency Analysis Report Samples

Pattern Frequency Analysis Report Samples

Adjusting the Master Index Configuration

Based on the results of the final frequency analyses (see Performing Frequency Analyses on Cleansed Data), you might need to adjust the configuration of the master index application by adjusting the block fields if the frequencies are too high and by setting the relative match weights based on how unique each match field is. The results could also indicate that you might need to define exclusion files for the Initial Bulk Match and Load tool or filters for the SBR filter so certain values are not used for matching. For example, if there are a large number of SSN fields with the default value “000–00–0000”, you can exclude that values from the blocking process, the match process, or the survivor calculation for the single best record.

Copyright © 2009, 2011, Oracle and/or its affiliates. All rights reserved. Legal Notices

Previous

Next