Analyzing and Cleansing Data for Sun Master Index

Data Profiler Processing Attributes

The following table lists and describes the attributes for the profilerVariable element in the configuration file. These attributes define the data source and path names for the Data Profiler as well as batch size. Below is a sample of the profiler attributes.


profilerVariable objectdefFilePath="../../src/Configuration" 
DBconnection="../StagingDB" startFrom="50001" profileSize="50000" 
reportFilePath=/Reports

Attribute 

Description 

objectdefFilePath 

The path and filename to the object.xml file to use to profile the data.

DBconnection 

The path to the staging database or the path and name of the flat file containing the data to be profiled. In this path, use forward slashes rather than back slashes.  

startFrom 

The record number at which the Data Profiler will start analyzing data. Use this attribute, along with the profileSize attribute, if you are running the process in batches.

profileSize 

The number of records to process in one batch. This attribute is optional.  

reportFilePath 

The path where the Data Profiler reports will be stored. The profiler generates three different types of reports: Simple Frequency Reports, Constrained Frequency Reports, and Pattern Frequency Reports. One report is generated for each frequency rule you define. To see examples of these reports, see Data Profiler Report Samples