Analyzing and Cleansing Data for Sun Master Index

Simple Frequency Analysis Report Samples

Simple frequency analysis reports list the frequencies of various data values found in the specified fields without using any data verification, transformation, or conditional rules. You can specify a sort order for this type of report, and you can specify a frequency threshold. The sample frequency rule defined below analyzes first and last names, reporting the top 6 frequencies and only if the frequency is three or more.


<SimpleFrequencyAnalysis>
  <fields>
    <field fieldName="Person.FirstName"/>
    <field fieldName="Person.LastName"/>
  </fields>
  <sortOrder fieldName="Person.FirstName" increasing="false"/>
  <threshold value="3" more="true"/>
  <topNpatterns value="6" showall="false"/> 
</SimpleFrequencyAnalysis>

This analysis generates a report similar to the following:

SF_PROFILE_SIMPLE_FRQ_1_1–0.csv

PERSON.LAWSTNAME 

PERSON.FIRSTNAME 

FREQUENCY 

SMITH 

ANN 

38 

JONES 

SUSAN 

31 

SMITH 

JOHN 

31 

THOMPSON 

JAMES 

28 

JOHNSON 

BETH 

26 

MILLER 

FRANK 

25 

The sample frequency rule defined below analyzes social security numbers and analyzes whether there are duplicates (two or more occurrences).


<SimpleFrequencyAnalysis>
  <fields>
    <field fieldName="Person.SSN"/>
  </fields>
  <sortOrder fieldName="Person.SSN" increasing="false"/>
  <threshold value="2" more="true"/>
</SimpleFrequencyAnalysis>

This analysis generates a report similar to the following excerpt:

SF_PROFILE_SIMPLE_FRQ_2_1–0.csv

PERSON.SSN 

FREQUENCY 

999999999 

457 

000000000 

125 

123456789 

41 

222423535 

992203847