Analyzing and Cleansing Data for Sun Master Index

Required Format for Flat Data Files

Both the Data Cleanser and the Data Profiler are designed to read data from a staging database created by extracting data from your source database using Data Integrator. You can also extract your data to a flat file using the extractor of your choice. If you use Data Integrator to extract the data to be analyzed and cleansed, the extracted data is written to an Axion flat-file database in the required format for the Data Profiler and Data Cleanser.

If you use a data extractor other than Data Integrator, the data needs to be placed in a flat file a format the Data Profiler and Data Cleanser can read. If your data is in a different format, you can define a custom data reader to read the flat file into the Data Profiler and Data Cleanser. The analysis tools can read a flat file in the following format without any additional configuration:


GID|SystemCode|LocalID|UpdateDate|UserID|ObjectFields

where:

Below is an example of a valid input record based on the standard master index Person template, which includes alias, address, and phone objects. Note the empty fields after the first and last names for the phonetic and standardized data that will be inserted by the Data Cleanser. There are also empty fields after the street address for the parsed street address components that will also be inserted by the Data Cleanser.


28|ORACLE|00160419|11/14/1999 08:41:10|GSMYTHE|P|ELIZABETH|||ANN|WARREN||||MRS
|554-44-55555|08/18/1977|Y|F|M|W|13|BAP|ENG|STEVE|ANN|MARCH|GEORGE|CAHILL|SHEFFIELD
|CT|USA|E|Y||C4411444|CA|07/21/2018||ENG|USA#$BETH||CAHILL$LIZ|ANN|CAHILL#$H|1519 
BOARDWALK||||||Unit 5|SHEFFIELD|CT|09876|1075|CAPE BURR|USA$W|12500 EAST RIVER ST.
||||||Suite 1310|CAPE BURR|CT|09877||CAPE BURR|USA#$CH|9895557848|$CB|9895551500|19