The following table lists and describes the attributes for the cleansingVariable element in the configuration file. These attributes define the data source and path names for the Data Cleanser as well as global validation rules. Below is a sample of the cleansing attributes.
cleansingVariable objectdefFilePath="../../src/Configuration" validateType="true" validateNull="false" validateLength="true" DBconnection="../StagingDB" goodFilePath="./Output/good.txt" badFilePath=./Output/bad.txt startCount="1" standardizer="true" |
Attribute |
Description |
---|---|
objectdefFilePath |
The path and filename for the object.xml file to use to cleanse the data. |
validateType |
An indicator of whether the cleanser should validate each field's data type against the type defined in object.xml. Specify true to validate field type; otherwise specify false. If you validate against type and the validation fails for any field in a record, the record is written to the bad file. |
validateNull |
An indicator of whether the cleanser should check for null values in each field that is configured to be required in object.xml. Specify true to check for null values; otherwise specify false. If you check for null values and any required field in a record is null, the record is written to the bad file. |
validateLength |
An indicator of whether the cleanser should validate each field's length against the length defined in object.xml. Specify true to validate field length; otherwise specify false. If you validate against length and the validation fails for any field in a record, the record is written to the bad file. |
DBconnection |
The path to the staging database or the path and name of the flat file containing the data to be profiled. Use forward slashes in this path rather than back slashes. |
badDataFilePath |
The path and name of the file that lists the records that are found to contain bad data during the cleansing process. This file includes an error message for each record describing the reason it was rejected. If you specify a path that does not exist, you need to create the path. |
goodDataFilePath |
The path and name of the file that lists the records that do not contain any bad data. These records can be processed through the Initial Bulk Match and Load tool into the master index database. If you specify a path that does not exist, you need to create the path. |
startCounter |
The starting number for the GID generator for the cleansed records. The GID is a unique value used by the Initial Bulk Match and Load tool, which takes the good data file created by the cleansing process as its input. Enter a non-negative long value. For the initial cleansing, set this to 1. |
standardizer |
An indicator of whether the Data Cleanser should standardize the input data according to the standardization rules defined in the mefa.xml file in the master index project. Specify true to standardize the data. This populates the standardized values into the output file. Specify false to bypass standardization. If no value is specified or this property is missing, the default is true. |