Data Type Check
The Data Type Check processor checks that the values in String or String Array attributes conform to a consistent data type, and categorizes as invalid any records with values that are not of the expected data type.
Note that Number and Date attributes are by definition 100% consistent with regard to their data type, and so cannot be checked.
The Data Type Check is a useful way of quickly finding values that have been entered into the wrong fields in a user application - typically numbers or dates that have been entered into fields that expect text values only.
Note that it is possible to 'expect' dates or numbers in a String attribute, and categorize as invalid any values that are not of the expected type. This is provided because dates and numbers are not always held in attributes with a controlled data type that can be read from the schema of the data source.
The following table describes the configuration options:
Configuration | Description |
---|---|
Inputs |
Specify one or more String or String Array attributes that you want to check for data type consistency. |
Options |
Specify the following options:
|
Outputs |
Describes any data attribute or flag attribute outputs. |
Data Attributes |
None. |
Flags |
For each attribute input, a new attribute is created in the following format:
|
The following table describes the statistics produced by the profiler:
The Date Formats Reference Data used by the Data Type Check must conform to the standard Java 1.6.0 or later SimpleDateFormat API.
Statistic | Description |
---|---|
Valid |
Records with data of the expected data type in the input attribute. |
Invalid |
Records with data not of the expected data type in the input attribute. |
Clicking on the Additional Information button will show the above statistics as percentages of the total number of records analyzed.
Output Filters
The Data Type Check produces the following output filters:
-
Valid records
-
Invalid records
Example
In this example, the Data Type Check is used to check if all values for a NAME attribute are in the textual format.In this case, null values are treated as invalid.
Input Attribute | Valid/Invalid |
---|---|
Michael |
Valid |
John Smith |
Valid |
<Null> |
Invalid |
19-Aug-2012 |
Invalid |