9.3.4.2 Creating Data Survival Rules
This topic describes the systematic instructions to create data survival rules for match ruleset.
- On the Data Survival Rules page, click
Add. The page displays the attributes to create the
Rule.
Note:
You should not create multiple data survival rulesets for a particular ER Type. - Enter the field values as described in the following table.
Table 9-11 Fields and Description - Data Survival Rules
Field Description Rule Name The name of the Rule. Description The description of the Rule. ER Type The type of data that is to be matched. The values are as follows: - CSA_8128
- CSA_8129
Note:
It is recommended to use CSA_8129 pipeline with Compliance Studio 8129 and BD 8129 and the other pipeline is deprecated.
The lower version pipeline is supported only if you are upgrading to CS 8.1.2.9.0.
Dataset The name of the output data structure to which this data is saved. The values are as follows: - Customer8128
- Customer8129
Once you select the Dataset, the input tables are displayed. The list of tables will change based on the Dataset selection. Each table displayed can be expanded to allow you to select the survival function for each attribute if needed. - On the Create Data Survival Rule Match Ruleset page, perform the
following:
- Filter the records using the search filter (Type the filter)
- Sort the records
To sort the records, right-click on any of the columns and select Sort > Sort Ascending or Sort Descending for ascending and descending order, respectively. The records will be displayed according to sort order.
- Select the method types. See the following step for method type details.
- You can select the following method types used to store the final record.
- Generate: This is applicable for primary keys of the tables, where the system will generate the unique identification number for the record.
- Generate Sequence: This is applicable for child records, where the system will generate the identification number in sequential order of the particular parent key.
- Distinct: This will apply to child tables only.
This will remove the duplicate values and allow the user to choose to
save the unique record. This applies to columns that are marked as
primary keys.
Note:
The user should not have Type Distinct and All together with other columns that return unique values in child tables. - Latest: A date column like Data dump date against
each row in the input tables will pick the latest date. For example, if
the customer's Occupation is Teacher and Businessman and all the records
have a Data Dump date against them, the latest record value will be
taken.
You can select the latest date attribute from the Latest Values drop-down list.
- Longest: It counts the number of characters and picks the most extended values.
- Most Common: This will apply to the most repeated value that will be picked for data survival. For example, if the Occupation of three customers is 1. Business, 2. Business, and 3. The teacher then Business occurs twice so that value will be picked.
- All: This case can only apply to child records, where you can store multiple values against customer id. For example, you have three customers to be merged, and all three have different email IDs, so all the email IDs will be stored against the merged Global entity.
- Default: This will be used for inserting a user-defined value for columns. For example – if the user wants to provide the default value of the Branch_Code column as ER, then ER will be inserted for all the rows.
- Maximum: This will apply to number columns. Users can choose this for storing the maximum number value. For example, if Annual Income is C1 - 2M, C2- 3M, and C3 - 5M, then 5 M will be saved for the global entity.
- Minimum: This will apply to number columns. Users can choose this for storing minimum number values. For example, if Annual Income is C1 - 2M, C2- 3M, and C3 - 5M, then 2M will be saved for the global entity.
- Null: The user can choose not to insert any value while persisting the record. So null will signify a null value for that particular column.
- Metadata: This will apply to
attributes that should be selected based on metadata, for example, to
select the occupation with the highest risk score.
To enable this, select the Type as Metadata. You can select the metadata type in the Default Value/Type drop-down list.
The types of Metadata are user-defined and can be set using an API call (See Populate the Metadata for Data Survival in Studio Schema section in the OFS Compliance Studio Administration and Configuration Guide). Each attribute value that appears in the metadata for the given type will be given a numerical value. In the UI, select the precedence value in the Latest Values/Precedence drop-down list as either Minimum or Maximum. This will select the attribute with the highest or lowest numerical score as part of the Data Survival logic.Note:
- This is applicable only for String attributes.
- Numerical scores have to be provided against each attribute value in the source data. Where a value does not exist in the metadata, then no numerical value will be assigned, and it will be excluded from selection if other attributes have values.
- Where no strings are in the metadata, the data survival algorithm will select any one of the attributes for the output.
- User Defined Method: It displays the custom data
survival method created by the user. To create a custom data survival
method, see the FCC_DATASURV_TYPE table in the OFS Compliance Studio
Administration and Configuration Guide.
Note:
If an invalid value is returned from the data survival logic defined in the custom method, then the job will be successful. However, the record is not survived in the Staging Output tables.
- Click Save to save the Rule or click Cancel to revert the changes.