5.5.5.2 Unsupervised Scoring

  1. This is a pre-seeded batch and will be available in all workspaces (production & sandboxes).
  2. This Batch is to be executed in the Production workspace.

The scoring data batch is used to fetch one month or more of transactional data for previously segmented customers and also 12 months or more of transactional data for new entities who are now eligible for segmentation.

The following tables that this batch will populate.
  • AIF_BEHAVIORAL_DATA_UNSUP_PROD
  • AIF_NON_BEHAVIORAL_DATA_PROD

    Note:

    1. This batch has 2 tasks defined under it:
      • Scoring_Data_Load
      • ML_Scoring
    2. In Sandbox, Cluster Information will be stored in the AIF_ENTITY_CLUSTER table.

Figure 5-81 Define Task for Unsupervised Scoring



Data for new entities is populated into these tables:
  • AIF_BEHAVIORAL_DATA_UNSUP
  • AIF_NON_BEHAVIORAL_DATA
Scoring_Data_Load
  • Objective folder for this task:
    Home/Modeling/Pipelines/AIF Batch Framework/Unsupervised ML/Scoring
    Data
  • Model: Retain the default settings.

    Note:

  • The values in Optional Parameters can be edited:
    • from_date: From date in DD-MON-YYYY format. Example: 01-Jul-2021
    • to_date: To date in DD-MON-YYYY format. Example: 31-Jul-2021
  • Example: from_date=01-Jan-2021,to_date=31-Jan-2021
ML_Scoring
The values in Optional Parameters can be edited:
  • osot_end_month_anomaly_scoring: Specify the scoring data month in YYYYMM format. If it is not specified, then by default the latest month data available in the table will be picked up for anomaly scoring.
  • debug: Assign True if debug mode is to be switched on. Default is False.
  • data_start_date: Start date for Scoring Data lookup in YYYYMM format.
  • data_end_date: End Date for Scoring/New Data lookup in YYYYMM format.
  • method_anomaly_scoring: String indicating which anomaly scoring method to use. Currently "NNLOF", "PCAREC" and "ISOFOR" are supported and the default is "NNLOF".
  • cutoff_pctl_anomaly_scoring: Cutoff percentile for anomaly flags. Ranges from 0 to 100. Defaults to 99.
  • osot_end_month_deviation_scoring: Specify the scoring data month in YYYYMM format. If it is not specified, then by default the latest month data available in the table will be picked up for deviation scoring.
  • cutoff_pctl_deviation_scoring: Cutoff percentile for deviation scoring. Ranges from 0 to 100. Defaults to 99.
  • method_deviation_scoring: String indicating which deviation scoring method to use. Currently "LDCOF" and "CBLOF" are supported and the default is "CBLOF".
  • Choose Link Types as Scoring.

    Example:

    osot_end_month_anomaly_scoring=None,debug=False,data_start_date=202207,d ata_end_date=202207,method_anomaly_scoring=NNLOF,cutoff_pctl_anomaly_sco ring=99,osot_end_month_deviation_scoring=None,cutoff_pctl_deviation_scor ing=99,method_deviation_scoring=LDCOF

    Figure 5-82 Edit Task for Unsupervised ML_Scoring



After scoring for unsupervised, the data is stored in the following tables:
  • AIF_ANOMALY_SCORE
  • AIF_ANOMALY_SCORE_DETAILS
  • AIF_ANOMALY_SCORE_ECM_DETAILS
  • AIF_ENTITY_CLUSTER_DEVIATION

The application can consume anomaly scores from the above tables for downstream integrations. For more information on these tables, see the OFS Compliance Studio Data Model Reference Guide.