ofs_aif.sanctions package¶

Submodules¶

ofs_aif.sanctions.birth_year_similarity module¶

class birthyearSimilarity(year1, year2=1970)¶

Bases: object

ages()¶

digitSimilarity(digit_sim= 0 1 2 3 4 5 6 7 8 9 0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 4 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 5 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 6 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0)¶

exactMatch(flip_year=False)¶

timeDistance()¶

year_length = 365.2422¶

ofs_aif.sanctions.edq module¶

class edq¶

Bases: object

base64_string(input_string=None)¶

get_update_case_url()¶

get_user_password()¶

set_update_case_url(url)¶

set_user_password(edq_user_password)¶

update_case(alerts_pdf=None)¶

update_case_parallel(alerts_pdf=None)¶

ofs_aif.sanctions.event_scoring module¶

class eventScoring(connect_with_default_workspace=True)¶

Bases: supervised

This class eventScoring is a special use case of supervised learning for anti-money laundering for sanctions event scoring (AMLSES)

create_definition(save_with_new_version=False, cleanup_results=False, version=None)¶

API creates unique definition using Model Group and Model Group Scenario ( optional ) for a given Notebook.

Parameters:

model_group_scenario_name – Name of the Model Group , as per it created in AIF-Admin notebook.
save_with_new_version – Boolean flag with options True/False. It helps creating history of models/outputs for a given definition. Any version can be chosen at later point in time.
cleanup_results – Boolean flag with options True/False. When set to True, deletes all the outputs due to previous executions.
version – when multiple versions of the definitions are created, version is supplied to pick the required version of the definition. Default value is None means MAX version of the definition.

Returns:

Return successful message on completion, and proper error message on failure.

Examples:

>>> aif.create_definition( model_group_name = "CORPORATE AND INSTITUTIONAL" ,
>>>                        model_group_scenario_name = "SHELL",
>>>                        save_with_new_version = False,
>>>                        cleanup_results = False,
>>>                        version = None )
Definition creation successful...
True

create_evented_data(date_range=None, osot_date_range=None)¶

This API prepares sanctions evented data using python API, create_evented_data from table ml4aml_sanctions_events and stores them in AIF class members self.B_DF for in-time and self.B_DF_OSOT for out-time (osot). DATA_SOURCE and BUSINESS_CENTRE make a unique filter to get data from table.

Parameters:

date_range – From and To Date for OSIT ( Model Build ) data set in YYYYMMDD format or YYYYMMDD as numeric data type. Example: date_range = [20150101, 20151231]
osot_date_range – From and To Date for OSOT Validation data set in YYYYMMDD format as numeric data type Example: osot_date_range = [20160101, 20160331]

Returns:

osit and osot data is stored in class members for further references.

Example:

>>> aif.create_evented_data( date_range=[20150101,20151231], osot_date_range=[20160101,20160331])
Data preparation ( Sanctions events ) successful...
True

create_modeling_dataset(X=None, osot=False)¶

This API converts any new Sanctions data into modelling data by applying all the transformations recorded during training process for unsupervised.

Parameters:

X – Sanctions input data as pandas data frame.
osot – Boolean flag to indicate data set type ( in-time or out-time (OSOT) ). Set to False always while prediction.

Returns:

Sanctions stage 2 created data is saved inside the class object.

Example:

>>> aif.create_modeling_dataset( X )

get_event_score_summary(jurisdiction=None, business_domain=None, fic_mis_date=None)¶

get_evented_data(osot=False)¶

Get sanctions based events in-time(osit) or out-time(osot) data as pandas data frame.

Parameters:: osot – Boolean flag to indicate data set type ( in-time or out-time (OSOT) ). False : (default) For in-time data set. ( Model build dataset ) True : For OSOT dataset.
Returns:: osit/osot data as pandas data frame.

Example:

>>> B_OSIT_PDF  = self.get_evented_data();
Data dimension  : 41544 x 8
>>> B_OSOT_PDF  = self.get_evented_data(osot = True);
OSOT dataset is None

import_model_template(jurisdiction=None, business_domain=None, overwrite=False)¶

This API will create the objectives in Complaince Studio by taking jurisdiction & business domain as an input and also imports model drafts to respective objectives.

Parameters:

jurisdiction – Jurisdiction for an event segment
business_domain – Business domain for an event segment
overwrite – If True Model Templates will be overwritten.

Returns:

On successfull execution, imports model template notebooks into respective objectives/folders.

Examples:

>>> aif.import_model_template(jurisdiction = 'North America', business_domain = "United States of America", overwrite = False)

predict(X=None, date_range=None, key_column='EVENT_ID', fic_mis_date=None, batch_run_id=None, threshold=0.7, return_score=False, debug=False)¶

Test scoring interactively by connecting to production like schema before scheduling it as batch process in real production. Same sandbox can also be used for the scoring purpose. In this case sandbox schema should have scoring related input and output tables. All run time parameters expected during scoring batch should be set in studio paragraph for testing purpose.

Parameters:

X – Stage 2 transformed new data as pandas data frame. default is None
key_column – Identity column
date_range – Scoring date range as python list
fic_mis_date – AAI FIC MIS Date used in the batch execution.
batch_run_id – AAI Batch Run ID for the execution
threshold – Threshold to generate events for ECM. default 0.7
return_score – Boolean flag. If set to True scoring result is returned as panadas data frame to the caller. Default is False, and which is real production use case.
debug – Boolean(True/False). If set to True, debug mode is on

Returns:

Returns output scores as pandas data frame.

Examples:

>>> score_pdf_list = self.predict(X = Stage_2_OSOT_pdf,
>>>                 key_column = 'ENTITY_ID'
>>>                 date_range = ['','']
>>>                 fic_mis_date = date.today(),
>>>                 batch_run_id = 'RRF_ICC_BATCH_123',
>>>                 threshold = 0.5,
>>>                 return_score = True,
>>>                 debug = True )
Returns output scores as pandas data frame

set_edq_url(edq_url)¶

set_edq_user_password(edq_user_password)¶

set_event_segments(jurisdiction=None, business_domain=None)¶

show_event_segments()¶

View Available segments for sanction based events Displays all unique combinations of DATA_SOURCE & BUSINESS_CENTRE in ML4AML_SANCTIONS_EVENTS

Returns:: pandas dataframe show possible event segments.

Example:

>>> aif.show_event_segments()

update_edq_events(event_score_df=None, parallel=True)¶

update_event_score(jurisdiction=None, business_domain=None, fic_mis_date=None)¶

ofs_aif.sanctions.string_similarity module¶

class stringSimilarity(string1, string2, r_map={})¶

Bases: object

editDistance(swap=False, rm_vowels=False, rm_repeated=False)¶

histogramSimilarity(rm_vowels=False, rm_repeated=False)¶

longestCommonSubstr(swap=False, rm_vowels=False, rm_repeated=False)¶

phoneticEditDistance(swap=False, rm_vowels=False, rm_repeated=False)¶

vowels = ['a', 'e', 'i', 'o', 'u']¶

ofs_aif.sanctions.transformation module¶

class matchSimilarity¶

Bases: BaseEstimator, TransformerMixin

For each type of time-series variable calculates 3 jump values: NORM: (current_month ??? avg (prev_12_months)) / avg (prev_12_months) LM: (current_month ??? last_month) / last_month SMLY: (current_month ??? same_month_last_yr) / same_mth_last_yr

Apply the user configured ???%over??? thresholds (default is 200%). Denote a violation as ???1???, no violation as ???0???, not enough data for calculation ???_???, concatenate the result into a 3-digit bit map

Args:: threshold_percentage ([list]): Cut-off percentage for denoting a violation. Multiple jump variables for the same type of base variable can be created by using different thresholds passed as a list

aggregateInfo(x, old_field, new_field)¶

editDistance(x, rm_vowels, rm_repeated)¶

exactMatch(x, flip_year)¶

fit(X, key_var='ENTITY_ID', target_var='SAR_FLG', feature_include=None, feature_exclude=None)¶

Fit Jump bitmap transformer

Args:: X ([DataFrame]): Input time-series dataframe key_var (str, optional): Key variable for grouping time-series data. Defaults to “ENTITY_ID”. target_var (str, optional): Variable indicating the target. feature_include ([list], optional): List of features to be included in bitmap computation. Defaults to None. feature_exclude ([list], optional): List of features to be excluded from bitmap computation. Defaults to None.
Returns:: Self: Fitted transformer

genderMatch(x)¶

histogramSimilarity(x, rm_vowels, rm_repeated)¶

longestCommonSubstr(x, rm_vowels, rm_repeated)¶

name_mapping()¶

occupationAge(x)¶

timeDistance(x)¶

transform(X)¶

Transform time-series features within X to 3-char jump bitmaps

Args:: X ([DataFrame]): Input time-series dataframe
Returns:: [DataFrame] – Contains jump bitmap features for all the time-series features

ofs_aif.sanctions package¶

Submodules¶

ofs_aif.sanctions.birth_year_similarity module¶

ofs_aif.sanctions.edq module¶

ofs_aif.sanctions.event_scoring module¶

ofs_aif.sanctions.string_similarity module¶

ofs_aif.sanctions.transformation module¶

Module contents¶