ofs_aif.ofs_asc package

Submodules

ofs_aif.ofs_asc.asc module

class asc(connect_with_default_workspace=True)

Bases: ofs_aif.aif.aif, ofs_aif.aif_utility.aif_utility

This class ofs_asc contains the python methods for automatic scenario calibration use cae

add_model_groups(model_group_name=None)

Create segmentation (model group) for AMLES

Parameters:

model_group_name – Unique name for the model group. Only alphanumeric character set including underscore, hyphen and space are allowed

Returns:

successful message on successfully creating the model groups in AIF system.

Examples:
>>> input_pdf = pd.DataFrame({'MODEL_GROUP_NAME'   : ["Sig Cash 1"],
>>>                         'ENTITY_NAME'         : ["ASC"],
>>>                         'ATTRIBUTE_NAME'      : ["ASC"],
>>>                         'ATTRIBUTE_VALUE'     : ["ASC"],
>>>                         'LABEL_FILTER'        : ["ASC"],
>>>                         'FEATURE_TYPE_FILTER' : ["ASC"]
>>>                         })
>>>
>>> supervised.add_model_groups(self, input_pdf )
calculate_sample_size(sample_size_method='hyper_geometric', hyper_params=None)

It calculates sample size for each of the strata using hypergeometric distribution as a default method.

Parameters:
  • sample_size_method

    Method to get the sample size. The default method is hyper_geometric. Valid options are:

    • sample_size_method=’hyper_geometric’
      • It takes a sample from hypergeometric distribution

    • sample_size_method=function
      • The user defined method which is to be passed like sample_size_method = proportionate_sampling

      • The first parameter of the user defined method should always the strata population.

      • The user-defined method should always return sample number as an output.

  • hyper_params

    dict of tunable parameters for hyper geometric distribution.

    • It is only applicable when sample_size_method = “hyper_geometric”

    • Keys of dict should be a strata number and values should be tunable parameters.

    • Examplehyper_params={1{‘Pt’0.005, ‘Pe’0}, 2{‘Pt’0.005, ‘Pe’0}, 3{‘Pt’0.005, ‘Pe’0}} )
      • Pe: Expected interesting event rate.

      • Pt: Tolerable suspiciuos event rate

      • Pw: Power of the test. Default is 95%

Returns:

dataframe

Examples:
>>> ofs_asc.perform_stratification()
>>>
>>>#User defined method for computing sample size
>>>stratified_summary = ofs_asc.show_stratification_summary()
>>>def proportionate_sampling(strata_population, stratify_proportions = [0.45, 0.30, 0.10]):
>>>    for idx, row in stratified_summary.iterrows():
>>>    if row['STRATA'] == 1:
>>>        sample_size = strata_population*stratify_proportions[0]
>>>    elif row['STRATA'] == 2:
>>>        sample_size = strata_population*stratify_proportions[1]
>>>    else:
>>>        sample_size = strata_population*stratify_proportions[2]
>>>    return sample_size
>>>
>>>ofs_asc.calculate_sample_size( sample_size_method = proportionate_sampling)
create_definition(save_with_new_version=False, cleanup_results=False, version=None)

API creates unique definition using Model Group for a given Notebook. Internally called AIF create_definition method.

Parameters:
  • save_with_new_version – Boolean flag with options True/False. It helps creating history of models/outputs for a given definition. Any version can be chosen at later point in time.

  • cleanup_results – Boolean flag with options True/False. When set to True, deletes all the outputs due to previous executions.

  • version – when multiple versions of the definitions are created, version is supplied to pick the required version of the definition. Default value is None means MAX version of the definition.

Returns:

Return successful message on completion, and proper error message on failure.

Examples:
>>> ofs_asc.create_definition( save_with_new_version = False,
>>>                        cleanup_results = False,
>>>                        version = None )
Definition creation successful...
True
execute_scenario()

This API executes multiple instances of scenario notebook in parallel for different fic_mis_date. The scenario data gets stored into table fcc_am_event_details

Param:

None

Returns:

Display successful message.

generate_expression(tunable_parameters=None)

It converts the tunable parameters passed by a user into a format accepted by the operator_overloading class.

Parameters:

tunable_parameters (str) – tunable parameters passed by a user. Logical & and | operators should be used to create an expression. Ex: ‘Large_Cash_Transaction_Amt & (Large_Cash_Transaction_Cnt | (Large_Cash_Transaction_Amt & Large_Cash_Transaction_Cnt))’

Returns:

expression

Examples:
>>> ofs_asc.generate_expression(tunable_parameters = 'Large_Cash_Transaction_Amt & (Large_Cash_Transaction_Cnt | (Large_Cash_Transaction_Amt & Large_Cash_Transaction_Cnt))')
'LARGE_CASH_TRANSACTION_AMT & (LARGE_CASH_TRANSACTION_CNT | (LARGE_CASH_TRANSACTION_AMT & LARGE_CASH_TRANSACTION_CNT))'
get_cross_segment_parameter_analysis(segments_include=None, segments_exclude=None, select_features=None, figsize=(14, 6), title=None)

Function to compare distribution of alerts and effective alerts for specified parameters across specified segments.

Parameters:
  • segments_include (list) – list of segments to include

  • segments_exclude (list) – list of segments to exclude

  • select_features (list) – list of features to be plotted

  • figsize (tuple) – size of plot

  • title (str) – title of chart

Returns:

plot

get_cross_tab(segments_include=None, segments_exclude=None, feature_name=None, bins=None, round_digits=0)

Utility function to get a cross tab of parameter and effective flag

Parameters:
  • segments_include (list) – list of segments to include

  • segments_exclude (list) – list of segments to exclude

  • feature_name (str) – name of the feature being analyzed

  • bins (numpy array or list) – array specifying bins, values should be bucketed into; If none deciles are used as intervals

  • round_digits (int) – number specifying how intervals should be rounded. To be used when deciles are used to bin the parameter values

Returns:

dataframe

get_data(tag='BTL', tunable_parameters=None, segments_include=None, segments_exclude=None)

It retrieves the actual data for the tunable parameters passed by the user. Depending on the tag, either BTL or ATL data will be retrieved for analysis.

Parameters:
  • tag (str) – ‘BTL’ tag for BTL analysis and ‘ATL’ tag for ATL analysis. Default is ‘BTL’

  • tunable_parameters (str) – (mandatory) Parmeteres to be tuned passed using logical & and | operators

  • segments_include (list) – segments to be included

  • segments_exclude (list) – segments to be excluded

Returns:

dataframe

Examples:
>>> tunable_parameters='Large_Cash_Transaction_Amt & (Large_Cash_Transaction_Cnt | (Large_Cash_Transaction_Amt & Large_Cash_Transaction_Cnt))'
>>> ofs_asc.get_data(tunable_parameters=tunable_parameters, segments_include=['AMEA_HR','AMEA_MR'])
get_density_plots(segments_include=None, segments_exclude=None, select_features=None, figsize=(14, 6), title=None)

Function to get density plots for multiple parameters

Parameters:
  • segments_include (list) – list of segments to include

  • segments_exclude (list) – list of segments to exclude

  • select_features (list) – List of feature names(strings) to be plotted

  • figsize (tuple) – size of plot

  • title (str) – title of chart

Returns:

plot

get_effectiveness_trend(segments_include=None, segments_exclude=None, feature_name=None, bins=None, figsize=(14, 6), title=None, **kwargs)

Function to get density plots for multiple parameters

Parameters:
  • segments_include (list) – list of segments to include

  • segments_exclude (list) – list of segments to exclude

  • feature_name (str) – feature to be analyzed

  • bins (numpy array or list) – array specifying bins, values should be bucketed into; If none deciles are used as intervals

  • figsize (tuple) – size of plot

  • title (str) – title of chart

  • **kwargs

    Keyword arguments:

    • round_digits (int) –

      number of digits to which bin limits are to be rounded

    • ax (int) –

      axis on which the plot is to be placed. Used only when creating multi grid plots

Returns:

Plot

Function to get density plots for multiple parameters

Parameters:
  • segments_include (list) – list of segments to include

  • segments_exclude (list) – list of segments to exclude

  • select_features (list) – List of feature names(strings) to be plotted

  • bins (numpy array or list) – array specifying bins, values should be bucketed into; If none deciles are used as intervals

  • figsize (tuple) – size of plot

  • title (str) – title of chart

  • **kwargs

    Keyword arguments:

    • round_digits (int) –

      number of digits to which bin limits are to be rounded

Returns:

Plot

get_frequency_table_1D(segments_include=None, segments_exclude=None, feature_name=None, bins=None, plot=False, figsize=(8, 6), **kwargs)

Function to return a frequency table and optionally convert to a heat map

Parameters:
  • segments_include (list) – list of segments to include

  • segments_exclude (list) – list of segments to exclude

  • feature_name (str) – feature to be analyzed

  • bins (numpy array or list) – array specifying bins, values should be bucketed into; If none deciles are used as intervals

  • plot (bool) – specifying whether to plot a heat map

  • figsize (tuple) – size of plot

  • **kwargs

    Keyword arguments:

    • cmap (str) –

      any accepted python color map

    • round_digits (int) –

      number of digits to which bin limits are to be rounded

    • ax (int) –

      axis on which to place the plot. Used only for multigrid plots

Returns:

dataframe and Plot(optional)

get_frequency_table_2D(segments_include=None, segments_exclude=None, select_features=None, bins=None, plot=False, figsize=(14, 8), **kwargs)

Function to return a frequency table and optionally convert to a heat map

Parameters:
  • segments_include (list) – list of segments to include

  • segments_exclude (list) – list of segments to exclude

  • select_features (list) – List of feature names(strings) to be plotted

  • bins (numpy array or list) – array specifying bins, values should be bucketed into; If none deciles are used as intervals

  • plot (bool) – specifying whether to plot a heat map

  • figsize (tuple) – size of plot

  • **kwargs

    Keyword arguments:

    • cmap (str) –

      any accepted python color map

    • round_digits (int) –

      number of digits to which bin limits are to be rounded

Returns:

dataframe and Plot(optional)

get_frequency_tables_1D(segments_include=None, segments_exclude=None, select_features=None, bins=None, figsize=(8, 6), plot=False, title=None, **kwargs)

Function to get frequency tables and optionally heatmaps for multiple parameters

Parameters:
  • segments_include (list) – list of segments to include

  • segments_exclude (list) – list of segments to exclude

  • select_features (list) – features to be analyzed

  • bins (list of lists or arrays) – list of arrays specifying bins, values should be bucketed into; If none deciles are used as intervals

  • figsize (tuple) – size of plot

  • plot (bool) – specifying whether to plot a heat map

  • title (str) – title for plot

  • **kwargs

    Keyword arguments:

    • cmap (str) –

      any accepted python color map

    • round_digits (int) –

      number of digits to which bin limits are to be rounded

Returns:

list of dataframes and Plot(optional)

get_overall_summary(segments_include=None, segments_exclude=None)

Function to get segment wise summary of ATL Data

Parameters:
  • segments_include (list) – list of segments to include

  • segments_exclude (list) – list of segments to exclude

Returns:

Dataframe with summary of alerts

get_samples(strata_include=None, strata_exclude=None)

It takes the random samples from each strata equal to the number of sample_size calculated.

::param strata_include: strata to be included :type strata_include: list :param strata_exclude: strata to be excluded :type strata_exclude: list

Returns:

Returns successful message

Examples:
>>> ofs_asc.get_samples(strata_include = ['AMEA_HR_1','AMEA_MR_1'], strata_exclude = None)
static hyper_calc(N, Pt=0.005, Pe=0, Pw=0.95)

sample size calculation from Hyper geometric distribution

Parameters:
  • N – Strata size

  • Pt – Tolerable suspiciuos event rate

  • Pe – Expected interesting event rate

  • Pw – Power or reliability

Returns:

sample_size

identify_atl_btl_population()

This API calls a SQL procedure p_asc_update_event_master4atl which updates the event_tag from BTL to ATL for each fic_mis_date in ASC_EVENT_MASTER table

Param:

None

Returns:

None

import_analysis_templates(analysis_id=None)

This API will create the objectives in MMG by taking scenario folder name as an input and also imports below analysis drafts to respective objectives.

  • BTL Analysis

  • ATL Analysis

  • ASC Scenario Execution

  • Impact Analysis

Parameters:

analysis_id – folder name which will be created under Home/ASC/Analysis. Ex: ‘Sig Cash 2’

Returns:

API response in json format.

Examples:
>>> aif.import_analysis_templates(analysis_id='Sig Cash 2')
investigate_samples(strata_include=None, strata_exclude=None)

It pushes the selected samples to DB table ASC_EVENT_SAMPLE

Parameters:
  • strata_include (list) – strata to be included

  • strata_exclude (list) – strata to be excluded

Returns:

Returns successful message

Examples:
>>> ofs_asc.investigate_samples(strata_include = None, strata_exclude = ['AMEA_HR_0','AMEA_MR_0','AMEA_RR_0'])
load_asc_event_master()

It calls a SQL procedure p_asc_load_event_master which loads data into ASC_EVENT_MASTER table for a list of fic_mis_date given by a class variable self.run_dates

Param:

None

Returns:

status message for each fic_mis_date

load_object()

Loads the object saved using self.save_object()

Param:

None

Returns:

valid python object on successful execution.

Example:
>>> data_pdf = ofs_asc.load_object()
perform_stratification(perc_list=[0.7, 0.3, 0.2], startification_method='percentile')

It stratifies the input population (group by JURISDICTION and RISK_LEVEL) using stratified sampling and assigns strata number to each group within chosen segment. The default method is percentile used for creating stratas where each tunable parameter is converted to percentile feature and then based on the cutt-off percentile passed by perc_list, the segments gets splitted into number of stratas equal to the splitted values passed in perc_list.

Parameters:
  • perc_list

    list of cutt-off percentiles to create stratas.

    • Strata number will folow the order from left to right.

    • First element in the list represent strata 1 and second element represent strata 2 and so on…

    • Strata 0 will always have an entities not included in any other startas

  • startification_method – stratas will be created on percentile method. Default value is “percentile”

Returns:

dataframe

Examples:
>>> ofs_asc.perform_stratification(perc_list=[0.8,0.6,0.4,0.2])
save_object(description=None)

save python objects permanently. Load them whenever needed.

Parameters:

description – description for the object to be saved

Returns:

Paragrapgh execution will show success else failure.

Example:
>>> data_pdf = data.frame({'COL1':[1,2,3], 'COL2':[4,5,6]})
>>> ofs_asc.save_object( value = data_pdf)
scenario_post_processing()

It calls a method load_asc_event_master

Param:

None

Returns:

None

show_atl_btl_population()

This API shows the count of ATL and BTL population for each fic_mis_date in ASC_EVENT_MASTER table

Param:

None

Returns:

dataframe

show_event_segments()

This API calls a SQL procedure p_asc_show_event_segments and returns the event’s volume for each unique combination of JURISDICTION and RISK_LEVEL along with the unique SEGMENT_ID for each row.

Param:

None

Returns:

dataframe

Examples:

>>> ofs_asc.show_event_segments()
SEGMENT_ID JURISDICTION RISK_LEVEL  POPULATION
AMEA_HR         AMEA         HR        1778
AMEA_MR         AMEA         MR           1
AMEA_RR         AMEA         RR           3
show_event_volume()

This API displays the event volume for each fic_mis_date in ASC_EVENT_MASTER table

Param:

None

Returns:

dataframe

show_execution_status()

It shows the execution status of the scenarios. The status can be one of these:

  • RUNNING

  • FAILED

  • COMPLETED

Param:

None

Returns:

dataframe

show_sample_size(strata_include=None, strata_exclude=None)

It shows the sample size calculated for each of the segments.

Parameters:
  • strata_include (list) – strata to be included

  • strata_exclude (list) – strata to be excluded

Returns:

dataframe showing sample size for each strata

Examples:
>>> ofs_asc.show_sample_size(strata_include = ['AMEA_HR_1','AMEA_MR_1'], strata_exclude = None)
show_samples(strata_include=None, strata_exclude=None)

It shows the selected samples for each strata.

Parameters:
  • strata_include (list) – strata to be included

  • strata_exclude (list) – strata to be excluded

Returns:

Returns dataframe showing selected samples

Examples:
>>> ofs_asc.show_samples(strata_include = ['AMEA_HR_1','AMEA_MR_1'], strata_exclude = None)
show_scenario_and_focus()

Display available scenarios and focal entities for analysis

Param:

None

Returns:

dataframe

Examples:
>>> ofs_asc.show_scenario_and_focus()
show_scenario_bindings()

Returns the binding names for current selected scenario which user can choose for setting up the tunable parameters.

Param:

None

Returns:

list of scenario binding’s names

show_scenario_parameters()

This API finds out all the possible run dates for date ranges provided by the user and display all parameters set by the user.

Param:

None

Returns:

Display parameters set by a user.

show_stratification_summary()

It displays strata number assigned to each group along with events population within each strata.

Param:

None

Returns:

dataframe showing strata number assigned to each segment

Examples:
>>> ofs_asc.show_stratification_summary()
class operator_overloading(operator_value)

Bases: object

Module contents