ofs_aif.ofs_asc package¶
Submodules¶
ofs_aif.ofs_asc.asc module¶
- class asc(connect_with_default_workspace=True)¶
Bases:
ofs_aif.aif.aif
,ofs_aif.aif_utility.aif_utility
This class
ofs_asc
contains the python methods for automatic scenario calibration use cae- add_model_groups(model_group_name=None)¶
Create segmentation (model group) for AMLES
- Parameters:
model_group_name – Unique name for the model group. Only alphanumeric character set including underscore, hyphen and space are allowed
- Returns:
successful message on successfully creating the model groups in AIF system.
- Examples:
>>> input_pdf = pd.DataFrame({'MODEL_GROUP_NAME' : ["Sig Cash 1"], >>> 'ENTITY_NAME' : ["ASC"], >>> 'ATTRIBUTE_NAME' : ["ASC"], >>> 'ATTRIBUTE_VALUE' : ["ASC"], >>> 'LABEL_FILTER' : ["ASC"], >>> 'FEATURE_TYPE_FILTER' : ["ASC"] >>> }) >>> >>> supervised.add_model_groups(self, input_pdf )
- calculate_sample_size(sample_size_method='hyper_geometric', hyper_params=None)¶
It calculates sample size for each of the strata using hypergeometric distribution as a default method.
- Parameters:
sample_size_method –
Method to get the sample size. The default method is hyper_geometric. Valid options are:
- sample_size_method=’hyper_geometric’
It takes a sample from hypergeometric distribution
- sample_size_method=function
The user defined method which is to be passed like sample_size_method = proportionate_sampling
The first parameter of the user defined method should always the strata population.
The user-defined method should always return sample number as an output.
hyper_params –
dict of tunable parameters for hyper geometric distribution.
It is only applicable when sample_size_method = “hyper_geometric”
Keys of dict should be a strata number and values should be tunable parameters.
- Examplehyper_params={1{‘Pt’0.005, ‘Pe’0}, 2{‘Pt’0.005, ‘Pe’0}, 3{‘Pt’0.005, ‘Pe’0}} )
Pe: Expected interesting event rate.
Pt: Tolerable suspiciuos event rate
Pw: Power of the test. Default is 95%
- Returns:
dataframe
- Examples:
>>> ofs_asc.perform_stratification() >>> >>>#User defined method for computing sample size >>>stratified_summary = ofs_asc.show_stratification_summary() >>>def proportionate_sampling(strata_population, stratify_proportions = [0.45, 0.30, 0.10]): >>> for idx, row in stratified_summary.iterrows(): >>> if row['STRATA'] == 1: >>> sample_size = strata_population*stratify_proportions[0] >>> elif row['STRATA'] == 2: >>> sample_size = strata_population*stratify_proportions[1] >>> else: >>> sample_size = strata_population*stratify_proportions[2] >>> return sample_size >>> >>>ofs_asc.calculate_sample_size( sample_size_method = proportionate_sampling)
- create_definition(save_with_new_version=False, cleanup_results=False, version=None)¶
API creates unique definition using Model Group for a given Notebook. Internally called AIF
create_definition
method.- Parameters:
save_with_new_version – Boolean flag with options True/False. It helps creating history of models/outputs for a given definition. Any version can be chosen at later point in time.
cleanup_results – Boolean flag with options True/False. When set to True, deletes all the outputs due to previous executions.
version – when multiple versions of the definitions are created, version is supplied to pick the required version of the definition. Default value is None means MAX version of the definition.
- Returns:
Return successful message on completion, and proper error message on failure.
- Examples:
>>> ofs_asc.create_definition( save_with_new_version = False, >>> cleanup_results = False, >>> version = None ) Definition creation successful... True
- execute_scenario()¶
This API executes multiple instances of scenario notebook in parallel for different fic_mis_date. The scenario data gets stored into table
fcc_am_event_details
- Param:
None
- Returns:
Display successful message.
- generate_expression(tunable_parameters=None)¶
It converts the tunable parameters passed by a user into a format accepted by the
operator_overloading
class.- Parameters:
tunable_parameters (
str
) – tunable parameters passed by a user. Logical&
and|
operators should be used to create an expression. Ex: ‘Large_Cash_Transaction_Amt & (Large_Cash_Transaction_Cnt | (Large_Cash_Transaction_Amt & Large_Cash_Transaction_Cnt))’- Returns:
expression
- Examples:
>>> ofs_asc.generate_expression(tunable_parameters = 'Large_Cash_Transaction_Amt & (Large_Cash_Transaction_Cnt | (Large_Cash_Transaction_Amt & Large_Cash_Transaction_Cnt))') 'LARGE_CASH_TRANSACTION_AMT & (LARGE_CASH_TRANSACTION_CNT | (LARGE_CASH_TRANSACTION_AMT & LARGE_CASH_TRANSACTION_CNT))'
- get_cross_segment_parameter_analysis(segments_include=None, segments_exclude=None, select_features=None, figsize=(14, 6), title=None)¶
Function to compare distribution of alerts and effective alerts for specified parameters across specified segments.
- Parameters:
segments_include (
list
) – list of segments to includesegments_exclude (
list
) – list of segments to excludeselect_features (
list
) – list of features to be plottedfigsize (
tuple
) – size of plottitle (
str
) – title of chart
- Returns:
plot
- get_cross_tab(segments_include=None, segments_exclude=None, feature_name=None, bins=None, round_digits=0)¶
Utility function to get a cross tab of parameter and effective flag
- Parameters:
segments_include (
list
) – list of segments to includesegments_exclude (
list
) – list of segments to excludefeature_name (
str
) – name of the feature being analyzedbins (
numpy array or list
) – array specifying bins, values should be bucketed into; If none deciles are used as intervalsround_digits (
int
) – number specifying how intervals should be rounded. To be used when deciles are used to bin the parameter values
- Returns:
dataframe
- get_data(tag='BTL', tunable_parameters=None, segments_include=None, segments_exclude=None)¶
It retrieves the actual data for the tunable parameters passed by the user. Depending on the tag, either BTL or ATL data will be retrieved for analysis.
- Parameters:
tag (
str
) – ‘BTL’ tag for BTL analysis and ‘ATL’ tag for ATL analysis. Default is ‘BTL’tunable_parameters (
str
) – (mandatory) Parmeteres to be tuned passed using logical & and | operatorssegments_include (
list
) – segments to be includedsegments_exclude (
list
) – segments to be excluded
- Returns:
dataframe
- Examples:
>>> tunable_parameters='Large_Cash_Transaction_Amt & (Large_Cash_Transaction_Cnt | (Large_Cash_Transaction_Amt & Large_Cash_Transaction_Cnt))' >>> ofs_asc.get_data(tunable_parameters=tunable_parameters, segments_include=['AMEA_HR','AMEA_MR'])
- get_density_plots(segments_include=None, segments_exclude=None, select_features=None, figsize=(14, 6), title=None)¶
Function to get density plots for multiple parameters
- Parameters:
segments_include (
list
) – list of segments to includesegments_exclude (
list
) – list of segments to excludeselect_features (
list
) – List of feature names(strings) to be plottedfigsize (
tuple
) – size of plottitle (
str
) – title of chart
- Returns:
plot
- get_effectiveness_trend(segments_include=None, segments_exclude=None, feature_name=None, bins=None, figsize=(14, 6), title=None, **kwargs)¶
Function to get density plots for multiple parameters
- Parameters:
segments_include (
list
) – list of segments to includesegments_exclude (
list
) – list of segments to excludefeature_name (
str
) – feature to be analyzedbins (
numpy array or list
) – array specifying bins, values should be bucketed into; If none deciles are used as intervalsfigsize (
tuple
) – size of plottitle (
str
) – title of chart**kwargs –
Keyword arguments:
- round_digits (
int
) – number of digits to which bin limits are to be rounded
- round_digits (
- ax (
int
) – axis on which the plot is to be placed. Used only when creating multi grid plots
- ax (
- Returns:
Plot
- get_effectiveness_trends(segments_include=None, segments_exclude=None, select_features=None, bins=None, figsize=(8, 6), title=None, **kwargs)¶
Function to get density plots for multiple parameters
- Parameters:
segments_include (
list
) – list of segments to includesegments_exclude (
list
) – list of segments to excludeselect_features (
list
) – List of feature names(strings) to be plottedbins (
numpy array or list
) – array specifying bins, values should be bucketed into; If none deciles are used as intervalsfigsize (
tuple
) – size of plottitle (
str
) – title of chart**kwargs –
Keyword arguments:
- round_digits (
int
) – number of digits to which bin limits are to be rounded
- round_digits (
- Returns:
Plot
- get_frequency_table_1D(segments_include=None, segments_exclude=None, feature_name=None, bins=None, plot=False, figsize=(8, 6), **kwargs)¶
Function to return a frequency table and optionally convert to a heat map
- Parameters:
segments_include (
list
) – list of segments to includesegments_exclude (
list
) – list of segments to excludefeature_name (
str
) – feature to be analyzedbins (
numpy array or list
) – array specifying bins, values should be bucketed into; If none deciles are used as intervalsplot (
bool
) – specifying whether to plot a heat mapfigsize (
tuple
) – size of plot**kwargs –
Keyword arguments:
- cmap (
str
) – any accepted python color map
- cmap (
- round_digits (
int
) – number of digits to which bin limits are to be rounded
- round_digits (
- ax (
int
) – axis on which to place the plot. Used only for multigrid plots
- ax (
- Returns:
dataframe and Plot(optional)
- get_frequency_table_2D(segments_include=None, segments_exclude=None, select_features=None, bins=None, plot=False, figsize=(14, 8), **kwargs)¶
Function to return a frequency table and optionally convert to a heat map
- Parameters:
segments_include (
list
) – list of segments to includesegments_exclude (
list
) – list of segments to excludeselect_features (
list
) – List of feature names(strings) to be plottedbins (
numpy array or list
) – array specifying bins, values should be bucketed into; If none deciles are used as intervalsplot (
bool
) – specifying whether to plot a heat mapfigsize (
tuple
) – size of plot**kwargs –
Keyword arguments:
- cmap (
str
) – any accepted python color map
- cmap (
- round_digits (
int
) – number of digits to which bin limits are to be rounded
- round_digits (
- Returns:
dataframe and Plot(optional)
- get_frequency_tables_1D(segments_include=None, segments_exclude=None, select_features=None, bins=None, figsize=(8, 6), plot=False, title=None, **kwargs)¶
Function to get frequency tables and optionally heatmaps for multiple parameters
- Parameters:
segments_include (
list
) – list of segments to includesegments_exclude (
list
) – list of segments to excludeselect_features (
list
) – features to be analyzedbins (
list of lists or arrays
) – list of arrays specifying bins, values should be bucketed into; If none deciles are used as intervalsfigsize (
tuple
) – size of plotplot (
bool
) – specifying whether to plot a heat maptitle (
str
) – title for plot**kwargs –
Keyword arguments:
- cmap (
str
) – any accepted python color map
- cmap (
- round_digits (
int
) – number of digits to which bin limits are to be rounded
- round_digits (
- Returns:
list of dataframes and Plot(optional)
- get_overall_summary(segments_include=None, segments_exclude=None)¶
Function to get segment wise summary of ATL Data
- Parameters:
segments_include (
list
) – list of segments to includesegments_exclude (
list
) – list of segments to exclude
- Returns:
Dataframe with summary of alerts
- get_samples(strata_include=None, strata_exclude=None)¶
It takes the random samples from each strata equal to the number of sample_size calculated.
::param strata_include: strata to be included :type strata_include:
list
:param strata_exclude: strata to be excluded :type strata_exclude:list
- Returns:
Returns successful message
- Examples:
>>> ofs_asc.get_samples(strata_include = ['AMEA_HR_1','AMEA_MR_1'], strata_exclude = None)
- static hyper_calc(N, Pt=0.005, Pe=0, Pw=0.95)¶
sample size calculation from Hyper geometric distribution
- Parameters:
N – Strata size
Pt – Tolerable suspiciuos event rate
Pe – Expected interesting event rate
Pw – Power or reliability
- Returns:
sample_size
- identify_atl_btl_population()¶
This API calls a SQL procedure
p_asc_update_event_master4atl
which updates the event_tag from BTL to ATL for each fic_mis_date in ASC_EVENT_MASTER table- Param:
None
- Returns:
None
- import_analysis_templates(analysis_id=None)¶
This API will create the objectives in MMG by taking scenario folder name as an input and also imports below analysis drafts to respective objectives.
BTL Analysis
ATL Analysis
ASC Scenario Execution
Impact Analysis
- Parameters:
analysis_id – folder name which will be created under Home/ASC/Analysis. Ex: ‘Sig Cash 2’
- Returns:
API response in json format.
- Examples:
>>> aif.import_analysis_templates(analysis_id='Sig Cash 2')
- investigate_samples(strata_include=None, strata_exclude=None)¶
It pushes the selected samples to DB table
ASC_EVENT_SAMPLE
- Parameters:
strata_include (
list
) – strata to be includedstrata_exclude (
list
) – strata to be excluded
- Returns:
Returns successful message
- Examples:
>>> ofs_asc.investigate_samples(strata_include = None, strata_exclude = ['AMEA_HR_0','AMEA_MR_0','AMEA_RR_0'])
- load_asc_event_master()¶
It calls a SQL procedure
p_asc_load_event_master
which loads data into ASC_EVENT_MASTER table for a list of fic_mis_date given by a class variable self.run_dates- Param:
None
- Returns:
status message for each fic_mis_date
- load_object()¶
Loads the object saved using
self.save_object()
- Param:
None
- Returns:
valid python object on successful execution.
- Example:
>>> data_pdf = ofs_asc.load_object()
- perform_stratification(perc_list=[0.7, 0.3, 0.2], startification_method='percentile')¶
It stratifies the input population (group by JURISDICTION and RISK_LEVEL) using stratified sampling and assigns strata number to each group within chosen segment. The default method is percentile used for creating stratas where each tunable parameter is converted to percentile feature and then based on the cutt-off percentile passed by perc_list, the segments gets splitted into number of stratas equal to the splitted values passed in perc_list.
- Parameters:
perc_list –
list of cutt-off percentiles to create stratas.
Strata number will folow the order from left to right.
First element in the list represent strata 1 and second element represent strata 2 and so on…
Strata 0 will always have an entities not included in any other startas
startification_method – stratas will be created on percentile method. Default value is “percentile”
- Returns:
dataframe
- Examples:
>>> ofs_asc.perform_stratification(perc_list=[0.8,0.6,0.4,0.2])
- save_object(description=None)¶
save python objects permanently. Load them whenever needed.
- Parameters:
description – description for the object to be saved
- Returns:
Paragrapgh execution will show success else failure.
- Example:
>>> data_pdf = data.frame({'COL1':[1,2,3], 'COL2':[4,5,6]}) >>> ofs_asc.save_object( value = data_pdf)
- scenario_post_processing()¶
It calls a method
load_asc_event_master
- Param:
None
- Returns:
None
- show_atl_btl_population()¶
This API shows the count of ATL and BTL population for each fic_mis_date in
ASC_EVENT_MASTER
table- Param:
None
- Returns:
dataframe
- show_event_segments()¶
This API calls a SQL procedure
p_asc_show_event_segments
and returns the event’s volume for each unique combination of JURISDICTION and RISK_LEVEL along with the unique SEGMENT_ID for each row.- Param:
None
- Returns:
dataframe
- Examples:
>>> ofs_asc.show_event_segments() SEGMENT_ID JURISDICTION RISK_LEVEL POPULATION AMEA_HR AMEA HR 1778 AMEA_MR AMEA MR 1 AMEA_RR AMEA RR 3
- show_event_volume()¶
This API displays the event volume for each fic_mis_date in
ASC_EVENT_MASTER
table- Param:
None
- Returns:
dataframe
- show_execution_status()¶
It shows the execution status of the scenarios. The status can be one of these:
RUNNING
FAILED
COMPLETED
- Param:
None
- Returns:
dataframe
- show_sample_size(strata_include=None, strata_exclude=None)¶
It shows the sample size calculated for each of the segments.
- Parameters:
strata_include (
list
) – strata to be includedstrata_exclude (
list
) – strata to be excluded
- Returns:
dataframe showing sample size for each strata
- Examples:
>>> ofs_asc.show_sample_size(strata_include = ['AMEA_HR_1','AMEA_MR_1'], strata_exclude = None)
- show_samples(strata_include=None, strata_exclude=None)¶
It shows the selected samples for each strata.
- Parameters:
strata_include (
list
) – strata to be includedstrata_exclude (
list
) – strata to be excluded
- Returns:
Returns dataframe showing selected samples
- Examples:
>>> ofs_asc.show_samples(strata_include = ['AMEA_HR_1','AMEA_MR_1'], strata_exclude = None)
- show_scenario_and_focus()¶
Display available scenarios and focal entities for analysis
- Param:
None
- Returns:
dataframe
- Examples:
>>> ofs_asc.show_scenario_and_focus()
- show_scenario_bindings()¶
Returns the binding names for current selected scenario which user can choose for setting up the tunable parameters.
- Param:
None
- Returns:
list of scenario binding’s names
- show_scenario_parameters()¶
This API finds out all the possible run dates for date ranges provided by the user and display all parameters set by the user.
- Param:
None
- Returns:
Display parameters set by a user.
- show_stratification_summary()¶
It displays strata number assigned to each group along with events population within each strata.
- Param:
None
- Returns:
dataframe showing strata number assigned to each segment
- Examples:
>>> ofs_asc.show_stratification_summary()
- class operator_overloading(operator_value)¶
Bases:
object