Auto ML Classifiers supported in AIF

These classifiers define what can be used for cross-validation, and uses Classifier-Grid, which is a list of dictionaries and each dictionary is a classifier-group that contains a list of classifiers, hyper-parameters, and control parameters for that group. Use a dictionary to repeat the same classifier or technique using multiple times with a different set of hyper and control parameters.

Hyper-parameters are classifier-specific. The standard sklearn classifier must follow its hyperparameters list. Control parameters are as follows:

·        feature_include

·        feature_exclude

·        feature_must_include

·        woe_transform

 

Each classifier dictionary consists of three optional key-value pairs. Classifiers within each dictionary form a part of different model groups. The key-values pairs within each model group include the following values:

·        models: Specifies a single model name or a list of model names. Default is None

·        params: Specifies the hyper-parameters for the model as a dictionary. Applicable only if a single model is specified. default is None

·        ctrls: Specified input data control as a dictionary which includes:

§       feature_include {List}: List of features to be included in the model build process

§       feature_exclude {List}: List of features to be excluded in the model build process. Either feature_include or feature_exclude should be None. If both are provided only feature_include will be considered.

§       feature_must_include {List}: List of features that must be included and should be part of the final model

§       woe_transform {bool}: When True, performs woe transformations on input data before model fit

§       woe_bin_method {String}: Take values quantile/interval/auto which specifies the type of binning to be used for numeric variables before woe transformations. This converts numeric to discrete values

§       woe_n_tile {int}: Number of bins to be created when converting numeric to discrete data

§       require_test {bool}: When True, test data is used during the model fit. Applicable to xgboost only.


Each dictionary is a classifier-group and must be separated by commas. For example, [{classifier-group-1},{classifier-group-2},{classifier-group-3}]]. Following are the possible combinations:

·        Specify models without any parameters or controls.
Any number of models that are defined in the ofs_classifiers module can be listed.
[{
'models': ['WOELR', 'LR', 'ADB', 'MLP', 'XGB', 'GB', 'RF', 'NB']
}]

·        Specify Models and controls without any parameters.
Any number of models that are defined in the ofs_classifiers module can be listed.
controls can take in values as defined above and this will be applied for all the models within this model group.
[{
'models': ['LR', 'ADB', XGB', 'GB',]
'ctrls': {'feature_include': ['f1','f2','f3'], 'woe_transform': True}
}]

·        Specify models and parameters without any controls.
Only one model can be specified in this case as each model will have its own set of parameters.
Each parameter can take multiple values for constructing a grid search. In the example below, 6 (3*2) models would be created for 6 different combinations of the two parameters specified.
[{
'models': ['xgb']
'params': {'n_estimators':(10,15,20), 'max_depth':[5,6]}
}]

·        Specify models, parameters, and controls.
Only one model can be specified in this case as each model will have its own set of parameters. Each parameter can take multiple values for constructing a grid search. In the example below, 6 (3*2) models would be created for 6 different combinations of the two parameters specified.
controls can take in values as defined above and this will be applied for all the models within this model group.
[{
'models': ['xgb']
'params': {'n_estimators':(10,20), 'max_depth':[5,6]}
'ctrls': {'feature_include': ['f1','f2','f3'], 'woe_transform': True}
}]