The User-Defined Transformation Function (UDTF) lets you define and implement your calculations in addition to the ready-to-use transformations available in the application.
The following are the mandatory input parameters:
· X: The object of the class ofsaif.
· data: The behavioral frame.
· data.nb: The non-behavioral frame.
· Osot: The boolean flag which distinguishes data from Model Build Data, OSOT Data, and New Production Data.
· addToStage2: The boolean flag which guides the prediction process to decide whether to add the current output to Stage 2 or not.
All ready-to-use transformations and User-Defined Transformation (UDT) functions are implemented in AIF using the following standard mechanism:
· The scikit-learn library, which provides a library of transformers that help to perform the following functions:
§ clean feature representations
§ reduce feature representations
§ expand feature representations
§ generate feature representations
· Like other estimators, these are represented by classes with a fit method (fit_transform), which learns model parameters (for example, mean and standard deviation for normalization) from a training set, and a transform method that applies this transformation model to unseen data. This method is a convenient and efficient way to handle the modeling and transformation of the training data simultaneously.
· Combining such transformers, either in parallel or series is covered in Pipelines and composite estimators. Pairwise metrics, Affinities, and Kernels cover transforming feature spaces into affinity matrices while transforming the prediction target (y) considers transformations of the target space (e.g. categorical labels) for use in scikit-learn.
· All ready-to-use transformations and User-Defined Transformation functions(UDF) are implemented in AIF using the same standard mechanism.
Some of the key guidelines that must be followed for User-Defined Transformation (UDT) functions are as follows:
· the User-defined transformation ( UDT ) source code must be a valid python package. Run time classes and functions are not supported.
· The class methods (fit, transform, and/or predict) implementation must always take the pandas data frame as an input for the independent features 'x'.
· The transform and/or predict method only takes the independent features 'x' as an input argument ( that is, as a pandas data frame ).
· The returned object from the transform and predict class methods must be a pandas data frame.
· The key_var variable must be a part of the return or resultant frame.
· Any other parameters/arguments required for the implementation can be adjusted as arguments to the class constructor or to fit the method.
For the detailed guidelines on dataset transformations, see https://scikit-learn.org/stable/data_transforms.html. Right-click to open this link in a new tab or window.