Post Processor Component

Post processor components are responsible for running any action after the entire data is processed and all the metrics are calculated. They can be used for different purposes, like writing the metric result to storage, calling any external service, or integration with other tools. The interface of the post processor is intentionally kept open ended for the same reason.

Post processor components only have access to the final output of the framework, for example, the profile and Test Results, and don’t have access to any of the input data. You can add many post processor components to your monitoring runs for the builder and they can all be run.

is_critical flag can be passed to a post processor. When set to true, Insights run is marked as failed when the post processor execution fails. By default the flag is set to False.

Post Processor Type

Description

LocalWriterPostProcessor

Persists Insights Profile and optionally Insights Test Results in a user provided local file system

ObjectStorageWriterPostProcessor

Persists Insights Profile and optionally Insights Test Results in a user provided OCI Object Storage bucket.

OCIMonitoringPostProcessor

Sends Insights Tests Results to OCI Monitoring Service. This forms the basis for “Monitoring Notifications” feature to allow data scientists to continually monitor data and model health.

How to use

In this section we can see, how to use different post processors in ML Insights. An optional parameter of is_critical can be set to True, which can mark the insights run as fail in case of failure in post processor run. This flag is false by default.

LocalWriterPostProcessor

Local Writer Post Processor stores the Insights profile and Insights Test Results (if available) in user provided local file location.

Output will be in form of Profile object in serialized format, this object can be deserialized back using profile unmarshall method and Insights Test Results JSON object (if generated during Insights run) will be persisted as a JSON file.

User needs to pass valid file_name, file_location, test_results_file_name (Optional), is_critical (Optional) for this post processor.

Test Results will not be persisted if the user does not configure optional parameter test_results_file_name.

  1. How to configure Local Writer Post Processor in Ml Insights
     'post_processors': [
        {
            'type': 'LocalWriterPostProcessor',
            'params': {
                'file_name': '<FILE_NAME>',
                'file_location': '<LOCAL_FILE_LOCATION>',
                'is_critical': True/False,
                'test_results_file_name': 'LOCAL_TEST_RESULTS_FILE_NAME.json'
            }
        }
    ]
    
  2. Import all the relevant post processor classes to use
    from mlm_insights.core.post_processors.local_writer_post_processor import LocalWriterPostProcessor
    
  3. Construct a new object by passing the right parameters (if any) to the constructor and create a list
    post_processor_list = [LocalWriterPostProcessor(file_name='<FILE_NAME>',
                                                    file_location='<LOCAL_FILE_LOCATION>',
                                                    test_results_file_name='<LOCAL_TEST_RESULTS_FILE_NAME.json>',
                                                    is_critical=False)]
    
  4. Pass the newly created list to builder object
    InsightsBuilder().with_post_processors(post_processors=post_processor_list)
    

ObjectStorageWriterPostProcessor

Object Storage Writer Post Processor stores the Insights Profile and Insights Test Results (if available) in user-provided Object storage location.

Output will be in form of Profile object in serialized format, this object can be deserialized back using profile unmarshall method and Insights Test Results JSON object (if generated during Insights run) will be persisted as a JSON file.

User needs to pass valid namespace, bucket_name, prefix, object_name, test_results_object_name (Optional), storage_options (Optional), is_critical (Optional) for this post processor

Test Results will not be persisted if the user does not configure optional parameter test_results_object_name.

  • storage_options: A dictionary containing optional configuration options for OCI Object Storage. Typically, this dictionary should contain OCI region and signer details.

    storage_options = {"region": region, "signer": signer}
    
  1. How to configure Local Writer Post Processor in Ml Insights
     'post_processors': [
        {
            'type': 'ObjectStorageWriterPostProcessor',
            'params': {
                'namespace': '<NAMESPACE>',
                'bucket_name': '<BUCKET_NAME>',
                'prefix': '<PREFIX>',
                'object_name': '<OBJECT_NAME>',
                'test_results_object_name': '<TEST_RESULTS_OBJECT_NAME>',
                'storage_options': {
                    'key1': 'value1'
                },
                'is_critical': True/False
            }
        }
    ]
    
  2. Import all the relevant post processor classes to use
    from mlm_insights.core.post_processors.object_storage_writer_post_processor import ObjectStorageWriterPostProcessor
    
  3. Construct a new object by passing the right parameters (if any) to the constructor and create a list
    post_processor_list = [ObjectStorageWriterPostProcessor(namespace='<NAMESPACE>',
                                                            bucket_name='<BUCKET_NAME>',
                                                            prefix='<PREFIX>',
                                                            object_name='<OBJECT_NAME>',
                                                            test_results_object_name='<TEST_RESULTS_OBJECT_NAME>',
                                                            storage_options={
                                                                'key1': 'value1'
                                                            },
                                                            is_critical=True/False)]
    
  4. Pass the newly created list to builder object
    InsightsBuilder().with_post_processors(post_processors=post_processor_list)
    

OCIMonitoringPostProcessor

OCI Monitoring Post Processor is used for pushing the test suite results to OCI Monitoring Service.

User needs to pass valid CompartmentId, Namespace (Optional, Default is ‘ml_monitoring’), customized dimensions in form of key-value pairs (Optional), is_critical (Optional).

The following default dimensions are pushed to OCI Monitoring in addition to any custom dimensions that users specify:

Dimension Name

Example

Description

metric

Count

Name of the metric on which a test is configured

feature_name

Age

Name of the feature on which a test is configured (For metrics computed on the entire dataset, this dimension is not specified)

alias

TestIsComplete-completion-percentage-expected-90.0

Parameter that specifies test name and threshold value (to maintain uniqueness in case user specified multiple tests with same test name and different threshold)

user-defined tags

severity-level

User-defined tags specified while configuring a test

Application consuming this post processor must have appropriate IAM policies to push custom metrics to OCI Monitoring Service. For more information about IAM policies please refer: OCI_Monitoring_Service_IAM_Policies.

  1. How to configure OCI Monitoring Post Processor in Ml Insights
     'post_processors': [
        {
            'type': 'OCIMonitoringPostProcessor',
            'params': {
                'compartment_id': '<COMPARTMENT_ID>',
                'namespace': '<NAMESPACE>',
                'is_critical': True/False,
                'dimensions': {
                    'key1': 'value1',
                    'key2': 'value2'
                }
            }
        }
    ]
    
  2. Import all the relevant post processor classes to use
    from mlm_insights.core.post_processors.oci_monitoring_post_processor import OCIMonitoringPostProcessor
    
  3. Construct a new object by passing the right parameters (if any) to the constructor and create a list
    post_processor_list = [OCIMonitoringPostProcessor(compartment_id='<COMPARTMENT_ID>',
                                                      namespace='<NAMESPACE>',
                                                      dimensions={'key1': 'value1'},
                                                      is_critical=True/False)]
    
  4. Pass the newly created list to builder object
    InsightsBuilder().with_post_processors(post_processors=post_processor_list)