mlm_insights.core.metrics.conflict_metrics package

Submodules

mlm_insights.core.metrics.conflict_metrics.conflict_label module

class mlm_insights.core.metrics.conflict_metrics.conflict_label.ConflictLabel(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, lg_k: int = 11, input_list: ~typing.List[str] | None = None, target_column: str = 'y_true')

Bases: DatasetMetricBase

Computes Conflict Label metric based on the input features and target feature.
This metric calculates and returns the number of times dataset has different label values for the same set of input features.
This is an approximate metric.
Internally, it uses a sketch data structure with a default K value of 2048. It is a dataset Multivariate level metric.
Supports all data types

Configuration

lg_k: int, default=11
  • Maximum size, in log2, of k. The value must be between 4 and 26, inclusive

Returns

  • Conflict Label count: int
    • Count of different label values for the same set of input features

Exceptions

  • MissingRequiredParameterException
    • Error will be thrown in case features_metadata is missing while defining the metric metadata

    • Or in case Target column is missing from the metric metadata

Examples

from mlm_insights.builder.builder_component import MetricDetail, EngineDetail
from mlm_insights.builder.insights_builder import InsightsBuilder
from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType
from mlm_insights.core.metrics.metric_metadata import MetricMetadata
from mlm_insights.core.metrics.conflict_metrics.conflict_label import ConflictLabel
import pandas as pd

input_schema = {
    'square_feet': FeatureType(
        data_type=DataType.FLOAT,
        variable_type=VariableType.CONTINUOUS,
        column_type=ColumnType.INPUT),
    'square_area': FeatureType(
        data_type=DataType.FLOAT,
        variable_type=VariableType.CONTINUOUS,
        column_type=ColumnType.INPUT),
    'house_price_prediction': FeatureType(
        data_type=DataType.FLOAT,
        variable_type=VariableType.CONTINUOUS,
        column_type=ColumnType.PREDICTION),
    'house_price_target': FeatureType(
        data_type=DataType.FLOAT,
        variable_type=VariableType.CONTINUOUS,
        column_type=ColumnType.TARGET)
}
data_frame = pd.DataFrame({'square_feet': [1, 2, 3, 1, 2, 1, 4, 2],
                           'square_area': [11, 21, 31, 41, 51, 11, 12, 21],
                           'house_price_target': [1, 9, 3, 5, 4, 11, 17, 18],
                           'house_price_prediction': [1, 2, 3, 1, 2, 6, 3, 5]})
metric_details = MetricDetail(univariate_metric={},
                              dataset_metrics=[MetricMetadata(klass=ConflictLabel)])

runner = InsightsBuilder().             with_input_schema(input_schema).             with_data_frame(data_frame=data_frame).             with_metrics(metrics=metric_details).             with_engine(engine=EngineDetail(engine_name="native")).             build()

profile_json = runner.run().profile.to_json()
dataset_metrics = profile_json['dataset_metrics']
print(dataset_metrics["ConflictLabel"])

Returns the standard metric result as:
{
    'metric_name': 'ConflictLabel',
    'metric_description': 'Computes Conflict Label metric based on the input features and target feature. '                          'This metric calculates and returns the number of times dataset has different label values'                          ' for the same set of input features . This is an approximate metric',
    'variable_count': 1,
    'variable_names': ['conflict_label_count'],
    'variable_types': [DISCRETE],
    'variable_dtypes': [INTEGER],
    'variable_dimensions': [0],
    'metric_data': [2],
    'metadata': {},
    'error': None
}
classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) ConflictLabel

Create Conflict Label metric using the configuration and kwargs

Parameters

config : Metric configuration kwargs: Key value pair for dynamic arguments. The current kwargs contains:

  • features_metadata: Contains input schema for each feature

get_required_shareable_dataset_components(**kwargs: Any) List[SDCMetaData]

Returns the Shareable Dataset Components that a Metric requires to compute its state and values Metrics which do not require SDC need not override this property

Returns

List of SDCMetaData. Each SDCMetaData must contain the klass attribute which points to the SDC class

get_result(**kwargs: Any) Dict[str, Any]

Returns the number of times dataset has different label values for the same set of input features .

Returns

Dict: Conflict Label metric of the dataset.

get_standard_metric_result(**kwargs: Any) StandardMetricResult

This method returns metric output in standard format.

Returns

StandardMetricResult

input_list: List[str] = None
lg_k: int = 11
merge(other_metric: ConflictLabel, **kwargs: Any) ConflictLabel

Merge two Conflict Label Metric into one, without mutating the others.

Parameters

other_metricConflict Label

Other Conflict Label metrics that need be merged.

Returns

TypeMetric

A new instance of Conflict Label metrics

target_column: str = 'y_true'

mlm_insights.core.metrics.conflict_metrics.conflict_prediction module

class mlm_insights.core.metrics.conflict_metrics.conflict_prediction.ConflictPrediction(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, lg_k: int = 11, input_list: ~typing.List[str] | None = None, prediction_column: str = 'y_predict')

Bases: DatasetMetricBase

Computes Conflict Prediction metric based on the input features and predict feature.
This metric calculates the number of times a model gave conflicting output results for the same set of input features.
This is an approximate metric.
Internally, it uses a sketch data structure with a default K value of 2048. It is a dataset Multivariate level metric.
Supports all data types

Configuration

lg_k: int, default=11
  • Maximum size, in log2, of k. The value must be between 4 and 26, inclusive

Returns

  • Conflict Label count: int
    • Count of different label values for the same set of input features

Exceptions

  • MissingRequiredParameterException
    • Error will be thrown in case features_metadata is missing while defining the metric metadata

    • Or in case Target column is missing from the metric metadata

Examples

from mlm_insights.builder.builder_component import MetricDetail, EngineDetail
from mlm_insights.builder.insights_builder import InsightsBuilder
from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType
from mlm_insights.core.metrics.metric_metadata import MetricMetadata
from mlm_insights.core.metrics.conflict_metrics.conflict_prediction import ConflictPrediction
import pandas as pd

input_schema = {
    'square_feet': FeatureType(
        data_type=DataType.FLOAT,
        variable_type=VariableType.CONTINUOUS,
        column_type=ColumnType.INPUT),
    'square_area': FeatureType(
        data_type=DataType.FLOAT,
        variable_type=VariableType.CONTINUOUS,
        column_type=ColumnType.INPUT),
    'house_price_prediction': FeatureType(
        data_type=DataType.FLOAT,
        variable_type=VariableType.CONTINUOUS,
        column_type=ColumnType.PREDICTION),
    'house_price_target': FeatureType(
        data_type=DataType.FLOAT,
        variable_type=VariableType.CONTINUOUS,
        column_type=ColumnType.TARGET)
}
data_frame = pd.DataFrame({'square_feet': [1, 2, 3, 1, 2, 1, 4, 2],
                           'square_area': [11, 21, 31, 41, 51, 11, 12, 21],
                           'house_price_target': [1, 9, 3, 5, 4, 11, 17, 18],
                           'house_price_prediction': [1, 2, 3, 1, 2, 6, 3, 5]})
metric_details = MetricDetail(univariate_metric={},
                              dataset_metrics=[MetricMetadata(klass=ConflictPrediction)])

runner = InsightsBuilder().             with_input_schema(input_schema).             with_data_frame(data_frame=data_frame).             with_metrics(metrics=metric_details).             with_engine(engine=EngineDetail(engine_name="native")).             build()

profile_json = runner.run().profile.to_json()
dataset_metrics = profile_json['dataset_metrics']
print(dataset_metrics["ConflictPrediction"])

Returns the standard metric result as:
{
    'metric_name': 'ConflictPrediction',
    'metric_description': 'Computes Conflict Prediction metric based on the input features and predict feature .'                          'This metric calculates the number of times a model gave conflicting output results'                          ' for the same set of input features . This is an approximate metric.',
    'variable_count': 1,
    'variable_names': ['conflict_prediction_count'],
    'variable_types': [DISCRETE],
    'variable_dtypes': [INTEGER],
    'variable_dimensions': [0],
    'metric_data': [2],
    'metadata': {},
    'error': None
}
classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) ConflictPrediction

Create Conflict Prediction metric using the configuration and kwargs

Parameters

config : Metric configuration kwargs: Key value pair for dynamic arguments. The current kwargs contains:

  • features_metadata: Contains input schema for each feature

get_required_shareable_dataset_components(**kwargs: Any) List[SDCMetaData]

Returns the Shareable Dataset Components that a Metric requires to compute its state and values Metrics which do not require SDC need not override this property

Returns

List of SDCMetaData. Each SDCMetaData must contain the klass attribute which points to the SDC class

get_result(**kwargs: Any) Dict[str, Any]

Returns the number of times a model gave conflicting output results for the same set of input features

Returns

Dict: Conflict Prediction metric of the dataset.

get_standard_metric_result(**kwargs: Any) StandardMetricResult

This method returns metric output in standard format.

Returns

StandardMetricResult

input_list: List[str] = None
lg_k: int = 11
merge(other_metric: ConflictPrediction, **kwargs: Any) ConflictPrediction

Merge two Conflict Prediction Metric into one, without mutating the others.

Parameters

other_metricConflict Prediction

Other Conflict Prediction metrics that need be merged.

Returns

TypeMetric

A new instance of Conflict Prediction metrics

prediction_column: str = 'y_predict'

Module contents