Fairness with AutoMLx

by the Oracle AutoMLx Team


Fairness Demo notebook.

Copyright © 2025, Oracle and/or its affiliates.

Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/

Overview of this Notebook¶

In this notebook, we explore the fairness features of AutoMLx package. We start by training an AutoML model on the Census Income dataset. Later, we provide examples of how to evaluate the fairness of the model and the dataset. We also explore how the provided explanation techniques may help us gain more insight on the fairness of our model.

Prerequisites¶

  • Experience level: Novice (Python and Machine Learning)
  • Professional experience: Some industry experience

Table of Contents¶

  • Preliminaries
  • Quick Start
    • Load the Data
    • Train and evaluate an AutoML model
    • Evaluate the Fairness of the Model by Computing Statistical Parity
    • Compute Fairness Feature Importance
    • Mitigating a Model's Unfairness
  • In Depth: The Census Income Dataset
  • Unintended Bias and Fairness
    • Overview of the Fairness Metrics
    • Unintended Bias Detection
      • Measure the Compliance of a Model with a Fairness Metric
        • Train a Model Using Scikit-learn
        • Train a Model Using AutoML
      • Measure the Compliance of the True Labels of a Dataset with a Fairness Metric
      • Other Fairness Metrics
  • Revealing Bias with Explainability
    • Initializing an MLExplainer
    • Model Explanations (Global Feature Importance)
    • Model Fairness Explanations (Fairness Feature Importance)
    • Global vs Fairness Feature Importance
  • Model Bias Mitigation
  • References

Preliminaries¶

Here we import some required libraries.

In [1]:
%matplotlib inline
%load_ext autoreload
%autoreload 2

Load the required modules.

In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px
import sklearn
from sklearn.metrics import roc_auc_score
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split

# Settings for plots
plt.rcParams['figure.figsize'] = [10, 7]
plt.rcParams['font.size'] = 15

import automlx

Quick Start¶

Here, we give an overview of the key features. We first load the Census Income dataset, train a model on it, evaluate its accuracy and fairness, and compute its fairness feature importance. All of these steps will be revisited again, but with more detail through the rest of the notebook.

Load the Data¶

In [3]:
dataset = fetch_openml(name='adult', as_frame=True)
df, y = dataset.data, dataset.target

# Several of the columns are incorrectly labeled as category type in the original dataset
numeric_columns = ['age', 'capitalgain', 'capitalloss', 'hoursperweek']
for col in df.columns:
    if col in numeric_columns:
        df[col] = df[col].astype(int)


X_train, X_test, y_train, y_test = train_test_split(
    df, y.map({">50K": 1, "<=50K": 0}).astype(int), train_size=0.8, random_state=12345
)

X_train, X_val, y_train, y_val = train_test_split(
    X_train, y_train, train_size=0.75, random_state=12345
)

X_train.shape, X_val.shape, X_test.shape
Out[3]:
((29304, 14), (9769, 14), (9769, 14))

Train and evaluate an AutoML model¶

The AutoML API is quite simple to work with. We create an instance of the pipeline. Next, the training data is passed to fit.

In [4]:
model = automlx.Pipeline(task='classification')
model.fit(X_train, y_train)
[2025-04-25 03:08:03,813] [automlx.backend] Overwriting ray session directory to /tmp/ct1h5q1c/ray, which will be deleted at engine shutdown. If you wish to retain ray logs, provide _temp_dir in ray_setup dict of engine_opts when initializing the AutoMLx engine.
[2025-04-25 03:08:07,973] [automlx.interface] Dataset shape: (29304,14)
[2025-04-25 03:08:12,868] [sanerec.autotuning.parameter] Hyperparameter epsilon autotune range is set to its validation range. This could lead to long training times
[2025-04-25 03:08:13,548] [sanerec.autotuning.parameter] Hyperparameter repeat_quality_threshold autotune range is set to its validation range. This could lead to long training times
[2025-04-25 03:08:13,561] [sanerec.autotuning.parameter] Hyperparameter scope autotune range is set to its validation range. This could lead to long training times
[2025-04-25 03:08:13,642] [automlx.data_transform] Running preprocessing. Number of features: 15
[2025-04-25 03:08:14,283] [automlx.data_transform] Preprocessing completed. Took 0.641 secs
[2025-04-25 03:08:14,310] [automlx.process] Running Model Generation
[2025-04-25 03:08:14,361] [automlx.process] KNeighborsClassifier is disabled. The KNeighborsClassifier model is only recommended for datasets with less than 10000 samples and 1000 features.
[2025-04-25 03:08:14,361] [automlx.process] SVC is disabled. The SVC model is only recommended for datasets with less than 10000 samples and 1000 features.
[2025-04-25 03:08:14,363] [automlx.process] Model Generation completed.
[2025-04-25 03:08:14,432] [automlx.model_selection] Running Model Selection
(run pid=2668571) [LightGBM] [Info] Number of positive: 2000, number of negative: 2000
(run pid=2668571) [LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000439 seconds.
(run pid=2668571) You can set `force_row_wise=true` to remove the overhead.
(run pid=2668571) And if memory is not enough, you can set `force_col_wise=true`.
(run pid=2668571) [LightGBM] [Info] Total Bins 391
(run pid=2668571) [LightGBM] [Info] Number of data points in the train set: 4000, number of used features: 15
(run pid=2668571) [LightGBM] [Info] [binary:BoostFromScore]: pavg=0.500000 -> initscore=0.000000
[2025-04-25 03:08:33,792] [automlx.model_selection] Model Selection completed - Took 19.360 sec - Selected models: [['XGBClassifier']]
[2025-04-25 03:08:33,820] [automlx.adaptive_sampling] Running Adaptive Sampling. Dataset shape: (29304,16).
[2025-04-25 03:08:35,976] [automlx.trials] Adaptive Sampling completed - Took 2.1557 sec.
[2025-04-25 03:08:36,067] [automlx.feature_selection] Starting feature ranking for XGBClassifier
[2025-04-25 03:08:43,777] [automlx.feature_selection] Feature Selection completed. Took 7.726 secs.
[2025-04-25 03:08:43,832] [automlx.trials] Running Model Tuning for ['XGBClassifier']
[2025-04-25 03:09:28,601] [automlx.trials] Best parameters for XGBClassifier: {'learning_rate': 0.10242113515453982, 'min_child_weight': 12, 'max_depth': 4, 'reg_alpha': 0, 'booster': 'gbtree', 'reg_lambda': 0.01878279410038923, 'n_estimators': 143, 'use_label_encoder': False}
[2025-04-25 03:09:28,602] [automlx.trials] Model Tuning completed. Took: 44.770 secs
[2025-04-25 03:09:35,259] [automlx.interface] Re-fitting pipeline
[2025-04-25 03:09:35,275] [automlx.final_fit] Skipping updating parameter seed, already fixed by FinalFit_4c481463-9
[2025-04-25 03:09:37,613] [automlx.interface] AutoMLx completed.
Out[4]:
<automlx._interface.classifier.AutoClassifier at 0x150a49d219a0>
In [5]:
y_proba = model.predict_proba(X_test)
score_original = roc_auc_score(y_test, y_proba[:, 1])

print(f'Score on test data: {score_original:.2f}')
Score on test data: 0.91

Evaluate Fairness of the Model by Computing Statistical Parity¶

Among the several fairness metrics available in the AutoMLx package, we compute the statistical parity of the model on test data.

In [6]:
from automlx.fairness.metrics import ModelStatisticalParityScorer

fairness_score = ModelStatisticalParityScorer(protected_attributes='sex')
parity_test_model = fairness_score(model, X_test)
print(f'Statistical parity of the model on test data (lower is better): {parity_test_model:.2f}')
Statistical parity of the model on test data (lower is better): 0.18

Compute Fairness Feature Importance¶

Using the fairness feature importance, we can gain insight on which features contribute the most to model unfairness.

In [7]:
explainer = automlx.MLExplainer(model,
                               X_train,
                               y_train,
                               target_names=["<=50K", ">50K"],
                               task="classification")
In [8]:
fairness_exp = explainer.explain_model_fairness(protected_attributes='sex',
                                                scoring_metric='statistical_parity')
fairness_exp.show_in_notebook()

Mitigating a Model's Unintended Bias¶

AutoMLx provides a bias mitigation tool as well. We first need to initialize a ModelBiasMitigator. It requires a fitted model (the base estimator), the name of the protected attributes to use, a fairness metric, and an accuracy metric.

In [9]:
from automlx.fairness.bias_mitigation import ModelBiasMitigator

bias_mitigated_model = ModelBiasMitigator(
    model,
    protected_attribute_names="sex",
    fairness_metric="equalized_odds",
    accuracy_metric="balanced_accuracy",
    random_seed=12345,
)

The ModelBiasMitigator can be called with the usual scikit-learn interface, notably being trained with a single call to fit.

In [10]:
bias_mitigated_model.fit(X_val, y_val)
Out[10]:
<automlx.fairness.bias_mitigation._sklearn.ModelBiasMitigator at 0x150a8dbe6370>

The model can easily be used for inference.

In [11]:
bias_mitigated_model.predict(X_test)
Out[11]:
array([0, 1, 0, ..., 1, 0, 0])

We can also visualize all of the best models that were found by our approach using a single show_tradeoff call.

In [12]:
bias_mitigated_model.show_tradeoff(hide_inadmissible=False)

A summary of these models can be accessed as below.

In [13]:
bias_mitigated_model.tradeoff_summary_
Out[13]:
equalized_odds balanced_accuracy multiplier_sex=Female multiplier_sex=Male
0 0.006916 0.608927 0.111534 0.199131
1 0.023170 0.624593 0.193148 0.233259
2 0.028703 0.628152 0.256520 0.233259
3 0.036227 0.642396 0.329184 0.274094
4 0.046819 0.759193 1.441927 0.840396
5 0.052286 0.793150 3.661628 1.334403
6 0.095597 0.795917 1.767152 1.365998
7 0.097830 0.796030 1.700147 1.364635
8 0.129705 0.817235 8.367842 3.129173
9 0.151287 0.819858 5.442261 2.822292
10 0.184541 0.822028 4.626349 3.146395
11 0.216564 0.822988 3.661628 3.447235
12 0.237449 0.824125 2.028335 3.146395

Each of these models can be selected as the final bias-mitigated model and inference can be performed like before. For example, the model with index=1 can be selected as below.

In [14]:
bias_mitigated_model.select_model(1)

In Depth: The Census Income Dataset¶

We start by reading in the dataset from OpenML.

In [15]:
dataset = fetch_openml(name='adult', as_frame=True)
df, y = dataset.data, dataset.target

Lets look at a few of the values in the data

In [16]:
df.head()
Out[16]:
age workclass fnlwgt education education-num marital-status occupation relationship race sex capitalgain capitalloss hoursperweek native-country
0 2 State-gov 77516.0 Bachelors 13.0 Never-married Adm-clerical Not-in-family White Male 1 0 2 United-States
1 3 Self-emp-not-inc 83311.0 Bachelors 13.0 Married-civ-spouse Exec-managerial Husband White Male 0 0 0 United-States
2 2 Private 215646.0 HS-grad 9.0 Divorced Handlers-cleaners Not-in-family White Male 0 0 2 United-States
3 3 Private 234721.0 11th 7.0 Married-civ-spouse Handlers-cleaners Husband Black Male 0 0 2 United-States
4 1 Private 338409.0 Bachelors 13.0 Married-civ-spouse Prof-specialty Wife Black Female 0 0 2 Cuba

The Adult dataset contains a mix of numerical and string data, making it a challenging problem to train machine learning models on.

We visualize the distribution of the target variable in the training data.

In [17]:
y_df = pd.DataFrame(y)
y_df.columns = ['income']

fig = px.histogram(y_df, x="income")
fig.show()

We now visualize the distribution of the target variable conditioned on values of sex. We can already see some biases in this dataset.

In [18]:
df1 = pd.concat([df, y_df], axis=1)

df1 = df1.groupby('sex')['income'].value_counts(normalize=True)
df1 = df1.mul(100).reset_index()
df1.columns = ['sex', 'income', 'percent']

fig = px.bar(df1, x="sex", y="percent", color="income", barmode="group")
fig.show()

We now separate the predictions (y) from the training data (X) for both the training (70%) and test (30%) datasets. The training set will be used to create a Machine Learning model using AutoML, and the test set will be used to evaluate the model's performance on unseen data.

In [19]:
# Several of the columns are incorrectly labeled as category type in the original dataset
numeric_columns = ['age', 'capitalgain', 'capitalloss', 'hoursperweek']
for col in df.columns:
    if col in numeric_columns:
        df[col] = df[col].astype(int)


X_train, X_test, y_train, y_test = train_test_split(
    df, y.map({">50K": 1, "<=50K": 0}).astype(int), train_size=0.8, random_state=12345
)

X_train, X_val, y_train, y_val = train_test_split(
    X_train, y_train, train_size=0.75, random_state=12345
)

X_train.shape, X_val.shape, X_test.shape
Out[19]:
((29304, 14), (9769, 14), (9769, 14))

Unintended Bias and Fairness¶

Protected attributes are referred to as features that may not be used as the basis for decisions (for example, race, gender, etc.). When machine learning is applied to decision-making processes involving humans, one should not only look for models with good performance, but also models that do not discriminate against protected population subgroups.

Overview of the Fairness Metrics¶

We provide a table summarizing the fairness metrics in the AutoMLx package. Choosing the right fairness metric for a particular application is critical; it requires domain knowledge of the complete sociotechnical system. Moreover, different metrics bring in different perspectives and sometimes the data/model might need to be analyzed for multiple fairness metrics. Therefore, this choice is based on a combination of the domain, task at hand, societal impact of model predictions, policies and regulations, legal considerations, etc. and cannot be fully automated. However, we hope that the table below will help give some insights into which fairness metric is best for your application.

Machine learning models that decide outcomes affecting individuals can either be assistive or punitive. For example, a model that classifies whether or not a job applicant should be interviewed is assistive, because the model is screening for individuals that should receive a positive outcome. In contrast, a model that classifies loan applicants as high risk is punitive, because the model is screening for individuals that should receive a negative outcome. For models used in assistive applications, it is typically important to minimize false negatives (for example, to ensure individuals who deserve to be interviewed are interviewed), whereas in punitive applications, it is usually important to minimize false positives (for example, to avoid denying loans to individuals that have low credit risk). In the spirit of fairness, one should therefore aim to minimize the disparity in false negative rates across protected groups in assistive applications whilst minimizing the disparity in false positive rates for punitive applications. In the following table, we have classified each metric based on whether or not it is most appropriate for models used in assistive or punitive applications (or both). For further explanations, please refer to this book.

Metric Dataset Model Punitive Assistive Perfect score means
Consistency ✓ NA NA Neighbors (k-means) have the same labels
Smoothed EDF ✓ NA NA Sub-populations have equal probability of positive label (with log scaling of deviation)
Statistical Parity ✓ ✓ ✓ Sub-populations have equal probability of positive prediction
True Positive Rates ✓ ✓ Sub-populations have equal probability of positive prediction when their true label is positive
False Positive Rates ✓ ✓ Sub-populations have equal probability of positive prediction when their true label is negative
False Negative Rates ✓ ✓ Sub-populations have equal probability of negative prediction when their true label is positive
False Omission Rates ✓ ✓ Sub-populations have equal probability of a positive true label when their prediction is negative
False Discovery Rates ✓ ✓ Sub-populations have equal probability of a negative true label when their prediction is positive
Equalized Odds ✓ ✓ ✓ Sub-populations have equal true positive rate and equal false positive rate
Error Rates ✓ ✓ Sub-populations have equal probability of a false prediction
Theil Index ✓ ✓ Error rates are the same for sub-populations and whole population (deviations are measured using entropy).

Unintended Bias Detection¶

The automlx.fairness.metrics module provides metrics dedicated to assessing and checking whether the model predictions and/or true labels in data comply with a particular fairness metric. For this example, we will take a look at the statistical parity metric. This metric, also known as demographic parity, measures how much a protected group's outcome varies when compared to the rest of the population. Thus, such fairness metrics denote differences in error rates for different demographic groups/protected attributes in data. Therefore, these metrics are to be minimized to decrease discrepancies in model predictions with respect to specific groups of people. Traditional classification metrics such as accuracy, on the other hand, are to be maximized.

Measure the Compliance of a Model with a Fairness Metric¶

In the context of the Adult Census Income dataset, if we want to measure fairness with respect to the sex attribute, statistical parity corresponds to the disparity between the model's rate of predicting a >50k income between men and women. Model fairness metrics are available as scikit-learn compatible scorers, taking in a list of protected_attributes at creation and then being called with a model, X, and y on which to measure fairness - note that the fairness features are not limited to AutoML models. By default, the fairness metric will measure the difference between a subgroup's outcome and that of the rest of the population, returning the mean disparity over all subgroups. These two options can be changed at the creation of the metric, using the distance_measure and reduction arguments, respectively.

Train a Model Using Scikit-learn¶

We first train a simple sklearn random forest and then evaluate its performance and fairness.

In [20]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import OneHotEncoder

sklearn_model = sklearn.pipeline.Pipeline(
    steps=[("preprocessor", OneHotEncoder(handle_unknown="ignore")), ("classifier", RandomForestClassifier())]
)
sklearn_model.fit(X_train, y_train)
Out[20]:
Pipeline(steps=[('preprocessor', OneHotEncoder(handle_unknown='ignore')),
                ('classifier', RandomForestClassifier())])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Pipeline(steps=[('preprocessor', OneHotEncoder(handle_unknown='ignore')),
                ('classifier', RandomForestClassifier())])
OneHotEncoder(handle_unknown='ignore')
RandomForestClassifier()

We use the roc_auc_score scoring metric to evaluate the performance of this model on unseen data (X_test).

In [21]:
y_proba = sklearn_model.predict_proba(X_test)
score = roc_auc_score(y_test, y_proba[:, 1])

print(f'Score on test data: {score:.2f}')
Score on test data: 0.90

Now, we can also compute the statistical parity of the model on test data.

In [22]:
fairness_score = ModelStatisticalParityScorer(protected_attributes='sex')
parity_test_sklearn_model = fairness_score(sklearn_model, X_test)
print(f'Statistical parity of the sklearn model on test data (lower is better): {parity_test_sklearn_model:.2f}')
Statistical parity of the sklearn model on test data (lower is better): 0.19

Train a Model Using AutoML¶

One can also use the fairness metrics to score AutoML models, which is even easier, because it will handle all of the dataset pre-processing and also select the best learning algorithm.

In [23]:
model = automlx.Pipeline(task='classification')
model.fit(X_train, y_train)
[2025-04-25 03:11:35,769] [automlx.interface] Dataset shape: (29304,14)
[2025-04-25 03:11:35,874] [automlx.data_transform] Running preprocessing. Number of features: 15
[2025-04-25 03:11:36,280] [automlx.data_transform] Preprocessing completed. Took 0.406 secs
[2025-04-25 03:11:36,306] [automlx.process] Running Model Generation
[2025-04-25 03:11:36,357] [automlx.process] KNeighborsClassifier is disabled. The KNeighborsClassifier model is only recommended for datasets with less than 10000 samples and 1000 features.
[2025-04-25 03:11:36,357] [automlx.process] SVC is disabled. The SVC model is only recommended for datasets with less than 10000 samples and 1000 features.
[2025-04-25 03:11:36,358] [automlx.process] Model Generation completed.
[2025-04-25 03:11:36,430] [automlx.model_selection] Running Model Selection
[2025-04-25 03:11:53,842] [automlx.model_selection] Model Selection completed - Took 17.412 sec - Selected models: [['XGBClassifier']]
[2025-04-25 03:11:53,871] [automlx.adaptive_sampling] Running Adaptive Sampling. Dataset shape: (29304,16).
[2025-04-25 03:11:55,656] [automlx.trials] Adaptive Sampling completed - Took 1.7846 sec.
[2025-04-25 03:11:55,771] [automlx.feature_selection] Starting feature ranking for XGBClassifier
[2025-04-25 03:12:03,593] [automlx.feature_selection] Feature Selection completed. Took 7.839 secs.
[2025-04-25 03:12:03,647] [automlx.trials] Running Model Tuning for ['XGBClassifier']
[2025-04-25 03:12:49,388] [automlx.trials] Best parameters for XGBClassifier: {'learning_rate': 0.10242113515453982, 'min_child_weight': 12, 'max_depth': 4, 'reg_alpha': 0, 'booster': 'gbtree', 'reg_lambda': 0.01878279410038923, 'n_estimators': 143, 'use_label_encoder': False}
[2025-04-25 03:12:49,389] [automlx.trials] Model Tuning completed. Took: 45.742 secs
[2025-04-25 03:12:56,342] [automlx.interface] Re-fitting pipeline
[2025-04-25 03:12:56,357] [automlx.final_fit] Skipping updating parameter seed, already fixed by FinalFit_2070da66-2
[2025-04-25 03:12:58,079] [automlx.interface] AutoMLx completed.
Out[23]:
<automlx._interface.classifier.AutoClassifier at 0x1501d42f5f40>

Again, we use the roc_auc_score scoring metric to evaluate the performance of this model on unseen data (X_test).

In [24]:
y_proba = model.predict_proba(X_test)
score_original = roc_auc_score(y_test, y_proba[:, 1])

print(f'Score on test data: {score_original:.2f}')
Score on test data: 0.91

We can now go on with this model to show-case the rest of fairness features and metrics, but everything could work for a scikit-learn model as well.

In [25]:
fairness_score = ModelStatisticalParityScorer(protected_attributes='sex')
parity_test_model = fairness_score(model, X_test)
print(f'Statistical parity of the model on test data (lower is better): {parity_test_model:.2f}')
Statistical parity of the model on test data (lower is better): 0.18

Below is another way to visually see statistical parity. The difference of the bars amounts to statistical disparity.

In [26]:
y_pred = model.predict(X_train)

df_predict = X_train.copy()
df_predict['model prediction probability'] = y_pred

pred_per_sex = df_predict.groupby('sex').mean('model prediction probability').reset_index()
pred_per_sex = pred_per_sex.rename(columns={'model prediction probability': 'average prediction'})

fig = px.bar(pred_per_sex, x='sex', y='average prediction')
fig.show()

We can see here that our tuned model has a statistical disparity with respect to sex of 0.33, meaning that among the two values of sex in the dataset, the model predicts favorable outcomes for one sex 33% more than the other.

Model fairness metrics are also available as functions taking as inputs y_true, y_pred and subgroups - though note that statistical parity, by definition, does not require the true labels.

In [27]:
from automlx.fairness.metrics import model_statistical_parity

y_pred = model.predict(X_test)
subgroups = X_test[['sex']]

parity_test_model = model_statistical_parity(y_pred=y_pred, subgroups=subgroups)
print(f'Statistical parity of the model on test data (lower is better): {parity_test_model:.2f}')
Statistical parity of the model on test data (lower is better): 0.18

Measure the Compliance of the True Labels of a Dataset with a Fairness Metric¶

Given a dataset with some ground truth labels, we can check whether those true labels satisfy a particular fairness metric of concern. In this context, statistical parity measures the disparity of positive label rates between subgroups and the rest of the population. Dataset fairness metrics are available as scikit-learn compatible scorers, taking in a list of protected_attributes at creation and then being called with a model, X and y on which to measure fairness, with model being an ignored and optional argument.

In [28]:
from automlx.fairness.metrics import DatasetStatisticalParityScorer

DSPS = DatasetStatisticalParityScorer(protected_attributes='sex')

parity_test_data = DSPS(X=X_test, y_true=y_test)

Dataset fairness metrics are also available as functions taking as inputs y_true and subgroups.

In [29]:
from automlx.fairness.metrics import dataset_statistical_parity

subgroups = X_test[['sex']]

parity_test_data = dataset_statistical_parity(y_test, subgroups)
print(f'Statistical parity of the test data (lower is better): {parity_test_data:.2f}')
Statistical parity of the test data (lower is better): 0.20

We can see here that the test set of the Adult Census Income Dataset has a statistical parity with respect to sex of 0.20, meaning that men have 20% more >50k labels than women. Interestingly, the dataset's statistical disparity (0.20) is less than the tuned model's (0.33), highlighting that a trained model can amplify the unintended bias that is contained in the dataset.

In [30]:
fig = px.bar(
        pd.DataFrame({
            'Fairness  Type': ['Data Fairness', 'Model Fairness'],
            'Statistical Parity': [parity_test_data, parity_test_model],
        }),
        x='Fairness  Type',
        y='Statistical Parity',
)
fig.show()

Other Fairness Metrics¶

Statistical parity is only one of the many supported fairness metrics. As another example, we can compute Equalized Odds, which measures the disparity of a model’s true positive and false positive rates between different subgroups of the data based on demographic information/protected attributes.

In [31]:
from automlx.fairness.metrics import EqualizedOddsScorer

fairness_score = EqualizedOddsScorer(protected_attributes='sex', distance_measure='diff')
EO_original = fairness_score(model, X_test, y_test)
print(f'Equalized odds on test data (lower is better): {EO_original:.2f}')
Equalized odds on test data (lower is better): 0.08

We can also easily compute these fairness metrics on more than one protected attribute.

In [32]:
fairness_score = EqualizedOddsScorer(protected_attributes=['sex', 'race'], distance_measure='diff')
EO = fairness_score(model, X_test, y_test)
print(f'Equalized odds on test data (lower is better): {EO:.2f}')
Equalized odds on test data (lower is better): 0.29

Note that, unlike statistical parity, we cannot compute equalized odds on the dataset since it is dependent to model output. However, we can compute other metrics on the dataset like Smoothed EDF; it is computed as the minimal exponential deviation of positive target ratios comparing a subgroup to the rest of the population.

In [33]:
from automlx.fairness.metrics import smoothed_edf

subgroups = X_train[['race', 'sex']]
smoothed_edf_score = smoothed_edf(y_train, subgroups)
print(f'Smoothed EDF score on train data: {smoothed_edf_score:.2f}')
Smoothed EDF score on train data: 1.63

Revealing Bias with Explainability¶

For a variety of decision-making tasks, getting only a prediction as model output is not sufficient. A user may wish to know why the model outputs that prediction, or which data features are relevant for that prediction. For that purpose the Oracle AutoMLx solution defines the MLExplainer factory function, which allows to compute a variety of model explanations.

Initializing an MLExplainer¶

The MLExplainer object takes as argument the trained model, the training data and ground truth labels, as well as the task.

In [34]:
explainer = automlx.MLExplainer(model,
                               X_train,
                               y_train,
                               target_names=["<=50K", ">50K"],
                               task="classification")

Model Explanations (Global Feature Importance)¶

The notion of global feature importance intuitively measures how much the model's performance (relative to the provided train labels) would change if a given feature were dropped from the dataset, without retraining the model. This notion of feature importance considers each feature independently from all other features.

In [35]:
global_exp = explainer.explain_model()

There are two options to show the explanation's results:

  • to_dataframe() will return a dataframe of the results.
  • show_in_notebook() will show the results as a bar plot.

The features are returned in decreasing order of importance.

In [36]:
global_exp.show_in_notebook()

Model Fairness Explanations (Fairness Feature Importance)¶

The Global Feature Importance attributions can be computed for fairness metrics using the explain_model_fairness() method, which provides 95% confidence intervals for each fairness feature importance attribution.

In [37]:
fairness_exp = explainer.explain_model_fairness(protected_attributes='sex',
                                                scoring_metric='statistical_parity')
fairness_exp.show_in_notebook()

Here, we see that marital-status is considered to be the feature that contributes most to the model's unfairness. Note that fairness feature importance has to be interpreted slightly differently: the most important features are the ones that contributed the most to make the model unfair. For that purpose, below, we compare these two types of explanations.

Global vs Fairness Feature Importance¶

Let's plot a feature's fairness importance according to its global importance to highlight the difference between the two.

In [38]:
def compare(global_exp, fairness_exp):
    dfg = global_exp.to_dataframe()
    dff = fairness_exp.to_dataframe()

    dfg = dfg.set_index('Feature')
    dff = dff.set_index('Feature')

    dfg.columns = [f'{col}_score' for col in dfg.columns]
    dff.columns = [f'{col}_fairness' for col in dff.columns]

    df = pd.concat([dfg, dff], axis=1)

    df = df.reset_index()

    df.columns = ['Feature', 'Increases Accuracy', 'Upper-bound Accuracy', 'Lower-bound Accuracy',
                  'Decreases Fairness', 'Upper-bound Fairness', 'Lower-bound Fairness',]

    fig = px.scatter(df, x="Increases Accuracy", y="Decreases Fairness", text="Feature", log_x=False, size_max=60)

    fig.update_traces(textposition='middle left')

    fig.update_layout(
        height=800,
        title_text='Global vs Fairness Feature Importance'
    )

    fig.show()
In [39]:
compare(global_exp, fairness_exp)

AutoMLx provides other explainers that can sometimes also reveal unintended biases that the model has learned. For more explainers, please refer to the OracleAutoMLx_Classification notebook.

Model Bias Mitigation¶

The AutoMLx package provides a bias mitigation algorithm that fine-tunes decision thresholds across demographic groups to compensate for the bias present in the original model. The approach is called Bias Mitigation.

First, we need to initialize a ModelBiasMitigator. It requires a base estimator, the name of the protected attributes to use, a fairness metric, and an accuracy metric. There are many more options that can be configured, for example, let's say you'd like the bias mitigated model to be constrained to not exceeding an absolute value for the fairness metric. This is how that can be configured.

In [40]:
from automlx.fairness.bias_mitigation import ModelBiasMitigator

bias_mitigated_model = ModelBiasMitigator(
    model,
    protected_attribute_names="sex",
    fairness_metric="equalized_odds",
    accuracy_metric="balanced_accuracy",
    constraint_type="absolute",    # indicates a hard constraint
    constraint_value=0.1,          # The maximum allowed equalized odds score
    constraint_target="fairness",  # as opposed to accuracy
)

Similarly, let's say you'd like the bias mitigated model to be constrained to not decreasing accuracy more than 5% relative to the most accurate (but potentially unfair) model. The following is how this would be configured - these are also the default options. Other common options such as time limit or random seed can also be specified.

In [41]:
bias_mitigated_model = ModelBiasMitigator(
    model,
    protected_attribute_names="sex",
    fairness_metric="equalized_odds",
    accuracy_metric="balanced_accuracy",
    constraint_type="relative",          # default
    constraint_value=0.05,               # default
    constraint_target="accuracy",        # default
    time_limit=50,
    n_trials_per_group=30,               # Number of different multiplying scalars to consider
    random_seed=12345,
)

The ModelBiasMitigator can be called with the usual scikit-learn interface, notably being trained with a single call to fit.

In [42]:
bias_mitigated_model.fit(X_val, y_val)
Out[42]:
<automlx.fairness.bias_mitigation._sklearn.ModelBiasMitigator at 0x1501cf1e32b0>

The fitted model can then be used to collect probabilities and labels like any usual model.

In [43]:
bias_mitigated_model.predict_proba(X_test)
Out[43]:
array([[0.9480151 , 0.05198494],
       [0.00739041, 0.99260956],
       [0.89531946, 0.1046806 ],
       ...,
       [0.5493545 , 0.45064548],
       [0.6450227 , 0.3549773 ],
       [0.98485196, 0.01514801]], dtype=float32)
In [44]:
bias_mitigated_model.predict(X_test)
Out[44]:
array([0, 1, 0, ..., 0, 0, 0])

We can see a summary of the best models found within the tradeoff_summary_ dataframe.

In [45]:
bias_mitigated_model.tradeoff_summary_
Out[45]:
equalized_odds balanced_accuracy multiplier_sex=Female multiplier_sex=Male
0 0.023170 0.624593 0.193148 0.233259
1 0.028703 0.628152 0.256520 0.233259
2 0.036227 0.642396 0.329184 0.274094
3 0.054804 0.779418 2.391568 1.075606
4 0.100502 0.794733 1.552707 1.365998
5 0.129705 0.817235 8.367842 3.129173
6 0.157349 0.819466 7.183583 3.448481
7 0.216564 0.822988 3.661628 3.447235
8 0.237449 0.824125 2.028335 3.146395

We can also visualize all of the best models that were found by our approach using a single show_tradeoff call.

In [46]:
bias_mitigated_model.show_tradeoff(hide_inadmissible=False)

By default, the best model retained and used for inference is the most fair within a 5% accuracy drop relative to the most accurate model found by our approach. It is highlighted in red in the above figure. Note how the base estimator without bias mitigation is dominated by a number of models available with bias mitigation. With little to no loss of accuracy score, we have a model that is more than twice more fair!

If we prefer a model with a different fairness and accuracy tradeoff, we can instead pick another model from the tradeoff plot above. The index needed to select a model can be obtained by hovering over individual points in the plot. We can also look up a model's index in the tradeoff_summary_ DataFrame. We can then select the model using the select_model method.

In [47]:
bias_mitigated_model.select_model(3)

We can run inference on with this model, just like the other one.

In [48]:
bias_mitigated_model.predict(X_test)
Out[48]:
array([0, 1, 0, ..., 1, 0, 0])

References¶

  • More examples and details: http://automl.oraclecorp.com/
  • Oracle AutoML http://www.vldb.org/pvldb/vol13/p3166-yakovlev.pdf
  • scikit-learn https://scikit-learn.org/stable/
  • UCI https://archive.ics.uci.edu/ml/datasets/Adult
  • Big Data and Social Science https://textbook.coleridgeinitiative.org/