5 Tools To Evaluate Machine Learning Model Fairness and Bias.

Source: Deep Learning on Medium

Introducing some tools to easily evaluate and audit machine learning models for fairness and bias

Evaluating machine learning models for bias is becoming an increasingly common focus for different industries and data researchers. Model Fairness is a relatively new subfield in Machine Learning. In the past, the study of discrimination emerged from analyzing human-driven decisions and the rationale behind those decisions. Since we started to rely on predictive ML models to make decisions for different industries such as insurance and banking, we need to implement strategies to ensure the fairness of those models and detect any discriminative behaviour during predictions.

As ML models get more complex, it becomes much harder to interpret them. Predictive models are usually a black-box function that takes a certain input (x) and outputs a prediction (y). Let’s say an insurance company wants to use a predictive model to measure the risk of taking a client on-board. The input (x) can consist of features or attributes such as race, age, gender, ethnicity, education level and income. They can also decide whether or not a person should be asked to pay a higher premium based on the model predictions that look into the same attributes I just mentioned. In the banking and financial industry in the United States, this may have some legal implications as it violates the Equal Credit Opportunity Act (fair lending) by not approving credit request of right applicants.

As the use of predictive models rapidly grows and deployed to make informative decisions to access some services such a bank loan, creditworthiness or employment, it is now important to audit and interpret the output decisions of those models and design for fairness in the early stages. In this article, I will discuss 5 tools that can be used to explore and audit the predictive model fairness.

1- FairML:

FairML is a toolbox written in python to audit machine learning models for fairness and bias. It’s an easy way to quantify the significance of the model’s inputs. It uses four input ranking algorithms to quantify a model’s relative predictive dependence on model’s inputs.

For installation and demo code, you can refer to the main Github repo for the library.

2- Lime

Lime is an open source project that aims at explaining and interpreting how machine learning models work. The great thing about Lime is the broad range of the Machine Learning or Deep Learning models it supports. I can interpret text classification, multi-class classification, image classification and regression models.

Here is a quick demonstration of using Lime to understand and explore the decision criteria for making predictive decisions. In this example, I trained a text classifier that uses the 20 categories newsgroup dataset. The below text is being classified as comp.windows.x category.

Now, let’s use Lime to explore what specific words in this text has the most weight to come up with this decision.

As you can see, the words client, application have the most weight. Let’s try a little experiment. I am going to remove those two words from the above text and place another word and try again to predict.

You can see that the word space introduced bias and completely changed the prediction although the context of the text is still the same.

To understand how the code works, you can refer to the code repository and the example code in the link below:

3- IBM AI Fairness 360

AI Fairness 360 is an open-source library that detects and mitigate bias in machine learning models using a bunch of bias mitigation algorithms. This library is very comprehensive and full of metrics to evaluate bias.

They created an Interactive Experience in which you can see the metrics and test the capabilities.

They have also created a Guidance Material that can guide through which metrics can be used for which use case.

This library relies on a bunch of Bias mitigation algorithms such as:

and many more.


SHAP can explain the output of any machine learning model by connecting game theory with a local explanation. It uses some beautiful JS visualization to explain the models.

For detailed explanation and guidance on how it works, refer to the link below

5- Google What-If Tool

Smile Detection Example

Google What-If Tool (WIF) is a tensorboard plugin understanding of a black-box classification or regression ML model. It has multiple demos, interactive experience and comprehensive documentation.

In Conclusion, any bias in ML models is due to some kind of bias present in the people working on annotating the data or it lies in the data itself due to skewness or missing features or any other reason that needs to be picked up and investigated. Failing to capture those features and generalize the data to train the models can result in model bias. Biased Machine Learning models may result in making unfair/biased decisions which would, consequently, impact the end users. Therefore, it is really important that all stakeholders should focus on detecting any presence of bias in their developed models.