Source: Deep Learning on Medium
Google Open Sources Fairness Indicators to Help Build Fair Machine Learning Systems
The new toolset enables to quantify fairness metrics for machine learning models
Ethics is one of the disciplines that must accompanied the evolution of artificial intelligence(AI) systems. Building AI agents that provide ethical outcomes is one of the foundational challenges of the next decade of machine learning systems. Among the different aspects of ethical systems, fairness is one that deserves particular attention. The idea of idea of a fair machine learning model is one whose outcomes don’t favor any particular group based on a specific bias. Conceptually, the idea of fair machine learning systems seems incredibly intuitive but how can we materialize it technically. After all, fairness is a concept of ethics that regularly involves subjective opinions. To measure fairness in machine learning models would require a quantitative definition of it. A few days ago, Google took some initial steps to address this challenge with the release of the fairness indicators for TensorFlow.
The idea of quantifying fairness in a machine learning model is far from trivial. Bias can manifest itself across all aspects of the machine learning lifecycle aggregating its impact on the final outcome of the model. To detect this unequal impact, evaluation over individual slices, or groups of users, is crucial as overall metrics can obscure poor performance for certain groups.
Evaluating the impact of different slices of users in a model is an interesting approach to evaluate fairness. It is important to notice that fairness cannot be achieved solely through metrics and measurement; high performance, even across slices, does not necessarily prove that a system is fair. However, this approach is good starting point to identify gaps in the performance of the model across relevant groups of users.
Fairness Indicators is a suite of tools that enables computation and visualization of commonly-identified fairness metrics for classification models. It is relevant to highlight that this is not the first attempt to provide a fairness evaluation metrics for machine learning models but previous attempts have had trouble scaling when applied to large datasets. The current architecture of the Fairness Indicator tool suite allows it to evaluate models and datasets of any size.
From a functional standpoint, Fairness Indicators enables a core set of capabilities of evaluate fairness in machine learning models:
- Evaluate the distribution of datasets
- Evaluate model performance, sliced across defined groups of users
- Feel confident about your results with confidence intervals and evals at multiple thresholds
- Dive deep into individual slices to explore root causes and opportunities for improvement
To enable the aforementioned capabilities, Fairness Indicators computes confidence intervals, which can surface statistically significant disparities, and performs evaluation over multiple thresholds. In the UI, it is possible to toggle the baseline slice and investigate the performance of various other metrics. The user can also add their own metrics for visualization, specific to their use case.
The current version of Fairness Indicators is optimized for the TensorFlow stack. Specifically, the release includes the following components:
- Tensorflow Data Analysis (TFDV) [analyze distribution of your dataset]
- Tensorflow Model Analysis (TFMA) [analyze model performance]
- Fairness Indicators [an addition to TFMA that adds fairness metrics and the ability to easily compare performance across slices]
- The What-If Tool (WIT) [an interactive visual interface designed to probe your models better]
One of the great capabilities of the Fairness Indicators release is the integration with the What-If Tool (WIT). Clicking on a bar in the Fairness Indicators graph will load those specific data points into the WIT widget for further inspection, comparison, and counterfactual analysis. This is particularly useful for large datasets, where Fairness Indicators can be used to identify problematic slices before the WIT is used for a deeper analysis.