Source: Deep Learning on Medium

# DeepMind is Using This Old Technique to Evaluate Fairness in Machine Learning Models

One of the arguments that is regularly used in favor of machine learning systems is the fact that they can arrive to decisions without being vulnerable to human subjectivity. However, that argument is only partially true. While machine learning systems don’t make decisions based on feelings or emotions, they do inherit a lot of human biases via the training datasets. Bias is relevant because it leads to unfairness. In the last few years, there has been a lot of progress developing techniques that can mitigate the impact of bias and improve the fairness of machine learning systems. Recently, DeepMind published a research paper that proposes using an old statistical technique known as Causal Bayesian Networks(CBN) to build more fairer machine learning systems.

How can we define fairness in the context of machine learning systems? Humans often define fairness in terms of subjective criteria. In the context of machine learning models, fairness can be represented as the relationships between a sensitive attribute( race, gender…) and the output of the model. While directionally correct, that definition is incomplete as it is impossible to evaluate fairness without considering the data generation strategies for the model. Most fairness definitions express properties of the model output with respect to sensitive information, without considering the relations among the relevant variables underlying the data-generation mechanism. As different relations would require a model to satisfy different properties in order to be fair, this could lead to erroneously classify as fair/unfair models exhibiting undesirable/legitimate biases. From that perspective, identifying unfair paths in the data generation mechanisms is as important as understanding the models themselves.

The other relevant point to understand about analyzing fairness in machine learning models is that its characteristics expand beyond technological constructs and typically involve sociological concepts. In that sense, visualizing the datasets is an essential component to identify potential sources of bias and unfairness. From the different frameworks in the market, DeepMind relied on a method called Causal Bayesian networks (CBNs) to represent and estimate unfairness in a dataset.

# Causal Bayesian Networks as a Visual Representation of Unfairness

Causal Bayesian Networks(CBNs) are a statistical technique used to represent causality relationships using a graph structure. Conceptually, a CBN is a graph formed by nodes representing random variables, connected by links denoting causal influence. The novelty of DeepMind’s approach was to use CBNs to model the influence of unfairness attributes in a dataset. By defining unfairness as the presence of a harmful influence from the sensitive attribute in the graph, CBNs provides a simple and intuitive visual representation for describing different possible unfairness scenarios underlying a dataset. In addition, CBNs provide us with a powerful quantitative tool** **to measure unfairness in a dataset and to help researchers develop techniques for addressing it.

A more formal mathematical definition of a CBN is a graph composed of nodes that represent individual variables linked by causal relationships. In a CBN structure, a path from node X to node Z is defined as a sequence of linked nodes starting at X and ending at Z. X is a** **cause of (has an influence on) Z if there exists a causal** **path** **from X to Z, namely a path whose links are pointing from the preceding nodes toward the following nodes in the sequence.

Let’s illustrates CBNs in the context of a well-known statistical case study. One of the most famous studies in bias and unfairness in statistics was published in 1975 by a group of researchers at Berkeley University. The study is based on the college admission scenario in which applicants are admitted based on qualifications Q, choice of department D, and gender G; and in which female applicants apply more often to certain departments (for simplicity’s sake, we consider gender as binary, but this is not a necessary restriction imposed by the framework). Modeling that scenario as a CBN we have the following structure. In that graph, the path G→D→A is causal, whilst the path G→D→A←Q is non causal.

# CBNs and Unfairness

How can CBNs help to determine causal representations of unfairness in a dataset? Our college admission example showed a clear example about how unfair relationships can be modeled as paths in a CBN. However, while a CBN can clearly measure unfairness in direct paths, the indirect causal relationships are highly dependent on contextual factors. For instance, consider the three following variations of our college in which we can evaluate unfairness. In these examples total or partial red paths are used to indicate unfair and partially-unfair links, respectively.

The first example illustrates a scenario in which female applicants voluntarily apply to departments with low acceptance rates, and therefore the path G→D is considered fair.

Now, consider a variation of the previous example in which female applicants apply to departments with low acceptance rates due to systemic historical or cultural pressures, and therefore the path G→D is considered unfair (as a consequence, the path D→A becomes partially unfair).

Continue with the contextual game, what would happen if our college lowers the admission rates for departments voluntarily chosen more often by women? Well, the path G→D is considered fair, but the path D→A is partially unfair.

In all three examples, CBNs provided a visual framework for describing possible unfairness scenarios. However, the interpretation of the influence of unfair relationships is often dependent on contextual factors outside the CBN.

Until now, we have used CBNs to identify unfair relationships in a dataset but what if we could measure them? It turns out that a small variation of our technique can be used to quantify unfairness in a dataset and to explore methods to alleviate it. The main idea to quantify unfairness relies on introducing counterfactual scenarios that allow us to ask if a specific input to the model was treated unfairly. In our scenario, a counterfactual model would allow to ask whether a rejected female applicant (G=1, Q=q, D=d, A=0) would have obtained the same decision in a counterfactual world in which her gender were male along the direct path G→A. In this simple example, assuming that the admission decision is obtained as the deterministic function f of G, Q, and D, i.e., A = f(G, Q, D), this corresponds to asking if f(G=0, Q=q, D=d) = 0, namely if a male applicant with the same department choice and qualifications would have also been rejected.

As machine learning continues to become a more integral part of software applications, the importance of creating fair models will become more relevant. The DeepMind paper shows that CBNs can offer a visual framework for detecting unfairness in a machine learning model as well as a model for quantifying its influence. This type of technique could help us to design machine learning models that represent the best of human values and that mitigate some of our biases.