Understanding and Reducing Bias in Machine Learning

Source: Deep Learning on Medium

Go to the profile of Jiaconda

‘.. even after the observation of the frequent or constant conjunction of objects, we have no reason to draw any inference concerning any object beyond those of which we have had experience’ — Hume, A Treatise of Human Nature

Bias in Machine Learning is defined as the phenomena of observing results that are systematically prejudiced due to faulty assumptions. However, without assumptions, an algorithm would have no better performance on a task than if the result was chosen at random, a principle which was formalized by Wolpert in 1996 into what we call the No Free Lunch theorem.

Mireille Hildebrandt, a lawyer and philosopher working at the intersection of law and technology, has written and spoken extensively on the issue of bias and fairness in machine learning algorithms. In an upcoming paper on agnostic and bias free machine learning (Hildebrandt, 2019), she argues that bias free machine learning doesn’t exist and that a productive bias is necessary for an algorithm to be able to model the data and make relevant predictions. The three major types of bias that can occur in a predictive system can be laid out as:

· Bias inherent in any action perception system (productive bias)

· Bias that some would qualify as unfair

· Bias that discriminates on the basis of prohibited legal grounds

Performance in machine learning is achieved via minimization of a cost function. Choosing a cost function and therefore the search space and the possible values of the minimum introduces what we refer to as productive bias into the system. Other sources of productive bias come from the context, purpose, availability of adequate training and test data, optimization method used as well as from trade-offs between speed, accuracy, overfitting and overgeneralizing, each choice associated with a corresponding cost. Thus, the assumption of machine learning being free of bias is a false one, bias being a fundamental property of inductive learning systems. In addition, the training data is also necessarily biased, and it is the function of research design to separate the bias that approximates the pattern in the data we set out to discover vs the bias that is discriminative or just a computational artefact.

According to the “No Free Lunch theorem” (Macready, 1997), all classifiers have the same error rate when averaged over all possible data generating distributions. Therefore, a certain classifier must have a certain bias towards certain distributions and functions to be better at modelling those distributions, which would at the same time render it worse at modelling other types of distributions.

Figure 1: Depiction of the No Free Lunch theorem, where better performance at a certain type of problem comes with the loss of generality (Fedden, 2017)

It should also always be kept in mind that the data used to train the algorithm is finite, and therefore does not reflect reality. This also results in bias which arises from the choice of training and test data and their representation of the true population. Another assumption we make is that of assuming that the limited training data is able to model and accurately classify the test data.

“[…] most current theory of machine learning rests on the crucial assumption that the distribution of training examples is identical to the distribution of test examples. Despite our need to make this assumption in order to obtain theoretical results, it is important to keep in mind that this assumption must often be violated in practice.”
 — Tim Mitchell

Even if we do manage to rid our systems of above-mentioned biases, there is still potential for bias to creep in over time: once algorithms are trained, perform well and are put into production, how is the algorithm updated, who determines if the system is still doing well two years later, and who gets to determine what it means for a system to do well?

Being aware of the different bias present in these systems requires the ability to explain and interpret how they work, echoing the transparency theme of Adrian Weller’s talk: if you cannot test it, you cannot contest it. We need to be clear about the productive bias that ensures the functionality of these systems, and figure out the remaining unfairness in either the training set or the algorithms. That would allow us to infer the third and the most serious kind of bias, that which results in discrimination based on prohibited legal grounds.

Krishna Gummadi, head of the Networked Systems research group at the Max Planck Institute for Software Systems, has worked extensively on reducing discriminatory bias in classifiers, and his research in this area deals with discrimination from a computational perspective and develops algorithmic techniques to minimise it.

With machine learning systems becoming more ubiquitous in automated decision making, it is crucial that we make these systems sensitive to the type of bias that results in discrimination, especially discrimination on illegal grounds. Machine learning is already being used to make or assist decisions in the following domains of Recruiting (Screening job applicants), Banking (Credit ratings/Loan approvals), Judiciary (Recidivism risk assessments), Welfare (Welfare Benefit Eligibility), Journalism (News Recommender Systems) etc. Given the scale and impact of these industries, it is crucial that we take measures to prevent unfair discrimination in them via legal as well as technical means.

To illustrate his techniques, Krishna Gummadi focussed on the controversial recidivism prediction tool called COMPAS developed by Northpointe Inc, the output of which used all across the United States by judges in pretrial and sentencing. The algorithm is based on responses to a 137 items questionnaire featuring questions on their family history, residential neighbourhood, school performance etc. to predict their risk of committing crimes in the future.

The result of the algorithm was used by judges to decide the length and type of sentencing while considering the input-output relationship as a black box. While some studies have shown that COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) is no better at predicting recidivism risk than randomly recruited internet strangers, other studies have focussed more on testing the algorithms treatment of different salient social groups.

Figure 2: Study by Propublica that finds discrimination in the rate at which the algorithm misclassifies non-reoffending defenders at different rates for blacks and whites

A study done by ProPublica (Jeff Larson, 2016) showed that the algorithm was twice as likely to label black defenders as high risk who eventually did not reoffend as compared to white defenders. This is given by the False Positive rate (FP rate) of black defendants which is 44.85 (i.e. 44.85 percent of the black defendants classified as reoffending did not reoffend) as compared to 23.45 for white defendants. However, Northpointe came with the rebuttal that according to the measures they used, black and white defenders had the same misclassification rate.

Figure 3: Northpointe’s rebuttal that their scoring on a 10-point scaled classified blacks or whites correctly at similar rates

It turns out that both the parties are correct and this is because they use different measures for fairness. And there is no algorithm that can perform equally well on both the fairness measures if the base recidivism rates differ for blacks and whites. Both these fairness measures represent inherent tradeoffs.

We need to be able to understand the tradeoffs of these fairness measures so as to make informed decisions on their discriminative capability. This is why we need a computational perspective. So let us first understand what constitutes as discrimination from this perspective.

The normative definition used is:

wrongfully impose a relative disadvantage on persons based on their membership in some salient social group e.g., race or gender

How do we operationalise this definition and implement that in an algorithm? The above expression contains several fuzzy notions that require formalisation, ‘relative disadvantage’, ‘wrongful imposition’, ‘salient social group’ etc. If we focus just on the ‘relative disadvantage’ component, we will recover the first type of discrimination that can occur in an algorithm, that of Disparate treatment.

Figure 4: Example of a binary classification problem where the algorithm learns whether a loan will be repaid based on m+1 features, of which z is a sensitive feature

Krishna used the above binary classification problem to better understand the different types of discrimination. The problem is to predict whether a loan will be repaid based on m+1 features out of which one feature is a sensitive feature, for example client’s race.

Disparate Treatment:

This type of relative discrimination can be detected if a user’s predicted outcome changes on changing the sensitive feature. In the above example, this would mean the algorithm predicting a positive label for returning a loan for a white person and a negative one for a black person, even if all other features are exactly the same. To prevent any dependence on race, we would need to remove the sensitive features from the dataset

We can formalise this as:

i.e. the probability P of the output should not depend on or change with the output

Disparate Impact:

This type of relative discrimination can be detected if there is a difference in the fraction of positive (negative) outcomes for the different sensitive groups. In the above case this would happen if more percentage of black people were classified as defaulters as compared to white people.

We formalise this requirement as:

i.e. the probability of a positive label (returning loans) for z=1(white clients) is the same as the probability of positive label for z=0 (black clients).

Disparate Impact measures the level of indirect discrimination towards a group and is often seen in human decision making as well. Even if we remove disparate treatment by removing the sensitive feature, discrimination can still happen through other correlated features such as zip code. Measuring and correcting disparate impact makes sure that this is corrected. This requirement should be used when the training dataset is biased. While being considered a controversial measure by many, notably by critics who hold that some scenarios cannot be freed from disproportionate outcomes.

Disparate Mistreatment:

This type of relative discrimination is detectable when we measure the difference in fraction of accurate outcomes for different sensitive groups. This is the type of discrimination that was found by Propublica in the Northpointe algorithm which misclassified innocent black defenders as reoffending at twice the rate as white people. We can correct this mistreatment by requiring same proportions of accurate outcomes for all sensitive groups involved.

We formalise this requirement as:

Where z=0,1 represent different sensitive groups.

Mechanisms for non-discriminatory Machine Learning:

Before formalising mechanisms to correct discrimination in algorithms, we need to take into account the fact that algorithms are not unbiased as is the running narrative. Algorithms are objective as compared to humans, but that does not make them fair, it just makes them objectively discriminatory. The job of an algorithm is to optimise a cost function so as to reach the best approximation of the actual function that is generating the output we want to predict. If the best function is one which classifies all members of a disadvantaged group as reoffending or unable to pay a loan, that is what the algorithm is going to select. Therefore, objective decisions can be unfair and discriminatory.

To put it simply, algorithms learn a model for the output by approximating a function that takes features as input and approximates an output. It decides the best parameters for this function based on which ones minimise the difference between the function outputs and the actual results. This is called an optimisation problem. We can add further constraints to this problem by requiring that the approximated function must also obey one or all of the above formulated requirements to avoid discrimination. That is:

which requires the model is approximated in such a way that the accuracy rates for all sensitive groups are the same. However, adding constraints results in a trade-off, a relatively fair algorithm at the cost of some accuracy. Krishna applied this constraint to the recidivism prediction algorithm, and at the cost of a little accuracy managed to get an algorithm that has a similar rate of error for black and white defendants. He used constraints on the False Positive Rate (FPR) which is the probability that an innocent defendant is classified as a recidivist and the False Negative Rate (FNR) which is the probability that a future reoffending defendant is classified as non-reoffending. As we see in Figure 8, the difference in the FPR and FNR of black and white defendants approaches 0 as the constraints are tightened with only a small loss in accuracy (~66.7% to ~65.8%).

Figure 5: Correcting Disparate mistreatment for Recidivism prediction dataset.

In short, despite fears that algorithms will only serve to further entrench and propagate human biases, there is significant effort by the AI community to avoid and correct discriminatory bias in algorithms while also making them more transparent. Furthermore, the emergence of GDPR formalises and structures these efforts to make sure that industries are given an incentive to follow best practices when it comes to storing and processing massive amount of personal data of citizens and prioritise fairness at the cost of some accuracy. Ultimately, algorithmic systems will be a reflection of the society that they attempt to model and approximate, and it will take active efforts from government and well as the private sector to make sure that it works not just to entrench and further exacerbate the inequalities inherent in our structures, but correct them by putting strict measures and constraints that penalise it. This allows us to envision a society in which decision making can be potentially rid of the subjectivity of human bias by replacing it with objective algorithmic decisions, that is aware of its biases if not entirely free of them.

Works Cited

Ansaro. (2017, Oct 12). Retrieved from https://medium.com/ansaro-blog/interpreting-machine-learning-models-1234d735d6c9

Fedden, L. (2017). Retrieved from Medium Inc.: https://medium.com/@LeonFedden/the-no-free-lunch-theorem-62ae2c3ed10c

Hildebrandt, M. (2019). Privacy As Protection of the Incomputable Self: From Agnostic to Agonistic Machine Learning. (p. 19 (1)). Forthcoming in Theoretical Inquiries of Law.

Jeff Larson, S. M. (2016, May). Retrieved from ProPublica: https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

Kevin Eykholt, I. E. (2018). Robust Physical-World Attacks on Deep Learning Visual Classification. CVPR.

Langer, E. B. (1978). The mindlessness of Ostensibly Thoughtful Action: The Role of “Placebic” Information in Interpersonal Interaction. Journal of Personality and Social Psychology, 36(6), 635–642.

Macready, D. H. (1997). No Free Lunch Theorems for Optimization. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, Vol 1, №1.

Marco Tulio Ribeiro, S. S. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135–1144 ). ACM.

MIT News. (2018, Feb 11). Retrieved from http://news.mit.edu/2018/study-finds-gender-skin-type-bias-artificial-intelligence-systems-0212

Open AI Blog. (2017, Feb 24). Retrieved from https://blog.openai.com/adversarial-example-research/

Rich Caruana et al. (n.d.). ‘Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-Day Readmission. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (pp. 1721–1730). New York, USA.

Wired Inc. (2018, March 29). How coders are fighting bias in facial recognition software. Retrieved from https://www.wired.com/story/how-coders-are-fighting-bias-in-facial-recognition-software/