Original article was published by Luca Rossi on Artificial Intelligence on Medium
3. How Should We Address Training Bias and Discrimination?
Some people may think that machines are immune to human flaws like stereotypes, biases, discrimination.
But the problem is that not only AI is prone to the same flaws, but can also exacerbate them.
What distinguishes AI from non-intelligent computer programs is the ability to learn. There are various ways for AI to learn, the most common is by examples. This is called Supervised Learning and it basically works like this:
- A human operator trains the AI model on a training set.
This set contains labeled samples, for example N images of dogs associated with the label “dog” and M images of cats associated with the label “cat”. Those labels are manually provided by humans doing the most boring job ever.
- The model learns rules to associate each sample with the right label.
In that way, it tries to generalize so that the learned rules can be applied to new samples.
- The human operator tests the model on a test set.
This set contains unlabeled samples. The model uses the learned rules to guess the label of each sample.
This method works very well, especially with Deep Learning approaches.
The problem is that a seemingly good model can fail spectacularly on real-world tasks. This happens because the training set is usually not representative of the real world.
The training set is built by humans and contains all their human biases. When the AI model learns from this training set, it learns those biases as well.
There are some metrics to evaluate an AI model, but none of them can tell you if the model is unbiased, unless you test it in the real world.
If you are a data scientist, this probably made you think about overfitting. But what I’m talking about is a bit different (especially if you consider that overfitting is associated with low bias).
Overfitting happens when the model fails to generalize. The model learns “too well” from the training set and evaluation metrics on training are really good.
But these results are misleading: for example, the model can learn that all cats have whiskers, but if you show it a cat without whiskers, it will not recognize it as a cat.
Overfitting is usually easy to spot: just test the model on new data. If the results are bad, it means that the model overfitted.
The best way to minimize overfitting is to make the training set as diverse as possible. Samples with the same label should not be too similar. For example, the training set should include different breeds of dogs (or cats) in different poses, with different colors, from different viewpoints.
When you test the model again on new data and results are good, it means that the model is good and can be used in real-world applications.
Let’s say that you trained a model to detect skin cancer. You made the training set diverse enough to minimize overfitting, and you got an accuracy of 98% on the test set.
You got excited and deployed the model into the real world. Some months later, your model has been taken down and you have been accused of making it racist.
You are confused. Obviously you are not racist. What are they talking about?
Turns out that both the training set and the test set contained only white people. You didn’t do it on purpose, but because you work in a mostly white community, and only had access to images containing white people. You didn’t even realize that you ignored black people, it was an unconscious error.
Of course, when the model has been shown a black person for the first time in the real world, it wrongly concluded that they were covered by a gigantic tumor.
This is not overfitting in the technical sense and it’s difficult to detect. It happens because the training set and test set are often taken in a similar context.
The test set is usually taken from a controlled environment that is different than the real world.
It’s easy to recognize when the training set doesn’t represent the real world: just use a different test set. But how do you recognize when the test set doesn’t represent the real world? There is no standard solution to this problem.
The example I made before isn’t too far-fetched. In 2015, Google Photos had to remove the “gorilla” label because two black people have been labeled as gorillas.
Most datasets are biased towards WEIRD white men. There are countless examples of AI discrimination, like a camera biased against Asians (it thinks they are constantly blinking), a crime predictor biased against Blacks, or Amazon’s HR AI biased against women.
AI itself isn’t biased, humans are, often subconsciously. We transmit our own biases to AI not because we are racist or sexist, but because we don’t realize that we are used to seeing the world from a partial perspective.
If I ask you to picture a person and you immediately think of a white guy, it doesn’t make you guilty. But the consequences can be disastrous, like in the examples we have just seen.
It gets worse.
Remember, in point #1, when we talked about inference? One thing is detecting a bias against black people in the real world, but what will happen when AI will make complex inferences that result in more subtle forms of discrimination?
Imagine that you apply for a job. There is another candidate that is just like you: same degree, same work experience, same soft skills, same personality. Oh, and you are both straight white good-looking WEIRD guys.
The HR AI has to decide whether you are good candidates. It doesn’t have to choose between the two of you, it can select one, both, or none.
The HR AI selects the other candidate and not you. It doesn’t explain why, since it’s a “black box” deep neural network. It just thinks that you are a bad candidate, and the other is a good one.
How is that possible? You two are the same. If he is really good, you should be good too. Or, if you are really bad, he should be bad as well.
Maybe the HR AI just doesn’t like you. Maybe it’s something in the way you draw butterflies, the color of your eyes, what you ate that morning, the way you smile, where you parked your car, that it just doesn’t like.
And there is no way to understand it easily.
Can we solve the AI bias problem? Maybe it can’t be eliminated, but we could try to mitigate it.
One solution could be a more standardized process in dataset collection and documentation. This process should examine different types of biases, like those identified by Friedman and Nissenbaum: existing bias, technical bias, and emergent bias.
This could be aided by a new profession that may emerge in the near future: algorithm bias auditor.
That person would find biases in datasets, using their real-world experience, and in AI models, by “dissecting” black boxes.
This point could be the one that, more than any other point described here,
Among all the other points described here, this could be the one with the highest pervasiveness/negligence ratio: AI is everywhere, but we are doing very little to address discrimination by AI.