Source: Deep Learning on Medium
DECO: Debiasing through Compositional Optimization of Machine Learning Models
In the last decade or so, there has been a growing number of applications of machine learning in domains that have a significant and direct impact on human life. In these domains, it is common to find that machine learning models end up learning societal biases and prejudices against certain classes or groups of humans.
The models end up biased as the training datasets in many of these applications are biased. Bias in these datasets can be attributed to two different sources:
(1) Historical biases: Certain classes of people were biased against historically, and this can be reflected in the dataset;
(2) Sampling biases: The data could be imbalanced across classes or groups of people (e.g., policing and arrest records across different neighborhoods).
Machine learning researchers have established that bias exists in many different forms. Accordingly, there are many different measures for quantifying bias and many different mitigation algorithms for decreasing one or more bias measures [1, 2, 3]. There are still some major challenges.
(1) Bias measures are application dependent and context-dependent.
(2) Most debiasing algorithms are measure specific.
(3) Debiasing may require retraining.
Training can be quite expensive in many real-world scenarios where the same model might be used in different jurisdictions that have different fairness requirements.
We present a new approach for reducing bias that addresses some of these issues. In our approach, we treat debiasing as an optimization problem and apply optimization procedures to parts of an already trained machine learning model. For instance, given a trained neural network, we apply optimization to set the weights on one or more layers of the network such that the weights reduce some combination of bias measures without sacrificing performance. Through experimentation, we have found that optimizing just the last layer works well.
This approach has advantages over other debiasing algorithms as it can work with trained models. We can also use with any combination of bias and performance measures and can tune our objective to achieve varying amounts of bias reduction.
We used the UCI census income dataset to test our approach. The following figures show how our method performs when different bias measures are targeted. First, we looked at reducing two different bias measures, Equal Opportunity Difference and Statistical Parity Difference for two different protected attributes: race and sex. As the graphs show, our approach was able to successfully debias both measures individually across the two different attributes. For equal opportunity difference, our method even uncovered models that had higher performance than the starting model. The accuracies shown here are test accuracies.