Don’t Judge an Object by Its Context: Learning to Overcome Contextual Bias
Bias in Machine Learning → this paper was from Facebook → contextual bias → for feature representations → decorrelate object category from context.
And they found a way to reduce contextually bias.
CNN → is designed to capture context → from large scale dataset, the above image shows when the model relies on the PERSON to classify Skateboard → those cases are contextual bias.
It is because there was a human riding the skateboard → the model captured that data and context, this is not what we want.
Though it is challenging it would be good if there is a method to tell if the model learned those biases. (but how to get those are hard to go). (a couple of other biases are → microwave and more)
Deploying these models in the real world is dangerous, not a good idea. (we want to make sure that when an object appear with another one we want the model to classify correct even when there is only one object in the image).
And we are going to make this happen to make sure that the model is seeing the object more in the image, rather than relying on the context. (minimize the overlap between class activation map and another one).
And they were able to improve performance on those images. (the model that has a bias → hard to deploy in the real world and dangerous to use).
In the vision, context is a good thing, there are more features that we can use. (but there isn’t a way to fight those biases → fix a model that is relying on context to much).
Minimizing the overlap between GradCAM and the object? → correctly identify when there is an overlap. (how can we identify bias categories → they made a mathematical argument about this)
But basically, if one class’s performance is lower when another object is not present → there is a bias. (they were able to capture directional bias → one direction. So due to object A → B).
There are two methods that they have developed → and they are built on the Grad CAM map.
Wow, this is actually very sweet → the descriptions were hard, but the image says another story → minimizes the overlap between classes → in this work CAMs are used as visualization methods as well as a new variable to optimize.
Since Class Activation Map → is a good idea → and it is fully differentiable → hence we are able to inject this into the full training process.
This is a bit complicated → but in general, this → makes the model learn the bias category and make a fix of that. (the weighted loss → since we are going to assume that bias objects are rare) → interesting.
They used a pre-trained model as a backbone → and there seems to be a two-stage of training.