Computer vision is different from human sight, it can also be deceived or make mistakes, who is responsible for them?
Imagine an intelligent camera capable of detecting a dangerous terrorist in a crowd of people: an artificial intelligence algorithm identifies a series of clues (patterns) that individually could hardly provide useful indications, but on the whole clearly announces the danger of an individual. The algorithm assesses how the man is dressed, the fact that he is wearing military trousers, a heavy jacket with lots of pockets, it gives scores on the circumspect way he walks, or on the aggressive facial expression. The police operator does not know how the Artificial Intelligence (AI) has assembled his judgment, he only knows that the algorithm has expressed an evaluation of danger beyond the limit threshold and then intervenes by taking the suspect under close observation.
Let’s take a step back and consider the patterns identified by the AI. How was the machine ‘trained‘ to recognize the terrorist? In the description above there is a fundamental simplification, I talked about a terrorist and I gave a number of clues that each of us would consider obvious. What happens in reality is different: in the training phase the AI receives a rather large amount of images with crowds of people in whom terrorists have already been identified. The AI learns to recognize them by trial and error and to do so evaluates the images by breaking them down to the level of individual pixels. On the one hand the algorithms are based on observations that can be traced back to human experience, and on the other they consider elements that the human eye does not normally take into account.
The former has a direct bearing on human evaluations. In this case, the images with terrorists used during the training phase were selected by people based on their experience. The latter are often difficult to describe because they are based on a way of processing images that does not have a direct correspondence with the explicit mental processes in our brain. Artificial network neurons observe single elements that are initially quite simple — lines, curves, colors — and that become more complex — shapes and objects — as the network increases in depth.
The process described carries with it a series of implications, elements that must be carefully weighed to avoid dangerous short circuits. Who has to do with AI and Computer Vision knows the most critical steps of this process: the difficulty of collecting large amounts of images necessary for the training phase; the careful analysis of datasets to avoid hidden prejudices; the delicate engineering phase of the features that best respond to our specific task. But even so it is not enough. It is necessary to take into account two other less evident but decisive steps. The first one is how to manage the risk that AI algorithms can be circumvented when not really deceived. The second, we must ask ourselves where the human responsibility intervenes in the chain.
Although very sophisticated, image recognition systems may have limitations with respect to human vision. They work differently: if properly developed they can see and recognize objects better than an untrained human eye and they also do not have attention problems when they perform repetitive tasks; at the same time disturbing elements that we would consider irrelevant can confuse them.
A recent study from the York University of Toronto has shown that modifying a small portion of an image processed by the AI can confuse the algorithm and have a non-local impact in the object detection operation. The researchers showed how, taking into account the photograph of a living room, it was sufficient to overlap in a small area a disturbing element (the frame with an elephant) to make the algorithm suddenly stop recognizing a chair and a cup placed in another portion of the image.
How to avoid that the navigator of our self-driving car suddenly goes haywire for the passage of a stork or a drone? Or to prevent our terrorists recognition algorithm from being tricked by a bizarre-shaped balloon?
To tell the truth these are not new problems, and engineers and policy makers have been working on them for a long time. An autonomous vehicle for examples has numerous different sensors, it is to be excluded that it relies solely on its artificial vision. But even so, who is responsible for the accident caused by the decision taken by the autopilot system?
Returning to our starting example, the human role returns to be fundamental. The police operator who analyzes the suspected terrorists has to decide how to handle the images without ever falling into the trap of believing that the selection made by the AI is in any way probative. He will be able to make decisions based on a mix of assessments made together with the AI, but in essence he will always be responsible. He must never become a mere executor and for better or for worse, in cases of false/positive — when a presumed terrorist turns out to be a common person — or false/negative — any failure to report a terrorist — he will have to answer for the intervention or for the lack of action.
#ComputerVision, #ArtificialIntelligence, #AI, #DeepLearning
Source: Deep Learning on Medium