Not one week goes by in which we doesn’t here about another impressive milestone achieved by artificial intelligence(AI) systems. As AI research and technology advances, AI agents are constantly showing impressive learning performances that match and surpass the cognitive skills of humans on different domains. However, most AI programs still rely on computationally expensive training and, even reinforcement learning(RL) models that try to build knowledge organically require thousands of hours of training in order to match human’s performance. Humans, in contrast, are able to rapidly learn the fundamentals new skills by just having a small exposure to it. The ability of the human brain to learn so efficiently have puzzled neuroscientists for decades.
One of the key differentiators between the human brain and the AI structures such as deep neural networks is that the former is more than just a combination of interconnected neurons. In addition to the electric signal exchanged between neurons, the brain is constantly segregating different chemicals known as neurotransmitters to accomplish different functions. A recent research from Alphabet’s subsidiary DeepMind believes that one of those neurotransmitter plays a key role in the brain’s ability to rapidly learn new subjects. We are talking about Dopamine.
Commonly known as the brain’s pleasure signal( or the Kim Kardashian of molecules 😊 ), Dopamine acts as a reward system in the brain. The neurotransmitter is often associated with behaviors such as pleasure, lust, motivation, addiction or even extreme feminism(brain the neuroscientists for that one 😉 ). However, other than strengthening connections between neurons, Dopamine was never seen as a key enabler of learning. The DeepMind team used different meta-learning techniques that seem to indicate the opposite.
Before we deep dive into the DeepMind research, let’s take a second to appreciate the importance of what we are discussing here. Typically, AI systems draw inspiration from neuroscience and try to emulate the known structures of the brain in order to reach new levels of intelligence. Here we have the opposite, an AI system that is teaching us something we didn’t know about the brain. I find that remarkable.
The DeepMind team leveraged different meta-reinforcement learning techniques that simulate the role of Dopamine in the learning process. How did they do that exactly? Well, the meta-learning trained a recurrent neural network (representing the prefrontal cortex) using standard deep reinforcement learning techniques (representing the role of dopamine) and then compared the activity dynamics of the recurrent network with real data taken from previous findings in neuroscience experiments. Recurrent networks are a good proxy for meta-learning because they are able to internalize past actions and observations and then draw on those experiences while training on a variety of tasks.
The initial results indicate that dopamine’s role goes beyond just using reward to learn the value of past actions and that it plays an integral role, specifically within the prefrontal cortex area, in allowing humans to learn efficiently, rapidly and flexibly on new tasks.
DeepMind research consisted of six meta-learning experiments from the field of neuroscience — each requiring an agent to perform tasks that use the same underlying principles (or set of skills) but that vary in some dimension. For instance, one experiment that was recreated was the famous Harlow Experiment, a psychology test from the 1940s used to explore the concept of meta-learning. In the original test, a group of monkeys were shown two unfamiliar objects to select from, only one of which gave them a food reward. They were shown these two objects six times, each time the left-right placement was randomized so the monkey had to learn which object gave a food reward. They were then shown two brand new objects, again only one would result in a food reward. Over the course of this training, the monkey developed a strategy to select the reward associated-object: it learnt to select randomly the first time, and then based on the reward feedback to choose the particular object, rather than the left or right position, from then on. The experiment shows that monkeys could internalize the underlying principles of the task and learn an abstract rule structure — in effect, learning to learn.
The meta-learning model recreated the Harlow experiment by sing a virtual computer screen and randomly selected images, the experiment showed that the ‘meta-RL agent’ appeared to learn in a manner analogous to the animals in the Harlow Experiment, even when presented with entirely new images never seen before. The meta-learning agent was also able to adapt quickly to different of tasks with different rules and structures.
The fascinating part of the DeepMind results is that most of the learning seemed to have been taking place in the original recurrent neural network which supports the thesis that dopamine plays a key role in the learning process. It is a well-known fact that dopamine strengths the signals between neurons and that has been typically associated with its influence during the learning process. However, DeepMind’s meta-learning model maintain fixed weights in the neural network which means that the learning levels couldn’t be adjusted but the agent was still able to learn and solve new tasks. This result indicates that dopamine is not also a mechanism to adjust weights between neurons but it also carries important information about the rules of the target tasks which influences the learning process.
The DeepMind research represents not only a breakthrough in the meta-reinforcement learning field but its equally relevant in the neuroscience space. AI is teaching how we learn instead of the other way around 😉
Source: Deep Learning on Medium