Meta-learning | Learn How to Learn Fast

Original article was published by Jarvis+ on Artificial Intelligence on Medium

Meta-learning | Learn How to Learn Fast

Meta-learning is one of the most active research fields in the field of deep learning. Some schools of artificial intelligence agree with the view that meta-learning is a stepping stone to unlock artificial general intelligence (AGI).

In recent years, the research and development of meta-learning technology has exploded. The idea behind meta-learning can be traced back to 1979. Donald B. Maudsley redefines the new cognitive paradigm in his works as “learners realize and gradually control their internalized perception, inquiry , Learning and growing habits”.


In 1985, John Biggs defined meta-learning more simply as “understanding and controlling self-learning” in his work. Although these definitions are accurate from the perspective of cognitive science, it seems difficult to adapt to the specific work of artificial intelligence.
In artificial intelligence systems, meta-learning can be simply defined as the ability to acquire the versatility of knowledge. Humans can acquire multiple tasks at the same time with minimal information. We can recognize a new object by looking at a single picture, or we can learn complex multitasking activities at the same time, such as driving a car or flying an airplane.

Although agents can complete very complex tasks, they require a lot of training on any atomic subtasks, and they are still very bad at multitasking. Therefore, the path to the versatility of knowledge requires the agent to “learn how to learn”, or to describe it in terms, is meta-learning.

Meta-learning models

When humans are learning, they will use different methods according to specific situations. Similarly, not all meta-learning models use the same technology. Some meta-learning models focus on optimizing the structure of the neural network, while other models (such as Reptile) focus more on finding the right data set to train a specific model.

  • The Artificial Intelligence Laboratory at the University of California, Berkeley recently published a research paper that comprehensively enumerates different types of meta-learning.
  • Small-sample meta-learning: The idea of ​​”small-sample meta-learning” is to create a deep neural network to learn from the simplest data set, such as imitating how babies learn to recognize objects by seeing only one or two pictures. The idea of ​​small-sample meta-learning inspired the generation of technologies such as memory-enhancing neural networks or single-sample generation models.
  • Optimizer meta-learning: The focus of the optimizer’s meta-learning model is to learn how to optimize neural networks to better complete tasks. These models usually include a neural network that applies different optimizations to the hyperparameters of another neural network to improve the target task. Those models that focus on improving the gradient descent technology are a good embodiment of the optimizer’s meta-learning, just like the models published in the study.
  • Metric learning: The goal of metric learning is to determine a metric space for high-efficiency learning. This method can be regarded as a subset of small-sample meta-learning. It uses learning metric space to evaluate learning quality and gives examples. This research paper shows readers how to apply metric learning to classification problems.
  • Recurrent model meta-learning: This type of meta-learning model is suitable for recurrent neural networks (RNNs), such as long and short-term memory networks (LSTM). In this architecture, the meta-learner algorithm will train the RNN model to process the data set in turn, and then process the newly input data in the task. In an image classification setting, this may involve passing a collection of pairs of data sets (images, labels) in turn, followed by new examples that must be classified. Meta-reinforcement learning is an example of this approach.

Meta-learning classic paper sharing

✅ Siamese Neural Networks for One-shot Image Recognition, (2015), Gregory Koch, Richard Zemel, Ruslan Salakhutdinov.
✅Prototypical Networks for Few-shot Learning, (2017), Jake Snell, Kevin Swersky, Richard S. Zemel.
✅ Gaussian Prototypical Networks for Few-Shot Learning on Omniglot (2017), Stanislav Fort.
✅Matching Networks for One Shot Learning, (2017), Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Koray Kavukcuoglu, Daan Wierstra.
✅ Learning to Compare: Relation Network for Few-Shot Learning, (2017), Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip H.S. Torr, Timothy M. Hospedales.
✅An embarrassingly simple approach to zero-shot learning, (2015), B Romera-Paredes, Philip H. S. Torr.
✅Low-shot Learning by Shrinking and Hallucinating Features, (2017), Bharath Hariharan, Ross Girshick.
✅Low-shot learning with large-scale diffusion, (2018), Matthijs Douze, Arthur Szlam, Bharath Hariharan, Hervé Jégou.
✅Low-Shot Learning with Imprinted Weights, (2018), Hang Qi, Matthew Brown, David G. Lowe.
✅One-Shot Video Object Segmentation, (2017), S. Caelles and K.K. Maninis and J. Pont-Tuset and L. Leal-Taixe’ and D. Cremers and L. Van Gool.
✅One-Shot Learning for Semantic Segmentation, (2017), Amirreza Shaban, Shray Bansal, Zhen Liu, Irfan Essa, Byron Boots.