Khipu 2019 and the State of AI in Latin America

Source: Deep Learning on Medium

Khipu Program

This below is my Google Calendar activities during the Khipu week. It was a very intense event, with a large number of talks you could simply not miss.

Monday (Nov. 11):

From the first day we had some amazing talks, starting with Guillermo Sapiro (Uruguayan 🇺🇾) talking about his and his colleagues work on using machine learning to speed up the diagnostic of autism in infants. René Vidal (Chilean 🇨🇱) presented a very interesting talk about the mathematics of deep learning optimization.

We also had parallel practical sessions on Convolutional Neural Networks and Optimisation for Deep Learning. The practicals were ministered using Colab Notebooks, all of which can be acessed here. I enjoyed the optimisation session very much.

Monday ended on a very satisfying note with a show featuring Uruguayan tango!

Tuesday (Nov. 12):

The second day included a very anticipated talk by Ian Goodfellow. Previously a researcher at Google Brain, now working at Apple and considered by many as the father of Generative Adversarial Networks (GANs), Ian spoke about some very useful applications of GANs in many areas, including the effort by Apple on using GANs to train an eye-gaze predictor in the absence of a large enough amount of labelled examples. The idea is to train a GAN to map from a synthetic, CG simulation of an eye (gazing in a particular direction) to a photo-realistic image. This can be done by training a GAN to generate samples from a distribution of real eye pictures and coupling it with a self-regularization loss whose purpose is to minimize the difference between the synthetic and generated images.

Tuesday ended with an amazing panel on “How to Write a Great Research Paper” featuring Nando de Freitas, Claire Monteleoni, David Lopez-Paz and Martin Arjovsky.

Wednesday (Nov. 13):

Luciana Benotti (Argentinian 🇦🇷) spoke about challenges faced by Latin American researchers, which include expensive conferences, decreasing funds for research and low access to cutting-edge equipment. She chose to devote her talk to tell the success stories of her students, summarizing the coping mechanisms they’ve successfully implemented to produce impactful research — such as collaborating with researchers from rich countries to obtain bilateral funding or addressing hard, but low-resource tasks.

Wednesday also featured one of the highlights of Khipu 2019: a talk by Yoshua Bengio, one of the 3 recipients of the 2019 Turing Award along with Geoffrey Hinton and Yann LeCun and considered by many one of the fathers of modern deep learning. Bengio spoke about his perspectives on the future of AI research. According to Bengio, we have made a lot of progress in the recent years with insights such as attention and GANs, but are still far behind with respect to human level intelligence. Among other things, that’s because modern DL models are trained to solve very specific tasks, while human intelligence is general. Bengio suggested that a way forward is to focus on learning agents capable of interacting with the world as humans do. He pinpointed “meta-learning” (learning to learn) as a key direction to improve the state-of-the-art in machine learning, and devoted a lot of attention to the subject of causality (humans learn causal relationships about the world, but current ML focuses on correlations). On a final note, Bengio spoke about the importance of discussing the ethics of AI and making sure that AI is used for social good.

Following a brief discussion about the purpose, history and cultural heritage of the original Khipus, Nando de Freitas presented a stellar lecture on Reinforcement Learning, building up from the basics. Nando briefly commented on the gaps in Latin American mathematical education while presenting the lecture, and stopped it more than once to cover concepts lacking in our formal education.

Thursday (Nov. 14):

The fourth day included my favorite talk of Khipu 2019, presented by David Lopez-Paz. It was named “Causality and Generalization” and spoke a lot about what the field of ML can gain by learning to train models which capture causal relationships instead of (possibly spurious) correlations. David spoke about how modern image classification tools fail at identifying causal relationships even in very simple tasks. For example, the “cattle” class is so heavily correlated with grasslands that modern tools will fail to recognize a cow in a beach scenario.

David spoke at length about how the biggest lie in ML is assuming that the test distribution will be similar to the train distribution. To circumvent this issue, causal models are desperately needed.

One of the highlights of the event was the Women in AI panel, hosted by Google. It featured Brazilian 🇧🇷 researcher Sandra Avila, Google Scientist Chelsea Finn, Uruguay’s 🇺🇾 Ex-Education minister and current dean of Facultad de Ingeniaria Maria Simon, Apple’s Giuilia Pagallo (Venezuelan 🇻🇪), Uruguay’s 🇺🇾 current Minister of Industry, Energy and Mining Guillermo Moncecchi and DeepMind’s Nando de Freitas. Maria Simon spoke at length about the need for integrating non-minority groups when discussing about diversity in research: “It’s no use talking only to women about this issue! They’re already convinced”. When I registered for this event, I have to tell I was afraid I’d be the only men in the room, but I wasn’t. It was very nice to see so much men interested in hearing more about gender diversity in AI.

Friday (Nov. 15):

Google AI lead Jeff Dean

Friday was the last day of the event, but it included some very interesting talks. The current lead of Google AI, Jeff Dean, which is responsible for many of Google’s pivotal projects such as MapReduce and Tensorflow, presented a talk about the company’s efforts to use AI for social good in healthcare and for environmental or education purposes.

Oriol Vinyals spoke about DeepMind’s efforts to achieve superhuman performance on the game of Starcraft, a far more difficult benchmark than the Chinese boardgame of Go (DeepMind achieved superhuman status after defeating the long-reigning champion, Lee Sedol), detailing the model’s architecture (which includes a ResNet, a Transformer, Pointer networks, among other architectures) and the company’s strategy for training and evaluating it against human players.

Finally, Jeff Dean returned to the stage to give a historical perspective of the current decade (2010s) with respect to machine learning. It was a beautiful thing to see just how much the ML community has achieved in so little time. Jeff brought graphs showing the exponential rate of the # of ML papers produced each year. Moore’s law may have ended, but it is being replaced by something else.

From 2011 to 2016, the state-of-the-art in image classification rocketed from a 26% error rate to 3% (a 4x drop in a 5 year period). In the meantime it achieved superhuman performance, as the human error rate is typically 5%.

Dean also talked about the Grand Engineering Challenges for the 21st Century, and how Google is working to solve a large number of them using AI.

Google has collected a series of achievements in the healthcare field, such as managing to diagnose diabetes from retinal images better than trained ophthalmologists. The researchers also realized that the same technique could be used to predict a set of health indicators, such as cardiovascular risk.

Jeff also spoke about something which I’m particularly excited about (as my research focuses on Graph Neural Networks for symbolic problems). Traditionally, we use machine learning to tackle problems for which a solution is hard or virtually impossible to articulate in a procedural manner. We cannot describe how to classify an animal as a cat or a dog given a matrix of pixels, so we use convolutional neural networks to do that. A very interesting research direction is training models to solve problems for which we already have an explicit solution, but one which consumes too much resources. Jeff exemplified this with a work which has managed to predict the chemical properties of molecules with similar accuracy and 10 thousand times faster than traditional techniques.

He also talked about Google’s efforts on advancing the state-of-the-art in connectomics (that is, reconstructing the network structure of an animal’s brain given brain imaging slices).

We’ve Put a Worm’s Mind in a Lego Robot’s Body

Today we can already port a C. Elegans brain into a Lego Mindstorms robot, but using state-of-the-art DL techniques we can expect to be able to do the same for animals with much larger brains in the near future.

Jeff also talked a little bit about the attention revolution in sequence-to-sequence models and natural language processing, with a focus on the transformer architecture.

Another topic in the talk was AutoML, and I have to admit that at this moment I feel like the audience of ML engineers felt like we’re living in a science fiction scenario. AutoML consists in automating the part of ML research devoted to finding efficient DL architectures. This is done by training a reinforcement learning agent to navigate the space of model architectures to find the ones best suited for training.

To me, the surprising thing is how AutoML manages to find architectures yielding models which are not only more accurate, but also faster than those produced by ML experts. AutoML has produced a totally different Pareto curve than those we get from human experimentation. The trade-off between speed and accuracy is much better for AutoML.

Jeff spoke about how ML is reinventing how we build computers, from TPUs to small chips used for efficient inference (e.g. Google’s Coral).

Finally, Jeff talked about Google’s vision for the future of ML: not necessarily bigger and more accurate models requiring a lot of memory and computation nor smaller and less accurate models easily deployable in a small device, but an hybrid solution. Gigantic models, but very sparsely activated and relying heavily on intelligent routing (also learned). These models would be composed of many sub-modules, each specialized for a given task, and one would be able to activate them on-demand depending on the input.