A Decade in Deep Learning

Source: Deep Learning on Medium

Natural Language Processing

Image Courtesy: Ontotext

Although not as fancy as the other domains in deep learning, arguably NLP algorithms are the most sophisticated and successful so far. This success may be attributed to the relatively easier task of training on a text corpus compared to images and videos. Not only has NLP seen immense progress this decade, but it has also successfully become a staple in various commercial applications. As you will notice in the following subsections that researchers at Google have played the most significant role in improving NLP algorithms, yet there have also been contributions from almost all global research labs in its progress.

Word Embeddings

Image Courtesy: Sebastian Ruder

Technically also known as a distributional semantic model, today word embeddings are used in almost all NLP algorithms. While embedding words as vectors had been proposed way back in 2003 by Bengio et al., it was still computationally expensive and more efficient algorithms were required for their implementation. Come 2013, Google pushed this technique into widespread use by proposing the Continuous Bag-of-Words Model (CBOW) and the Continuous Skip-gram Model (original paper and extensions proposed by the same authors). to compute them and providing an open-source implementation in the form of word2vec. In 2015, we witnessed Stanford researchers introduced the GloVe model which further improved upon word embedding algorithms.


Image Courtesy: Christopher Olah

While LSTMs were originally proposed in 1997 by Hochreiter and Schmidhuber, it was during this decade that they really rose to prominence. Although they have eventually been outperformed by newer algorithms, they still played a massive role in several commercially successful translation software such as Google Translate and Apple’s Siri. Kyunghyun Cho et al. provided a further improvement to the LSTM architecture by proposing the Gated recurrent unit (GRU). The GRU naturally leads us to the next innovation…


As part of Google, Sutskever et al. introduced the now popular Seq2Seq model in 2014. Extending upon the previous works on RNNs, LSTMs and GRU, this embedding-decoding technique is what currently powers Google Translate and many other NLP tasks. It is


Wait not these Transformers!

Google Brain caused quite a stir when they introduced the paper interestingly titled “Attention Is All You Need”. By narrowing down the focus only to the important components of data, attention offered improved performance over the LSTMs and was also lighter in terms of computation power required.


Image Courtesy: YNG Media

Google yet again managed to provide a breakthrough in NLP before the end of the decade by introducing BERT. It is a language representation model that simultaneously considers the text from the left and right sides. While one can say that Google researchers achieved state-of-the-art in several NLP tasks due to the unparalleled processing power at their disposal, the model is quite adaptable. The pre-trained BERT model can be adapted to any task by simply adding an additional output layer.

Voice Assistants

Image Courtesy: Business Insider/Yu Han

In October 2011, iPhone 4S was the first Apple product to come integrated with Siri. This was revolutionary in the smartphone segment as for the first time a technology very much associated with futuristic Artificial Intelligence was available to the mass commercial market. As of today, the market is filled with voice assistants that perform exceedingly well at a range of tasks such as voice recognition, speech to text, text translation, etc. To see just how far we have come with this technology, step into the house of anyone who is tech-savvy and call out “Alexa!”. More likely than not you will be welcomed by a familiar response.

An interesting read if you question the progress of current NLP: https://www.theatlantic.com/technology/archive/2018/01/the-shallowness-of-google-translate/551570/