A simple Word_Predictor using RNN
I don’t know how many of you would agree with me when I say, it really annoys me when an auto-correct feature in my phone automatically fills in for me sometimes. But the idea is fascinating, like how is it doing that? And the answer is the T9 Algorithm.There is an almost similar feature in Gmail called Smart Compose . In here the work is done by Deep Learning.Here we are trying to build a text generator using the same deep learning methods.
So basically, Word Prediction or Text Generation is a Language modelling problem, A classic ideation in Machine Learning and Deep Learning. In here, we are trying to do the word prediction using Recurrent Neural Network or RNN. Neural Network is actually said to be a design, inspired by how human brains work. They recognize patterns and clusters and use them to give us powerful insights and terrific applications of different kind of levels.
RNN or Recurrent Neural Networks, as the name suggests, is a repeating neural network. They are the kind whose output from the previous step is fed as input to the current step. Deep Learning already had Convolutional Neural Networks but why did the idea of RNN come along? It is because CNN was unable to deal with sequential data.
Sequential data are an interdependent type of data. It always depends on the previous stage for itself to get propagated. For example Stock Prediction, this prediction is only possible by a thorough study in the stock market for a considerable amount of data. Likewise, a conversation can only be made by relating all previously used words. The problem with CNN is that it doesn’t depend on its previous data or each step is self-dependent. So for our mentioned task, we are gonna have to choose RNNs.
Well, RNN sure did a major part in giving deep learning a lot more possibilities.RNN comes in a lot of different variants on the basis of architecture, what we are gonna use for now is LSTM RNN. LSTM stands for Long Short term Memory. So speaking in our problem context, say we have to predict this sentence, “She is cooking in the kitchen” so to generate the word kitchen, RNN needs to relate the word cooking within its model, which is possible since it remembers it’s few previous states. We are good to go, but what happens if the sentence was this, “I’m an Indian by birth, but I moved to London when I was little…I would love to learn my mother tongue Hindi.” The problem here is that there are many sentences between predicted word Hindi and the word India it uses to relate and predict.RNN lacks the property of keeping info for a long time and this is typically called Longterm Dependency. LSTM is a solution to this problem. This is an architecture specially built to address this problem.
Here for our project to work on the above-specified RNNs and LSTM, we need a couple of libraries. For those you who haven’t been in this area, let me introduce keras an open-source neural network written in python that can run on top of Tensorflow. This library is the most convenient and easiest to get started with deep learning.