Original article was published by Akash Deep on Deep Learning on Medium
In our last Series of Deep learning we had learnt how we prepare Input for our Neural networks for Natural Language processing by using word embedding and tokenization. In this series we will try to understand the core concepts of Deep Neural networks, rise of Neural networks and what can Neural networks do i mean what all the task we can achieve by applying neural networks concepts in industry. Lets get started
I want to make it very clear that Neural networks are not something which has evolved recently. The development of neural networks started in 1990’s i mean LSTM(Long Short term memory) was developed in 1997 and CNN(Convolution Neural Networks) was developed in 1998. Now your questions will be why was these things not popular at that time. There are mostly 3 reasons why the deep neural networks became popular in late of 2010. we will try to understand one by one.
Reason 1: Availability of large amount of dataset- This is one of the reason for the evolution of deep learning. Earlier, when we don’t have large amount of data, after the changing of the era from paper world to digital world at starting of 2003–04 the generation of data started growing exponentially and each and every year it is growing more than that. Just to make you understand i want to give you one information. The data produced in 2019 is more than the complete data what has been produced in between 2000–2018 and the total data what will be going to produced in the end of 2020 it will be more than the data produced in 2000–2019. So just imagine how rapidly we are entering into the world of big big data so fastly and rapidly. Mostly Deep Learning i mean the concepts of neural network started becoming popular after 2012 when Alexnet by Facebook was introduced and able to classify correctly from the set of 1000 labels on the imagenet dataset. Coming to imagenet, it is a huge repository for the images which consists of 1000 categories images of more than 1 millions in numbers. I recommend you to go through the imagenet website and try to explore the things there.
Reason 2: Evolution of Compute power- I can say this is the most important reason which led to the evolution of deep neural networks because it requires a lots of computation per second to train neural networks and for this to happen we need lots of computation power and the evolution of GPU’s and TPU’s changed our dreams to reality and still lot to come. As we are aware, soon we will be entering into the world of Quantum computing.
Reason 3: Ability to deploy matrix multiplication on GPU,s- This has the relation with the second reason what i had mentioned above. The NVIDIA CUDA, Deep Neural Network library(cuDNN) is a GPU-accelerated library of primitive for deep neural networks. it provides higly tuned implementations for the neural networks operation such as backpropagation, pooling, normalization and many more. I will explain each and every terms related to deep learning in my next article. As we know we need to pass matrix as the input to our neural networks so we need maximum amount of matrix calculation and to perform this we need high computation or parallel computation.
We had seen and understand why deep learning started become popular recently by understanding above 3 reasons. Now will try to understand where the deep learning is mostly used now a days i mean all the applications of deep learning one by one.
Object Detection: It means basically localizing and classifying each objects in the image. This is the widely used application of deep learning now a days and we have many use cases on object detection. We can apply object detection at traffic in metropolitan city. We can use this application for virtual attendance system and in hospitals.
Image generation: It means generating of images of same kind by the neural networks that means if we will give any image to neural network basically it will mimic that image and will able to generate the image of same type. This is also used widely as in many android or ios devices as photo editor. The deep learning neural networks basically used for this use case is GAN’S. I will walk you through the deep architecture of GAN’S in the latter article. The advanced model for this use case is cycle GAN’S which generally used in image to image translation.
Text to image synthesis: This means we will be giving input as a text to model and it will generate the image based on that text. This is the example of encoder-decoder architecture of the Deep neural networks. For example if we will give the sentence “Parrot is sitting on tree” the model will output a image of parrot which is sitting on tree. More about such encoder-decoder architecture we will discuss in sometime next article.
Pixel to image: This means the generation of picture from drawing of the sketch. This can be explained from below picture. This is also one of the most important use case that we will be discussed latter.
Image captioning: This is one of the most important use cases of deep learning in this we used to give a image to the network and the network understand that image and will add caption to it. For example suppose we will give a “image of a boy using laptop” the model will decode the image to to the output as text “boy using laptop”. This is again the architecture of encoder-decoder in which we used to give image as input which is encoded by the CNN after the encoded output is given to RNN to decode that image as text.
Question Answering: This is also one of the most important use case of NLP in which we used to train our model on the sequence of question and answer and allow our model to learn the sequence and that can be used. Chatbots are most important use cases and its used widely now a days in the industry. Mainly we use RNN as both encoder and decoder in this use cases. There are many modern architecture for this use case now, such as Transformers that we will discuss latter
We have seen the most important use cases listed above on neural networks. There are many more, such as image colorization, image inpainting, Machine translation and many more. We will try to understand each and every use cases in detail in our further articles.
Now we will try to understand the basic architecture of the Neural networks. Before that we will try to understand what neural network does and basically the concept of weight in neural networks at high level. To make it very simple, think tomorrow is my exam and we have to predict whether i am going to pass the examination or not, in this case our desired output y is 0(fail the exam),1(not fail the exam). In this case what all the input we can think? The input could be “how much did i studied”, “how smart i am”, “my previous knowledge”, “my name”. Now we will feed this input and output to our network and the network will self assign the weights to these input bases on their importance. As per my understanding the weights to the “how much i studied” will be more because this is the important factor either i am going to pass the exam or not and “my name” this input weight will be less because name doesn’t decide for a person that he is going to pass the exam or not. By training the neural networks with lots of example of this type my model will also develop human intelligence and will give less importance to name and more importance to “how much i had studied” this is the basic example to understand the concept of weight in neural networks.
If we had understand the above example, Now we will see the basic architecture of Neural networks. We can think the architecture of neural network is same as of the human brain like whatever we used to see that terms as input and according to the input we judge what input is important based on different different context basically what to remember and what to leave, in this process we are assigning weight with the help of activation function if we will compare it with neural network. The basic neural network consists of the input layer, weights, bias, activation function, hidden layers and output layer. The first layer is known as input layer that means from this layer we used to pass all the desired input to the model and after it goes through the hidden layers and after all the calculation in hidden layers, it is passed to the output layer for the prediction and re-learning. The input and output both are feeded to the network at the time of model training. This is at very high level. We will try to understand deep architecture when we will understand supervised, unsupervised and semi supervised in our latter article.
Input layer: This is the beginning layer of any neural network. From this layer we used to feed prepared input and the corresponding levels to the model.
Hidden layers: This is the middle layer of neural network, this is also known as the black box. All the nodes of input layer is connected to the nodes of hidden layers. We can have multiple hidden layers in the network. Every hidden layers are associated with the activation function.
Output layers: This is the last layer of the neural network which is responsible for prediction. Each nodes of hidden layers is connected with the output layer and the output generated by hidden layers are transferred to the output layer for the evaluation purpose. The output layer is also associated with the activation function which gives the probability of the levels.
Activation Function: This we can understand is a type of threshold which is responsible for the activation of any neurons. I mean based on the value it will decide the importance of each input and if any input needs to used so what will be the importance at very high level. We will look each and every activation function in details along with their mathematical function and graph in our latter article.
Weight: This is something which model learns while training. when the input passed to the neural networks based on the importance model used to assign the value to that input and that value is nothing its a weight at very high level.
Bias: This is also something which model learns at very high level. For example if will provide temperature in Celsius as the input and temperature in Fahrenheit the model learns the formulae of the conversion from Celsius to Fahrenheit as (x degree calsius*9/5)+32. This is the learnt formulae by the neural network in this the 32 is termed as bias. This is the something which model learns and also we used to provide as the time of input. In my next tutorial exactly i will be using this use case and will explain you each and every steps how to implement this conversion using Keras and fully connected layer i.e dense layer in keras.
We have successfully seen the when neural networks evolved? What are the application of neural networks in the industry? The basic architecture of the neural network at very high level? and the different terms associated with the neural networks. In our next tutorial i will explain you how the neural network works step by step and what is backpropagation in detail, along with programetic implementation of neural network using python and keras. Also if you want to understand more about tokenization and word embedding you can go through the below link for more understanding in step by step.
Also if anyone is interested in cloud computing they can go through my below blog for step by step understanding of cloud computing.