Original article was published on Deep Learning on Medium
What is a Neural Network?
An Conceptual look with an applied approach
From Face app to Self driving cars neural networks played a very vital role in AI ( Artificial intelligence ) . But the question is what exactly is a neural network ? Speaking on high level: it’s a complex mathematical function which takes some input and give desired output. Let’s dive into the detail with an example.
Whenever a real estate broker wants to sell any house he will look for the best price with profit at which the house could be sold. The broker itself figures out the price of the house on the basis of some features like: Size, No. of bedrooms and locality, his experience will help in figuring out the optimal price. How neural network will do the same task? Here in human what we called ‘experience’ is analogous to data for the neural network through which neural network will learn to predict the house prices. Let’s assume for now we have only one feature for predicting the price of the house : Size, and generally the increase in the Size of the house is directly proportional to the increase in Price. i.e. As the size of the house increases the price also increases.
Here we can see that the line does not start from origin because ‘Size’ of the house will not start from 0 similarly line will not go below the x axis because prices cannot be negative. Now we can try predicting the house price on the basis of Size.
So now what exactly the neuron is doing ?
For every given input of ‘Size’ the neuron is trying to predict the ‘Price’ for the house , So neuron is just computing the function in which ‘Price’ of the house is related to the ‘Size’ and in this case it is linear (from Fig 1). Now we are left with one question How the neural network figures out the function and it’s values? ( like in case of linear we have to find the slope and intercept of the line). Let’s see this in detail:
Now we are quite sure that only a single feature will not give the optimal prediction of the price i.e. seeing only the size of the house we do not estimate the price for it.
Therefore, Let’s assume the data consist of 3 input features: No. of bedrooms, Size, Locality and a single output i.e. House price. So in training the neural network (for learning to predict the prices: it’s just like you are preparing for the exams using previous year question paper for which the solutions are known ) we will have input-output pairs i.e. for every single House with input of the features (Size, locality, No. of bedrooms) there will be a single output ‘House Price’ . i.e. Given the Size, locality, No. of bedrooms for a house as input and the known output(which is already present in the data) ‘price’ for that house. What is the Mathematical Function best suited for predicting the price as close as the observed price for that house i.e. ‘predicted prices must be as close as the observed prices of the house’. This is exactly what we have to do, we have to minimize the difference between observed price and Predicted Price, and the steps involved is called the ‘Back Propagation steps’. And finding the value of the price for input features of a house using the mathematical function the steps involved is known as ‘Forward Propagation Steps’.
What is the function of the neuron in the hidden layer?
Every neuron in the hidden layer takes all the input features (Size, locality, No. of bedrooms). But it’s not necessary they compute the same thing. So if we consider very 1st neuron, and let input from locality is 0 i.e. we will have only Size(It’s the size of the house) and No. of . Bedrooms for that house. It’s not hard to see that family size can be predicted from those two features, so even though family size is not given in the input features(Data) , it is predicted by the neuron in the hidden layer, similarly, other features can also be extracted.
Similarly, you can easily see output neuron(neuron after the hidden layer) will predict the House price.
So, it’s so far clear on high level what exactly neurons do, now what about hidden layers? So hidden layer helps in extracting features which are not explicitly provided like: ‘Family Size’, therefore, more the number of hidden layer more complex features to be extracted( like: Nose on the face in an image). And which will indeed help in more accurate predictions(extracting more complex features).
The complete process describes as follows:
In training , the network starts with some assumption for the function and it’s values, to predict the house price. For each House It tries to predict the price(forward propagation) and sees how much error it makes (difference between original price and predicted) and try to minimize the error (backward Propagation) So, while predicting the next house price the function would’ve been slightly better Since it is learning from each example , so more the data more it will learn, and will be accurate at predicting, And the function after training is used for test were output(house prices) is not shown to the network ( Just like after the preparation(training) you give the exams(testing) ) . In testing , trained network will be given inputs(size, locality and no. of bedrooms) from the test data and network predicts the House prices which are then compared with the original prices from the test data to see how well the network has performed ( Just like results of the exams ).Then the decision been made either to retrain the network with modifications(like if student failed in the exams) or to deploy in the real world .
Note: Here the test data is the unseen data which neural network have not seen during training. References :
3Blue1Brown Neural networks playlist Youtube
Neural networks and deep learning Course by Andrew Ng