The agent receives rewards by performing correctly and penalties for performing incorrectly. The agent learns without intervention from a human by maximizing its reward and minimizing its penalty. e.g playing game and learn from every move that it was correct or not.

Deep learning is also learned data like other machine learning algorithms but using neural networks called as Artificial Neural Network (ANN). We can use Deep learning libraries like TenserFlow(written by Google) ,Keras(running on top of TensorFlow, Microsoft Cognitive Toolkit etc), PyTorch(written by Facebook)

Artificial Neural Network:
An artificial neural network is a computing system that is collection of connected units called neurons that are organized into what we call layers. Layers are three types- Input Layer , Output layer, Hidden Layer.

Input layer — One node for each component of the input data.
Hidden layers — Arbitrarily chosen number of nodes for each hidden layer.
Output layer — One node for each of the possible desired outputs.
wikipedia.com
Each little circles in above image is called neurons . Each neurons transmit signals to next neuron.

Since this network has three nodes in the input layer, this tells us that each input to this network must have three dimensions, like above example favorite color, favorite music and favorite drink .

Since this network has two nodes in the output layer, this tells us that there are two possible outputs for every input that is passed forward (left to right) through the network. For example, Male or Female could be the two output classes(prediction classes).

Why have different types of layers?
Neurons in an ANN are organized into layers. Following type of layers we can use in ANN on basis of problem statement.

Dense Layer

Convolutional layers

Pooling layers

Normalization layers

Recurrent layers

Each type of layers are used for different type of task . Convolutional layers commonly used for Image classification and Dense layers is having each input to each output within its layer.

Each layers will pass information to next layers from left to right in ANN. Each connection between nodes (neurons)will have weight associated which is just a number.When any input pass to one given node then the input will be multiplied by weight and passed to next node. In next node in next layer will compute passed input using activation function.This process will continue until out layer reached in ANN.

next node output= activation(weighted sum)

Activation function : Activation function is a function that maps a node’s inputs to its corresponding output.

e.g — Sigmoid, Relu etc. These functions are built using different mathematics techniques.

Loss function: During the training process, the loss will be calculated using the network’s output predictions and the true labels for the respective input.

Let say if we assign our labeled result (Male/Female) as 0 and 1 and then if we pass training data to model to predict it as Male but ANN calculated it as Female then loss will calculated. Since in ANN everything is just numbers then your predicted output will also be a number. Let say in this pass ANN give you output as .25 means it calculated female as .75 and male as .25.

so error will be :

error = 0.25–0.00 = 0.25

Loss function that is commonly used in practice called the mean squared error (MSE).

Mean Squared Error: We first calculate the difference (the error) between the provided output prediction and the label. We then square this error.

If we passed multiple samples to the model at once (a batch of samples), then we would take the mean of the squared errors over all of these samples

Training an artificial neural network:
In ANN training our task is to optimized network weights assigned in model initialization and find out accurate weights which will map each input to correct output.Mapping is most important stuff in training process. In each training pass ANN will try to map each input to correct output and if output is not matched then ANN will calculate loss function. Loss is error between provided labeled output and predicted output.In training process we need to pass same data again and again so that model can calculate loss function and optimize network weight.

How neural networks learns:
When the model is initialized, the network weights are set to arbitrary values and then after each pass ANN will calculate loss function. After the loss is calculated, the gradient of this loss function is computed with respect to each of the weights within the network.Once we have the value for the gradient of the loss function, we can use this value to update the model’s weight. We then multiply the gradient value by something called a learning rate . A learning rate is a small number usually ranging between 0.01 and 0.0001, but the actual value can vary.

new weight = old weight- (learning rate * gradient).

After network got new weights then all old weights will be replaced by new weights and same process will repeat again and again util correct labeled output will be archived by network.

Overfitting and underfitting in ANN:
Overfitting : Overfitting occurs when our model becomes really good at being able to classify or predict on data that was included in the training set, but is not as good at classifying data that it wasn’t trained on. So essentially, the model has overfit the data in the training set.

Reduce overfitting: Add more training data and use different type of data set to train your model . e.g If you classify image then train your model with different angle of same image.

We can reduce model complexity as well . May be removing layers from network.

Underfitting : A model is said to be underfitting when it’s not able to classify the data it was trained on.

Reduce underfitting: Increase model complexity. Add new layers in ANN.

Add more feature vector in training data so that model can predict against more features.

Thank you so much 🙂