Deep Learning Fundamental- Important Concepts

Source: Deep Learning on Medium

Deep Learning Fundamental- Important Concepts

einfochips.com

Deep learning is subfield of Machine learning and uses algorithms inspired by structure and function of brain’s neural network.

Before talking about Deep leaning , we need to understand different types of ML algorithms present for different use cases.

We can categorize machine learning in four parts

1- Supervise learning — Supervise algorithms is working with your data model which is labeled. Supervise learning has two parts . One is called classification and other is called regression.

Regression– This used to predict continuous values like stock price and home price in specific city . Common algorithms are liner regression , logistic regression etc.

Classification– This is used to predict boolean values like True/False or Male/Female.Common algorithms are Naive Bayes,Decision Trees,Support Vector Machines (SVM),Decision Tree, Random Forest etc.

e.g —If you need to do Gender classification on basis of certain feature vectors like Favorite color ,Favorite music,Favorite drink ,Favorite soft drink and then your data is labeled as M or F. You can see last column is labeled as result. Depend upon algorithm what you are using model will learn from training data and then predict result for new supplied data.

2- Unsupervised learning- Unsupervised ML algorithms are used when you don’t have labeled data like above. Common algorithms are k-means clustering, Association Rules.

3- Semi supervised learning- In above two type data was labeled or not labeled. Semi supervised algorithms are used for hybrid data set where some of data are labeled and some not. Typically, this combination will contain a very small amount of labeled data and a very large amount of unlabeled data. The basic procedure involved is that first, the programmer will cluster similar data using an unsupervised learning algorithm and then use the existing labeled data to label the rest of the unlabeled data.

4- Reinforcement leaning- Reinforcement learning, in the context of artificial intelligence, is a type of dynamic programming that trains algorithms using a system of reward and punishment. A reinforcement learning algorithm, or agent, learns by interacting with its environment.

The agent receives rewards by performing correctly and penalties for performing incorrectly. The agent learns without intervention from a human by maximizing its reward and minimizing its penalty. e.g playing game and learn from every move that it was correct or not.

Deep Leaning :

Deep learning is also learned data like other machine learning algorithms but using neural networks called as Artificial Neural Network (ANN). We can use Deep learning libraries like TenserFlow(written by Google) ,Keras(running on top of TensorFlow, Microsoft Cognitive Toolkit etc), PyTorch(written by Facebook)

Artificial Neural Network:

An artificial neural network is a computing system that is collection of connected units called neurons that are organized into what we call layers. Layers are three types- Input Layer , Output layer, Hidden Layer.

  1. Input layer — One node for each component of the input data.
  2. Hidden layers — Arbitrarily chosen number of nodes for each hidden layer.
  3. Output layer — One node for each of the possible desired outputs.
wikipedia.com

Each little circles in above image is called neurons . Each neurons transmit signals to next neuron.

Since this network has three nodes in the input layer, this tells us that each input to this network must have three dimensions, like above example favorite color, favorite music and favorite drink.

Since this network has two nodes in the output layer, this tells us that there are two possible outputs for every input that is passed forward (left to right) through the network. For example, Male or Female could be the two output classes(prediction classes).

Why have different types of layers?

Neurons in an ANN are organized into layers. Following type of layers we can use in ANN on basis of problem statement.

Dense Layer

Convolutional layers

Pooling layers

Normalization layers

Recurrent layers

Each type of layers are used for different type of task . Convolutional layers commonly used for Image classification and Dense layers is having each input to each output within its layer.

Each layers will pass information to next layers from left to right in ANN. Each connection between nodes (neurons)will have weight associated which is just a number.When any input pass to one given node then the input will be multiplied by weight and passed to next node. In next node in next layer will compute passed input using activation function.This process will continue until out layer reached in ANN.

next node output= activation(weighted sum)

Activation function : Activation function is a function that maps a node’s inputs to its corresponding output.

e.g — Sigmoid, Relu etc. These functions are built using different mathematics techniques.

Loss function: During the training process, the loss will be calculated using the network’s output predictions and the true labels for the respective input.

Let say if we assign our labeled result (Male/Female) as 0 and 1 and then if we pass training data to model to predict it as Male but ANN calculated it as Female then loss will calculated. Since in ANN everything is just numbers then your predicted output will also be a number. Let say in this pass ANN give you output as .25 means it calculated female as .75 and male as .25.

so error will be :

error = 0.25–0.00 = 0.25

Loss function that is commonly used in practice called the mean squared error (MSE).

Mean Squared Error: We first calculate the difference (the error) between the provided output prediction and the label. We then square this error.

If we passed multiple samples to the model at once (a batch of samples), then we would take the mean of the squared errors over all of these samples

Training an artificial neural network:

In ANN training our task is to optimized network weights assigned in model initialization and find out accurate weights which will map each input to correct output.Mapping is most important stuff in training process. In each training pass ANN will try to map each input to correct output and if output is not matched then ANN will calculate loss function. Loss is error between provided labeled output and predicted output.In training process we need to pass same data again and again so that model can calculate loss function and optimize network weight.

How neural networks learns:

When the model is initialized, the network weights are set to arbitrary values and then after each pass ANN will calculate loss function. After the loss is calculated, the gradient of this loss function is computed with respect to each of the weights within the network.Once we have the value for the gradient of the loss function, we can use this value to update the model’s weight. We then multiply the gradient value by something called a learning rate. A learning rate is a small number usually ranging between 0.01 and 0.0001, but the actual value can vary.

new weight = old weight- (learning rate * gradient).

After network got new weights then all old weights will be replaced by new weights and same process will repeat again and again util correct labeled output will be archived by network.

Overfitting and underfitting in ANN:

Overfitting: Overfitting occurs when our model becomes really good at being able to classify or predict on data that was included in the training set, but is not as good at classifying data that it wasn’t trained on. So essentially, the model has overfit the data in the training set.

Reduce overfitting: Add more training data and use different type of data set to train your model . e.g If you classify image then train your model with different angle of same image.

We can reduce model complexity as well . May be removing layers from network.

Underfitting: A model is said to be underfitting when it’s not able to classify the data it was trained on.

Reduce underfitting: Increase model complexity. Add new layers in ANN.

Add more feature vector in training data so that model can predict against more features.

Thank you so much 🙂