Source: Deep Learning on Medium
It is recommended that you read this link before starting the study
The “deep” in deep learning
Deep learning is a specific subfield of machine learning: a new take on learning representation from data that puts an emphasis on learning successive layers of increasingly meaningful representation. The deep in deep learning isn’t a reference to any kind of deeper understanding achieved by the approach; rather, it stands for this idea of successive layers of representation. How many layers contribute to a model of the data is called the depth of the model.Other appropriate names for the field could have been layered representations learning and hierarchical representations learning.For our purpose, deep learning is a mathematical framework for learning representation from data.
As you can see in below figure, the network transforms the digit image into representations that are increasingly different from the original image and increasingly information about the final result.You can think of a deep network as a multistage information-distillation operation, where information goes through successive filters and comes out increasingly purified(that is, useful with regard to some task).
How deep learning works?
The specification of what a layer does to its input data is stored in the layer’s weight, which in essence are a bunch of numbers ( Weights are also sometimes called the parameters of a layer). Learning means finding a set of values for the weights of all layers in a network.
To control the output of a neural network, you need to be able to measure how far this output is from what you expected. This is the job of the loss function of the network, also called the objective function.
The fundamental trick in deep learning is to use this score as a feedback signal to adjust the value of the weights a little. This adjustment is the job of the optimizer, which implements what’s called the Backpropagation algorithm.
Initially, the weights of the network are assigned random values, so the network merely implements a series of random transformation. Naturally, its output is far from what it should ideally be, and loss score is accordingly very high. But with every example the network processes, the weights are adjusted a little in the correct direction, and the loss score decrease, This is the training loop. A network with a minimal loss is one for which the outputs are as close as they can be to the targets: a trained network.