Construction of a simple neural network by mono-perceptron

Source: Deep Learning on Medium

Go to the profile of Futoshi

Recent years, the development of deep learning have seen everywhere. What is deep learning? It is an algorithm for machines to recognize objects or sounds and think like humans.

Deep learning works with a mathematical model called a neural network built by an infinite set of perceptrons. Perceptron is a way of signal transmission, which mimics the mechanism that occurs in the human brain. In this article, I will show how to construct a simple neural network using mono-perceptron and solve linear regression problems.

This time the linear regression problem is to fit a straight line to the noisy data set. In general, the least squares method is used to fit the data. The mathematical model for solving this problem is a neural network, but the principle is almost the same as the least squares method.

Design a neural network

The neural network to solve this problem is very simple. One output result(Y) corresponds to one input data(X). To obtain this output result, two parameters must be operated; weight(W) and bias(B). The weight is multiplied by the input data and the bias is added.

Another output gets the error(E) between the calculated output(Y) and the correct value(T). This error is calculated by the sum of squares error function.

The function to find the error is called the loss function. We must adjust two parameters; W and B so that the value of this error approaches zero. To update parameters is called learning. To update parameters, it is necessary to know how much the error changes when each parameter changes slightly. In other words, this is a derivative between the error and each parameter.

However, it cannot be calculated in this form. We need an idea. So, the chain rule makes it easy to solve this problem. It converts the differential equation to below.

We can update the parameters with this equation as follows.

“μ” is called the learning rate. This factor must be adjusted to prevent over-learning. The parameters are updated according to this equation when the number of updating depends on the learning rate. The lower learning rate, the slower the learning, and the higher, the greater the risk of over-learning.

Write a code

Now we know how the neural network learns. So, let’s start to write the program with python.

At first, import the libraries needed to calculate the program.

The calculations required to build a neural network are described using the only numpy. I will also import the library; matplotlib, and the module; pyplot, to show the result as a graph.

Then, create a data set. As input data(X) 20 sample data, which is selected between 0 and 2 randomly is prepared and add Gaussian noise to it. This data set is set to follow “y = 3x + 5” and this will be the correct label.

The goal we want to reach is to find “y = 3x + 5” by neural networks learning. After the data set can be created, we will set initial parameters.

W and B are set to random values. The learning rate sets a value that does not exceed 1, but if it gets too close to 1, over-learning will occur, so select a smaller value. Be prepared to save the calculated error(E_storage).

Everything is ready. The flow of calculation is as follows.

– Calculate the output(Y)

– Calculate the error(E)

– Save the value of E

– Update parameters(W, B)

The learning is repeated 80 times.

When learning is complete, the results are plotted on a graph. Using the adjusted parameters; W and B, calculate the predicted straight line.

And then write the code to draw a graph. Draw the data set and the predicted line on the same graph.

The error values are also drawn on the graph. The value is close to zero if learning is successful.

We finished writing the code. Run this program!

Two graphs were output. The left one is the predicted graph, and the other is the loss function. The straight line is not strictly “y = 3x + 5”. It may need more learning. Anyway, programming seems to be successful roughly. According to theory, the algorithm worked.

The graph obtained by the program is an image, but this graph is animated to make it easy to understand.

In this article, we were able to predict straight lines using a simple neural network. In the world, there are various deep learnings that can recognize and think about images and languages. The model of the neural network that forms deep learning is built with countless combinations of this simple perceptron. We got the first key to know how the machine thinks and anticipates the answer.