Multi-Layer Perceptron (MLP) Lightly Explained

Source: Deep Learning on Medium


Before we get to MLP, let’s review what is a perceptron. Now, the perceptron is an interesting computation unit to talk about. It looks a lot like the neuron in our brain and it is supposed to function as a neuron. Below is a simplified perceptron with one binary input (X could be 0 or 1).

A simple perceptron with one binary input that outputs the same binary bit.

This perceptron executes the identity function. Notice inside the circle there’s the threshold1. We can also through in some weight and turn it into a NOT gate.

Another simple perceptron with one binary input and it is a NOT gate.

If we have more than one input, the perceptron gets a lot more interesting. Below is an example of the perceptron modeling the AND gate. With a few tweaks, it can also model the OR gate.

A not so simple perceptron with two binary inputs and it happens to be an AND gate.

(Assume the weight is 1 unless there’s a number specifically drawn on the line.)

A perceptron can model boolean operators, though not all of them. Because there will always be the beloved XOR gate.

Why boolean gates?

For the simplicity of the demonstration, we used binary inputs and outputs in the examples above. We made ourselves some nice classifiers, linear classifiers specifically, with the perceptron unit. That also explains why a single perceptron cannot model the XOR gate. As shown below, if we lay out the combinations of X and Y in the 2D space, we can draw a line that perfectly separates True and False values for the AND gate, not the XOR gate.

AND gate and XOR gate.

When working with booleans, perceptrons are linear classifiers.

Real-Valued Inputs

Fortunately and unfortunately, in the machine learning realm, we are most likely working with real-valued inputs. The perceptrons are still linear classifiers, and they still model lines.

A perceptron with two real-valued inputs, and it is still a linear classifier.

We are ready for the MLP.