Neural Network: the XOR Challenge

Original article was published on Artificial Intelligence on Medium

Neural Network: the XOR Challenge

Before starting on to what this challenge is about let me tell you that this is the reason the entire research department of artificial intelligence struggled for 15 years and here’s where the term ‘the cold and dark winter of artificial intelligence research’ came up. Let’s clear it out to you…

Consider the two graphs given below, where the pink and blue sphere represent two different classifications of the data

Now, to solve the classification problem, we need to find a way out to separate the pink and blue spheres representing some data. Before moving on think how you will be doing so?

Yes, we can have a line to separate the first graph’s data so effectively but what about the second one? Yeah, we surely can draw two lines and separate the classes but will it be linear? No, right!

Using a binary classifier like a single artificial neuron, it is impossible to separate the data in the second graph by a linear function. This is where the XOR challenge came up.

XOR Challenge:

1. A single artificial neuron can only learn linearly separable functions.

2. XOR: Can’t be separated by a single line.

When the AI researchers found this out, they realized that the entire research for artificial intelligence will be a waste if they don’t find a solution to solve this simple problem to classify such data. The solution that they came up with was to introduce non-linearity to the network or the model getting trained.

The entire concept of artificial intelligence is to mimic our brain and work exactly as it does. So, they again got inspired by the way our neurons pass the information in our bodies. The output of the first neuron act as the input of the next and following the same concept artificial neurons got converted to a network of neurons i.e. neural network as shown below.

All the layers in between the input and the output layers are hidden layers. All the perceptrons(neurons) get the input as a sum of the outputs(multiplied by weight) from the previous layer.

This is how our system got non-linear and thus this neural network can now learn non-linearly separable functions too. And yes, this is another milestone the researchers had achieved leading to the high demand for neural networks in today’s date.

How to make your model non-linear?

We can make our model non-linear by ensuring that the activation function is non-linear. Activation function determines whether a neuron fires or not.