Understanding the Universal Approximation Theorem

Original article was published on Deep Learning on Medium

Many Input Variables

Let’s extend this approach to the case of many input variables. This may sound complicated, but all the ideas we need can be understood in the case of just two inputs. So let’s address the two-input case.

Here, we have inputs x and y, with corresponding weights w1 and w2, and a bias b on the neuron. Let’s set the weight w2 to 0, and then play around with the first weight, w1, and the bias, b, to see how they affect the output from the neuron:

w1=0 w2=0 and b=0
w1=3 w2=0 and b=0
w1=-7 w2=0 and b=0
w1=1 w2=0 and b=1

As you can see, with w2=0 the input y makes no difference to the output from the neuron. It’s as though x is the only input.

Let’s change the bias too to play with the graph and understand what’s happening.

As the input weight gets larger the output approaches a step function. The difference is that now the step function is in three dimensions. Also as before, we can move the location of the step point around by modifying the bias. The actual location of the step point is Sx≡−b/w1

We can use the step functions we’ve just constructed to compute a three-dimensional bump function. To do this, we use two neurons, each computing a step function in the x-direction. Then we combine those step functions with weight h and −h, respectively, where h is the desired height of the bump. It’s all illustrated in the following diagram:

We’ve figured out how to make a bump function in the x-direction. Of course, we can easily make a bump function in the y-direction, by using two-step functions in the y-direction. Recall that we do this by making the weight large on the ‘y’ input, and the weight 0 on the x input. Here’s the result:

This looks nearly identical to the earlier network!

Let’s consider what happens when we add up two bump functions, one in the x-direction, the other in the y-direction, both of height h:

We can continue like this to approximate any kind of function. But we will end this discussion here and I will point you to some links in the bottom to let you play with these neural networks.