An Intro to Neural Networks: HOW does a computer know what a handbag or number is?

Source: Artificial Intelligence on Medium

An Intro to Neural Networks: HOW does a computer know what a handbag or number is? How do they convert human written numbers into computer-readable numbers?

This week, I went to an event run by IBM called the Role of AI in Retail. Again, we classified different brands of handbags. I wrote a blog about this previously since I had gone to a similar event a few weeks ago. After that event, I went on YouTube and rewatched some videos I watched in the past about an introduction to neural networks.

I found this video to be particularly helpful from the YouTuber 3Blue1Brown.

He explains in the video that the basic beginnings of learning to create a neural network is handwriting the numbers 0–9 and having a computer recognize those numbers. As a programmer, a good analogy that he used was this was the “Hello World” of neural networks. Printing “Hello World” is usually the first thing a software engineer tries to do when learning a new language to get a sense of the syntax of the new language.

Rather than talk about recognizing different types of handbags which is a lot more complex, I’m going to talk about recognizing hand written numbers just because it’s easier for me to wrap my head around.

A neural network functions a lot like the neurons in our brains. It’s designed around our brains to function similarly to it. You have some neurons, and activating those neurons fires off some other next set of neurons.

In between the beginning neurons and final results, there are some layers of neurons in between. These neurons recognize the patterns in written numbers and filter out the written numbers until the computer comes to a conclusion as to what it thinks the written number is. It does this by breaking down the written numbers into small little pieces.

I’ll start off with what I think may be the easiest for a computer to recognize, the number 1. The number 1 is just one straight line. Okay, so that’s easy for a computer to recognize. Just look for straight vertical lines and assume those are 1s. Now, there are a lot of numbers that also have a straight vertical line, like 4, 7 and 9.

Let’s look at what exactly makes a 9 a 9. Sure, a 9 has a line on the right side, but there is one big difference between a 9 and a 1 of course. That being that a 9 has a little circle on the top. So how can a computer recognize a circle? ⭕️

I would break that circle down into 8 different parts, sort of like a clock. Starting from the top, it has a small curve on the top. On the top right, it has a small curve, angled to the top right. On the right, it has a small curve on the right. Rinse and repeat and now you have bunch of little curves that the computer can sort of recognize. The computer can put all of these little curves together and assume that hey, this looks like a circle.

Now let’s go back to the difference between a 1 and a 9. If a computer sees a straight line, it can assume that thus far, knowing that only 1’s and 9’s exist, it can be either a 1 or a 9. This uncertainty of whether the written number is a 1 or a 9 is possibly one of the middle layers of the neural network. But say you wrote a 9 on your trackpad. The computer sees a circle, and it also sees a straight line connected to it. The computer puts the two together and voila, now you have a 9.


Now let’s move on further. There are a few numbers that have circles in them: 0, 6, 8, 9.

If you only write/draw a circle, then that computer can probably deduce that it’s a 0. If you draw a circle on top of another circle, then that’s probably an 8.

When it comes down to it, it really is just breaking it down into little details so that the computer can digest what you write and then put all the lego pieces together.

There’s one more thing though that I should mention. Human’s don’t exactly have perfect handwriting. Heck, even I, as a human, can’t read my doctor’s prescription. It’s gibberish. So there’s something to account for this called a confidence score at the end or output of the neural network. Essentially, it’s a percentage or a number from 0.00–1.00, which is the computer’s way of saying I think this is a 9, but not too sure. It could also be a 4 depending on how you look at it. So the computer says, here’s a number which is my way of saying how much I think this is a 9 or a 4. This confidence score is just saying how much the computer believes that what it’s returning is true.

I hope that demystifies things a bit, and that it seems more “artificial” than intelligence.

Harry Potter is one of my favorite movies so have fun coding and learning about neural networks!