Nowadays, the buzz is around these magical tools called “machine learning” and “deep learning”. But to the average non-compsci human, these magical tools are just that, magic.
Well, in this post (and possibly subsequent posts), I will ATTEMPT to explain what these tools are and reveal the trickery behind all this. I will also discuss the theory behind popular machine learning algorithms in ways which hopefully a non-compsci person may understand.
What is machine learning?
Pattern recognition. That’s it. Ignore all the fancy math and Greek alphabet that computer scientists use to make themselves seem smart. Fundamentally, (most) machine learning is pattern recognition.
Consider this example. You’re a baby. Your task is to fit wooden blocks into the correctly shaped slots. You start off by picking up the square block and sticking it into the circular hole. Doesn’t work. You try a few times more and you find that the square block fits into the squared hole. Congratulations, you just learned. Now, every time you see a square block, you know to fit it into a squared hole.
In a machine (i.e. software) context, say a user receives several emails all titled something like this: “Viagra 50% off only today!!!” Because this user, Scott, is a man of potent youth, he has no need for these emails and therefore, deletes these emails without opening them. Now, Gmail knows that Scott does not need Viagra and automatically sends these emails to his Spam folder.
These are very simple examples of machine learning in use: recognizing a pattern such as which block shape fits into a hole and which emails are spam. There are obviously other more complicated applications, such as image recognition and Russian bots influencing the 2016 US Election, but let’s stick with the basics for now.
Types of machine learning
There are 3 main types of machine learning:
- Supervised learning
- Unsupervised learning
- Reinforcement learning
The examples I described earlier are supervised learning. In supervised learning, the machine learns from existing examples. So, Gmail learnt what words (e.g. “Viagra”) to flag as spam when Scott provided some examples of what he considered to be spam (i.e. emails containing the word “Viagra”). The baby learnt from her own examples that square blocks only fit into squared holes, and not circular holes.
Unsupervised learning, on the other hand, doesn’t need these examples. Consider giving the baby a bunch of square and circular blocks (and not telling the baby the difference between them). Now, you ask the baby to separate the blocks into 2 categories. A dumb baby would not understand what you’re saying. A smart baby, on the contrary, would understand what you’re saying, but refuse to do so and just throw the blocks around. Luckily, machine learning (generally) is smarter than this baby and would be able to identify that these blocks are in different shapes. By sorting these blocks by their shapes, our machine would be able to identify 2 separate categories (i.e. square and circle) for these blocks. That is unsupervised learning: identifying patterns without you having to explain what the pattern is.
Now, reinforcement learning is kind of different from the other two. Reinforcement learning learns from trials to optimize a reward. Example?
It’s a Friday night and you’re lonely because you have no friends, so you decided to download Tinder. Lucky for you, you have devilish good looks, so after swiping 300 girls, you finally match with 5. Now most guys would consider getting these matches the “reward” already, but no, we’re ambitious. We want a phone number.
Because you’re a socially awkward pickle, you decide to start with a simple “Hi” for your first girl, even though Tinder told you to say something funny. Guess what? She doesn’t respond. Reward = 0.
Now that you know you’re not handsome enough to simply start with a “Hi”, you googled a pick-up line and sent it to the second girl. “You want to play shark attack?” She replies, “What’s that?” Reward +1. “I eat and you scream.” “Bye. Filing a restraining order on you.” Well, at least she replied. Your total reward = 1.
So you’re making some progress, trying out different things to say and slowly increasing the reward you get. You finally came to the optimized technique.
- Start off with a funny, yet respectful joke
- She replies → Reward +1
- Ask about her bio (so she knows you care about her personality)
- She likes getting attention → Reward +1
- Invite her to a public and safe spot
- She agrees because she’s bored and lonely too → Reward +3
- Ask for her phone number to stay in touch
- She gives you her real phone number → Reward +6941
Now that you figured out the secret to optimizing your reward (i.e. being closer to getting her phone number), you decide to follow this procedure every time. THAT is reinforcement learning. Learning from your mistakes, adjusting, and ultimately getting that maximized reward of your dream.
Source: Deep Learning on Medium