How Machines Learn

Source: Deep Learning on Medium

“Words and symbols, that’s all they are…”

This quote is something I’ve repeated to myself almost daily over the past 6+ months. “Why?” you ask, let me explain.

In the past whenever I would meet someone working on something truly inspiring, such as Space Colonization, Artificial General Intelligence, or Nuclear Fusion, I found myself defaulting to the same thought…

“That’s amazing, but I’m not smart enough to do the same…”

I’ve realized this is a myth and there’s a simple secret to bridging the gap between today’s Dylan and the future space/AGI/Nuclear Dylan. That “secret” is understanding “words” and “symbols”. Each of these areas I’ve mentioned and many others I’ve not mentioned here have their own separate lingo, but once you start to understand the lingo (e.g. “words” and “symbols”) that gap begins to disappear.

My goal with this post is to do two things.

  1. Educate — I want you to walk away understanding machine learning (ML) at the 5,000-foot view, without all the jargon (e.g. “words”) and math (e.g. “symbols”)
  2. Inspire –I’m hoping you realize, what took me years to understand… No matter what your background, no matter how “complex” a topic seems, you’re able to learn anything.

A quick note to you the reader on my approach for this post.

While researching for this post, I came across an endless list of documents, lectures, and books filled with TONS of jargon and mathematical formulas. This can get overwhelming for most people, so I’m going to do everything I can to avoid this intimidating wall that most curious minds run into when Googling “Machine Learning”.

Models vs. Algorithms

Before we dive headfirst into the world of machine learning, let’s take a step back to understand what people mean when they say “model” or “algorithm”. I’ve found understanding the difference between these two has helped smooth out my machine learning journey.

  • Algorithm — A simple set of rules to follow
  • Model — This is what you build when mixing an algorithm with some data

Model = Training (Algorithm + Data)

I’ve found the best way to explain this is through the analogy of a vending machine.

Imagine a “model” is like a Vending Machine when given input (e.g. money), it will give you some output (e.g. a tasty Twix bar that you’ll regret eating later. Ha!).

The “algorithm” is used to train our model (e.g. vending machine), so all the decisions our model can make are based on the given input (e.g. amount of money, the product you choose, how much change you get back, etc.).

Credit: Kayk Waalk

Remember — When running an algorithm through a specific set of data a model will pop out at the other end.

If you’re interested in knowing a bit more check here, here, and here

Two big goals of machine learning

Machine learning has two major goals…

  1. Predict: This is all about predicting a missing or unknown number in a dataset. For example, predicting the weight of someone based on their height.
  2. Classify: This is all about classifying (or grouping, if that’s easier to understand) a category in a dataset. For example, classifying (e.g. grouping) Facebook users based on the posts they share/like.

Now there are many ways of achieving these two goals, but at the end of the day, this is what every machine learning algorithm is aiming to achieve.

Think about some of the more obvious examples you encounter daily, for example…

Netflix — Haven’t you ever wondered why the cover art on your Netflix account films is different from your mom’s? Well, that’s because the machine learning Netflix uses has “classified” (e.g. grouped) you as a “romantic comedy fan”, while your mom is an “action fan”. Go peek at the different film cover photos between accounts. 😉

Creepy Amazon Ads — We’ve all been there… You know when you make a purchase on Amazon and right below that confirmed purchase is a recommendation on what else you should buy based on that purchase. These recommendations are sometimes a bit too accurate and can get into the realm of creepy quick. Ha! That’s the magic of Amazon’s machine learning.

These two examples are minor, but later I’ll touch on some very meaningful applications of this interesting magic we call “machine learning”.

If you’re interested in knowing a bit more about the two goals, check out here, here, and here.

Three ways to achieve our ML goals

Now that we have our goals, how do we achieve them?

There are three main ways the machine learning world achieves this and within these three there are many specific algorithms used. As I mentioned before my goal is not to confuse you, so for simplicity I’ll bucket the specific algorithms beneath their macro buckets, accepting the fact I may upset a few fellow ML nerds… But right now, I care more about you the general audience, so those ML nerds will have to swallow their obsessiveness to detail for the time being. HA! 😉

  1. Supervised Learning (labeled dated): Basically, you have some god-like supervisor watching over your every move and guiding you along. This means your algorithm is learning from labeled data, which is like an answer key the algorithm can use to judge its accuracy against.
  2. Unsupervised Learning (no labeled data): There’s no supervisor, you get complete freedom, but with freedom comes responsibility (thanks Spiderman) and the opportunity to f*** up. Here the ML algorithm is learning from unlabelled data (no answer key) and tries to make sense of it by pulling out features and patterns on its own. Usually, after this approach is used the results are funneled into a supervised (or deep neural net) algorithm to classify or predict something.
  3. Reinforcement Learning (action + reward/punishment): Like unsupervised learning, there is no answer key in this approach, it’s completely centered around the algorithm learning from experience… Kind of like us going through life. Imagine our reinforcement algorithm being a human that’s been thrown onto another planet, with the goal of colonization. This unlucky human will need to learn through trial/error, so their environment and the rewards/punishments that come with that environment will be what trains our human (e.g. an algorithm).

Side note: Almost all the AI headlines today are probably related to reinforcement learning.

Credit: Guru99

Looking under the cover

I’m going to list out some of the more specific algorithms used within each of these three macro buckets, but if it gets too wordy, skip over this section. I promise it won’t hurt my feelings. Haha!

Supervised (labeled data)

  • Linear Regression (predicting stuff): This algorithm aims to find the strength of the relationship between the goal (e.g. dependent variable) and multiple data points (e.g. independent variable). For example, based on the amount of rain, amount of sunshine, and the average temperature per day (e.g. multiple independent variables), we could predict how much marijuana you could grow (e.g. dependent variable) in a set time period. This algorithm finds the line that best fits between the independent (e.g. rain, sun, average temp.) and dependent (e.g. marijuana growth) variables. — (simple video explanation)
  • Logistic Regression (classifying stuff): This algorithm gives a binary value (1/0, yes/no, true/false, etc.) based on the data feed into it (e.g. independent variables). Let’s stick with our marijuana example, but with the goal of classifying, instead of predicting. For example, based on the number of rainy days, sunny days, and avg. temperature, we could classify you’re as a “big fish” or “small fish” for growing. This algorithm will not only give you a yes/no response, but a percent on how accurate it is (e.g. there’s a 70% chance you’re a “big fish” based on our data). — (simple video explanation)
  • Decision Trees + Random Forests (classifying and predicting stuff): A decision tree is something we’ve all seen, plus they’re intuitively easy to understand, so I won’t spend too much time here. A decision tree is useful when wanting to decide, but there are multiple actions you’ll need to consider. For example, deciding on a job offer (see below). This decision includes considering location, salary, culture, perks, family, etc. What’s stronger than a tree? A FOREST of trees! Haha. I’ll not explain this here, but Random Forests is a collection of decision trees, which makes it “better”. — (simple video explanation)

Credit: Rahul Saxena

  • Some other algorithms you’ll come across that I’m not going to bore you with here could be… Support Vector Machines (SVMs), Naïve Bayes, K Nearest Neighbours, & Gradient Boosting.

Unsupervised (no labeled data)

  • K-Means (classifying stuff): Remember this algorithm doesn’t have an “answer key” like the others, so it’s going at this blind. “K-Means” is all about grouping (e.g. clustering) similar things together. The steps here are simple, the algorithm is given a bunch of unlabelled data and its goal is to classify, as well as group similar data together. For example, if we give our algorithm a bunch of Facebook user data and “ask” it to group the users into Baby Boomers, Gen X, Millennials, and Gen Z.

Credit: Guru99

  • There are other types of unsupervised learning algorithms I’m not covering here, so if you’re interested in diving a bit deeper go here.

Reinforcement (trial and error)

There’s a lot to say here and I’m still learning along with you, so I’ll leave most of my rambling for a separate post. Here are a few helpful words to get you started for diving into reinforcement learning.

Methods (based on…?)

  1. Value-based: The agent is trying to maximize a specific “value function”, which basically means how good is the agent’s current situation based on previous actions
  2. Policy-based: The agent focuses on their strategy (e.g. policy) based on the environment and less on how much value they can get from their next action.
  3. Model-based: The agent is aiming to learn about its environment that you (it’s creator) “modeled”

Types (carrot or stick)

  1. Positive (carrot): Give the agent a reward for doing good and they look to maximize that reward.
  2. Negative (stick): Give the agent a punishment for doing bad and they look to minimize that punishment.

Two Popular Models

  1. Markov Decision Process (more here)
  2. Q learning (more here)

If you’re interested in learning more go here and here.

Machine Learning in action (applications)

Now it’s time we dig into how this magical technology is being used today. I’m only covering three interesting use cases here, so I recommend you do some digging of your own because there’s an endless list of amazing applications.

  • DeepMind & Climate (data center article & video) — DeepMind (a group in Google) was able to cut Google’s data center cooling bill by 40% and overall power usage by 15%, that’s a whole lot of power. At first, it seems Google is just saving tons of money that can be reallocated to other areas, continuing their world domination, but try to see past that. Think about how this could be leveraged for other data centers around the world and how we could make a big impact on Climate Change. Most people don’t realize that the electricity being provided by the not so environmentally friendly energy plants to cool these data centers has a massive impact on our planet (more here).
  • Apple iPhone X Unlock (explanation article & video) — This is deep learning (like machine learning, but “deeper”) at its finest. When you scan your face to unlock your iPhone X Apple is using some magic learning algorithm to compare tiny pieces of your face to each other, finding important features, and then identify you as you.
  • Finding diabetics quickly: There have been all kinds of applications of machine learning within medical diagnosis thanks to its ability to spot patterns where we humans can’t. A popular application has been used in finding out if a patient is diabetic or not by looking at their eyes (e.g. classifying stuff). The big plus here is that people in the medical field will be able to use these algorithms to identify more diabetics, more accurately. There have been tons published on this, so here’s a simple video showing how it works, plus a more complex article going into the weeds.

Choose your own adventure (resources)

Alrighty!

You’ve made it through this high-level overview of machine learning, congrats! Give yourself a pat on the back. 😊

You’re now part of a small club within society that understands the fundamentals behind one of the most impactful technologies humanity will ever see.

Depending on your interest and how deep you want to go down this machine learning rabbit hole, I’ve pulled together my favorite recommendations.

Blogs

  • Elements of AI — This is a free and beautifully made introduction to AI
  • The AI Revolution: The Road to Superintelligence (Wait But Why) — In this blog series Tim Urban does an amazing job mapping out the impact AI could have on us humans and our future alongside it.
  • Machine Learning for Humans — Honestly, this blog inspired me to write this post. Vishal touches on the different areas of machine learning without scaring away those non-technical folks, so if you want to build a stronger intuitive understanding of machine learning, start here.

Podcasts

  • Lex Friedman — Longer form interviews with experts in the space
  • Data Sceptic — Shorter episodes, but very helpful to understand different topics in machine learning
  • TWIML AI — Mainly industry updates and how machine learning is being used within the world today
  • Towards Data Science — Longer form interviews of people working in machine learning and their stories
  • The Future Life Institute — Longer episodes focused on educating the public on the existential risk that comes with such a powerful technology… Something we should think about a bit more, so it’s highly recommended.

YouTube

  • Crash Course AI Series — I’m a fan of anything Crash Course and AI is no different… 😊
  • 2-minute papers — Very accessible and short videos that will help you stay up to date with what’s happening at the cutting edge of AI.
  • Stat Quest — One of my new favorites! Josh simplifies the math behind machine learning like no one else… This should be the first place you go if you’re interested in lifting up the mathematical covers of machine learning.
  • Computerphile — Computerphile has a variety of topics and the animation they use is the best I’ve seen, it makes understanding the bits/bytes of machine learning much easier.
  • Springboard — If you’re interested in getting into the world of machine learning this is the place to go

Book