Machine Learning 101

Source: Deep Learning on Medium

In broad terms machine learning is a subset or a way of achieving Artificial Intelligence (AI) . AI involves machines that can perform tasks that are characteristic of human intelligence like logical decision making, learning and improving from experience.

AI systems can be broadly divided into two categories Narrow and General, while Narrow AI refers to a system which is built for handling some particular task like- spam filtering whereas, a Generalised AI system refers to systems which can in theory solve any problem and this is where most of the buzzwords that we hear these days like Machine Learning , Deep Learning emerged.

Even though the idea has been around for ages now , but it started giving significant and groundbreaking results just around a decade ago with the emergence and popularisation of internet leading to generation of tons of digital data and creation of efficient and faster than ever storage and processing equipment.


Arthur Samuel in 1959 defined machine learning as a field of study that gives computers the ability to learn without being explicitly programmed.

What is the meaning of “learn”?

A computer program is said to learn from experience E, with respect to some task T, and some performance measure P, if its performance on T as measured by P improves with experience E.

Let’s break it down , Imagine you go out in the woods to play archery with your friends , If you have never used a bow and arrow yourself before , How will you learn?What are some of the tasks and calculations made by your brain?

You will first take a shot hoping it goes somewhere near the target (fingers crossed and all) , Then you will see where it hits and for the next shot your brain does some calculations based on last attempt and based on that adjusts posture, direction, force and other features accordingly to improve the result of this next shot, will you get the bull’s eye in second shot? Highly unlikely. So your brain repeats the process again and again improving slowly in each iteration and if you practice enough shots you will be able to consistently hit the target accurately, this is what we call as developing muscle memory for a task.

Let’s compare it with the above definition of how a computer program learns , let taking a shot be task T , P be the probability that it hits the next shot on target and E be the experience of having a program take 10’s or 1000’s of shots. So it takes a shot T , based on the result calculates P for next shot and then makes changes to it’s features based on E and then take shot T again and repeat the process aiming to maximise probability of hitting bull’s eye.

Now, suppose you liked playing archery with your friends and aspiring to get better at it you indulge in regular practice every day of the week in the woods under clear sky.

But, when the weekend comes and you head out with your friends the weather gets windy and cloudy. So, even after all that rigorous practice you still couldn’t hit many shots on target. One reason being that you got used to your environment and thus, couldn’t perform in a different environment the technical word for this scenario is over-fitting.If you are wondering that is there something called under-fitting too?If yes what is that? Then in nutshell under-fitting is simply the case of performing bad because of lack of practice.

Now suppose you train in windy environment before next weekend but next time it starts pouring, or your friends decide to change the game a little to hit moving targets, or this time you have to hit targets on elevated platforms or any of the countless other simulations possible.

So how can we master a skill then?

Due to all these ever changing environment variables,It is said that for one to truly master a skill such that he can perform well no matter what the circumstances it takes 10,000 hours of practice.

But who has that much time, energy or patience right?

This is where a intelligent system comes into play. A machine learning agent can quickly run 1000’s of simulations and adapt to various environment conditions and learn to hit the target consistently consistently most of the time.

Three basic learning models :

1> Supervised Learning

When, Why and How:

When the system is provided input and their corresponding labels and is then expected to predict the label for some previously unknown set of input.


When we need the output in a continuous range then it’s called a regression problem. To Solve regression problems we aim to find the best fit curve such that root mean squared error of given data is minimum then use that curve to predict values of new data. Example-predicting price of some product in future based on past records , weather prediction etc.

Classification vs Regression

When we need to categorise some data into some finite discrete classes(like 0,1,2…) then it becomes a classification problem. To make a binary classifier i.e. which just has two possible outcomes 0 or 1 , yes or no we can use perceptron algorithm where we aim to find the equation of a line which separates given training data such that data points on one side of the line are of one class and data points on the other side are of the other. Example- classifying mail as spam or not spam , finding the presence of tumour from x-rays etc.

2> Unsupervised Learning

When Why and How:

When we only have data features but no labels and we have to find patterns in data and group data having similar features together . Unsupervised learning is adopted for problems where the answer is not known.


Clustering : There are various algorithms to perform clustering like K means clustering, DBSCAN and hierarchical clustering , In K means clustering algorithm we need to enter the value of K beforehand that is how many clusters we want to find in this ,whereas there is no such limitation in hierarchical model.

Used in – Recommendation system , Social Network analysis , image segmentation, market trend analysis etc.

3> Reinforcement Learning

When Why and How:

This is quite different from above two methods here we do not provide any data to train on we just give the equipment to perform a task and then set up trial and error based reward system where for each desired interaction of agent with environment , agent gets rewarded and builds upon that feedback. It is same as how a dog is trained , in the beginning he does not understand any command but when he randomly follows our command once and gets rewarded for that he repeats that action on the same command again for reward and eventually learns to perform the task without rewards.

Reinforcement Learning process

Example: Reinforcement Neural Networks(RNN), Long Short Term Memory networks (LSTM). 
Used in: Language modelling and translation, QLearning for atari games, handwriting recognition/generation, keyboard predictions etc.

Now, what is deep learning and what’s so special about it?

“Deep learning is a particular kind of machine learning that achieves great power and flexibility by learning to represent the world as nested hierarchy of concepts, with each concept defined in relation to simpler concepts, and more abstract representations computed in terms of less abstract ones.”

While the machine learning algorithms depend on programmer to provide some data based on which it can give results, deep learning is capable of finding out the features necessary for the task all on it’s own.

Machine Learning vs Deep Learning

As shown in above graph for less data generic machine learning techniques may outperform deep learning since it takes more time and resources to train. But given enough time and data it will outperform everything else.

One downside of deep learning is that as the model gets deep and layers get dense interpretability of the result gets lost, i.e. understanding the process based on which we are getting results.


We have come a long way in this field and applications of machine learning,deep learning and AI can be found everywhere now , from classification of search results on google to facebook friend recommendations to image recognition systems to personal voice assistants to self driving cars , But it’s just the beginning and there are still immense possibilities out there that can be automated using using these techniques.

There is hope that we can one day make an ultimate AI based system which can solve any variable problem X answers to which are still unknown to humanity.