The Basis of Machine Learning

Source: Deep Learning on Medium

Before learning about the basis of machine learning first we have to know about Expert Systems and how is it different from Machine Learning.

Expert Systems:

In artificial intelligence, an expert system is a computer system that emulates the decision-making ability of a human expert. Expert systems are designed to solve complex problems by reasoning through bodies of knowledge, represented mainly as if-then rules rather than through conventional procedural code. The first expert systems were created in the 1970s and then proliferated in the 1980s. Expert systems were among the first truly successful forms of artificial intelligence (AI) software.

How Expert Systems are different from Machine Learning?

In an Expert System, the full knowledge of the expert acquired is digitized and is used in decision making. An expert specifies all steps he took to make the decision, the basis for doing the same, and how to handle exceptions.

In Machine Learned solution, while giving the training examples, the expert is only asked for a decision. A “Supervised Learning” algorithm would determine, based on all the data available, mimic the end-behavior of the expert. This works well in many situations, since, algorithms that are efficient for machines may not be the most efficient for humans, and machines are more used to handling lots of different dimensions.

Machine Learning:

Machine learning is an application of artificial intelligence (AI) that provides systems with the ability to automatically learn and improve from experience without being explicitly programmed.

Six jars of Machine Learning:

Six Jars of Machine Learning (src:Source:


Machine learning- a form of AI that uses large data sets to teach computers how to respond to and act like humans — allows businesses to optimize operations, deliver better customer experiences, enhance security and more.

The data contains Features and a Target. The features are represented with X and the Target is represented with Y. By using X we build our model to predict the outcomes which approximately matches to Y.


The task in Machine Learning is divided into Supervised and Unsupervised Learning

Supervised Learning:

In supervised learning, we have a classification, regression.


Classification is a field of research to classify things/objects/images/sound/text etc using machine learning/Statistical Learning techniques.

Classification and Regression


Regression is a technique from statistics that are used to predict the values of the desired target quantity when the target quantity is continuous.

Unsupervised Learning:

In unsupervised learning, we have a clustering.


Clustering is the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group than those in other groups. In simple words, the aim is to segregate groups with similar traits and assign them into clusters.



In the machine learning paradigm, model refers to a mathematical expression of model parameters along with input placeholders for each prediction, class, and action for regression, classification and reinforcement categories respectively.

There are many models that researchers and data scientists have created over the years. Some are very well suited for image data, others for sequences (like text, or music), some for numerical data, others for text-based data.


It’s a method of evaluating how well specific algorithm models the given data. If predictions deviate too much from actual results, loss function would cough up a very large number. Gradually, with the help of some optimization function, loss function learns to reduce the error in prediction. Broadly, loss functions can be classified into two major categories depending upon the type of learning task we are dealing with — Regression losses and Classification losses.

Regression Losses

Mean Square Error/Quadratic Loss/L2 Loss

Mathematical formulation:

Mean Squared Error

Classification Losses

Hinge Loss/Multi-class SVM Loss

Mathematical formulation:

SVM Loss or Hinge Loss

Cross-Entropy Loss/Negative Log Likelihood

Mathematical formulation:

Cross-entropy loss

Learning :

Optimization algorithms helps us to minimize (or maximize) an Objective function (another name for Error function) E(x) which is simply a mathematical function dependent on the Model’s internal learnable parameters which are used in computing the target values(Y) from the set of predictors(X) used in the model.

some of the optimization algorithms are Gradient Descent, Back Propagation for feedforward neural networks and Back Propagation Through Time(BPTT) in Recurrent Neural Networks.


he performance measure is the way you want to evaluate a solution to the problem. It is the measurement you will make of the predictions made by a trained model on the test dataset.

Performance measures are typically specialized to the class of problem you are working with, for example, classification, regression, and clustering. Many standard performance measures will give you a score that is meaningful to your problem domain.

some of the evaluation metrics used to evaluate the model Test and Train Datasets, Cross-Validation, Testing Algorithms.


In this post, you learned the basis of Machine Learning which divided into six jars you also learned some methods which are used while building a model.