Original article was published by Anuradha Kaurav on Artificial Intelligence on Medium
Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it to learn for themselves.
The process of learning begins with observations or data, such as examples, direct experience, or instruction, to look for patterns in data and make better decisions in the future based on the examples that we provide. The primary aim is to allow the computers to learn automatically without human intervention or assistance and adjust actions accordingly.
That’s a short introduction about Machine Learning, let’s not go into detail right now, as this article is mainly focused on the process of Machine Learning. So let’s get started…!
The Machine Learning process is all about building a predictive model that can be used to find a solution for a problem statement.
Here is the pictorial representation of the steps or the phases involved in the process of Machine Learning. Let’s have a look at each phase.
- Define Objective
– In the first step, we basically try to understand the objective of the problem statement.
– Here we need to answer some of the questions like:
* What kind of problem are we solving?
* How are we going to solve it?
* What are the target features or input data to be used?
* What kind of output will be produced as a solution? and so on.
- Data Gathering
– The role of this phase is to gather the data required to solve the problem.
– Here we check the availability of the resources to get the data.
– Many resources are available online that provide data sets, that are very helpful for the beginners in Machine Learning.
– This stage is one of the most time-consuming stages in the process of Machine Learning.
- Data Preparation
– Here, all the data gathered is transformed into the required format by data cleaning.
– Data cleaning is done to remove inconsistencies in data like missing values, duplicate values, corrupted data, and unnecessary data.
– This is done to prevent the data from encountering any false predictions.
- Data Exploration
– This phase is commonly known as Exploratory Data Analysis (EDA).
– Data exploration involves understanding the patterns and trends in the data provided.
– At this stage, all the useful insights are drawn and correlations between the variables are understood.
- Building a Model
– The Machine Learning models are usually built by using some of the powerful Machine Learning algorithms such as the Classification Algorithm, Linear Regression, Decision Tree, etc.
– The algorithms are chosen based on the objective that is defined.
– This stage always begins by splitting the data set into two parts namely; Training data to build a model and Testing data to test the accuracy of the outcome.
6. Model Evaluation
– This stage is all about the evaluation and optimization of the model.
– Testing the data set is used to test the efficiency of the model and how accurately it can predict the outcome.
– After the accuracy is calculated, further possible improvements are done by using techniques like Parameter Tuning.
– The final outcome is predicted after performing Parameter Tuning and improving the accuracy of the model.
– The outcome can be a Categorical variable or Continuous variable.
– The categorical variables contain a finite number of categories or distinct groups. Categorical data might not have a logical order. For example, categorical predictors include gender, material type, and payment method.
– The Continuous variables are numeric variables that have an infinite number of values between any two values. A continuous variable can be numeric or date/time. For example, the length of a part of the date and time a payment is received.