This is Tree! Decision Tree!!

Original article was published on Artificial Intelligence on Medium

Today we will be focusing more on Decision Tee Regression and not on classification

Decision Tee Regression

Decision Tee Regression consists of supervised machine learning algorithms to predict the output target (A target output is the true output or labels on a given dataset.) feature by an optimal recursive binary splitting of output target and input predictor features data into an incrementally smaller node.

The Top node is known as the Root Node, internal nodes are known as Decision Nodes, and finally, the terminal nodes are known as Leaf Nodes.

An Optimal Recursive Binary Splitting or just Recursive Binary Splitting is a numerical procedure where all the values are lined up and different split points are tried and tested using a cost function. The split with the best cost (lowest cost because we minimize cost) is selected.

A cost function is a measure of how wrong the model is in terms of its ability to estimate the relationship between x and y. This is typically expressed as a difference or distance between the predicted value and the actual value.

Basic Terminology

(1) Entropy:- It is the measure of randomness/ unpredictability in a dataset.

entropy can be calculated by,

(2) Information Gain:- It is the measure of the decrease in entropy after the dataset is split.

Creating a Regressive Decision Tree

(Basic Terminologies & Mathematics needed)

Step (1):-

Divide the Predictor Space ( a preditor space is nothing but a space consisting of all the possible values of attributes needed for prediction) distinct and non-overlapping (non-repeating data) regions.

How To Split the data??

We frame the conditions that split the data in such a way that the information gain is the highest.

Gain is the measure of the decrease in entropy after splitting

we calculate the entropy of the dataset after every split to calculate the gain

Now we choose a condition which gives us the highest gain after every split. We do this by splitting the data using each condition & check the gain that we get out from them.

And hence the condition giving the highest gain will be used to make our first split.