Polynomial Regression in R — when to use and how to use

Source: Deep Learning on Medium

Polynomial Regression in R — when to use and how to use

( Sometimes tweaking linear model by few parameters make your model more accurate)

Let’s say we want to perform regression analysis on the non-linear dataset. For example, we can take a look at the level of an employee vs. salary offered dataset available here. You are free to use your own dataset for the problem, the requirement is only that the data should be non-linear. If it is linear, simple linear regression will work just fine. If you are interested in linear regression as well check out my linear regression for busy data scientists tutorial here. The data looks like the following:

data for poly regression — salary given by level of employee

Let’s create a linear plot for our data using linear regression. The tutorial for performing simple linear regression is here.

linear regression plot of non-linear data

Now it is clear that regression is not doing good in our data, so we need some kind of non-linear model.

non-linear model

So now let’s create additional features for polynomial regression.

> data$Level1 = data$Level^2> data$Level2 = data$Level^3> data$Level3 = data$Level^4

Now let’s plot with the new features.

my_model = lm(Salary ~ Level + Level1 + Level2 + Level3,data=data)> library(ggplot2)> ggplot() ++ geom_point(aes(x = data$Level, y = data$Salary),+ colour = ’red’) ++ geom_line(aes(x = data$Level, y = predict(my_model,newdata = data)),+ colour = ’blue’) ++ ggtitle(’Truth or Bluff (Polynomial Regression)’) ++ xlab(’Level’) ++ ylab(’Salary’)

And it gives us the following polt.

You can see it is much better now. So let’s make some predictions now.

> predict(my_model, data.frame(Level = 6.5,+ Level1 = 6.5ˆ2,+ Level2 = 6.5ˆ3,+ Level3 = 6.5ˆ4))Result: 158862.5

If you like my article, don’t forget to follow me on medium, or connect me on linkedin, or follow me on twitter.