Regression and Matrix Plots in Seaborn | Python

Original article was published on Artificial Intelligence on Medium

Regression and Matrix Plots in Seaborn | Python

Seaborn is a wonderful visualization library provided by python. It has several kinds of plots through which it provides the amazing visualization capabilities. Some of them include count plot, scatter plot, pair plots, regression plots, matrix plots and much more. This article deals with the regression plots and matrix plots in seaborn.

What are Regression Plots?

The regression plots in Seaborn library of Python are primarily intended to add a visual guide that helps to emphasize patterns in a dataset during exploratory data analysis. As the name suggests Regression plots, creates a regression line between 2 parameters and helps to visualize their linear relationships.

Getting started with Regression Plots –

1. Importing the required libraries

2. Importing the dataset

Using pandas library to read dataset.

You can download the same dataset from here.

Reading dataset using pd.read_csv() function of pandas

3. Exploratory data analysis

  • df.head() function gives the first 5 rows of the dataset as the output
df.head()
  • Checking total number of NaN values in each column
No. of Missing values in each column of the dataset

4. Visualizations part

  • Plotting with regplot() function and also evaluating regression with residplot() function. A residual plot is useful for evaluating the fit of a model.
Subplots of Basic Regression and Residual plots
  • Polynomial regression : Seaborn supports polynomial regression using the “order” parameter. residplot() with polynomial regression.
Subplots of Polynomial Regression plots
  • Customizing regression plots : Binning the data. “x_bins” parameter can be used to divide the data into discrete bins.
Binning the data into discrete bins

Getting started with Matrix Plots –

Seaborn’s heatmap() function requires data to be in a grid format.

Pandas corr() function is frequently used to manipulate the data.

1. Correlation function of pandas library

Pandas corr() function calculates correlations between columns in a dataframe.

corr() function of pandas library

2. Building a heatmap of correlation matrix

The output of correlation matrix can be converted to a heatmap with seaborn library. Plotting a heatmap with Seaborn’s inbuilt heatmap function.

correlation matrix using seaborn’s heatmap function

3. Customizing the heatmap

  • “annot” is used to annotate the actual value that belongs to these cells
  • “cmap” is used for the colour mapping you want like coolwarm, plasma, magma etc.
  • “linewidth” is used to set the width of the lines separating the cells.
  • “linecolor” is used to set the colour of the lines separating the cells.
Customized heatmap

This brings us to the end of this article. I hope you have understood all the visualizations clearly. Make sure you practice as much as possible.

If you wish to check out more resources related to Data Science and Machine Learning you can refer to my Github account.

Do look out for other Jupyter notebooks in the series which will explain the various other aspects of Data Visualizations with Seaborn in Python.

You can also check my Data Science Portfolio on Github account.

Hope you like the post.