Recognizing Handwritten Digits with Scikit-Learn

Original article was published by Aditya Bhandari on Artificial Intelligence on Medium

Recognizing Handwritten Digits with Scikit-Learn

Recognizing handwritten text is a problem that can be traced back to the first automatic machines that needed to recognize individual characters in handwritten documents. Classifying handwritten text or numbers is important for many real-world scenarios. For example, a postal service can scan postal codes on envelopes to automate the grouping of envelopes which has to be sent to the same place. This article presents recognizing the handwritten digits (0 to 9) using the famous digits data set from Scikit-Learn, using a classifier called Logistic Regression.

Scikit-Learn is a library for Python that contains numerous useful algorithms that can easily be implemented and altered for the purpose of classification and other machine learning tasks.


If you already have Jupyter notebook and all the necessary python libraries and packages installed you are ready to get started.

If not you can use Google colab too!

Let us start by importing our libraries

Visualizing the images and Training

To use a classifier we have to Flatten the image

Create a classifier: a support vector classifier

Split data into train and test subsets

Now predict the value of the digit

Confusion matrix

A confusion matrix is a table that is often used to evaluate the accuracy of a classification model. We can use Seaborn or Matplotlib to plot the confusion matrix. We will be using for Matplotlib our confusion matrix.


From this article, we can see how easy it is to import a dataset, build a model using Scikit-Learn, train the model, make predictions with it, and finding the accuracy of our prediction(which in our case is 97.11%). I hope this article helps you with your future endeavors!

Thank you for reading my article!

For The Source code, Click here