IBM Watson: Visual Recognition 101

Original article was published on Deep Learning on Medium

Hi Folks,

In early 2000, computer researches believe, identify the objects in an image is almost impossible. Even with the highly advanced algorithms and computing processing powers. 20 years back today almost all the products, who do their operations based on images and visual media use advanced machine learning to some extent level in their business processes to achieve high accuracy and effectiveness. Retails, manufacturing, education, visual auditing, insurance are the top in the consumer list of ML-based image clarification applications in today’s market.

The most important fact is, we don’t need to reinvent the wheel cause today most of the tech giants such as Google, IBM shipped their SaaS modulized machine learning solutions including deep learning algorithms, pre-build trained models and all the required hardware implementation. We have capabilities to do some amount of customization without worrying about most of the major concerns we faced in traditional developments.

IBM Watson Visual Recognition service provides a complete image classification solution including tagging, classifying, and train visual content of scenes, objects, and other content to identify items such as objects, places, and people in images using deep learning algorithms. The classification response includes keywords that provide information about the given content.

In this post, I’ll explain how we can implement a complete image classification solution using IBM’s visual recognition features. Please note this complete demonstration use IBM cloud lite business plan and it’s completely free and it provides 1,000 free images per month toward Custom and Pre-Trained Models.

Demo

Prerequisite

Before we begin, you should have an IBM Cloud account. Go to Create a Free Account and create a cloud account. It’s completely free and no credit card required.

Step 1

Once you complete the account creation, you can log in to Cloud Dashboard and it will route you to your IBM cloud dashboard.

IBM Cloud Dashboard

Step 2

Go to ‘Create Resource’ in the dashboard and search ‘Visual Recognition’ service’ in IBM cloud products.

Search on IBM Cloud products

Step 3

You have to create Visual Recognition service by providing basic details such as Region, Service Name, Resource Group, and Tag. For the moment just give a meaning full name to the Service Name field and leave all other fields as it is.

Create Service on IBM Cloud

Step 4

Once you create the Service, it will redirect to the service dashboard. This particular service-related all the details listed here. All the authentication details required to access this service via remotely listed in the Manage tab.

IBM Service Dashboard

Step 5

Classify using Pre-Build Model

IBM Watson Visual Recognition service provides a set of built-in models, these models are highly accurate and we don’t need to train by ourselves. According to documentation, they have three basic models.

  • General model: Default classification from thousands of classes.
  • Explicit model: Whether an image is inappropriate for general use.
  • Food model: Specifically for images of food items.

If these pre-build-model full fill your requirement, you can easily use this via RESTful web services.

CURL Request — General Model
CURL Request — Food-Model

Create a Custom Model

Most of the time pre-build model more than enough to get an accurate result, but in some cases, we have to train our models to get a more accurate result in our specific scenarios. To create our custom model, we need to launch Watson Studio.

IBM Watson Studio

Click ‘Create Model’ in Classify Images box, and the next window provides basic details of the Custom Model project. You can also restrict the collaborators and select the storage to store images used to train your model. For this demo, I don’t have any other storage options other than default one, so I’ll keep it as it is.

Create a New Project

IBM visual studio required a minimum of 10 images per class and two classes per model. You can put training images into a zip, name it as expected class name and upload it into to project. In this case, I created three classes called Birthday Cake, Cup Cakes, and Wedding Cake. Once you successfully upload the training data you can simply train your model by clicking the ‘Train Model’ button. It will take some time to complete and pop up notification once it has done.

Train Model Window

We can test our model using a custom model test option and get some idea about the accuracy of the trained model. You can see how significant the result with the minimum number of required training data.

Test Custom Model

Same as the Pre-Build Model we can use our custom model through RESTful web service.

CURL Request — Custom Model

I believe you got some extensive idea about IBM’s visual recognition service, try to create your model, and see the accuracy of the result.

Happy Coding.