Own Image Set Classifier Using TensorFlow API

What is TensorFlow?

TensorFlow is an open-source software library for dataflow programming across a range of tasks. It is a symbolic math library and also used for machine learning applications such as neural networks. source[Wikipedia]

What are we gonna do today?

We are gonna use the Tensorflow API to create our own Image Classifier.

Traditionally when we create our own Image classifier, we will use CNN in order to train the network and once the network is trained, it will be able to tell or predict on the test data or the image which we use.

If you look into this, this process needs a lot of time and processor speed in order to do this.

What if there is a method to do this using just command line in the command prompt or terminal and what is it takes less time to train and give a really good result

In this, we will use the Tensorflow API to create our own image classifier.We will be using transfer learning, that is we are starting with a model that has been already trained on another problem. We will then be retraining it on a similar problem. Deep learning from scratch can take days, but transfer learning can be done in short order.

We are going to use a model trained on the ImageNet Large Visual Recognition Challenge dataset. These models can differentiate between 1,000 different classes, like Dalmatian or dishwasher. You will have a choice of model architectures, so you can determine the right tradeoff between speed, size, and accuracy for your problem.

We will use this same model, but retrain it to tell apart a small number of classes based on our own examples

Setup

I will be using python 3 and python is set as my environment variable.

  1. Install Tensorflow
  2. Clone the repository in case of Linus or download in case of Windows
git clone https://github.com/googlecodelabs/tensorflow-for-poets-2

3. Once its done navigate into the folder by name ‘tensorflow-for-poets-2

cd tensorflow-for-poets-2

Training images

We will need a set of images to teach the model about the new classes you want to recognize. We will use Dog, Cat photos to initially train. Download the photos using google image search and a chrome extension called Fatkun Batch Download Image. This allows you to download all the image in a particular tab.
Save the Image in the separate folder named as cat and dog.Its recommend that we will use at least 1000 images in each folder for training

Take these 2 image data’s and place in inside the folder tf_files/Data

Re-Training the network

The retrain script can retrain either Inception V3 model or a MobileNet. In this exercise, we will use a MobileNet. The principal difference is that Inception V3 is optimized for accuracy, while the MobileNets are optimized to be small and efficient, at the cost of some accuracy.

Inception V3 has a first-choice accuracy of 78% on ImageNet, but is the model is 85MB, and requires many times more processing than even the largest MobileNet configuration, which achieves 70.5% accuracy, with just a 19MB download.

Pick the following configuration options:

  • Input image resolution: 128,160,192, or 224px. Unsurprisingly, feeding in a higher resolution image takes more processing time, but results in better classification accuracy. We recommend 224 as an initial setting.
  • The relative size of the model as a fraction of the largest MobileNet: 1.0, 0.75, 0.50, or 0.25. We recommend 0.5 as an initial setting. The smaller models run significantly faster, at a cost of accuracy.

With the recommended settings, it typically takes only a couple of minutes to retrain on a laptop.

Start TensorBoard

TensorBoard is a monitoring and inspection tool included with tensorflow. You will use it to monitor the training progress.

tensorboard --logdir tf_files/training_summaries &

This command will fail with the following error if you already have a tensorboard process running:

ERROR:tensorflow:TensorBoard attempted to bind to port 6006, but it was already in use

You can kill all existing TensorBoard instances with:

pkill -f "tensorboard"

Run the training

Imagenet models are networks with millions of parameters that can differentiate a large number of classes. We’re only training the final layer of that network, so training will end in a reasonable amount of time.

Start your retraining with one big command

python -m scripts.retrain \
--bottleneck_dir=tf_files/bottlenecks \
--how_many_training_steps=500 \
--model_dir=tf_files/models/ \
--summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}" \
--output_graph=tf_files/retrained_graph.pb \
--output_labels=tf_files/retrained_labels.txt \
--architecture="${ARCHITECTURE}" \
--image_dir=tf_files/flower_photos

AS IN:

Optional: I’m NOT in a hurry!

The first retraining command iterates only 500 times. You can very likely get improved results (i.e. higher accuracy) by training for longer. To get this improvement, remove the parameter –how_many_training_steps to use the default 4,000 iterations.

python -m scripts.retrain \
--bottleneck_dir=tf_files/bottlenecks \
--model_dir=tf_files/models/"${ARCHITECTURE}" \
--summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}" \
--output_graph=tf_files/retrained_graph.pb \
--output_labels=tf_files/retrained_labels.txt \
--architecture="${ARCHITECTURE}" \
--image_dir=tf_files/flower_photos

Using the Retrained Model

The retraining script writes data to the following two files:

  • tf_files/retrained_graph.pb, which contains a version of the selected network with a final layer retrained on your categories.
  • tf_files/retrained_labels.txt, which is a text file containing labels.

Classifying an image

The codelab repo also contains a copy of tensorflow’s label_image.py example, which you can use to test your network.

Now, let’s run the script on this test image :

python -m scripts.label_image \
--graph=tf_files/retrained_graph.pb \
--image=tf_files/<test_image_folder>/<testImage>jpg

Next steps

Congratulations, you’ve taken your first steps into a larger world of deep learning!

You can see more about using TensorFlow at the TensorFlow website or the TensorFlow Github project. There are lots of other resources available for TensorFlow, including a discussion group and whitepaper.

If you make a trained model that you want to run in production, you should also check out TensorFlow Serving, an open source project that makes it easier to manage TensorFlow projects.

Source: Deep Learning on Medium