Source: Deep Learning on Medium
Recently, deep learning (DL) and especially convolutional neural networks (CNN) make a huge advance in image classification. There are some known data sets that DL can easily reach the accuracy over 90% (mnist, cifar10, …). But most of the techniques are dealing with a fix number of categories in the training set. For a system that usually need to add new categories to classify, it is hard to remain the same accuracy without retrain the whole data from the beginning.
I came across the 2 following links to find the answer of adding new categories in DL model. What techniques to use for image matching , how-to-add-a-new-category-to-a-deep-learning-model
They propose a method called Content-based image retrieval to deal with new categories. In the following part, I will write about a method Deep Learning of Binary Hash Codes for Fast Image Retrieval that is introduced in the paper with the same name.
Deep Learning of Binary Hash Codes for Fast Image Retrieval
Image retrieval is a system that we upload an image and the system will return the images with the same type.
The concept of this technique is returning images which have the last feature layer closest to one of the query image (last layer represent the high level of features where the first layer represent the low level feature like: edge, line, … )
In the image retrieval problem, the number of images to compare with the query image can be huge. This paper proposes a hierarchical deep search that use coarse-to-fine search strategy. First — coarse search, binarizing the last layer (latent layer in the image above) activations by a threshold, then identify m candidates with lowest Hamming distance with the query image (The bits difference between two vectors). Secondly — Fine search, calculate the Euclidean distance of the previous layer (F7) of the query image with the images in the pool then sorting the value.
Object Verification with Binary Hash Codes
The problem I want to introduce in this post is not Image retrieval but object verification – Checking if an image matches its identity. New category can be added regularly.
Data set used is Mnist from 0 to 7. The new categories 8 and 9 will be added later using the approach above.
- First, the model is fine tuned for a classification problem with number from 0 to 9 (>99% accuracy)
(all the code is used with the fast.ai v1 library)
- Then, I run the model with the new set of data with number 9 and binarizing the last layer (In this moment, I only use the coarse fine approach). Save it and called ref9 (has 48 elements as it is indicated having good result in the paper)
- Finally, I run the model with all other numbers from 0 to 8 (10 pictures for each number) and compare their feature vector with the ref9. The result is average of the percentage of similarity between their feature vectors:
__ Number 9:
First, we test with number 9 to check if we have high similarity. The average (84%) is ok but it variance is quite high
— Number 0:
— Number 1:
__ Number 2:
And all others numbers have a low similarity (<60%) except Number 7:
It has the similarity of ref9 even higher than number 9
- So this approach with binarizing the last layer works not very good. In the next step, I will try calculating the Euclidean distance.
The source code of this blog you can find it here: