Xception Neural Network Transfer learning and Data Processing using AI

Source: Deep Learning on Medium


Hi everyone I been working on AI Wedding card Generator using Gan, To do that i need a lot of data, So I scraped bunch of data from website and google,There are lot of unwanted images mixed,So i want to prune it.But there are 1,00,000 image for border only,there are other element in wedding card which i was downloaded like monogram,pattern,corner pattern,background,font,component. It is headache to delete manually , So I want the Ai to do the pruning processes, So i trained a Xception Neural Network to classify the wanted and unwanted image and I will explain how i did it.First i will explain about How to prune the pattern which is a binary classification.Then i will explain the monogram which is little bit complex.

First process is to prepare the image dataset into hdf5 file,I manually selected a 500 pattern image and non pattern or pattern with watermark image and saved in a separate folder pattern and nonpattern.I didnt just randomly selected the dataset.I selected carefully with lot of variation in image,because i going to train less no of image to neural network so lot of variety of image is required So that neural network will predict properly.

Step 1 import all the required module

import numpy as np
import cv2
import glob
import os
from random import shuffle
from copy import copy
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import h5py

Next extract the path of the image and store it in a variable

notpattern=list(glob.iglob(u’D:\\aidelete\\testing\\notpattern\\*.*’))
pattern = list(glob.iglob(u’D:\\aidelete\\testing\\pattern\\*.*’))

Then assign a labels has 0 for nonpattern image and 1 for pattern image.

labels = [0 for path in notapattern]
for path in pattern:
labels.append(1)
addrs = copy(notapattern)
addrs.extend(pattern)

check the image file is readable, because we downloaded the image from online we cant assure that the all the image is readable.While dumping the image in hdf5 file,If some image cant be open,a error will popup then we have to start every thing from first.

for i in addrs:
check = u'\\\\?\\'+i
# try:
with open(check,'rb') as f:
pass
except FileNotFoundError:
# addrs.remove(d)

combine the path of the image and label in list then shuffle the data, again separate the data and assign shape for training dataset ,validation dataset and test dataset and also assign the image size as 299 x 299 x 3 because Xception can handle 299 x 299 x 3 input shape ,It can also handle other input shape check out it in below link.

data = list(zip(addrs, labels))
shuffle(data)
shuffle(data)
shuffle(data)
addrs, labels = zip(*data)
# Divide the data into 60% train, 20% validation, and 20% test
train_addrs = addrs[0:int(0.6*len(addrs))]
train_labels = labels[0:int(0.6*len(labels))]
val_addrs = addrs[int(0.6*len(addrs)):int(0.2*len(addrs))]
val_labels = labels[int(0.6*len(addrs)):int(0.2*len(addrs))]
test_addrs = addrs[int(0.8*len(addrs)):]
test_labels = labels[int(0.8*len(labels)):]

Then Generate the Hdf5 file to store the image.

hdf5_path = u'D:/aidelete/pattern/patterndataset.hdf5'
train_shape = (len(train_addrs), 299, 299,3)
val_shape = (len(val_addrs), 299, 299,3)
test_shape = (len(test_addrs), 299, 299,3)
# open a hdf5 file and create earrays
hdf5_file = h5py.File(hdf5_path, mode='w')
hdf5_file.create_dataset("train_img", train_shape, np.int8)
hdf5_file.create_dataset("val_img", val_shape, np.int8)
hdf5_file.create_dataset("test_img", test_shape, np.int8)
#hdf5_file.create_dataset("train_mean", train_shape[1:], np.float32)
hdf5_file.create_dataset("train_labels", (len(train_addrs),), np.int8)
hdf5_file["train_labels"][...] = train_labels
hdf5_file.create_dataset("val_labels", (len(val_addrs),), np.int8)
hdf5_file["val_labels"][...] = val_labels
hdf5_file.create_dataset("test_labels", (len(test_addrs),), np.int8)
hdf5_file["test_labels"][...] = test_labels
# hdf5_file.close()

Now dump the image into created hdf5 file

for i in range(len(train_addrs)):
# print how many images are saved every 1000 images
if i % 1000 == 0 and i > 1:
print ('Train data: {}/{}'.format(i, len(train_addrs)))
# read an image and resize to (299, 299)
# cv2 load images as BGR, convert it to RGB
addr = train_addrs[i]
try:
img = cv2.imread(addr)
img = cv2.resize(img, (299, 299), interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# img = img.reshape(299,299,1)
except:
addr = u'\\\\?\\' + addr
try:
print(addr)
img = mpimg.imread(addr)
img = cv2.resize(img, (299, 299), interpolation=cv2.INTER_CUBIC)
# img = img.reshape(299,299,1)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

except:
print(addr)
img = color.gray2rgb(mpimg.imread(addr))
img = cv2.resize(img, (299, 299), interpolation=cv2.INTER_CUBIC)
# img = img.reshape(299,299,1)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# add any image pre-processing here

# save the image and calculate the mean so far
hdf5_file["train_img"][i, ...] = img[None]

# loop over validation addresses
for i in range(len(val_addrs)):
# print how many images are saved every 1000 images
if i % 1000 == 0 and i > 1:
print ('Validation data: {}/{}'.format(i, len(val_addrs)))
# read an image and resize to (100, 100)
# cv2 load images as BGR, convert it to RGB
try:
addr = val_addrs[i]
img = cv2.imread(addr)
img = cv2.resize(img, (299, 299), interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# img = img.reshape(299,299,1)
except:
addr = u'\\\\?\\' + addr
try:
img = mpimg.imread(addr)
img = cv2.resize(img, (299, 299), interpolation=cv2.INTER_CUBIC)
# img = img.reshape(299,299,1)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
except:
img = color.gray2rgb(mpimg.imread(addr))
img = cv2.resize(img, (299, 299), interpolation=cv2.INTER_CUBIC)
# img = img.reshape(299,299,1)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# print(addr)
# add any image pre-processing here
# if the data order is Theano, axis orders should change
if data_order == 'th':
img = np.rollaxis(img, 2)
# save the image
hdf5_file["val_img"][i, ...] = img[None]
# loop over test addresses
for i in range(len(test_addrs)):
# print how many images are saved every 1000 images
if i % 1000 == 0 and i > 1:
print ('Test data: {}/{}'.format(i, len(test_addrs)))
# read an image and resize to (100, 100)
# cv2 load images as BGR, convert it to RGB
addr = test_addrs[i]
try:
img = cv2.imread(addr)
img = cv2.resize(img, (299, 299), interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# img = img.reshape(299,299,1)
except:
addr = u'\\\\?\\' + addr
try:
img = mpimg.imread(addr)
img = cv2.resize(img, (299, 299), interpolation=cv2.INTER_CUBIC)
# img = img.reshape(299,299,1)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
except:
img = color.gray2rgb(mpimg.imread(addr))
img = cv2.resize(img, (299, 299), interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# img = img.reshape(299,299,1)
# print(addr)
# add any image pre-processing here
# if the data order is Theano, axis orders should change
if data_order == 'th':
img = np.rollaxis(img, 2)
# save the image
hdf5_file["test_img"][i, ...] = img[None]
# save the mean and close the hdf5 file
# hdf5_file["train_mean"][...] = mean
hdf5_file.close()

That’s it now all the image are stored has hdf5 file.We can use this file to train the xception network.

If some error came , In jupyter notebook restart the kernal and then delete the hdf5 file and check the code or shape and image is readable or not and start the process again.

Now Data Processing is over, We will start the training process,Before that i recommand you to check out the xception neural network.

As you can see i chosen Xception network because it has less parameter and high accuracy. There are less trade off compare to other network which has high accuracy then xception network but the computation power required for other network is high due to that only i chosen xception network reference link

Xception and other network Links

Xception is a Depthwise conv neural network check it out in above video.

Let’s start the code

Import the module

from keras.applications.xception import Xception, preprocess_input
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
from keras.models import model_from_json
from keras.optimizers import Nadam
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import MaxPooling2D

Load the dataset and extract the training , validation and test set separatly

hdf5_path = '/path/name_of_the_dataset.hdf5'
hdf5_file = h5py.File(hdf5_path, "r")
hdf5_filetest = h5py.File(hdf5_pathtest, "r")
# reshape to be [samples][pixels][width][height]
X_train = hdf5_file.get('train_img').value
y_train = hdf5_file.get('train_labels').value
X_val = hdf5_file.get('val_img').value
y_val = hdf5_file.get('val_labels').value
X_test = hdf5_file.get('test_img').value
y_test = hdf5_file.get('test_labels').value
# X_train = X_train.reshape(X_train.shape[2],X_train.shape[1],X_train.shape[0])
# X_test = X_test.reshape(X_test.shape[2],X_test.shape[1],X_test.shape[0])
hdf5_file.close()
X_train = X_train.reshape(X_train.shape[0], 299, 299,3).astype('float32')
X_test = X_test.reshape(X_test.shape[0], 299, 299,3).astype('float32')
X_val = X_val.reshape(X_val.shape[0], 299, 299,3).astype('float32')

Now load the xception network ,here we are not going to modify inside the xception network we just doing transfer learning , we are going to add a fully connected layer to last output layer of the xception network before that we have flatten the last output layer but xception last layer is 2048x3x3 If we use flatten we get 18,432 ,So we have to reduce it by globalAveragepooling ,then train the model and save it.

base_model = Xception(weights='imagenet', include_top=False)
#add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(200,activation='elu')(x)
x = Dropout(0.4)(x)
x = Dense(170,activation='elu')(x)
predictions = Dense(1,activation='hard_sigmoid')(x)
model = Model(inputs=base_model.input, outputs=predictions)
model.compile(optimizer=Nadam(lr=0.0001), 
loss='binary_crossentropy',metrics=['accuracy'])
model.fit(X_train, y_train, validation_data=(X_val, y_val),shuffle="batch" ,epochs=10, batch_size=10,verbose=1)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0,batch_size=10)
print("Baseline Error: %.2f%%" % (100-scores[1]*100))
# # serialize model to JSON
try:
model_json = model.to_json()
with open("/content/drive/My Drive/patternmodelxceptioneludrop299.json", "w") as json_file:
json_file.write(model_json)
except:
pass
# serialize weights to HDF5
model.save_weights("/content/drive/My Drive/patternmodelxceptioneludrop299.h5")
print("Saved model to disk")
Xception Network

Now we are going to convert the 1,00,000 original image into dataset as we did in training dataset processing.After Load the model and dataset then predict the dataset and store it as numpy array file.

import h5py
hdf5_path = u'/content/drive/My Drive/originaldataset.hdf5'
hdf5_file = h5py.File(hdf5_path, "r")
X_pattern = hdf5_file.get('pattern_img').value
hdf5_file.close()
json_file = open('/content/drive/My Drive/patternmodelxceptioneludrop299.json')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
loaded_model.load_weights("/content/drive/My Drive/patternmodelxceptioneludrop299.h5")
loaded_model.compile(optimizer=Nadam(lr=0.00008), loss='binary_crossentropy',metrics=['accuracy'])
prediction = loaded_model.predict(X_pattern,batch_size=1,verbose=1)
np.save('/content/drive/My Drive/prediction_pattern.npy',prediction)

Now we can delete the nonpattern image using prediction_pattern.npy files.

Dont modify or delete the image in scraped image path, If you do means the prediction we will not work properly due to image path arranged order will change for example if deleted the first file of the image.prediction for second image is first image prediction So on. For saftely you can create separate array in hdf5 file to save the name and path of the file. So that before deleting you can verify the two image name and delete it.

Import the prediction_pattern.npy file

prediction_pattern = np.load('D:/aidelete/pattern/prediction_pattern.npy')

then load the path of mixed folder (pattern and nonpattern image)

path = list(glob.iglob(u'D:\\weddingcardsai\\pattern_scraped\\all\\*.*'))

Now delete the non pattern image using prediction file.

from tqdm import tqdm_notebook as tqdm
count = 0
for i in tqdm(range(len(path))):
if prediction_pattern[i][-1] < 0.6: #threshold
try:
os.remove(u'\\\\?\\'+path[i])
count +=1
except:
pass

I recommand you to set the threshold so high and run the code then check the folder whether it deleted all the pattern and take a backup. then decrease the threshold and again run the code, and take a backup Now check whether all the nonpattern image are deleted if little bit is not deleted by model.I recommand you to delete manually.If it also deleted the pattern image try to increase the threshold or create a proper training dataset image for training the model. Training dataset is very much important to avoid wrong predict and also check for overfitting using validation dataset if overfitting is occur use dropout to reduce the overfitting.

Due to company policy i cant show you the dataset. So that i cant show how i selected the training and test dataset.

In monogram all the process are same but i added extra predict, In pattern the model going to predict just whether the image is patter or not (0’s and 1’s).but in monogram the model going to predict the letter and also monogram or not.for example

Monogram
Not a Monogram

The monogram is combination of two letter, The letter tells initial letter of a two name of the person.

if a image contain more then two letter means it not a monogram.

why i am forcing the model to predict the letter because the model should learn pattern of the each letter.So that it will predict the complex monogram such as complex designed letter and also the model will know how many letter present in the image.More than two letter present means the model will predict the image has not a monogram.

In my case i trained the model one to three letter has monogram. More then three letter means it is not a monogram.But monogram is a two letter design only.

Here is the code link for monogram

Steps:

  • First create the training and validation dataset
  • then create the original dataset hdf5 file
  • Train the model with training and validation dataset
  • then test the trained model if it is not predicting well means try to change the training dataset and retrain it again.
  • Load the trained model and original dataset then predict the original dataset and store the predicted value has numpy file.
  • now delete the image using numpy file.

Thanks for reading my blog, I know it is not a clearly explained but i try to improve it.