Predict Artist From Art Using Deep Learning

Source: Deep Learning on Medium

A reliable way to identify artist is not only useful for labeling art pieces but also for identifying forgeries, Another art historical problem . So i decided to work on this while previously people have worked on this with different neural network technique on research papers and Kaggle competitions i thought why not apply state of art techniques like ResNet50 neural network for image classification task let’s start actual implementation for this case study.

Big task of Deep Learning to solve problem

While reading one research paper here i got to know about this problem and this enliven me to work on this because i do sketches , paintings a lot.The big task for deep learning is to predict artist by given one art .We know computers don’t understand direct interaction like given text , Images computers won’t solve our problem we have to convert this text , images to vector form or we can say binary form as 0,1. How image can be convert to vector form that computer can interpret ?. Here is simple image to understand image converted to vector/ matrix form of binary data as 0,1 .

Image
image

Import Libraries

Till now we are aware of what is the problem of this case study right . Now time to import libraries of python to start analysis. Big advantage of python packages is that it contains ton’s of libraries for data analysis.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import json
import os
from tqdm import tqdm, tqdm_notebook
import random

import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import *
from tensorflow.keras.optimizers import *
from tensorflow.keras.applications import *
from tensorflow.keras.callbacks import *
from tensorflow.keras.initializers import *
from tensorflow.keras.preprocessing.image import ImageDataGenerator

from numpy.random import seed
seed(1) #Seed function is used to save the state of random function
from tensorflow import set_random_seed
set_random_seed(1)

EDA (Exploratory Data Analysis)

We have done with importing libraries now next task is exploratory data analysis in this we have to take data and apply varieties of statistical analysis on it. The purpose of doing EDA is to find meaning or pattern in data. EDA tells us the deeper view of data where we can understand about problem. Frankly EDA is the art process in AI community the better you apply the better you understand data.

print(os.listdir("../input"))

input is main folder and rest files inside this so the code process of accessing this file.

With the help of pandas data frame accessing artists.csv

artists = pd.read_csv('../input/artists.csv')
artists.shape
# we have 50 rows and 8 col total.
output: (50, 8)

As we have read csv file now do simple analysis on this data below i have done some simple analysis to understand file.

artists.info() #info() gives information about file
# from here we can see our features
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50 entries, 0 to 49
Data columns (total 8 columns):
id 50 non-null int64
name 50 non-null object
years 50 non-null object
genre 50 non-null object
nationality 50 non-null object
bio 50 non-null object
wikipedia 50 non-null object
paintings 50 non-null int64
dtypes: int64(2), object(6)
memory usage: 3.2+ KB

We can see the information about our file it contains 50 rows and 8 columns(id,name,years..etc).

Data Preprocess

Data preprocess is task where we have to manipulate the data according to our solving strategy and guide the data from raw data to meaningful data. Here i first sorted data according to feature painting so that all my data in sorted manner and according to their painting name. We have taken paintings >200 for our analysis just taken sample data.

#https://en.wikipedia.org/wiki/Digital_image_processing
#Adding some more col as part of data preprocessing
# Sort artists by number of paintings
# above code shows "painting" feature so as to group paintings to respecting painting.
artists = artists.sort_values(by=['paintings'], ascending=False)
# Create a dataframe with artists having more than 200 paintings
artists_top = artists[artists['paintings'] >= 200].reset_index()
artists_top = artists_top[['name', 'paintings']]
#artists_top['class_weight'] = max(artists_top.paintings)/artists_top.paintings
artists_top['class_weight'] = artists_top.paintings.sum() / (artists_top.shape[0] * artists_top.paintings)
artists_top

Feature Engineering

I have done one feature engineered just added one more feature in this called ‘class_weight’ which put weight on paintings why we want weight actually simple putting weight on something it tells us the importance of that particular thing right, So i just added weight by creating new feature.

artists_top['class_weight'] = artists_top.paintings.sum() / (artists_top.shape[0] * artists_top.paintings)

After putting weight here is the result

First we added new feature class_weight and showing here is weight for each rows.

Getting top artist paintings

In this snippet of code we fetch top artist paintings and checking whether my files and name present or not.

# Explore images of top artists
images_dir = '../input/images/images' #my files with this directory and folder you can change as u want to.
artists_dirs = os.listdir(images_dir)
artists_top_name = artists_top['name'].str.replace(' ', '_').values

# See if all directories exist
for name in artists_top_name:
if os.path.exists(os.path.join(images_dir, name)):
print("Found -->", os.path.join(images_dir, name))
else:
print("Did not find -->", os.path.join(images_dir, name))

Showing Random Paintings

Till now we have featured engineered and data preprocessed and made some manipulation on our data now time to show some random paintings for clear visualization/understanding whats happening right.

#https://www.analyticsvidhya.com/blog/2019/08/3-techniques-extract-features-from-image-data-machine-learning-python/
# Print few random paintings
n = 5 # taking 5 random pic
fig, axes = plt.subplots(1, n, figsize=(20,10))

for i in range(n):
random_artist = random.choice(artists_top_name)
random_image = random.choice(os.listdir(os.path.join(images_dir, random_artist)))
random_image_file = os.path.join(images_dir, random_artist, random_image)
image = plt.imread(random_image_file)
axes[i].imshow(image)
axes[i].set_title("Artist: " + random_artist.replace('_', ' '))
axes[i].axis('off')

plt.show()

with the help of python library called matplotlib we can able to show images and plot graphs here we have taken max 5 images to show n=5 , within subplot we have figsize=(20,10) this basically set width and height of image you can set your own also. plt.imread() this allows us to read image file.

5 random images with artist name as per we design in the code

Data Augmentation

Data Augmentation is a strategy that enables practitioners to significantly increase the diversity of data available for training models, without actually collecting new data. This technique like padding , cropping , shifting , flipping etc. So here also i have used augmentation technique on images in order to increase data here i flipped up side down .

A simple Augmentation example

Left: A sample of 250 data points that follow a normal distribution exactly. Right: Adding a small amount of random “jitter” to the distribution. This type of data augmentation increases the generalizability of our networks
#https://www.kaggle.com/supratimhaldar/
# Augment data
batch_size = 16
train_input_shape = (224, 224, 3)
n_classes = artists_top.shape[0]

train_datagen = ImageDataGenerator(validation_split=0.2,
rescale=1./255.,
#rotation_range=45,
#width_shift_range=0.5,
#height_shift_range=0.5,
shear_range=5,
#zoom_range=0.7,
horizontal_flip=True,
vertical_flip=True,
)

train_generator = train_datagen.flow_from_directory(directory=images_dir,
class_mode='categorical',
target_size=train_input_shape[0:2],
batch_size=batch_size,
subset="training",
shuffle=True,
classes=artists_top_name.tolist()
)

valid_generator = train_datagen.flow_from_directory(directory=images_dir,
class_mode='categorical',
target_size=train_input_shape[0:2],
batch_size=batch_size,
subset="validation",
shuffle=True,
classes=artists_top_name.tolist()
)

STEP_SIZE_TRAIN = train_generator.n//train_generator.batch_size
STEP_SIZE_VALID = valid_generator.n//valid_generator.batch_size
print("Total number of batches =", STEP_SIZE_TRAIN, "and", STEP_SIZE_VALID)

Here batch_size =16 i have used you can use even if you increase batch_size it will give more accurate gradient simultaneously we deal with loss over large set of images. ImageDataGenerator() the ImageDataGenerator accepts the original data, randomly transforms it, and returns only the new, transformed data.

Print random paintings with augmentation version

#https://www.kaggle.com/supratimhaldar/
# Print a random paintings and it's random augmented version
fig, axes = plt.subplots(1, 2, figsize=(20,10))

random_artist = random.choice(artists_top_name)
random_image = random.choice(os.listdir(os.path.join(images_dir, random_artist)))
random_image_file = os.path.join(images_dir, random_artist, random_image)

# Original image
image = plt.imread(random_image_file)
axes[0].imshow(image)
axes[0].set_title("An original Image of " + random_artist.replace('_', ' '))
axes[0].axis('off')

# Transformed image
aug_image = train_datagen.random_transform(image)
axes[1].imshow(aug_image)
axes[1].set_title("A transformed Image of " + random_artist.replace('_', ' '))
axes[1].axis('off')

plt.show()
Here we can the two images one with original and second with augmented version

Build Model

Here time to build model to train the data and i have ended my exploratory data analysis part but still there are more techniques to do if you got some new idea for EDA best implement right, To know new way of understanding data and feature engineering. So here in this part we are going to build model which train our data as previously i have mentioned that i will going to use state of the art technique like ResNet50 model. I can use CNN(Convolutional Neural Network) but when i read on research paper that ResNet50 network do tremendously job on image data so let’s begin this section.

Neural network architechture

in fig (a): tells us how neural network works the circle part is activation function sum(Xi*Wi)i=1 to n means this exactly acts as neuron which fires when input given w1,w2,…wn is weight so input + weight given function f(sum(XiWi)) finally Yi as output combination of this called neural network. In our case we use ReLu activation.

#https://www.quora.com/ResNet50-tutorial
# Load pre-trained model
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=train_input_shape)
# I am using ResNet50 neural network it performs well for image data.
for layer in base_model.layers:
layer.trainable = True

In this snippet of code i am taking image data from Imagenet dataset. ImageNet is an image database organized according to the WordNet hierarchy this is freely available for researcher and data scientist for research purpose.

ResNet Model Implement

Let’s first understand the architecture of ResNet50 model is also called identity layer why because the sole purpose of identity layer is skip-connection that means skip one layer in ResNet model which helps reducing vanishing gradient problem . Below model depicts with weight layer f(x) in between ReLu activation and lastly x after skip-connection so finally f(x)+x.

from stackoverflow

Vanishing gradient problem is occur when we will use sigmoid or tanh activation function because when we do derivative of this activation function the old derivative(W_old) and new derivative(W_new) becomes equal and after backpropagation update function(-ndL/dw) goes < 0 to achieve vanishing gradient problem . The eta(n) times derivative of loss of network(dL) with respect to derivative of network weight(dw)

stackexchange

above equation tells during backpropagation when weights are updated the early layers(near input) are obtained by multiply gradients of later layers(near output) so if gradient of later layer is less than 1 so their multiplication vanish very fast.

#https://github.com/keras-team/keras-applications/blob/master/keras_applications/
# Add layers at the end
# creating layers for neural network
X = base_model.output
X = Flatten()(X)

X = Dense(512, kernel_initializer='he_uniform')(X) # he_uniform is a weight to neural network
#X = Dropout(0.5)(X)
X = BatchNormalization()(X)
X = Activation('relu')(X) # activation function i am using is "relu".

X = Dense(16, kernel_initializer='he_uniform')(X)
#X = Dropout(0.5)(X)
X = BatchNormalization()(X)
X = Activation('relu')(X)

output = Dense(n_classes, activation='softmax')(X)

model = Model(inputs=base_model.input, outputs=output)

This is simple implementation of ResNet50 model which creates 50 networks among each neural. Here dense() used as dense layer represents a matrix vector multiplication. In matrix we have vector form of data and its get updated when backpropagation occur so our matrix become m * dimensional vector so dense used to change the dimension of vector. I hope you got it right dropout() in neural network used when we have overfitted model means not good so dropout randomly drops nodes for cheap computation so in my case the model is not overfit so i negleted here. Batchnormalization() is concept in neural network suppose we have 5 layers of network without batchnormalized ok if we start normalize the network then normalize curve start shifting this is the problem called internal-covariant shift problem. To get rid of this i have used BatchNormalize and last before output layer i added softmax layer which takes multiclass input and gives output as sum to 1 which is probability output so that neural network able determine. And activation function that i used here is ReLU because it reduces vanishing gradient problem .So till now i have initialized resnet neural network lets train our model.

Train Model

#For training neural network i am using Adam optimiser.
#SGD will be slow not perform well so Adam performs better.
optimizer = Adam(lr=0.0001)
model.compile(loss='categorical_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])

For training deep learning neural network various techniques are there like Adam , Adadelta , SGD ..etc . I am using Adam optimizer for train SGD is bit slow to train deep learning. loss function i am using is multi-class log loss also called categorical-crossentropy because it is multiclass classifcation problem and model performance by using accuracy metric.

n_epoch = 10 #n_epoch :number of times training vect to update weight/ one complete iteration. 

early_stop = EarlyStopping(monitor='val_loss', patience=20, verbose=1,
mode='auto', restore_best_weights=True)

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=5, #I used ReduceLROnPlateau callback function when matri
verbose=1, mode='auto')
# Train the model - all layers
history1 = model.fit_generator(generator=train_generator, steps_per_epoch=STEP_SIZE_TRAIN,
validation_data=valid_generator, validation_steps=STEP_SIZE_VALID,
epochs=n_epoch,
shuffle=True,
verbose=1,
callbacks=[reduce_lr],
use_multiprocessing=True,
workers=16,
class_weight=class_weights
)

here epoch value is 10 upto 10 epoch time this will train all layers output given below :

10 epoch value with accuracy and loss

Here is the result after training the model for first time accuracy touched by my model is 0.93 or (93%) and loss reduce from 1.65 to 0.45 means model doing good now we freeze layers and re-train again.

# Freeze core ResNet layers and train again 
for layer in model.layers:
layer.trainable = False

for layer in model.layers[:50]:
layer.trainable = True

optimizer = Adam(lr=0.0001)

model.compile(loss='categorical_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])

n_epoch = 5
history2 = model.fit_generator(generator=train_generator, steps_per_epoch=STEP_SIZE_TRAIN,
validation_data=valid_generator, validation_steps=STEP_SIZE_VALID,
epochs=n_epoch,
shuffle=True,
verbose=1,
callbacks=[reduce_lr, early_stop],
use_multiprocessing=True,
workers=16,
class_weight=class_weights
)

in this code snippet i have mentioned above that i will train again by freeze layers of ResNet50 actually freeze means keep few layers not to train and train next few layers for that we have to do some change as given below:

for layer in model.layers[:50]:
layer.trainable = True

here result after freeze few layers and re-train again model touched accuracy 96% increased from previously right and if you notice error rate also decreased to 0.39 indeed model doing good. But one thing here as i told resnet50 creates 50 neural network but to train we definitely require huge space when i executed first time in my PC with intel core i5 it stopped and showing error of not enough space so i suggest when ever executing this type models at least require 13+ GB ram and with gpu to train or otherwise you can run on google colab provides 13 GB space. I personally executed this on colab only. So far we have trained our model enough now time to test by inserting an image to model lets see model correctly identify the artist or not!.

Training graph

# Plot the training graph
def plot_training(history):
acc = history['acc']
val_acc = history['val_acc']
loss = history['loss']
val_loss = history['val_loss']
epochs = range(len(acc))

fig, axes = plt.subplots(1, 2, figsize=(15,5))

axes[0].plot(epochs, acc, 'r-', label='Training Accuracy')
axes[0].plot(epochs, val_acc, 'b--', label='Validation Accuracy')
axes[0].set_title('Training and Validation Accuracy')
axes[0].legend(loc='best')

axes[1].plot(epochs, loss, 'r-', label='Training Loss')
axes[1].plot(epochs, val_loss, 'b--', label='Validation Loss')
axes[1].set_title('Training and Validation Loss')
axes[1].legend(loc='best')

plt.show()

plot_training(history)
Graph between accuracy and loss as simply we can conclude accuracy is high simultaneously loss reducing

Both graph represents here accuracy one is increasing and our loss graph is reducing.

Evaluate Model

This code snippet used to show performance matrix of model that how model predict and think as human thinking .

# Classification report and confusion matrix
from sklearn.metrics import *
import seaborn as sns

tick_labels = artists_top_name.tolist()

def showClassficationReport_Generator(model, valid_generator, STEP_SIZE_VALID):
# Loop on each generator batch and predict
y_pred, y_true = [], []
for i in range(STEP_SIZE_VALID):
(X,y) = next(valid_generator)
y_pred.append(model.predict(X))
y_true.append(y)

# Create a flat list for y_true and y_pred
y_pred = [subresult for result in y_pred for subresult in result]
y_true = [subresult for result in y_true for subresult in result]

# Update Truth vector based on argmax
y_true = np.argmax(y_true, axis=1)
y_true = np.asarray(y_true).ravel()

# Update Prediction vector based on argmax
y_pred = np.argmax(y_pred, axis=1)
y_pred = np.asarray(y_pred).ravel()

# Confusion Matrix
fig, ax = plt.subplots(figsize=(10,10))
conf_matrix = confusion_matrix(y_true, y_pred, labels=np.arange(n_classes))
conf_matrix = conf_matrix/np.sum(conf_matrix, axis=1)
sns.heatmap(conf_matrix, annot=True, fmt=".2f", square=True, cbar=False,
cmap=plt.cm.jet, xticklabels=tick_labels, yticklabels=tick_labels,ax=ax)
ax.set_ylabel('Actual')
ax.set_xlabel('Predicted')
ax.set_title('Confusion Matrix')
plt.show()

print('Classification Report:')
print(classification_report(y_true, y_pred, labels=np.arange(n_classes), target_names=artists_top_name.tolist()))

showClassficationReport_Generator(model, valid_generator, STEP_SIZE_VALID)

Above is performance metric which tells us that how much model predicted as per our assumptions so from above if we take class by class diagonally we can say like on bottom class marc_chagall actuall and predicted marc_chagall there 0.96 or (96%) mean model predicted nicely.

Test Model

here taking randomly 5 images and giving this random images to model to predict the artist by giving 5 random images that belong to particular artist or not.

# Prediction
from keras.preprocessing import *

n = 5
fig, axes = plt.subplots(1, n, figsize=(25,10))

for i in range(n):
random_artist = random.choice(artists_top_name)
random_image = random.choice(os.listdir(os.path.join(images_dir, random_artist)))
random_image_file = os.path.join(images_dir, random_artist, random_image)

# Original image

test_image = image.load_img(random_image_file, target_size=(train_input_shape[0:2]))

# Predict artist
test_image = image.img_to_array(test_image)
test_image /= 255.
test_image = np.expand_dims(test_image, axis=0)

prediction = model.predict(test_image)
prediction_probability = np.amax(prediction)
prediction_idx = np.argmax(prediction)

labels = train_generator.class_indices
labels = dict((v,k) for k,v in labels.items())

#print("Actual artist =", random_artist.replace('_', ' '))
#print("Predicted artist =", labels[prediction_idx].replace('_', ' '))
#print("Prediction probability =", prediction_probability*100, "%")

title = "Actual artist = {}\nPredicted artist = {}\nPrediction probability = {:.2f} %" \
.format(random_artist.replace('_', ' '), labels[prediction_idx].replace('_', ' '),
prediction_probability*100)

# Print image
axes[i].imshow(plt.imread(random_image_file))
axes[i].set_title(title)
axes[i].axis('off')

plt.show()
output images with their artist predicted by model

so, from above we could see that given 5 random images our model predicted right artist name with given image on average probability of around 80% and above. So finally this is the end of blog and we get to know about deep learning model, techniques to train i hope you enjoyed this blog for more detail about code you can visit to my github profile. Please take a look at http://github.com/homejeet for other projects I am working on. I will eagerly look forward to your feedback and suggestions.

References

  1. https://www.pyimagesearch.com/2019/07/08/keras-imagedatagenerator-and-data-augmentation/
  2. https://github.com/mk60991/ImageAI-image-Prediction
  3. https://www.appliedaicourse.com/
  4. http://cs231n.stanford.edu/reports/2017/pdfs/406.pdf
  5. https://www.researchgate.net/
  6. https://www.kaggle.com/

Connect with me on LinkedIn