A simple trick about multi-label image classification with ImageDataGenerator in Keras.

Source: Deep Learning on Medium


ImageDataGenerator is a great tool to augment images and to generate batch samples to feed into the network. However, as you might notice, ImageDataGenerator has been limited to a single-label classification problem. This means that each image can only belong to one class. Thus, many others have devoted work around with this problem. However, to me, many of these attempts seem a bit complicated, until I discovered this Stackoverflow’s reply. I modify the answer a bit and make it simpler.

TL;DR

import pandas as pd
from keras.preprocessing.image import ImageDataGenerator
from sklearn.preprocessing import MultiLabelBinarizer
def multilabel_flow_from_dataframe(data_generator, mlb):
for x, y in data_generator:
num_samples = len(y)
indices = y.astype(np.int).tolist()
y_multi = mlb.transform(df.iloc[indices]['tags'].values.tolist())
yield x, y_multi
df = pd.read_csv(...)
image_data_generator = ImageDataGenerator(...)
generator = image_data_generator.flow_from_dataframe(...)
mlb = MultiLabelBinarizer()
mlb.fit(df['tags'].values.tolist())
multilabel_generator = multilabel_flow_from_dataframe(data_generator, mlb)
model.fit_generator(
multilabel_flow_from_dataframe(multilabel_generator),
...
)

Step 1. Generate a DataFrame describes your dataset

Basically, this DataFrame, called df, has 3attributes — filename, tags and index. filename indicates the image path, while the tags is the location where you store the target labels related to this sample. index is simply sample index that acts as a helper for retrieving the corresponding samples later on. Here is a sample dataset:

Sample result of CK+ dataset.

Step 2. Create an ImageDataGenerator with flow_from_dataframe

Then, we create a normal ImageDataGenerator, image_data_generator, and we aim to provide the image data information from DataFrame. Thus, we further create a generator that is flow from DataFrame. The trick here is to use filename to be the x_col and index to be the y_col. The idea is that we aim to get the location in the df so as to retrieve the tags later on. The reason why tags is used directly in y_col is simply because Keras does not allow reading object type from DataFrame.

from keras.preprocessing.image import ImageDataGenerator
image_data_generator = ImageDataGenerator(
rotation_range=10,
width_shift_range=0.1,
height_shift_range=0.1,
zoom_range=.1
)
generator = image_data_generator.flow_from_dataframe(
dataframe=df,
directory='/path/to/your/image/dir',
x_col='filename',
y_col='index',
class_mode='other',
color_mode="grayscale",
target_size=(224, 224),
batch_size=2 # to make this tutorial simple
)

If we run generator.next(), we will get

(
array([...], dtype=float32), # X_batch
array([282., 21.]) # y_batch, it is the indices of X_batch
)

[282., 21.] is the index in the DataFrame. Therefore, we can simply retrieve the tags by df.iloc[[282, 21]]['tags'].

Step 3. Multi-Label Binarizer

If we can successfully retrieve the tags, what is left is to apply a multi-label binarizer to it. Here is how I do so:

from sklearn.preprocessing import MultiLabelBinarizer
# Fit a MultiLabelBinarizer
mlb = MultiLabelBinarizer()
mlb.fit(df['tags'].values.tolist())
def multilabel_flow_from_dataframe(data_generator, mlb):
for x, y in data_generator:
num_samples = len(y)
indices = y.astype(np.int).tolist()
y_multi = mlb.transform(df.iloc[indices]['tags'].values.tolist())
yield x, y_multi

When training the model, we can

multilabel_generator = multilabel_flow_from_dataframe(data_generator, mlb)
model.fit_generator(
multilabel_generator,
...
)