Predicting apparent Age and Gender from face picture : Keras + Tensorflow

Predicting the apparent age and gender from a picture is a very interesting problem from a technical point of view but can also be very useful when applied to better understand consumer segments or a user base for example. It can be used to infer the age or gender of a user and use this information to make personalized products and experiences for each user.

In this post we will train a model to predict those attributes given a face picture.

Data :

We use data from which is a dataset of face photos in the wild that are labeled into 8 age groups (0–2, 4–6, 8–13, 15–20, 25–32, 38–43, 48–53, 60-) and into 2 gender classes.
There are around 26,580 images (with missing labels in some cases) that are pre-split into 5 folds.

Existing results :

As this dataset is usually used as a benchmark for this type of tasks in many research papers, I was able to find many prior accuracy results for apparent age and gender prediction =>

Gender : 76.1±0.9
Age : 45.1±2.6

Gender : 86.8±1.4
Age : 50.7±5.1

Gender : 91
Age : 61.3±3.7

Preprocessing :

Faces are cropped and aligned using this tool :

Data augmentation :

We use Random shift, Zoom, Horizontal Flip as a form of data augmentation to create synthetic examples used during training to improve the generalization of the model.

Model :

We use a Resnet architecture pre-trained on ImageNet :

Resnet ( Deep Residual Networks ) are an architecture that reduces the under-fitting and optimization issues that occur in deep neural networks.

Residual Block :

Results :

We train the model 3 times for each fold and average the predictions and get the following results : 
Gender : 89.4±1.4
Age : 57.1±5.3
Those results are better than [1] and [2] probably because of the bagging and the pretrained weights but worse than [3] probably because Sighthound Inc used a bigger and internal Faces dataset for pretraining.

Code to reproduce the results can be found at :

