Predicting the apparent age and gender from a picture is a very interesting problem from a technical point of view but can also be very useful when applied to better understand consumer segments or a user base for example. It can be used to infer the age or gender of a user and use this information to make personalized products and experiences for each user.
In this post we will train a model to predict those attributes given a face picture.
We use data from https://www.openu.ac.il/home/hassner/Adience/data.html which is a dataset of face photos in the wild that are labeled into 8 age groups (0–2, 4–6, 8–13, 15–20, 25–32, 38–43, 48–53, 60-) and into 2 gender classes.
There are around 26,580 images (with missing labels in some cases) that are pre-split into 5 folds.
Existing results :
As this dataset is usually used as a benchmark for this type of tasks in many research papers, I was able to find many prior accuracy results for apparent age and gender prediction =>
Gender : 76.1±0.9
Age : 45.1±2.6
Gender : 86.8±1.4
Age : 50.7±5.1
Gender : 91
Age : 61.3±3.7
Faces are cropped and aligned using this tool : https://www.openu.ac.il/home/hassner/Adience/code.html#inplanealign
Data augmentation :
We use Random shift, Zoom, Horizontal Flip as a form of data augmentation to create synthetic examples used during training to improve the generalization of the model.
We use a Resnet architecture pre-trained on ImageNet :
Resnet ( Deep Residual Networks ) are an architecture that reduces the under-fitting and optimization issues that occur in deep neural networks.
We train the model 3 times for each fold and average the predictions and get the following results :
Gender : 89.4±1.4
Age : 57.1±5.3
Those results are better than  and  probably because of the bagging and the pretrained weights but worse than  probably because Sighthound Inc used a bigger and internal Faces dataset for pretraining.
Code to reproduce the results can be found at : https://github.com/CVxTz/face_age_gender
Source: Deep Learning on Medium