3DDFA — 3D Face Alignment in Full Pose Range

Original article was published by Mikhail Raevskiy on Artificial Intelligence on Medium


The model implementation is written in PyTorch and is available in the open repository on GitHub. The repository contains the project code, pre-trained MobileNet-V1 networks, and a preprocessed dataset for training and testing. On inference, 3DDFA processes an image in 0.27 milliseconds on a GeForce GTX TITAN X.

3 modes of 3D Face Alignment in 3DDFA approach

3DDFA — Network architecture

3DDFA combines cascade regression and convolutional networks. CNN is used as a regressor in a cascading convolutional network. The framework consists of four components: regression functionality, image features, convolutional network structures, and an error function for training the model.

The neural network works in two streams:

  • In the first stream with an intermediate learning parameter, the Projected Normalized Coordinate Code (PNCC) is constructed, which, together with the input image, is sent to the CNN input;
  • On the second stream, the model receives feature anchors with consistent semantics and conducts Pose Adaptive Convolution (PAC) on them

Outputs from the two streams are combined using an additional fully connected layer that predicts the intermediate parameter update.