Original article was published by Mikhail Raevskiy on Artificial Intelligence on Medium
The model implementation is written in PyTorch and is available in the open repository on GitHub. The repository contains the project code, pre-trained MobileNet-V1 networks, and a preprocessed dataset for training and testing. On inference, 3DDFA processes an image in 0.27 milliseconds on a GeForce GTX TITAN X.
3DDFA — Network architecture
3DDFA combines cascade regression and convolutional networks. CNN is used as a regressor in a cascading convolutional network. The framework consists of four components: regression functionality, image features, convolutional network structures, and an error function for training the model.
The neural network works in two streams:
- In the first stream with an intermediate learning parameter, the Projected Normalized Coordinate Code (PNCC) is constructed, which, together with the input image, is sent to the CNN input;
- On the second stream, the model receives feature anchors with consistent semantics and conducts Pose Adaptive Convolution (PAC) on them
Outputs from the two streams are combined using an additional fully connected layer that predicts the intermediate parameter update.