Position Map Regression Networks (PRN) is a method to jointly regress dense alignment and 3D face shape in an end-to-end manner. In this article, I’ll provide a short explanation and discuss its applications in computer vision.
When I was a child, I imagined that (due to movies, of course), in the future, we’d be able to have these crazy holograms where you could see people talking to you as if they were there. These kinds of applications for computer vision suggest we aren’t that far from achieving something similar.
In the last few decades, a lot of important research groups in computer vision have made amazing advances in 3D face reconstruction and face alignment. Primarily, these groups have used CNNs as the de facto ANN for this task. However, the performance of these methods is restricted because of the limitations of 3D space defined by face model templates used for mapping.
Position Map Regression Networks (PRN)
In a recent paper, Yao Feng and others proposed an end-to-end method called Position Map Regression Networks (PRN) to jointly predict dense alignment and reconstruct 3D face shape. They claim their method surpasses all previous attempts at both 3D face alignment and reconstruction on multiple datasets.
Specifically, they designed a UV position map, which is a 2D image recording the 3D coordinates of a complete facial point cloud, which maintains the semantic meaning at each UV polygon. They then train a simple encoder-decoder network with a weighted loss that focuses more on discriminative region to regress the UV position map from a single 2D facial image.
Their contributions can be summarized here (from the same paper):
– For the first time, we solve the problems of face alignment and 3D face reconstruction together in an end-to-end fashion, without the restriction of low-dimensional solution space.
– To directly regress the 3D facial structure and dense alignment, we develop a novel representation called UV position map, which records the position information of a 3D face and provides dense correspondence to the semantic meaning of each point on the UV space.
– For training, we proposed a weight mask that assigns different weight to each point on the position map and computes a weighted loss. We show that this design helps improve the performance of our network.
– We finally provide a light-weighted framework that runs at over 100FPS to directly obtain 3D face reconstruction and alignment resulting from a single 2D facial image.
– Comparison on the AFLW2000–3D and Florence datasets shows that our method achieves more than 25% relative improvements over other state-of-the-art methods on both 3D face reconstruction and dense face alignment.
Their code is implemented in Python using TensorFlow. You can take a look at the official repo here:
PRNet — The source code of ‘Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network’.github.com
If you want to run their examples you’ll need this:
- Python 2.7 (numpy, skimage, scipy)
- TensorFlow >= 1.4
- dlib (for detecting face. You do not have to install if you can provide bounding box information. )
- opencv2 (for showing results)
Right now the code is in development and they will be adding more and more functionalities in the near future.
Basics (Evaluated in paper)
- Face Alignment: Dense alignment of both visible and non-visible points(including 68 key points).
To be added:
3D Pose Estimation: Rather than only use 68 key points to calculate the camera matrix(easily effected by expression and poses), we use all vertices(more than 40K) to calculate a more accurate pose.
Texture Editing: Data Augmentation/Selfie Editing, modify special parts of input face, eyes for example:
Face Swapping: replace the texture with another, then warp it to original pose and use Poisson editing to blend images.
- Clone the repository
git clone https://github.com/YadiraF/PRNet
- Download the PRN trained model at BaiduDrive or GoogleDrive, and put it into
- Run the test code. (test AFLW2000 images)
python run_basics.py #Can run only with python and tensorflow
- Run with your own images
python demo.py -i <inputDir> -o <outputDir> --isDlib True
python demo.py --help for more details.
I’ll be using Deep Cognition’s Deep Learning Studio to test this and other frameworks in the near future, so start by creating an account :).
Thanks for reading this. I hope you found something interesting here :)
If you have questions just follow me on Twitter
The latest Tweets from Favio Vázquez (@FavioVaz). Data Scientist. Physicist and computational engineer. I have a…twitter.com
View Favio Vázquez’s profile on LinkedIn, the world’s largest professional community. Favio has 15 jobs jobs listed on…linkedin.com
See you there :)
Discuss the post on Hacker News.
Editor’s note: Want to know more about the power of computer vision? Check out these helpful Heartbeat resources:
Source: Deep Learning on Medium