A Unified Framework for Human Motion Imitation, Appearance Transfer, and Novel View Synthesis

Source: Deep Learning on Medium

A Unified Framework for Human Motion Imitation, Appearance Transfer, and Novel View Synthesis

Human image synthesis, including human motion imitation, appearance transfer and novel view synthesis has huge potential applications in character animation, re-enactment, virtual clothes try-on, movie or game making and so on.

However, there a challenge with existing methods for these tasks. The problem is that state of the art approaches existing generate unrealistic-looking images.

For instance, it is difficult to capture and preserve their network architecture due to some diverse clothes in terms of texture, style, color, and high-structure face identity. Second, articulated and deformable human bodies result in a large spatial layout and geometric changes for arbitrary pose manipulations. Furthermore, current methods cannot handle multiple source inputs including appearance transfer and different parts might come from different source people.

A Liquid Warping GAN

In this paper, researchers propose a unified framework to handle human motion imitation, appearance transfer, and novel view synthesis.

The training pipeline of the Liquid Warping GAN

They propose a Liquid Warping Block (LWB) to address the loss of source information from three aspects:

  1. A denoising convolutional auto-encoder is used to extract useful features that preserve source information, including texture, color, style and face identity
  2. Source features of each local part are blended into a global feature stream by our proposed LWB to further preserve the source details
  3. It supports multiple-source warping, such as in appearance transfer, warping the features of the head from one source and those of body from another, and aggregating into a global feature stream.
Illustration of human motion imitation, appearance transfer, and novel view synthesis. The first column is the source image and the second column is reference condition, such as image or novel view of the camera. The third column is the synthesized results.

Why it is Important?

Well, the proposed framework employs a body recovery module to estimate the 3D body mesh which is more powerful than 2D Pose.

Quite a leap considering that the existing approaches mainly use 2D pose, dense pose, and body parsing to estimate the human body structure.

Additionally, the proposed method is able to support a more flexible warping from multiple sources. More to the mix, the researchers develop a new dataset for the evaluation of human motion imitation, appearance transfer, and novel view synthesis.

The model demonstrates its effectiveness and robustness in occlusion cases, preserving face identity, shape consistency, and clothes details. With this, researchers and developers take these tasks to the next level.

Codes and Dataset

All codes and datasets are available here.

Read the full paper: Liquid Warping GAN

Thanks for reading, please comment and share. For an update of the most recent and interesting research papers, subscribe to our weekly newsletter. You can also connect with me on Twitter, LinkedIn, and Facebook. Remember to 👏 if you enjoyed this article. Cheers!