Making Bobblehead Animations using Deep Learning



Multi-pose estimation is currently a state-of-the-art deep learning approach in computer vision for detecting humans and their joints in an image. In this article, I outline briefly how you can make a funny little bobble-head GIF like the one I produced above using Lebron James’ face on top of Drake’s Hotline Bling music video. There are essentially 4 main steps:

  1. Download the video you are interested in overlaying. I chose Drake’s Hotline Bling video.

2. Download the isolated face image that you would like to overlay on top of your video. I chose Lebron James’ face. If your face image inconveniently has a background, then use some image editing tools to crop out the background until your images looks something like this:

You favourite celebrity’s face without the background

3. On each frame of the video, detect the humans and their joints. The project I used was https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation, which is a keras Python implementation of multi-pose estimation. In particular, we are interested in using the code to localize the head portion of the human object.

Example of using the code from https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation to perform human multi-pose estimation on an image.

4. Once we have the face screen coordinates of the video frame, we overlay the isolated face cut-out image on top of the video frame at these very coordinates.

(Left) Original video frame. (Right) Face overlay video frame.

5. Repeatedly repeat steps 3 and 4 for each frame in the video. Afterwards, I used ffmpeg to combine all the frames together to make a silent video. You will have to do a little more work to add and sync in the audio of the original video clip. But for the most part, you are done now!

Final demo of a bobblehead animation using multi-pose estimation

Source: Deep Learning on Medium