NeurIPS 2020 Workshop | Indie GAN Interpolation Method Turns Selfies Into Cartoon Characters

Original article was published by Synced on Artificial Intelligence on Medium


NeurIPS 2020 Workshop | Indie GAN Interpolation Method Turns Selfies Into Cartoon Characters

Last month, the deep learning powered online tool Toonify Yourself! went viral on social media, attracting attention from the ML community and far beyond. Designed “for fun and amusement using deep learning and Generative Adversarial Networks,” the system was developed by a pair of independent researchers, Justin N. M. Pinkney and Doron Adler, and let anyone change selfies or portraits into impressive animation-style images. Demand for the high-performance homemade model caused the site to crash, but it quickly returned thanks to support from user donations.

In a paper submitted to the NeurIPS 2020 Machine Learning for Creativity and Design workshop, Pinkney and Adler present their research, which enables image generation in novel domains and with a degree of creative control on the output. The team’s resolution dependant GAN interpolation method combines high resolution layers of an FFHQ model with low resolution layers from a model transferred to animated character faces to enable the combination of realistic facial textures with the structural characteristics of a cartoon.

Generative Adversarial Networks (GANs) are the current SOTA approach for many image synthesis and translation tasks. GANs can generate photorealistic images by learning regularities or patterns from the domain of their training data, but they struggle on image generation tasks for creative purposes, which often involves truly novel domains.

Pinkney and Adler based their method for interpolating between generative models in a resolution dependant manner on the StyleGAN architecture introduced by Nvidia researchers in 2018.

Schematic of the method’s “layer swapping” interpolation scheme

Previous work has shown that a model generated by transfer learning and the original “base model” have a close relationship, and that linearly interpolating between the weights of two such generative models produces outputs that are approximately an interpolation of the two learned domains. The researchers noted however that simply applying linear interpolation between all model parameters does not make use of an important control element in StyleGAN.

If the interpolation is instead done between parameters from different models but dependant on the resolution of the particular layer, it becomes possible to select and blend features from the different generators as desired. Pinkney and Adler also experimented with a dataset collected from online museum images of traditional Japanese ukiyo-e prints and demonstrated that the proposed method can introduce their iconic poses and head shapes while preserving photorealistic rendering.

With a ukiyo-e selfie transfer capability possibly joining the popular Toonification platform, there would seem to be endless opportunities for such easy and impressive image online image transfer tools, with considerable user demand for avatar generation and much more.

The paper Resolution Dependant GAN Interpolation for Controllable Image Synthesis Between Domains is on arXiv.