Original article was published on Artificial Intelligence on Medium
Image Compression with GANs
Overview of the paper “High-Fidelity Generative Image Compression” by Mentzer et. al.
Image compression is a very essential part of gaming experience with multiple applications related to storage and transmission of data. It is the key to making cloud game streaming possible by reducing the bandwidth requirements so that we can get framerates comparable to that of playing the game locally on our own computer. Image compression is also critical for game developers for packaging their games with high resolution textures so that the distribution version of the game has manageable disk space requirements for the end-users.
Hand-crafted algorithms for image compression have limited potential for compression because they lack any real understanding of the content they are compressing. This is where deep neural networks come in and promise to achieve much higher compression rates thanks to their ability to learn about images.
HiFiC: High-Fidelity Compression
This brings me to our paper in focus for today’s episode. It is titled “High-Fidelity Genrative Image Compression” and it provides a Generative Adversarial Network (GAN) based compression method that is quite simply mindblowing.
As you can see, JPEG compression shown on the right here is quite lossy compared to this paper’s method on the left when both are using about 75 kilobytes of space to encode this image. If you want comparable quality to this compression method, you’ll need at least 4 times as many bits per pixel. In fact, this method’s compressed images are quite difficult to tell apart even from the original full RGB images.
You can play with more such examples at the author’s project page. The difference observed on other images is also quite simply enormous and this level of compression is unprecedented. Truly remarkable what we can achieve with Deep Learning!
Scope of Future Work
Now, it is important to note that this work only focuses on compressing images, but I think it can be easily extended to apply to videos as well. One possible extension of this work would be to tweak the GAN network architecture to also process optical flow information between the consecutive frames.
This would make it more suitable for compressing videos which would be ideal for game streaming. I’m excited to see what the future holds with this research work and if it can give a new life to struggling streaming platforms like Google Stadia.