Machine Learning in the field of Photography

Original article was published by Samarth Pant on Artificial Intelligence on Medium

Machine Learning in the field of Photography

When hearing the words ‘AI’, ‘Machine Learning’ or ‘bot’ most people tend to visualize a walking, talking android robot which looks like something out of a Sci-Fi movie and immediately assume about a time far away in the future.

From clicking photographs from your mobile phones to editing them, everywhere AI is used. While clicking a photograph you might have seen some boxes automatically cropping around the face, this is achieved using the face detection technique of Machine Learning or you might have also observed that your device’s camera automatically detects the object in the frame and displays the name of the object on the screen, this is made possible by object detection technique.

With the advent of technology, it is becoming increasingly common to see visually appealing images with ultrahigh resolution. People no longer need to learn using tools like Photoshop and CorelDRAW to enhance and alter their images. AI is already being used in every aspect of image augmentation and manipulation in order to produce the best possible pictures.

Nearly every image that you might have seen would have been a captured photograph or manually created by a living, breathing person. There are possibly hundreds of tools for producing images manually but they do require a human presence to preside over the process. However, imagine a computer program that draws from scratch whatever you tell it to. Microsoft’s Drawing Bot might be one of the first and only such technologies that make this possible. Envision a time in the near future, when you can just download an app on your smartphone and give it a few instructions such as “I want an image of me standing next to the Eiffel Tower. ” (Make sure you word it correctly, though).

Now coming to the processing of the image or editing the image, using the software like Adobe Photoshop, Adobe Illustrator, and many other, here also AI is used.

In Photoshop, machine learning can be used to automatically correct the perspective of an image for you. We can also use machine learning-enabled features to make content-aware suggestions. For example, if you are working on a UI mock-up, XD might automatically suggest certain buttons.

One of the interesting ideas in the last few years in the field of Machine Learning is GAN’s.

Generative Adversarial Networks (GANs)

A generative adversarial network (GAN) is a class of machine learning frameworks designed by Ian Goodfellow and his colleagues in 2014.

Generative Adversarial Networks belong to the set of generative models. This means that their job is to create or “generate” new data in a completely automated procedure.

GAN is actually composed of two individual neural networks that compete against each other(In an adversarial manner). One neural network, called the generator, generates new data instances which it creates from random noise, while the other, the discriminator, evaluates them for authenticity. In other words, the discriminator decides whether each instance of data it reviews belongs to the actual training dataset or not.

  1. The generator initially takes in some random noise and passes it to the discriminator.
  2. As the discriminator already has access to a dataset of real images, it compares them to the image it received from the generator and evaluates its authenticity.
  3. Since the initial image is just random noise it would be evaluated as fake.
  4. The generator keeps trying its luck by varying its parameters so as to produce images that start getting a bit better.
  5. Both networks keep getting smarter as the training progresses, the generator at generating fake images and the discriminator at detecting them.
  6. Eventually, the generator manages to create an image indistinguishable from one in the dataset of real images. The discriminator is not smart enough to tell whether the given image is real or a fake.
  7. At this point, the training ends, and the generated image is our final result.

Machine Learning is used in almost every phase in photography, from clicking a photograph to enhancing/editing the photograph, and is used even in generating new photographs that do not exist(using GANs).