Source: Deep Learning on Medium
Understand The Computer Vision Landscape before the end of 2019
Get a brief overview of how Computer Vision developed over the past 50 years and gain knowledge around buzzwords like ‘AI Winter’ and their meaning.
2019 is about to come to an end, but before it does, wouldn’t it be nice to understand one of the buzzwords of our decade in Machine Learning? This article will help you gain a brief understanding of Computer Vision, enough knowledge to make you look clever over Christmas dinner.
After reading this article, you will become familiar with terms such as Computer Vision, Deep learning, Machine learning, AI Winter, Skynet…you know, the good stuff.
So what is this Computer Vision thing I keep hearing about?
When someone asks you this question, you answer them along the lines of “Computer Vision is how a computer sees.” Well, not precisely, try the explanation below to turn some heads definitely.
Computer Vision is the process by which a machine or a system generates an understanding of visual information by invoking one or more algorithms acting on the information provided. The understanding is translated into decisions, classifications, pattern observation, and many more. And now you are turning heads.
Let’s have a quick history lesson and see how the field of Computer Vision has developed.
The need for Computer Vision arose when we set out to mimic the perception and vision system within the human body. So the journey began in the 1960s, where academics took on human perception and attempted to replicate its basic functioning on a computer system. Our pioneering academics aimed to provide robots the ability to see and give a description of what has been observed by the robot. This was the first step to Skynet (Yes like the movie, but this was before that).
It wasn’t easy to make Skynet see like humans, so we looked to digital image processing techniques to gain an understanding of the content of images that are presented to the computer vision systems. What I mean by understanding is the extraction of edge information, contours, lines, and shapes from an image. The ’70s were all about algorithms that can extract the information from a digital image.
One thing to note is that the first AI Winter occurred in the 1970s. For those who are unfamiliar with the term ‘AI Winter,’ it can be described as a period where there is a reduced lack of interest, funding, morale (hype), and research ongoing within AI-related domains such as Computer Vision, Machine Learning and so on.
The ’80s and ’90s in Computer Vision focused on Maths and Statistics. Researchers and academics began to marry computer vision techniques with mathematical algorithms. A good example to portray the utilization of maths in computer vision and image processing techniques would be an edge detector algorithm.
Edge detection is one of the primary image processing techniques taught in most computer vision courses. In 1986 a peculiar and useful edge detector was developed by John F. Canny. It was called the Canny Edge Detector. By leveraging mathematical concepts such as calculus, differentiation, and function optimization, John F. Canny developed a very popular edge detector, and it’s still taught at Master level courses.
Fast forward to the previous decade; the 2000s was a rather revolutionary time for Computer Vision. Deep Learning emerged, and Computer Vision again became hot topics for the media, researchers, and academics.
Another key definition is coming up.
Deep Learning is a sub-branch of machine learning, where algorithms leverage the utilization of several layers of neural networks to extract richer features from input data. Examples of deep learning techniques are Deep Convolution Neural Networks(CNN) and Recurrent Neural Networks(RNN).
So many terminologies, before we go on, below is a link to some terminologies relating to Machine learning.
2012 was a pivotal year within the Computer vision landscape. You might know I am about to mention here (shhh and don’t ruin it for others). There is a competition called ‘ImageNet Large Scale Visual Recognition Challenge,’ this competition is held annually, and its mostly a gathering of academics, researchers, and enthusiasts comparing software algorithms that classified and detected objects in images. 2012 installment of this competition saw the introduction of a Deep Convolutional Neural Network(AlexNet) that achieved an error rate that surpassed the other competition at the current year and the previous years before it.
I won’t delve into too many details on how the AlexNet is designed, and there are tons of resources online for this. But I will mention the two significant benefits that AlexNet brought to the landscape.
Firstly, GPUs. AlexNet’s stunning performance was made possible using a Graphical Processing Unit (GPU). Although GPU has been used before within the competition, it was AlexNet’s utilization of the GPU that caught the eyes and attention of the computer vision community.
Secondly, CNNs became standard. AlexNet’s ability to showcase the effectiveness of CNNs meant that CNNs became popularised. From that year onwards to the present time, the implementation of CNNs is found in most computer vision applications and research.
I will have to pause here, and perhaps continue this topic in another article in the future. There are so many topics and domains I have not touched on, but below are some medium articles that explain in detail the key terms mentioned in this article and more.
Now you can go to 2020, having an understanding of Computer Vision and its development since the 1960s.
If you enjoyed this article and would like more articles like this then simply give me a follow and allow me to expand your knowledge on Machine Learning as a whole.