What does it take to create a real-world computer vision project?

Original article was published on Artificial Intelligence on Medium

What is the general workflow of a computer vision project for a customer?

The most important aspect of a computer vision project is to understand the business case of the customer. What is the exact return on investment? What is the general problem area, and the key performance indicators? The second especially important aspect is camera selection. What is the most suitable camera for the business case? Is it a cell phone camera, a fixed camera, or a camera off-the-shelf? Determining this requires research and investigation on different camera models, their technical specifications, and their characteristics.

Following that, a decision should be made on where to place the computation of the computer vision algorithms. This is again dependent on the exact business case. The computation can be placed into the camera itself, into a cloud environment such as AWS or Azure, or into a private server.

The choice of a programming language is then dependent on the selection of the computation location. If the computer vision algorithms are placed into the cloud, there are more options for choosing the programming language. However, in some cases the algorithm computations must reside in the camera itself, meaning that computation speed is of utmost importance leaving fewer options for the language. Consequently, many important decisions need to be made before selecting the programming language, libraries, and frameworks.

Once a decision on the language is made you can start to look at the different libraries and frameworks that are available. A best practice is to look for ready-made implementations of computer vision algorithms. If an implementation for an algorithm is not available it is then up to the developers to make their own. Notable libraries and frameworks to start looking at include the following:

OpenCV, Cuda, ARkit, ARCore, Unity AR Foundation, TensorRT, Keras, Pytorch, SimpleCV, AWS Rekognition, Azure Cognitive Services, and AWS SageMaker, to name a few.

Finally, once the aforementioned dependencies have been settled, the actual development work and training of the machine-learning or deep-learning algorithms can start.