Why every company will have machine learning engineers soon

Original article was published on Artificial Intelligence on Medium


Why every company will have machine learning engineers soon

Machine learning engineering will be commonplace in the immediate future

When we think of companies who use machine learning in production, we think of Google, Facebook, Uber—large, technical companies with mature data science teams and heaps of user data.

Maintaining Cortex, an open source API platform for machine learning, I’ve noticed something new:

  • Small startups and solo engineers are building ML applications.
  • Companies we don’t think of as “technical” are doing the same.

There are two person startups launching products built on state-of-the-art models, fast food chains using recommendation engines to generate menus, and solo hackers building projects like this DIY license plate reader (built using a Rasberry Pi and 5G).

In parallel to this shift, there has been another trend within production ML. Everyone is suddenly hiring machine learning engineers.

These events are not disconnected. The reason companies are hiring machine learning engineers, people who build ML applications, is the same reason that production machine learning is suddenly so much more accessible:

The costs and barriers attached to machine learning engineering have shrunk significantly, so much so that they no longer outweigh the benefits of production ML — even for non-FAANG companies.

1. Models are commodified now

Model development has historically been the biggest barrier to companies using machine learning. But over the last decade, particularly the last five years, state-of-the-art models have been open sourced, oftentimes in pre-trained form.

Now, you can implement a state-of-the-art text generating model like GPT-2 using a library like Hugging Face’s Transformers:

Source: Transformers

The above is just an example, but its importance shouldn’t be understated. In six lines of Python, you can serve predictions from a model more powerful than anything you could have trained with a $100,000 budget and multiple data scientists just a few years ago.

This isn’t unique to language models. Wildlife Protection Solutions is a small company that uses technology to help protect endangered species and ecosystems. Previously, they’d installed motion cameras to monitor nature preserves for poaching activity, but manually reviewing all of that video in realtime was simply ineffective. Even with programmatic filtering, they were still only detecting 40% of poachers captured on video.

Machine learning previously would have been out of the question for a team of this size. However, using Highlighter (a platform that makes it easy to tweak computer vision models to your data), WPS designed a model that detected poachers at over an 80% rate:

Source: Silverpond

WPS is a smaller company with limited funds and less data than a tech giant, but they were able to develop an effective model and catch twice as many poachers as before, all because models have been commodified.

2. Production tooling has matured

According to a recent study, a majority of data scientists report spending over 25% of their time on model deployment alone. It’s not uncommon for problems around deployment to bring a company’s ML efforts to a halt, at least temporarily.

This is why companies like Uber, Netflix, Google, etc. all hire large ML infrastructure teams. They have teams of specialized engineers dedicated specifically to building the infrastructure to deploy models.

For years, this has stopped companies from doing any machine learning engineering. Now, the ecosystem has matured, and there are tools to automate ML infrastructure.

For example, Cortex is an open source API platform that, given a trained model, will automatically configure and deploy a production API. For anyone with a software engineering background, the workflow is similar to Serverless or Elastic Beanstalk:

Source: Cortex repo

There are a handful of other projects that take the same approach of applying software engineering workflows and principles to build production tools for ML. DVC (Data Version Control), for example, provides a Git-like interface for versioning data and models:

Source: DVC

As a result of these projects, companies can hire machine learning engineers to build ML applications without hiring an entire platform team to build the underlying infrastructure.

3. Hardware is getting better — and more accessible

Inference is computationally expensive. To run a large deep learning model like GPT-2, for example, GPUs are required — and oftentimes, several of them.

Up until recently, the cost of GPUs has been prohibitive, but as the price of GPU instances has fallen, it has become much more approachable. In addition, new ASICs are being released specifically for machine learning (Inferentia on AWS, TPUs on GCP). These chips often are much more efficient than CPUs/GPUs, and can reduce the price of running inference at scale.

For example, we recently internally benchmarked an Inferentia instance (inf1.2xlarge) against a GPU instance with an almost identical spot price (g4dn.xlarge) and found that, when serving the same ResNet50 model on Cortex, the Inferentia instance offered a more than 4x speedup.

These trends aren’t slowing down. ASICs are going to continue to get more powerful, GPUs will continue to get cheaper, and the big clouds will continue to compete on price.

As this happens, the number of companies for whom machine learning engineering is viable increases, and as a result, the number of ML applications in the world grows.

Machine learning is becoming just another part of the stack

15 years ago, most businesses couldn’t take orders online. Now, even your local pizzeria has a website and integrates with a delivery platform.

10 years ago, there were around 100,000 apps on Apple’s App Store. This year, there are nearly 2 million, with seemingly every company offering one.

In all of these cases, there was a technology that was originally only accessible to a small group, that through a variety of trends — almost always including a reduction in cost, improvement in tooling, and the commodification of some part of the process — became accessible to everyone.

It’s hard to imagine this happening with machine learning, mostly because we’ve spent the last decade treating it as this scientific fiction research project that teaches computers to play Starcraft.

If we zoom out, however, and look at how it’s already being used in production by machine learning engineers, we see that ML is widely applicable, and that as the barriers to entry come down, it is becoming just another part of every company’s stack.