Reducing your deep learning costs by up to 75%

Source: Deep Learning on Medium


Every year at Amazon re:Invent, we hear about new services offered by Amazon Web Services (AWS), and they usually aim to solve issues facing business and developers, with one of the hot topics being all things machine learning and AI.

Remember back in September 2016, Amazon announced the addition of P2 instances to their Elastic Compute Cloud (EC2) service, which aimed to accelerate deep learning and other applications[1]. One of these issues has been the cost of performing deep learning in the cloud. TensorFlow models, for example, can become large and require high amounts of processing power to train. This directly translates to high costs — especially if using those P2 instance types, where pricing starts at $3.06 USD an hour. [2]

When it comes to deep learning applications, we make predictions using a trained model that is called “Inference”. This can result in the majority of your compute costs due to the need of standalone GPU instances, although in comparison, uses far less processing power than training the model with an initial dataset. Even at peak load, a GPU’s compute capacity may not be fully utilized when running inference.

This year at the annual re:Invent conference, we heard AWS CEO, Andy Jassy, during his keynote speech announce a new service named Amazon Elastic Inference, designed to reduce deep learning inference costs by up to 75% [3]

Amazon Elastic Inference enables you to easily attach low-cost GPU powered acceleration to existing EC2 and Amazon SageMaker instances with no code changes. At release, the service supports TensorFlow, MXNet, and ONNX model types — with more framework support being added later on.

With Amazon Elastic Inference, you’ll be able to choose the instance type suitable for your application — based on CPU and memory requirements, and then separately configure the amount of inference acceleration you require.

Here at REDspace, we’re an AWS Consulting Partner and have lots of experience developing all sorts of applications leveraging AWS services. Reach out to us to see how we can cut your deep learning costs with Amazon Elastic Inference. www.redspace.com

[1] https://aws.amazon.com/about-aws/whats-new/2016/09/introducing-amazon-ec2-p2-instances-the-largest-gpu-powered-virtual-machine-in-the-cloud/

[2] https://aws.amazon.com/ec2/pricing/on-demand/

[3] https://aws.amazon.com/about-aws/whats-new/2018/11/introducing-amazon-elastic-inference/