Machine Learning with AWS SageMaker

Source: Deep Learning on Medium

Machine Learning with AWS SageMaker

This article is a summary of a talk I gave at the yearly Webstep’sKompetensbio” event. Every year Webstep invites all developers to this free event that happens in some nice local cinema, where they can enjoy interesting tech-talks and watch some exciting movie. This year, for the first time, the event took place in three cities: Uppsala, Malmö and Stockholm.

Introduction

Ever since I was a child and to this day, I was a big Science-fiction fan. Growing up in a small town in former Eastern bloc country, was not really a lot of fun. Especially if you were smart and curious. Questioning authorities was not allowed as well as asking a lot of questions.

For a boy with a rich imagination and a lot of questions, the only way out was to find some other worlds, worlds where anything was possible, and everything was allowed. In the worlds of Arthur Clarke, Isaac Asimov and others, that is where I felt like at home.

Trying to imagine all those distant planets and advanced societies with superior technology, I could not avoid feeling a bit sad, because I thought that all that lies in a very, very distant future. And yet, here we are today talking about, and experiencing some of those technologies: artificial intelligence, self-driving cars, real time image-recognition systems etc.

We are truly living in very exciting times. And these fantastic new technologies are opening a lot of opportunities for everyone to get involved. Like with anything, there is a lot of miss information and noise. For that reason, in the following text I will try to debunk some myths around AI and also to answer the following questions:

  • What is Machine Learning?
  • Why is it relevant to you and why should you care?
  • How does it work?
  • When it should be used and when not?
  • How it can be done in an effective way?
  • How can AWS SageMaker help with that?

The Hype

There is a lot of hype these days around Machine learning, Artificial intelligence and related technologies. Like always with the media and reporters, “there is no better news than bad news”. And nothing gets our attention better than fear. That is why reporters keep writing about “killer robots” that are either going to destroy us all, or at least take all our jobs and leave us homeless on the street.

Honestly, nothing could be further from the truth. Those advanced war machines are far away in the future, if they ever come to happen. I am sure that a lot of regulation and treaties will prevent something like that from ever happening. When it comes to our jobs, things will change in that regard, but not as drastic as reporters want us to believe. We will come to that later.

The change

In the last decade we have witnessed that companies like Airbnb, Netflix, Amazon etc. have completely transformed industries they are in. Apart from being incredibly successful, all these companies have something else in common: all of them are using data to build their business models. And all of them have automated most of the business processes that run their day-to-day operations.

What we are seeing here is exactly what Marc Andreesen famously predicted in his article: Why software is eating the world? In essence, what he wanted to say is: eventually every process that can be automated will be by putting it into some kind of software. And every company regardless of its background, will become a software company.

The revolution

These changes are part of a much bigger story that has been going on for centuries already. It is a story about Industrial revolution. The one we are in right now, is the 4th Industrial revolution. All three previous revolutions started by some significant technological invention: steam engine in 1760, electricity in the 1820, computing in 1960 and now AI and ML (machine learning).

Thinking about this for a while I realized that the very nature of industrial revolutions has changed. We moved from transforming raw materials into energy to power industries — to transforming raw data into insights and knowledge, that powers intelligent services and devices. With this, also automation of jobs has changed. So called “blue collar” jobs were first, now “white collar” jobs are being automated.

In the past, every revolution had its winners and losers. This time it is not different. Some estimate that hundreds of millions of jobs will be lost. And the same or even higher number of new, different jobs will be created. Trillions of dollars of profit will be made each year by those companies who learn to use Machine learning and AI to their advantage.

In my mind, I see this huge change as a big ocean wave. And we all must ask ourselves — where do we want to be when it comes? On the top, surfing and having the time of our lives? Or at the bottom, risking getting crushed by this big wave? Or on the shore watching safely from a distance, but missing all the fun?

Definitions

What is Machine learning? There are many definitions but one sentence from famous hockey player Wayne Gretzky, describes is the best. He was not talking about machine learning when he said: “I skate where the puck is going to be, not where it has been.” He was simply answering the question from the reporter, how come he plays so well? In my view, the same principle applies also to the very nature of machine learning: trying to predict the next best move, in order to win the game.

If we want to use more formal definitions of Machine Learning (ML), AI and Deep learning(DL), we could describe them in this way:

  • AI: techniques which enable machines to mimic human behavior: image recognition, speech recognition, etc.
  • ML: subset of AI using statistical methods to enable machines to learn from experience
  • DL: subset of ML which makes computation of multi-layer neural networks possible.

There are many more similar definitions, but these are simple enough and they make sense to me.

Algorithms

Most known types of Machine learning algorithms are supervised, unsupervised and reinforcement learning. With supervised learning, the ”right answer” is known in advance. Program guesses the answer and compares it to the ”right answer” and learns from experience. With unsupervised, there is no ”right answer”. Program searches for patterns, clusters and anomalies in the data. And with reinforcement learning, the program tries to find the most efficient path to the goal. Success gives a reward — failure produces a restart.

The code

If we want to understand how these algorithms are working, we need to put them into the right context, by comparing them to traditional programming. With traditional programming, the inputs are the data and the rules, written by a developer. These inputs give us a desired output, or the “right answer”.

With machine learning, inputs are the data and the output, or the “right answers” collected from the past experiences. These inputs combined, and put through the machine learning algorithm, give us a model — or a new algorithm that has learned the rules from the data. This way we get the desired rules without explicit programming.

Prerequisites

Machine learning is not a magic or a silver bullet that can solve all problems. It is a tool that can be helpful and effective in some cases, but not in all. So how can we know when to use Machine learning and when not? There are some simple rules of thumb that can help us with that.

Use Machine Learning when:

  • Data exists in high quantity and quality. Data is the most important component. Without enough of the right data, good machine learning model is not possible to build.
  • Real answers are available and the “right answer” is known
  • Problem has very complex logic. Simple problems should not be over complicated. Use simple rules to solve the rather than applying machine learning.
  • Problem adapts in real time. Sometimes problem would force us to change the rules very often and in unexpected ways. In such cases look for more general solution, with the help of machine learning.
  • 100% accuracy is NOT required. Machine learning algorithms are never 100% correct. If that is good enough for your problem, then consider using them. And if your machine learning model is 100% correct — you are doing something wrong!
  • Privacy is NOT an issue. Only when we have permission to use data for the problem at hand, we should consider applying machine learning to it. Otherwise it can be illegal or at least immoral and unethical to use such data.

Use cases

Now that we know when to use machine learning, we can name some most common use cases. These are examples when machine learning works well, and it has been proven that in practice.

Ranking — is about finding the most relevant thing. Think about Google search for example.

Anomaly detection — is about spotting unusual things. Good examples for this use case are credit card fraud detection or preventive maintenance.

Clustering — is about grouping similar things together. Like for example customer segmentation.

Regression — is about predicting a numerical value of something. Here we can mention house price prediction as an example.

Recommendation — is about suggesting the most interesting thing. We have seen this many times in case of movie recommendation at Netflix for example.

Classification — is about finding what something is. Think about email spam filtering for example.

Machine learning life cycle

How is Machine learning usually done? It always starts with the raw, unrefined data. We have to get the data, clean it, reshape it and transform it. Then we have to perform statistical analysis and exploration to find hidden relations inside the data. These insights help us to find relevant features and filter out the other ones, that do not offer any value for the problem we are solving. This is what we call the build step.

This nicely shaped data is what we call “train data”. This data is then being used to experiment with different machine learning models until we find the one that gives us the best result. This is achieved through a long process of model evaluation and hyper parameter optimization. This is a train step of the machine learning life cycle.

Finally, when our favourite model is ready, it can be used to power some service or an application and has to be deployed. After that its performance gets evaluated to see if it works according to expectations. The data we collect from monitoring it and as a result of its functioning, can be used further to improve model even more. This is a deploy step of the process.

For each of the steps: build, train, deploy — we need a server or a cluster, depending on the requirement and a problem. Purchasing hardware, installing an operating required software and tying everything together — can be a big challenge.

The extent of this challenge depends on some important factors like company organisation, culture, technological level, skill level etc. All these factors can slow down Machine learning life cycle significantly.

Most common problems are: data scientists do not have access to the data, or they do not have sufficient computing resources. Developers do not have machine learning knowledge, and data scientists do not have good software engineering skills. Due to gap among these teams, hand-over and deployment of machine learning models in production is very slow and painful.

By now you must be wondering is there a better way? Luckily — there is! And that is why start talking about AWS SageMaker.

AWS SageMaker

This Amazon cloud service was created with only one idea in mind: to put machine learning into the hands of every developer, regardless of their knowledge of that area. It provides easy, fast and scalable way to perform all three usual steps: build, train and deploy.

Build step is done by connecting to other AWS services like S3 and transforming data in Amazon SageMaker notebooks. Train step is about using AWS SageMaker’s algorithms and frameworks, or bringing our own, for distributed training. Once training is completed, models can be deployed to Amazon SageMaker endpoints, for real-time or batch predictions.

Build

We can create Jupyter notebook instance with desirable server size and capacity, with just a few clicks. When Jupyter hub is running, the data cleaning and exploration can begin. The best feature here is the ability to choose desired server size for our notebook instance. We can also automate shut down of the instance after some period of inactivity and avoid unnecessary costs.

Train

We can train our models on the right server capacity with the ability to choose the size and number of servers to train on. Starting a server is just one line of code and the server will shut down automatically after the training of a model is complete.

Deploy

Here again we can deploy our Machine learning model with just one line of code, by specifying desired server capacity. Use the endpoint address to create the application service or serverless function.

Machine learning AWS SageMaker way

SageMaker gives us the ability to develop and deploy our machine learning models in a matter of days or weeks instead of months or years! It achieves that by easy provisioning of resources for data clean up and exploration, and seamless way to provision and scale server endpoints for deployment of machine learning models.

In this digital age, the only constant is change. We all live in an interconnected world, where our feedback and opinion matters. Good or bad customer reviews can make or break the company. The ability to act quickly, to adapt products, features and services to ever-changing taste of customers is the way to success for any company.

Machine learning is a very powerful tool that can create a lot of value when used the right way: to experiment and find the best business ideas.

But having the right idea and the right data is not enough. We also need the ability to scale quickly when we find a winning case.

AWS and SageMaker provides the ability to experiment and find the right idea fast and cheap, and to deploy it and scale — all in a single package.