Cloud Professional ML Engineer

Original article was published by Viren Luke Radhakrishnan on Artificial Intelligence on Medium


My Journey With ML — Part III/IV

Cloud Professional ML Engineer

ML Engineering, and the Google Cloud Professional Machine Learning Engineer examination

Google Cloud Platform (GCP) is a suite of cloud services offered by Google that runs on the same underlying architecture that powers Google’s apps. These services provide clients (independent users and organizations) with access to a robust infrastructure, serverless tools, and enterprise-grade software, with minimal overhead time in installation and management. GCP provides competition to Microsoft’s Azure and Amazon’s AWS platforms.

While GCP has a ton of functionality that may power many different industrial applications, the use-case we are concerned with is Artificial Intelligence. As such, we will look at the AI-related tools offered by the platform in this article.

Furthermore, Google Cloud provides various certification paths for developers, engineers, and architects to showcase their prowess with building on the cloud. In August, Google released the Cloud Professional Machine Learning Engineer (PMLE) exam, in beta mode, that I took and successfully cleared. As of October 15th, 2020, the exam is no longer in beta; i.e., it is available to everyone. This article will cover preparing for the exam, the costs associated with training and certification, and the pros and cons of getting certified.

While one may showcase their skills with TensorFlow in the form of projects, it is nearly impossible to demonstrate the same competence when working with the cloud; after all, how does one demonstrate their ability to use a tool? Therefore, I strongly urge you to consider taking this exam, even if you may not have wanted to take the TensorFlow Developer Certificate exam.

Developer Or Engineer?

According to Google, only 5% of a production ML system consists of the ML code. Building and training a model is essential, but there is a multitude of other processes that make up an ML pipeline. These may include but are not limited to Data Collection, Data Verification, Feature Extraction, Model Validation, Model Deployment, and Model Automation.

Source: https://developers.google.com/machine-learning/crash-course/production-ml-systems

If you’ve followed along with the series so far, you will now realize that there is a lot more to ML than building a Keras model and training it on your laptop.

This is the difference between being an ML Developer and an ML Engineer.

You cannot call yourself an ML Engineer until you are thorough with every stage in the ML pipeline, not just the model building. These skills matter because there’s no use in having a trained Keras model on your laptop if nobody can do anything with it.

What Google Cloud Platform Provides

So what tools does GCP offer to alleviate these concerns? Let’s take a look at some of them below, particularly from the perspective of an ML Engineer:

  • Google Cloud Storage provides you with the ability to store buckets of structured and unstructured data.
  • Pub/Sub allows you to have clients (say IoT devices) publish data to topics that can be subscribed to and streamed from other GCP tools.
  • DataFlow, an implementation of Apache Beam, lets you preprocess Batch and Streaming Data before model training and may also be used to post-process the model prediction results.
  • DataPrep provides you with a way to graphically visualize your data and performs any specified transformations with DataFlow under the hood.
  • BigQuery gives you the power to write SQL queries against large volumes of structured data and quickly learn patterns from it.
  • AI Platform Notebooks, Jobs, and Models give you the ability to experiment with, train, and deploy models rapidly (this is the part you must be quite familiar with by now).
  • KubeFlow Pipelines, implemented as a layer over Kubernetes, allows you to write portable, platform-agnostic ML code that can be run and deployed anywhere, including your laptop or a different cloud provider.
  • Cloud Composer, an implementation of Apache Airflow, gives you a way to schedule and orchestrate all the other GCP components in harmony.

GCP also provides solutions that let you work with AI without requiring you to write any code. These allow you to effortlessly create ML systems, which are quite powerful but may not have the custom functionality you need.

One of the best things about GCP is how well unified the entire ecosystem is. Every service works beautifully with the others, thereby allowing you to easily string a chain of services together when building a pipeline.

“Where Do I Begin?”

The details of the exam are entirely confidential. The content below is not to hint at what’s coming in the exam but to guide you towards learning the material you need to succeed as an ML Engineer. This so happens to be what the exam is testing you on.

Please note that I am in no way sponsored by or affiliated with anyone; everything I recommend here is from my research and experience. Also, I am in no way responsible for your performance in the exam, and following the steps below is at your own risk.

According to the PMLE page, the exam’s recommended training is the GCP Big Data and Machine Learning Fundamentals course from Coursera, the Machine Learning with TensorFlow on GCP Specialization from Coursera, and the Advanced Machine Learning with TensorFlow on GCP Specialization from Coursera.

Save this to your computer. It may help you organize your preparation!

At the time of this writing, portions of the above courses are outdated. These Specializations were initially released in 2018, and a lot has changed since then. If my observations are correct, they will be revamped in the coming months/years. Until then, here’s what you need to know:

  • If you followed the learning path in the previous part of this series, you could effectively skip most of the TensorFlow portions within these Specializations. They use TensorFlow 1.x, and we’re currently at 2.x, which has changed many ways we do things in TensorFlow. However, you must still thorough yourself with Estimators, because as of today, Keras models do not support Distributed Training and TFX (if you didn’t understand a word I said, don’t worry, you will once you start).
  • In Advanced Machine Learning with TensorFlow on GCP, you can entirely avoid the first course; it’s just a recap of the courses covered in Machine Learning with TensorFlow on GCP.
  • Provided you followed the learning path in the previous part of this series, in Advanced Machine Learning with TensorFlow on GCP, you can skip most of the third course (only complete week 2, from Going Deeper Faster, till the end of the week), most of the fourth course (only complete the sections on AutoML and DialogFlow), and most of the fifth course (only complete week 1 up to Factorization Approaches and week 2’s Building an End to End Recommendation System section).
  • Do note that many of the Labs in these Specializations are outdated and can thus be skipped.

If you want the Course and Specialization Certificates from Coursera, however, you cannot skip anything.

“What’s It Gonna Take?”

  1. Time- I would suggest that you budget about a month per Specialization for your preparation and put in about 2–3 hours of study per day.
  2. Course fees- You can choose to either audit the courses, whereby you will have access to the lecture material for free, but not get a Coursera certificate at the end; or pay for and take up the Specializations. The costs are, in my opinion, reasonable for what you get.
  3. Exam fee- The retail cost of the exam is $200. Is the price justified for what you receive? I believe so, but you will have to decide for yourself (more on this in the next section).
  4. Dedication- A lot of it. Not everything is well documented and explained. You may find considerable gaps in your understanding that will take many a night of reading to grasp.

Why Go Through This?

You get a certificate that proves your competence at using Google Cloud products to build effective ML solutions.

You also get added to the Google Cloud Credential Holder Directory.

You can read about all the other benefits here and here.

Please, however, note that the certificate, by itself, may not be enough to help you stand out; projects, internships, and work experience are for that. Furthermore, the credential expires within two years of your receiving it, and you will have to pay and recertify to retain the title. Is it worth it? You now have what you need to decide. Do note that Microsoft’s Azure and Amazon’s AWS platforms also offer similar certification exams.

In the next and final part of this series, we will take a look at why AI is so important, how to make a difference with it, and how it can get frustrating at times.