From 0 to Data Scientist



Intro to Machine Learning by Andrew Ng

This short guide is intended to help complete beginners land their first job in data science or machine learning, in four clearly defined steps.

The following resources got me from not knowing what machine learning is, to landing my first job as a machine learning researcher, working on a project funded by the European Space Agency (ESA).

The order I present material is not necessarily the order I went through each resource, but how I would do it if I had to start all over again.

1 — Machine Learning by Andrew Ng — Coursera | Stanford University

This course is a MUST!

This is the number one way to get started with machine learning. Since released by Prof. Andrew Ng over 1.8 million students have enrolled on Coursera.

I put off taking this course for quite some time, thinking that studying from books and online resources would be more effective, but boy was I wrong! 
Andrew Ng starts from the complete basics, although having a good grasp of linear algebra fundamentals does help.

No Python or R knowledge is required!

What makes this course special apart from Andrew Ng’s excellent teaching skills, are the quizzes and tests which really helped me to truly understand what was going on!

Topics covered include supervised learning algorithms such as linear regression, logistic regression, SVM’s and neural networks. Unsupervised learning algorithms are also included.

2 — Two Excellent Books!

Once armed with the fundamentals from Andrew Ng’s course and some knowledge in Python, it is time to start writing some code and reading further. I recommend learning Python in parallel with this point.

The two best books I used were:

(1) Deep Learning with Python

(2) Hands-On Machine Learning with Scikit-Learn and TensorFlow

Both of these books are written by experts in the field of AI and ML, and provide further details, exercises and code implementations of many machine learning algorithms.

It is important to go through the exercises, and write code yourself!

3 — Python

Python for Data Science

Python is not necessary to learn machine learning concepts, however Python comes with a huge amount of inbuilt libraries, most of which are for Machine Learning such as Scikit-Learn, Numpy & Pandas.

There are also many books on ML and AI with examples written in Python as well as many courses online that are also pushing Python.

These are some of the reasons why I recommend learning Python as one of the places to start when it comes to Machine Learning.

The two sources I recommend for learning Python are Python Crash Course and the official docs.

Most of the modern deep learning frameworks work with Python. This excellent blog covers the topic in greater detail.

Courtesy of towardsdatascience.com

4 — Mathematics

Courtesy of towardsdatascience.com

This is an essential step to truly start understanding what is going on under the hood of most machine learning algorithms. Naturally, mathematics is a vast field, however we are mainly concerned with Linear Algebra, Probability and Statistics and Calculus.

This excellent blog covers further details on this point.

Two top books provide all there is to know with regards to linear algebra, calculus and statistics.

(1) An Introduction to Statistical Learning with Applications in R
(2) Thomas’ Calculus

It is important to know the mathematical constructs behind common machine learning or data science algorithms such as linear regression, logistic regression, SVM’s, tree based methods and K-means clustering. This also gives a better intuition when it comes to implementing these algorithms in Python, R, Matlab or any other platform.

Hope this helps.

Thanks for reading,

Ryan

Source: Deep Learning on Medium