Source: Deep Learning on Medium
How to become a data scientist?
Since the discussion is about becoming data scientist,let me try breaking it into sub questions.
Sub-question 1 — Should i go into data science field?
Sub-question 2- How should i approach my data science journey?
Sub question 1 — Should i go into data science field?
Answer — The very first thing you should see is, does a correlation exist between what you do currently and what happens in data science/analytics/machine learning space? Assuming you are starting from zero, let me put it simply — Machine learning is a way to make machines learn from data.
You can refer to this youtube video from Andrew NG(Keep this name in mind, he will help you get better at any stage of data science journey) to better understand machine learning
Lecture 1.1 — Introduction What Is Machine Learning — [ Machine Learning | Andrew Ng ]
Having said that, your life will revolve around data and codes used to make machines learn.
if you aspire to be a data scientist, your affinity towards data and coding should be high.
I guess you should be able to answer sub question 1 by now.
Sub question 2 — How should i approach my data science journey?
Very first thing to ensure here is you cover breadth of few things listed below.
- SQL — This is one of the most important skill you should have if you want to become a data scientist. How to improve it and where to learn? There are lot of websites available where you can run SQL queries and practice. w3schools is one of my favorite, there are many more though. Link for w3school https://www.w3schools.com/sql/tr…
- Coding/Algorithm — You can install Rstudio(one of the most sought after tools in the industry) and start practicing R language. How to install R and R studio — links
Also, if you do not want to install R and R studio for now, you can practice online as well. Below are the links for it
Below are few books/links available on web for free that will help will start getting your hands dirty in R
- Statistics — This is one of the skill you must not ignore before jumping in to a data science use case. To make your journey smoother, assuming you are a beginner, i advice you to read this book https://www-bcf.usc.edu/~gareth/…. Ensure you finish this book at least once before moving to next step. This book has R practice materials as well which will help you understand the concepts and get better in R
- Model building Once you are done getting your hands dirty with R and SQL,learned some bit of statistics, in parallel start to make some simple machine learning models like linear regression, logistic regression, decision trees etc. You will find packages in R which will run these model for you but try to understand what is happening internally. You want data for it, refer to below links which give you data for free
Once you start getting grasp of how to run a machine learning model, then go to different forums where multiple people are working on same data-set. Kaggle(https://www.kaggle.com/) being one of the most important platform. Create a free account and start practicing on their data. The most important things to learn on this platform is what others are doing with same data? How are they approaching the same problem statement?
If you follow above steps properly and do couple of use cases with in depth understanding on kaggle, you can start putting data science as your skill in resume. Keep practicing, keep learning. There are always new challenges and concepts coming in data science world.
Wish you all the best!