My Journey as a Data Scientist

Original article was published on Artificial Intelligence on Medium


My Journey as a Data Scientist

Hello and welcome ‘eager young minds’

Nowadays I get a lot of questions from different people having a different background for pursuing MS in DS. The path for masters depends on several factors. Well, here I am portraying my experience and perspective for the same journey for any individual.

First of all, a little bit of an introduction about myself. I’m Nisarg Dave. I’m currently working as Data Scientist for Group K Diagnostics, a healthcare firm based in Philadelphia, PA, United States. I have done my master’s in Data Science from Michigan Tech. Back in India, I did my bachelors of technology in Information Technology from Nirma University, Ahmedabad, Gujarat.

Through this platform, I am trying to help the ones who have doubts regarding pursuing MS abroad. First of all, I want to make one thing very clear,

  • MS should not be done in the ‘XYZ’ field because it has a lot more demand or a very good salary package. These things matter but according to me all that matters is your interest and passion.

Generic guideline for MS

  • Identify the really good programs. The definition of good is different for each and everyone. Major things to consider for that is education quality, research, and job perspective.
  • The major intension of people pursuing MS would be gaining more knowledge for their personal growth. So give that a priority and shortlist something which can add more value to your career.
  • The other fact is, University doesn’t matter. All that matters is your dedication and passion. Getting into a prestigious university doesn’t guarantee anything. you need to work hard and put all of your efforts to succeed. Always remember, ‘NO Free Lunch’!
  • You need to do thorough research for your next 5 years plan down the line to make a decision. Ask yourself, Whyyou want to do this? What you want to become? At last, work on ‘how you will become the person you want to be.
  • Building your profile is most important. Your GRE, TOEFL, or GPA score doesn’t guarantee anything. You need to make your profile stronger by adding research papers, pet projects, industry experience, or internship stints.
  • Perform thorough university-related research. You need to browse the course structure, types of courses, types of research projects, and professors’ profiles.
  • Contact the alumni to know more about job fairs, opportunities, and job search-related aspects.

There are other important factors apart from this,

  • Choosing the right university, by considering all the above factors for your case.
  • Select the universities as per your profile. You can fragment them into easy, moderate, ambitious!
  • Always consider the immigration aspect of the country. In the USA, the F1 Student VISA has certain limitations. For USA based universities, I would recommend enrolling in the STEM MS program.
  • Always remember that the university’s ranking and reputation isn’t everything that matters! There are a lot of other factors which are much more important than this.
  • LOCATION matters the most! For job searching, location plays a major role. Always do a lot of research about the job scenario in nearby areas.
  • Finance matters! Make the calculation and choose the economic way. Finally, all that matters is the easy return on the investments.

Now I’ll explain my journey in a little bit more detail. The intension here is to make you learn something from my story.

My Story

Ahmedabad & Nirma University,

  • Nirma is one of the prestigious universities in Ahmedabad. I was privileged enough to get into Nirma! As a Gujarat state board student, I have excelled in my 11th and 12th grade. I was that so-called ranker student in my school time. My deep interest in science and technology led me to this path. I took the science stream and considered enhancing my technological knowledge further as a technologist. I never wanted to be second in anything. This OCD helped me in always staying on top 😛

(I studied at Nirma almost for free because of NITAA merit scholarship)

  • During my Nirma tenure, I started exploring various areas of computer science. I did follow the “broadening the knowledge approach” rather than “deepening the knowledge.” I believe that you need to try everything to select the best for you. You will not realize until you try it out. It is extremely necessary to learn bits and pieces of many areas of your field. After experiencing everything you can stick to your most favorite one!
  • I started trying different things by taking different electives. I was learning in my free time on my own. I started developing some skills out of the academic. Ex: learning to code iOS App or learning to code Linux Kernel. I have mostly learned stuff using hands-on practices and books! Mostly by self-learning and all by myself. I started being an active researcher and IEEE published writer for my couple of conference papers. I did one of the research in the area of “System on Chip” and one in the area of “Digital Systems”. During the Nirma years, I have learned some concepts of Game development and blueprint based coding in the Unreal Engine and Unity. (Thanks to my dear friend Ketul Majmudar)
  • My interest in machine learning was identified when I read the book called, “Artificial Intelligence: A Modern Approach by Peter Norvig”. For me, it was very interesting. Most of the concepts were very intuitive to my understanding. I could easily grasp and I realized my increased deep interest in the word of AI. In my 3rd year at Nirma, I took the Machine learning course and learned it thoroughly. The course that Nirma offered was also basic but I did dive deep into that in my free time. I went above and beyond by started learning deep learning. There were few other courses that I took in the 3rd and 4th years of Nirma including data mining, Information retrieval systems, big data analytics, and design and analysis of algorithms. Since I was interested in the field of AI and Data Science I did all of my mini and major projects in that domain itself. Again, self-learning was the tool here! One of my independent technical research pet-project was on “Differential privacy in machine learning”. I thank Dr. Sapan Mankad and Dr. Priyank Thakkar for their support and help. We did collaborate with Sapan Sir to establish the high-performance computing lab in Nirma, majorly for deep learning and GPU computing. I did also engage myself in AR & VR research under the guidance of Dr. Priyanka Sharma.
  • The result of not limiting myself to the single thing wasn’t bad at all. Data Science is a multi-disciplinary field. it has a wider range of applications throughout different fields. The experience I had during my academics helped me a lot in thriving as a successful data scientist.

Proud Husky at Michigan Tech,

  • I thought of pursuing MS in data science because data science is my passion. One of the reasons was to get myself ready for the industry. I chose Michigan Tech, mostly because of its reputation as a good research-oriented US public university. I loved the work of some professors at Michigan Tech and that also attracted me to be part of MTU.
  • The coursework at MTU is highly flexible and you can customize it as per your interests. I did take courses in the domain of AI, data science, business, and psychology. The courses that I enjoyed the most includes, Adv. AI, Scientific computing, Computational intelligence, Management of Innovation & technology, and data mining. I was elected as a graduate teaching assistant for the Adv. Stat analysis & design II, master’s course, where I had an essential and fruitful experience of teaching, mentoring, and leading students for their research projects.
  • During my Michigan Tech days, I did solve some real-world problems using data science and tools. The Rozsa Performing arts center at MTU wasn’t collecting and using the data for their business. The center had huge potential for collecting and using data for adding more business value. I did help them make a strong data collection mechanism after each shows they host or organize. The survey and other data collection mechanisms helped them getting useful data. I took the initiative to build the initial machine learning models to provide more fruitful information and analytics based on their consumers’ behavior. The project was basically about modeling consumer data to understand their preferences, behavior, and interests. Using that center can organize or host the show which interests the most to consumers. In a way, I did improve their overall operating efficiency. Using the models, they could understand their consumers in a better way.
  • During my Michigan Tech days, I earned the research fellowship of DARPA for their Explainable AI program. I worked as a Research scientist on that project to make “AI models more explainable and less fuzzy” Thanks to Dr. Shane Mueller, I had a privilege to work with him in the domain to unleash the hidden power of psychology in data science and machine learning. The good fact is, due to the fellowship I got the whole year of tuition fees waived! At last, we published our research findings at the Human Factors and Ergonomics Society conference 2020. In this entire project, I did handle several tasks in different areas. My tendency to learn, anything and everything helped me a lot.
  • I enjoyed the class of Scientific computing, as it dealt with solving real-world science problems using high-performance computing or Super Computing. During the course, we helped one of the friends of the professor at Oak Ridge National Laboratory to establish the automated workflow for their PDF form generation. Computing can solve many problems if you create innovative and creative solutions for each problem. Scientific solutions come with the challenge of optimization, consistency, and efficiency to accommodate real-world settings and data. Another beautiful course that I enjoyed the most was computational intelligence. The course deals with fuzzy logic-based algorithms where probability-based algorithms can not be applied. Another cool aspect of the course was to learn Neural networks, optimization, genetic algorithms, Nature-inspired algorithms (Swarm optimization, etc.), and what not! The learning curve was good only because of my focus on more of a hands-on work rather than just learning concepts and theory.
  • Michigan Tech was one of the great experiences in my life. I wanted to pursue MS for the knowledge, my professional growth, and to become an expert in my domain. Here I did follow the “deepening of the knowledge”rule because masters are all about it! Not only this, but I also wanted to come out of my comfort zone to challenge myself with the new things. Coming from Ahmedabad to the USA was a big deal but I became completely independent and capable of handling many things at the same time. There are always pros and cons to each process.

First Industry experience as a Data Scientist at Group K Diagnostics

I am working for Group K since a year. I joined their Product development, R&D team as an only Data Scientist in the team. There were a lot of challenges for me. I have never worked in an industrial environment as a sole data scientist. This means I need to get things done without seeking any help. I am responsible for my work, deadlines, and research. Sure the responsibility was huge but I have handled it fair enough. Sometimes I got stuck and I felt like this is kind of a road-block. But I have always remembered that, keep trying till you die trying. Believe me, it works and there is always a way around.

  • Group K is my first ever experience in the US healthcare industry. Being a data scientist at a healthcare firm is a huge deal. You need to take consideration for data security, data versioning, efficient, and consistent machine learning modeling. You simply can not bear getting false positives and false negatives as a diagnosis result.
  • Here I have learned using AWS for machine learning and DevOps workflows. Plus I have started getting better at computer vision. We are majorly using computer vision and machine learning tools at Group K. As an innovative healthcare startup, we are proudly building Point-of-Care diagnostic devices that can give results within minutes. It is a very advanced micro-fluidic paper-based device solution that utilizes the power of computer vision and machine learning altogether!

The one thing is to always remember that, as a data scientist you are responsible for many things. You need to have multiple experts in various areas such as business, computer algorithms, machine learning theory, automation, object-oriented programming, data analysis, and DevOps. Although, all that I have mentioned are just a few details. Data Science has a lot to offer and it depends on an individual to learn as per their interests.

Data Science in my perspective

What do data scientists do?

You all have probably heard somewhere that data scientist is the sexiest job of the 21st century. I would say, it is good to be data scientist but some things are very hyped here in the industry. I believe one has a lot to learn and contribute as a data scientist. It is not just about processing data for information. it is more about adding the business value and establishing a fruitful mechanism for strategic advantage. The role of data scientist varies as per different industries. Data scientists in fin-tech will tackle different things than data scientists in healthcare. It is more of an application-oriented field where you keep your mind open and creative to solve the problems in different areas. Data Scientists wear multiple hats. They should be able to perform data cleaning, data munging, data engineering, and data modeling. Ideally, they have their skill set in various domains including mathematics, computer science, data analysis, and business optimization.

How to become a data scientist? Can I become a data scientist if I am not from a computer science & engineering background?

Well, there is not a special or ideal path for becoming a data scientist. It depends on many things. The first thing you need to be good at is, mathematics! Data Science involves a lot of math. I have seen many data scientists in industry who are not from a computer science background. This door is open for anyone and everyone. Ideally, the people who are not from CS background need to learn some core CS concepts such as data structures, design & analysis of algorithms, database, object-oriented programming in python, and some level of operating system / BASH usage. It will benefit them in cracking interviews. You can always skip this but that is NOT AT ALL recommended by me. You can not directly jump to machine learning or AI or deep learning without learning all these subjects.

  • Another notable way that I figured out is to learn by hands-on practices. Try to implement everything you learn by doing some kind of capstone project.
  • The most important key I RECOMMEND is, try to implement the machine learning algorithm by yourself from scratch using C++ or C or Java. This approach will give you all the internal understanding of math and logic behind it. You will TRULY learn the concept or algorithm in this way. Using python packages and calling built-in function is fine, but you will not understand it in depth using this approach. Escaping the coding part will not help here. You need to brainstorm and code in core languages to understand ‘how it is being coded internally?’ The key here is to understand data storage, structure, and flow concerning all the mathematical logic that’s happening in the background.

I have had this guide with some useful resources and recommendations to get you started.

(Please pardon me if some links are broken as I did edit it years back!)

Job search perspective for data scientists

As a fresher, it is tricky to get into the industry as a data scientist. Even after pursuing MS, I would say it is challenging. Usually, the data scientists who are working right now in the industry have 2–3 years of experience as a software developer or so. Industry requires minimal 1–2 years of experience. Some companies hire freshers but it is competitive to get into.

  • If you are fresher, do not EVER think that you can not get into the industry as a data scientist. Certainly, you CAN. You need to put some extra effort to make your profile better. You can either work during your masters in academics. You can take part in interesting data science projects at your graduate school under different professors. So working as a research scientist or assistant helps in developing a good profile. Try to find some internships in the domain of data science and machine learning. One can always work on research projects to publish their work in some well-known conference or journal. This will for sure increase your chances to get a call from good companies for data science-related positions.

Conclusion

Just as passionate as it looks 🙂

I am ‘pakka Amdavadi’ who is coming from a normal middle-class family. It was a tough decision for me to leave my city, my people, and everything behind. I left everything based on some uncertain things that I wanted to achieve. Knowledge is all that matters the most to me. I didn’t take any experience in India. But I did a lot of projects, research work, and practices to achieve my milestone. Now, here I am working as a data scientist. I am still growing, I am still learning.

I have invested this time to write my story so that you can get a sense of the challenges that you might face. Passion and dedication is the key, no matter how uncertain this world can be. Keep hanging in there and put all your efforts. I wish you all the best!

I hope this helps! I have tried to cover many things but If you have doubts then feel free to text me or email me. You can always reach out to me on Linkedin or Twitter.