Original article was published by Suraj Regmi on Deep Learning on Medium
Why Do You Need Machine Learning?
Where does machine learning merit?
The next time I went to him asking for professional advice — he was already a software engineering professional — he recommended me to learn Python, again. Now, having been graduated and unemployed, I didn’t have much room to wander here and there, so I took his advice seriously, just as how he said. Thanks to his advice and guidance all over the years, I was able to switch completely to Python doing web development in Django, machine learning in Sklearn, data preprocessing in Pandas, deep learning in TensorFlow, and GUI programming in tkinter. He was an invaluable mentor to me who made a difference in my life.
I was doing the tasks he assigned me to do — like resizing images using PIL, using file operations, understanding the MVT concept of Django, scraping the websites with Scrapy, and so on. It was not an easy ride for me, everything was new, and it took me months to understand what is Python, what is going on with libraries, and other nitty-gritty.
One day, he was sharing with me about the work he was doing. He told me how he used naive bayes classifier to classify names into gender. I found it interesting and inquired more about that. That was when I got exposed to machine learning and realized my interest. I had a question on why machine learning was absolutely necessary for gender classification and he answered computers are a lot better than humans in finding the patterns. That was a eureka moment for me.
Why Machine Learning?
Machine learning is not the panacea. It can not solve all the problems. One machine learning model may not generalize well if the training data distribution is different from the testing data distribution. Let’s see where machine learning merits.
- Finding patterns
I find Nepali names quite distinguishable as male or female. I found most of the names ending with “i” or “a” as female. Similarly, male names seem to end with a consonant letter. This is the pattern I have found based on my experience with Nepali names data. At first, it is a pain to find all the patterns myself. Secondly, the data might have a richer experience than me. So, rather than trying to investigate all the patterns and spending too much time coding the patterns, it is way better to have a machine learn the patterns. It saves time, the code is better to maintain and there is a high chance it finds more patterns than a human.
- Adjusting to new patterns
The distribution of data is subject to change. The names my parents and grandparents would give to their children might be different from the names I or my contemporaries would give. It is not possible for me to check the new patterns continuously, so, I make use of machine learning in this scenario. The training is done on some time interval which takes care of new patterns.
- A new paradigm of solving problems
I am Nepali, familiar with Nepali names, so I know where to look for the patterns. The domain knowledge of Nepali names to me makes pattern-finding easier. But, the names in other languages/cultures might be different and I might not have a single clue for where to look to find the patterns. The pattern finding in such names which otherwise would seem impossible for me can be done with machine learning. It is more so the case when the data is complex, for example, image data. The task of image classification, where images can be widely varied, is impossible without using machine learning techniques.
- Discovering hidden, new patterns
We, humans, can learn from what the machine learning model learned. This way, we discover new patterns that were not known to us. For example, in spam filtering, a machine learning model can help us know what words/phrases make an email spam/non-spam. Similarly, clustering algorithms help us know which features make data/clusters disparate. Another example in computer vision would be the class activation map. Class activation maps can help us know what region the machine learning model thinks important for classification. Such insights can be helpful for humans to learn if any new hidden patterns.
These are the capabilities of machine learning if provided with enough and quality data. Machine learning is the most powerful technique to exploit pattern finding. Each merit of machine learning is immensely important. Many problems which were thought unsolvable are starting to find solutions, thanks to machine learning. Traditional techniques are being outperformed in each and every metric, all credit to machine learning. My mentor shed some light on why machine learning is the right technique to do gender classification from the names. Hopefully, this article gave you some ideas about the capabilities of machine learning.