Building an AutoML Tool That Anyone Can Use

Original article was published by Zarif Azher on Artificial Intelligence on Medium


Building an AutoML Tool That Anyone Can Use

Preface: The Github repo for the project can be found here, and the app itself is live here.

Machine learning and (it’s subset) Artificial Intelligence are becoming ubiquitous these days, and they’re being applied to solve new challenges in countless domains, from self-driving cars to medical imaging. To lower the barrier to entry of the field, tools are being developed to create machine learning models with little-to-no programming chops. I decided to put my skills to the test and create such an interface myself. Let’s talk about it!

The What

I set out to use Python to build a basic platform that allows a user to upload data in CSV or Excel format, and then apply basic machine learning algorithms such as Linear/Logistic Regression, Support Vector Machines, etc. It should be simple and easy to use, with no coding required. Also, the user should be able to download their model, to make use of it in the future.

The How

There are many popular Python frameworks such for creating web apps such as Flask and Django, but I decided to go with a relatively lesser-known and newer one — Streamlit. Streamlit is built specifically for Data Scientists and Machine Learning practitioners to quickly deploy their models and make visualizations. It is highly intuitive, and ‘magically’ takes care of the menial design aspects of developing a web app.

I used widgets such as the file uploader to take in user data allow them to select columns to be used as features/targets. Then, the user can select the type of model they would like to test, and see its results in a classification report and confusion matrix. Finally, they can choose to have their model emailed to them as a Python pickle. As you can imagine, Scikit-learn and Pandas were instrumental here. The email part, however, was quite challenging as I haven’t really done something like that in the past. Additionally, Gmail is pretty finicky about allowing ‘unknown’ sources to send emails, and sending them through SMTP/Python certainly causes them to fall into that category.

Finally, I deployed the project via Heroku, which was a relatively painless process once I grasped the basics.

Conclusion

While there are certainly bugs in the project (you can see a list on the Github repo), I think it is a great proof of concept to show how we are progressing towards an era where anybody can make use of machine learning. This will enable us to more efficiently solve countless problems around the world, and use tech to empower people. Although there are valid concerns about the field, with the proper regulation and care, the future is looking bright! That’s it for this post, and keep creating!

Find the project deployed at this link: https://stark-waters-72314.herokuapp.com/