Original article was published on Deep Learning on Medium
Things I Wish I Had Known:
How to Build a Machine Learning Web App Without REST API
So a few weeks back, one of my friends was working on this project for object detection using OpenCV and she was supposed to give it a client-side. She struggled a lot between GET/POST requests and eventually gave up on the project.
We’ve all been there at some point, where we struggled to integrate our ML models to our apps. That was the moment when I realised that many Data Science/ Machine Learning enthusiasts might not be aware of this boon, to the industry, called Sreamlit!
So…What is it?
Streamlit is an open-source app framework that helps data scientists and machine learning engineers to create beautiful, performant apps in only a few hours! All in pure Python. All for free.
And I should use it because?
✓ It is a minimal framework and currently the fastest way to build powerful data/machine learning apps!
✓ Enables a Machine Learning enthusiast to deploy without using Flask, Django, or other tools.
✓ I don’t have to worry about those callbacks anymore!
✓ I don’t have to worry about those HTML Tags anymore!
✓ It’s simplified Data Caching speeds up computation pipelines!
Let’s Get started instantly!
Streamlit hunts for changes on each save and updates the app instantly while you are coding. Code runs from top to bottom, always from a fresh state, and with absolutely no need for callbacks. It’s an effortless and robust app model that lets you build affluent UIs incredibly quickly.
All you need to do is the following:
And Hurray! In the next few seconds, the sample app will open in a new tab in your default browser.
So without any further ado, let’s get started and make a fully functional Multiple Classifier Machine Learning Web App.
Loading and Splitting our Dataset
In this station, we load our mushrooms dataset from the
mushrooms.csv file and split the dataset into the usual 7:3 (70% for training and 30% for testing). It is a binary classification dataset on whether a mushroom is Edible or Poisonous. You can find the dataset here.
@st.cache() is a caching mechanism that allows our web app to stay responsive even when loading data from the web, manipulating large datasets, or performing expensive computations.
So every time, you mark a function with the
@st.cache decorator, it tells Streamlit, “Hey you need to check a few things!” like:
- The input parameters of the function
- The value of any external variable used in the function
- The body of the function and
- The body of any function used inside the cached function.
Streamlit keeps a record of changes in these components through hashing. Think of the cache as a simple in-memory key-value store, where the key is a hash of all of the above and the value is the actual output object passed by reference.
So we had imported two simple Curves and a Confusion Matrix from sklearn library. We pass
plot_metrics, which will be one or more of the client selected options. The respective functionalities of each evaluation metrics are as follows:
Glad you could make it till here! Now we’ve come to the most exciting part. We will give our client, options to choose between multiple classifiers!
For this application, we are going to employ Support Vector Machine(SVM), Logistic Regression and Random Forest Classifier.
Support Vector Machine(SVM)
In a Support Vector Machine, we always aim to increase the margin. Always remember, larger the value of Regularisation parameter(C), smaller will be the margin.
If the client chooses the SVM, they will have an option of customizing hyperparameters like Regularisation parameter(C), Kernel and the Kernel Coefficient, through built-in widgets like radio buttons, multi-select and number input.
After customizing the hyperparameters, we are ready to train our SVM by clicking the Classify button on the sidebar.
Logistic Regression is a Machine Learning classification algorithm that is used to predict the probability of a categorical dependent variable.
If the client chooses the Logistic Regression, they will have an option of customizing hyperparameters like Regularisation parameter(C), Number of iterations, through built-in widgets like slider and number input
Random Forest Classifier
Random forests Classifier is considered as a highly accurate and robust method because of the number of decision trees participating in the process. It creates decision trees on randomly selected data samples, gets a prediction from each tree and selects the best solution through voting.
If the client chooses the Random Forest Classifier, they will have an option of customizing hyperparameters like No. of trees in the forest, Max. Depth of a tree and Bootstrap samples when building trees through built-in widgets like radio buttons and number input
You will have something like this on your browser:
Now you could see the real power of Streamlit. So someone with no formal machine learning or coding background can use your web app simply with touch and click controls to train models and see how different classifiers perform.
There were only a few built-in functions that were put into use for this basic project. You can always explore more and extend the functionality of your Machine Learning Web App.
For the Full code of this project, you can refer my GitHub.
Stay Safe Until Next Time:3