Deep Learning- MLGit2Docker End to End Automation

Original article was published on Deep Learning on Medium

Deep Learning- MLGit2Docker End to End Automation

In recent days many companies decided to invest in machine learning for intelligence and insights. They are recruiting more talented Data scientists, ML Engineers to solve business problems using AI . If you are hired as a Data Scientist or ML Engineer , you will start analyzing business problem and come up with a model by solving all advanced x-science and y-math problems, tune it to get to the desired accuracy and will plan to deploy it in production environment.

Its looking very simple right. But in real case scenario it is pretty hard. Do you know as a recent report from Deeplearning.ai says “only 22 percent of companies using AI/Machine Learning have successfully deployed a model” .

Here comes the importance of MlOps . It is a compound of machine learning and operations . Similar to Devops approaches , Mlops looks to increase automation and improve the quality of production ML.

In these article , I will show you how to automate all the tedious task in Mlops cycle using step by step approach from Git to Deploying our final model in Flask application.

Workflow of this task:

  1. Initially we will start with Creating a docker image with preinstalled libraries , which are needed for the model creation , training, hyper parameter tuning using a Dockerfile . Using that image we will be launching the container for ML model deployment.
  2. As soon as the developer push the code to Github , pull the code from Github to Jenkins through poll SCM . It will automatically copy the code from Jenkins workspace to mlops folder which created in Red Hat Linux.
  3. Check the code , if the code contain DeepLearning model like CNN launch container having preinstalled keras or launch the respective container (E.g Scikit Learn). Train the model and get the accuracy
  4. Find the accuracy of the model. If the accuracy is less than the threshold tweak the architecture by tuning hyper parameters and retrain the model until it achieve the desired accuracy.
  5. If the desired accuracy is achieved send a mail to developer indicating the status and accuracy of the model. And Deploy the model in a container having preinstalled Flask and Launch the flask application for live.
  6. Monitor the model deployment , if the container stop due to any issue , relaunch the container from where it stopped previously.
Fig.1: Workflow Automation from Git to Flask

In the above architecture you can see the overview of complete task in a higher end. Here we are using Git, Jenkins, Docker and Flask to automate the task.

Why messing with all these theories. Lets start our hands dirty to automate the process.

Step 1: Create Dockerfile for DeepLearning and Machine Learning and Flask application image with preinstalled python and libraries

Fig.2: Dockerfile for krsconimg, sklconimg, flaskdocker images
  • After creation of Dockerfile build the docker image using below code snippet.
docker build -t krsconimg:v1 /root/myap/docker/
  • The above code will create a container image named krsconimg:v1 (Keras container for deeplearning) using dockerfile located at /root/myapp/docker/
docker build -t sklconimg:v1 /root/sklapp/docker/
  • The above code will create a container image named sklconimg:v1 (Sklearn container for Machinelearning) using dockerfile located at /root/sklapp/docker/
docker build -t flaskdocker:v1 /root/mlops/docker/
  • The above code will create a container image named flaskdocker:v1 (Flask container for deploying model) using dockerfile located at /root/mlops/docker/

Step 2 : Creation of Job 1 in Jenkins(Pull the model_build code from github pushed by developer)

Fig.3 : Job 1 Git pull
Fig.4 : Job 1 build
  • PollSCM trigger keep on checking the GitHub repository , if any changes in the code it will automatically pull the code and copy it to /root/mlops directory.
  • Here I am using mnist handwritten digits dataset for model creation due to its less weight and simplicity.

You can access the github repository by clicking the link

Step 3 : Job2 Checking copied code and launching respective container for model training

Fig.5 : Job2 Trigger
  • As you can see, the job2 will trigger only if Job1 succeeds
Fig.6 : Job2 build
  • Job2 will check the code , launch the respective container , train the model and copy the accuracy to file accuracy.txt
  • In the model_build.py code, we are using command line argument to tune the model.
Fig.7 : command line argument in model_build.py

Step 4: Job3 Compare the accuracy with threshold and tweak architecture if accuracy less than threshold

  • This job will trigger only if the job2 is successful.
Fig.8 : Job3 build

Note : It is the more vital part of this automation. Tweaking the model architecture by adding layers and changing the filters in CNN.

Here I have set the threshold as 97 % . If the accuracy is less than threshold , it will tune the hyper-parameter by adding conv2d layer and change the filter by changing the command line argument like (1,2,3 etc.,). Refer Fig . 7

Fig.9 : Post build of Job3
  • If Job3 fails , it will trigger Job6

Step 5: Job4 Send mail to developer stating the model accuracy ,status and copy the model.h5(final model) to /root/mlops/git2docker

Fig.10 : Job 4 build

Copy of mail

Fig.11: Copy of mail.py
  • When the model attains 97% accuracy, it will send a mail to developer like this
Fig.12 : Mail sent

After copying the model.h5 to gi2docker (Flask application folder) the folder tree structure will look similar like this.

Fig .13 : Git2docker folder

Step 6: Job5 Deploy the final model in Flask application container and make it live

Fig .14 : Job5 build
  • After deploying the flask application in container flaskv1 . We can access the application using URL http://127.0.0.1:24000/
Fig.16 : Git2docker app

When you upload the image and press the analyze your image button our app will predict the result and it will appear similar to this

Fig.17 : Predicted results

Step 7: Job6 to monitor the krsconv1 and launch it if fails due to any issue

  • Job6 will keep on monitoring the krsconv1 and relaunch the container if it stops.
Fig.18 Job6 build

Overall Job-flow will look like this which can be visualized using build pipeline

Fig .19: MLgit2docker_flask build pipeline

Wrapping up:

In these article we have automated mlops task using git,docker,jenkins and flask. These flow can be furthered enhanced using Kubernetes and tuning hyper-parameter and testing codes with more conditions in jenkins.

I hope you enjoyed this article. Please share your valuable feedback and ideas.

Keep Learning …..

Keep Sharing …..