Full Stack Deep Learning Steps and Tools

Source: Deep Learning on Medium

A summary of what I’ve learned from a course about Full Stack Deep Learning

Photo by Fatos Bytyqi on Unsplash

Hi everyone, How’s everything? Today, I’m going to write article about what I have learned from seeing the Full Stack Deep Learning (FSDL) March 2019 courses. It is a great online courses that tell us to do project with Full Stack Deep Learning. What I love the most is how they teach us a project and teach us not only how to create the Deep Learning architecture, but tell us the Software Engineering stuffs that should be concerned when doing project about Deep Learning.

When we do a Deep Learning project, we need to know what are the steps and technology that we should use. We need to know these to enhance the quality of the project. This will be useful especially when we want to do the project in a team. We do not want the project become messy when the team collaborates.

This article will focus on the tools and what to do in every steps of a full stack Deep Learning project according to FSDL course (plus a few addition about the tools that I know). There will be a brief description what to do on each steps. This article will only show the tools that I lay my eyes on in that course. Programming language that will be focused in this article is Python.


  1. Steps
  2. Planning and Project Setup
  3. Data Collection and Labeling
  4. Codebase Development
  5. Training and Debugging
  6. Deployment
  7. Conclusion
  8. Afterwords


Figure 1 : Step on doing Full Stack Deep Learning project

These are the steps that FSDL course tell us:

  1. Planning and Project Setup
  2. Data Collection and Labeling
  3. Training and Debugging
  4. Deploying and Testing

Where each of the steps can be done which can come back to previous step or forth (not waterfall). The course also suggest that we do the process iteratively, meaning that we start from small progress and increase it continuously. For example, we start using simple model with small data then improve it as time goes by.

Planning and Project Setup

This step will be the first step that you will do. We need to state what the project going to make and the goal of the project. We also need to state the metric and baseline of the project. The substeps of this step are as follow:

Figure 2 : Substeps of planning and project setup

Define Project Goals

First, we need to define what is the project is going to make. There are two consideration on picking what to make. They are are Impact and Feasibility.

We need to make sure that the project is Impactful. What are the values of your application that we want to make in the project. Two questions that you need to answer are

  1. Where can you take advantages of cheap prediction ?
  2. Where can you automate complicated manual software pipeline ?

Where for cheap prediction produced by our chosen application that we want to make, we can produce great value which can reduce the cost of other tasks.

Feasibility is also thing that we need to watch out. Since Deep Learning focus on data, We need to make sure that the data is available and fit to the project requirement and cost budget.

Figure 3 : The relation of project cost and required accuracy

We need to consider the accuracy requirement where we need to set the minimum target. Since the project costs will tend to correlate super linearly with the project costs, again, we need to considerate our requirement and maximum cost that we tolerate. Also consider that there might be some cases where it is not important to fail the prediction and some cases where the model must have a low error as possible.

Finally, we need to see the problem difficulty. How hard is the project is. To measure the difficulty, we can see some published works on similar problem. For example, search some papers in ARXIV or any conferences that have similar problem with the project. With these, we can grasp the difficulty of the project.

See Figure 4 for more detail on assessing the feasibility of the project.

Figure 4 : Assessing feasibility of ML project

Choose Metrics

Hasil gambar untuk accuracy precision recall
Figure 5 : example of metrics. src: https://towardsdatascience.com/precision-vs-recall-386cf9f89488

Metric is a measurement of particular characteristic of the performance or efficiency of the system.

Since system in Machine Learning work best on optimizing a single number , we need to define a metric which satisfy the requirement with a single number even there might be a lot of metrics that should be calculated. For a problem where there are a lot of metrics that we need to use, we need to pick a formula for combining these metrics. There are:

  1. Simple average / weighted average
  2. Threshold n-1 metrics, evaluate the nth metric
  3. Domain specific formula (for example mAP)

Here are some example how to combine two metrics (Precision and Recall):

Figure 6 : Combining Precision and Recall

Choose Baseline

Photo by Nik Shuliahin on Unsplash

After we choose the metric, we need to choose our baseline. Baseline is an expected value or condition which the performance will be measured to be compared to our work. It will give us a lower bound on a expected model performance. The tighter the baseline is the more useful the baseline is.

Figure 7 : Baseline

So why is the baseline is important? Why not skip this step ? We can measure our model how good it is by comparing to the baseline. By knowing how good or bad the model is, we can choose our next move on what to tweak.

To look for the baseline, there are several sources that you can use:

  1. External baseline , where you form the baseline from the business or engineering requirement. You can also use published work results as the baseline.
  2. Internal baseline , Use scripted baseline or create simple Machine Learning (ML) model such as using standard feature based word or using simple model.

The baseline is chosen according to your need. For example if you want a system that surpass human, you need to add a human baseline.

Setup Codebase

Create your codebase that will be the core how to do the further steps. The source code in the codebase can be developed according to the current need of what the project currently going to do. For example, if the current step is collecting the data, we will write the code used to collect the data (if needed). We will mostly go to this step back and forth.

We should make sure that the source code in the codebase is reproducible and scalable, especially for doing the project in a group. To make it happen, you need to use the right tools. This article will tell us about it later.

Data Collection and Labeling

Figure 8 : Tweet about poll time spent as data scientist

After we define what we are going to create, baseline, and metrics in the project, the most painful of the step will begin, data collection and labeling.

Most of Deep Learning applications will require a lot of data which need to be labeled. Time will be mostly consumed in this process. Although you can also use public dataset, often that labeled dataset needed for our project is not available publicly.

Here is the substeps:

Figure 9 : substeps of data collection and labeling


We need to plan how to obtain the complete dataset. There are multiple ways to obtain the data. One that you should be considered that the data need to align according to what we want to create in the project.


If the strategy to obtain data is through the internet by scraping and crawling some websites, we need to use some tools to do it. Scrapy is one of the tool that can be helpful for the project


This is a Python scrapper and data crawler library that can be used to scrap and crawl websites. It can be used to collect data such as images and texts on the websites. We can also scrap images from Bing, Google, or Instagram with this. To use this library, we need to learn from the tutorial that is also available in its website. Do not worry, it is not hard to learn.

After we collect the data, the next problem that you need to think is where to send your collected data. Since you are doing the project not alone, you need to make sure that the data can be accessed by everyone. Also, we need to choose the format of the data which will be saved. Below is a solution when we want to save our data in cloud.

Object Storage

For storing your binary data such as images and videos, You can use cloud storage such as AmazonS3 or GCP to build the object storage with API over the file system. We can also built versioning into the service. See their website for more detail. You need to pay to use it (there is also a free plan).


Database is used for persistent, fast, scalable storage, and retrieval of structured data. Database is used to save the data that often will be accessed continuously which is not binary data. You will save the metadata (labels, user activity) here. There are some tools that you can use. One that is recommended is PostgresSQl.

It can store structured SQL database and also can be used to save unstructured json data. It is still actively maintaned.

Data Lake

Figure 10 : Data lake pattern

When you have data which is the unstructured aggregation from multiple source and multiple format which has high cost transformation, you can use data lake. Basically, you dump every data on it and it will transform it into specific needs.

Figure 11 : Amazon redshift

Amazon Redshift is one of cannonical solution to the Data Lake.

When we are doing the training process, we need to move the data that is needed for your model to your file system.

The data should be versioned to make sure the progress can be revertible. The version control does not only apply to the source code, it also apply to the data. We will dive into data version control after we talk about Data Labeling.

Data Labeling

Figure 12 : Data labeling solution

In this section, we will know how to label the data. There are source of labors that you can use to label the data:

  1. Hire annotators by yourself
  2. Crowdsource (Mechanical Turk)
  3. Use full-service data labeling companies such as FigureEight, Scale.ai, and LabelBox

If you want the team to annotate it , Here are several tools that you can use:


Source : dataturks.com

Online Collaboration Annotation tools , Data Turks. For the free plan, it is limited to 10000 annotations and the data must be public. It offers several annotation tools for several tasks on NLP (Sequence tagging, classification, etc) and Computer Vision (Image segmentation, Image bounding box, classification, etc). The FSDL course uses this as the tool for labeling.


Free open source Annotation tools for NLP tasks. It also support sequence tagging, classification, and machine translation tasks. Can also be set up as a collaborative annotation tools, but it need a server.


Offline annotation tools for Computer Vision tasks. It is released by Intel as Open Source. It can label bounding boxes and image segmentations.

Public datasets

If you want to search any public datasets, see this article created by Stacy Stanford for to know any list of public dataset.

It is still actively been updated and maintaned.

Data Versioning

There are level on how to do data versioning :

  1. Level 0 : Unversioned. We should not attempt this. Deployment need to be versioned. If the data is not versioned, the deployed models are also not versioned. There will be a problem if we use unversioned one, which is inability to get back to previous result.
  2. Level 1 : Versioned via snapshot. We store all of the data used on different version. Which we could say that it is a bit hacky. We still need to version data as easy as version the code
  3. Level 2 : Data is versioned as a mix of code and assets. The heavy files will be stored in other server (such as Amazon S3) where there is a JSON or similar type as reference of the relevant metadata. The relevant metadata can contains labels, user activity, and etc. The JSON file will be versioned. The JSON files can become big. To make it easier, we can use git-lfs to version it. This level should be acceptable to do the project.
  4. Level 3 : Use specialized software solution for versioning the data. If you think that level 2 is not enough for your project, you can do this. One of tool for data versioning is DVC.


Source : https://dvc.org/

DVC is built to make ML models shareable and reproducible. It is designed to handle large files, data sets, machine learning models, and metrics as well as code. It is a solution for versioning ML models with its dataset. We can connect the version control into the cloud storage such as Amazon S3 and GCP.

Codebase Development

When we do the project, expect to write codebase on doing every steps. Reproducibility is one thing that we must concern when writing the code. we need to make sure that our codebase has reproducibility on it.

Before we dive into tools, we need to choose the language and framework of our codebase.

Programming Language

For choosing programming language, I prefer Python over anything else. Python has the largest community for data science and great to develop. The popular Deep Learning software also mostly supported by Python. The language is also easy to learn.

Deep Learning Framework

Figure 13 : DL Framework production and development diagram

There are several choices that you can made for the Deep Learning Framework. The most popular framework in Python are Tensorflow, Keras, and PyTorch. Use the one that you like.

For easier debugging, you can use PyTorch as the Deep Learning Framework. Keras is also easy to use and have good UX. It is a wrapper of Tensorflow, Theano and other Deep Learning framework and make it easier to use. Tensorflow is also a choice if you like their environment. Tensorflow can be wise decision because of the support of its community and have great tools for deployment.

Do not worry about the deployment. There is exists a software that can convert the model format to another format. For example, you can convert the model that is produced by Pytorch to Tensorflow. We will see this later.

I think the factor of choosing the language and framework is how active the community behind it. Since it will give birth of high number of custom package that can be integrated into it.

Version Control

One of the important things when doing the project is version control. When we do the project, we don’t want the inability to redo our code base when someone accidentally wreck it. We also need to keep track the code on each update to see what are the changes updated by someone else. This will not be possible if we do not use some tools do it. Git is one of the solution to do it.

Okay, we know that version control is important, especially on doing collaboration work. Currently, git is one of the best solution to do version control. It also be used to share your code to other people in your team. Without this, I don’t think that you can collaborate well with others in the project.

Source : GitHub

There are several services that you can use that use Git such as GitHub, BitBucket, and GitLab.

Code Review

Figure 14 : Code Review

There is also important thing that should be done, which is Code Review. Code reviews are an early protection against incorrect code or bad quality code which pass the unit or integration tests. When you do collaboration, make someone check your code and review it. Most of the version control services should support this feature.

Project Structure

Figure 15 : Example of folder structure. Source : https://drivendata.github.io/cookiecutter-data-science/

When we first create the project structure folder, we must be wondering how to create the folder structure. Then, we give up and put all the code in the root project folder. It’s a bad practice that give bad quality code.

One of the solution that I found is cookiecutter-data-science. It give a template how should we create the project structure. It also give how to give a name to the created file and where you should put it. Be sure to use it to make your codebase not become messy. Consider reading the website to use it.

Integrated Development Environment (IDE)

IDE is one of the tools that you can use to accelerate to write the code. It has integrated tools which can be useful for developing. There are several IDEs that you can use:


IDE that is released by JetBrains. This IDE can be used not only for doing Deep Learning project, but doing other project such as web development. Pycharm has auto code completion, code cleaning, refactor, and have many integrations to other tools which is important on developing with Python (you need to install the plugin first). It has nice environment for doing debugging. It can also run notebook (.ipynb) file in it.


Jupyter Lab is one of IDE which is easy to use, interactive data science environment tools which not only be used as an IDE, but also be used as presentation tools. The User Interface (UI) is best to make this as a visualization tools or a tutorial tools. We can make the documentation with markdown format and also insert picture to the notebook.

Personally, I code the source code using Pycharm. When I create some tutorials to test something or doing Exploratory Data Analysis (EDA), I use Jupyter Lab to do it. Just do not put your reusable code into your notebook file, it has bad reproducibility.

Continuous Integration

“Hey, what the hell !? Why I cannot run the training process at this version” — A

“Idk, I just push my code, and I think it works on my notebook.. wait a minute.. I got an error on this line.. I didn’t copy all of my code into my implementation” — B

“So why did you push !?!?” — A

Before we push our work to the repository, we need to make sure that the code is really works and do not have error. To do that, we should test the code before the model and the code pushed to the repository. Unit or Integration Tests must be done.

Unit tests tests that the code should pass for its module functionality. Integration tests test the integration of modules. It will check whether your logic is correct or not. Do this in order to find your mistakes before doing the experiment.

Source : circleci.com

CircleCI is one of the solution to do the Continuous Integration. It can do unit tests and integration tests. It can uses Docker Image (we will dive into it later) as a containerization of the environment (which we should use it) . The similar tools that can do that are Jenkins and TravisCI.

Here are several library that you can use if you want to test your code in Python:

pipenv check : scans our Python package dependency graph for known security vulnerabilities

pylint : does static analysis of Python files and reports both style and bug problems.

mypy : does static analysis checking of Python files

bandit : performs static analysis to find common security vulnerabilities in Python code

shellcheck : finds bugs and potential bugs in shell scripts ( if you use it)

pytest : Python testing library for doing unit and integration test

Write them into your CI and make sure to pass these tests. If it fails, then rewrite your code and know where the error in your code is.

Here is one of the example on writing unit test on Deep Learning System.

Figure 16 : Example of doing unit test in ML experiments

Custom Environment and Containerization

“Hey, I’ve tested it on my computer and it works well”

“What ? No dude, it fails on my computer ? How the hell it works on your computer !?”

Ever experienced that ? I have. One of the problem that create that situation caused by the difference of your working environment with the others. For example, you work on Windows and the other work in Linux. The difference of your library and their library can also be the trigger of the problem.

To solve that, you need to write your library dependencies explicitly in a text called requirements.txt. Then run Python virtual environment such as pipenv. This will solve the library dependencies. Nevertheless, it still cannot solve the difference of enviroment and OS of the team. To solve it, you can use Docker.

Source : docker.com

Docker is a container which can be setup to be able to make virtual environment. We can install library dependencies and other environment variables that we set in the Docker. With this, you won’t have to fear on having error that is caused by the difference of the environment. Docker can also be a vital tools when we want to deploy the application. It will force the place of the deployment use the desired environment.

To share the container, First, we need write all of the step on creating the environment into the Dockerfile and then create a DockerImage. It can be pushed into DockerHub. Then the other person can pull the DockerImage from DockerHub and run it from his/her machine.

To learn more about Docker, There is a good article that is beginner friendly written by Preethi Kasireddy.

Figure 17 is an example how to create the Dockerfile.

Figure 17 : Example of Dockerfile

Training and Debugging

Now we are in Training and Debugging step. This is the step where you do the experiment and produce the model. Here are the substeps for this step:

FIgure 18 : Substeps Training and Debugging

Start Simple

With your chosen Deep Learning Framework, code the Neural Network with a simple architecture (e.g : Neural Network with 1 hidden layer). Then use defaults hyperparameters such as no regularization and default Adam Optimizer. Do not forget to normalize the input if needed. Finally, use simple version of the model (e.g : small dataset).

Implement and Debug

To implement the neural network, there are several trick that you should follow sequentially.

  1. Get the model to run

The things that we should do is to get the model that you create with your DL framework to run. It means that to make sure no exception occurred until the process of updating the weight.

The exception that often occurs as follow:

  1. Shape Mismatch
  2. Casting Issue
  3. Out of Memory

2. Overfit a Single Batch

After that, we should overfit a single batch to see that the whether the model can learn or not. Overfit means that we do not care about the validation at all and focus whether our model can learn according to our needs or not. We do this until the quality of the model become overfit (~100%). Here are common issues that occurs in this process:

  1. Error goes up (Can be caused by : Learning Rate too high, wrong loss function sign, etc)
  2. Error explodes / goes NaN (Can be caused by : Numerical Issues like the operation of log, exp or high learning rate, etc)
  3. Error Oscilates (Can be caused by : Corrupted data label, Learning rate too high, etc)
  4. Error Plateaus (Can be caused by : Learning rate too low, Corrupted data label, etc)

3. Compare to a known result

After we make sure that our model train well, we need to compare the result to other known result. Here is the hierarchy of known result:

Figure 19 : Hierarchy of known results

We do this to make sure that our model can really learn the data and see the model is in the right track on learning the task. We will need to keep iterating until the model can perform up to expectation.


We will calculate the bias-variance decomposition from calculating the error with the chosen metric of our current best model. The formula of calculating the bias-variance decomposition is as follow:


irreducible error = the error of the baseline
bias = training error - iredducible error
variance = validation error - train error
validation overfitting = test error - validation error

Here is some example on implementing the bias-variance decomposition.

Figure 20 : Breakdown of test error

By knowing the value of bias, variance, and validation overfitting , it can help us the choice to do in the next step what to improve.

If the model has met the requirement, then deploy the model. If not, then address the issues whether to improve the data or tune the hyperparameter by using the result of the evaluation. Consider seeing what is wrong with the model when predicting some group of instances. Iterate until it satisfy the requirement (or give up).

Here are some tools that can be helpful on this step:

Version Control

Here we go again, the version control. Yep, we have a version control for code and data now it is time to version control the model. Here are the tools that can be used to do version control:


Source : Wandb.com

A version control of the model’s results. It has nice User Interface and Experience. Then, It can save the parameter used on the model, sample of the result of the model, and also save the weight and bias of the model which will be versioned. Furthermore, It can visualize the result of the model in real time. Moreover, we can also revert back the model to previous run (also change the weight of the model to that previous run) , which make it easier to reproduce the models. It can run anytime you want. It also scales well since it can integrate with Kubeflow (Kubernetes for ML which manages resources and services for containerized application).


Source : Losswise.com

It is also a version control to versioning the model. It also saves the result of the model and the hyperparameter used for an experiment in a real time. It can also estimates when the model will finish the training . It will train the model every time you push your code to the repository (on designated branch). It also visualizes the result of the model in real time.

Hyperparameter Optimization

When optimizing or tuning the hyperparameter such as learning rate, there are some libraries and tools available to do it. There are:

For Keras DL Framework : Hyperas,

For Pytorch DL Framework : Hypersearch

Others : Hyperopt

WANDB also offer a solution to do the hyperparameter optimization. You need to contact them first to enable it though.


The final step will be this one. The substeps are as follow:

Figure 21 : Substeps of Deploying & Testing

Pilot in production means that you will verify the system by testing it on selected group of end user. By doing that, we hope that we can gain a feedback on the system before fully deploy it. For Testing, There are several testing that you can do to your system beside Unit and Integration test, for example : Penetration Testing, Stress Testing, etc.

After we are sure that the model and the system has met the requirement, time to deploy the model. First of all, there are several way to deploy the model. There are :

  1. Web Server Deployment
  2. Embedded System and Mobile

Web Server Deployment

There are several strategies we can use if we want to deploy to the website. Before that, we need to make sure that we create a RESTful API which serve the predictions in response of HTTP requests (GET, POST, DELETE, etc). The strategies are as follow:

  1. Deploy code to cloud instances. scale by adding instances.
  2. Deploy code as containers (Docker), scale via orchestration. App code are packaged into Docker containers. Example : AWS Fargate.
  3. Deploy code as “Serverless function”. App code are packaged into zip files. The serverless function will manage everything . e.g : instant scale, request per second, load balancing, etc. It’s different from these two above, Serverless Function only pay for compute time rather than uptime. Example : AWS Lambda, Google Cloud Functions, and Azure Functions
Figure 22 : Amazon Lambda

Embedded System and Mobile

To deploy to the embedded system or Mobile, we can use Tensorflow Lite. It has smaller, faster, and has less dependencies than the Tensorflow, thus can be deployed into Embedded System or Mobile. Unfortunately it has limited set of operators.

There is also a tools called TensorRT. It optimized the inference engine used on prediction, thus sped up the inference process. It is built on CUDA. On embedding systems, NVIDIA Jetson TX2 works well with it.

On Apple, there is a tools called CoreML to make it easier to integrate ML System to the IPhone. There is also similar tools called MLKit which can be used to help deploying ML System to Android.


Source : https://onnx.ai/

ONNX (Open Neural Network Exchange) is a open source format for Deep Learning models that can easily convert model into supported Deep Learning frameworks. ONNX supports Tensorflow, Pytorch, and Caffe2 . It can mix different frameworks such that frameworks that are good for developing (Pytorch) don’t need to be good at the deployment and inference (Tensorflow / Caffe2).


Figure 23 : Example of monitoing in AWS

If you deploy the application to cloud server, there should be a solution of the monitoring system. We can set the alarm when things go wrong by writing the record about it in the monitoring system. With this, we will know what can be improved with the model and fix the problem.


In this article, we get to know the steps on doing the Full Stack Deep Learning according to the FSDL course on March 2019. First, we need to setup and plan the project. We need to define the goals, metrics, and baseline in this step. Then, we collect the data and label it with available tools. In building the codebase, there are some tools that can maintain the quality of the project that have been described above. Then we do modeling with testing and debugging. After the model met the requirement, finally we know the step and tools for deploying and monitoring the application to the desired interface.


Photo by Joshua Hoehne on Unsplash

That’s it, my article about tools and steps introduced by the course that I’ve learned. Why do I write this article ? I found out that my brain can easily remember and make me understand better about the content of something that I need if I write it. Moreover, In the process of my writing, I get to have a chance to review the content of the course. Furthermore, It can make me to share my knowledge to everyone. I am happy to share something good to everyone :).

I gain a lot of new things in following that course, especially about the tools of the Deep Learning Stacks. I also get to know how to troubleshoot model in Deep Learning since it is not easy to debug it. It also taught me the tools , steps, and tricks on doing the Full Stack Deep Learning. To sum it up, It’s a great courses and free to access. Therefore, I recommend it to anyone who want to learn about doing project in Deep Learning.

To be honest, I haven’t tried all the tools written in this article. The tools and its description that this article presents are taken from the FSDL course and some sources that I’ve read. You can tell me if there are some misinformation, especially about the tools.


I welcome any feedback that can improve myself and this article. I’m in the process of learning on writing and learning to become better. I apreciate a feedback to make me become better. Make sure to give feedback in a proper manner 😄.

See ya in my next article.

Source : https://cdn.pixabay.com/photo/2017/07/10/16/07/thank-you-2490552_1280.png