Building Deep Learning Solutions in the Real World: Debugging and Interpretability

The implementation of large scale deep learning solutions in the real world is a road full of challenges. Many of those challenges are based on the fact that many of the tools and techniques we use during the lifecycle of a typical software application don’t apply in the deep learning space. As a result, data scientists and engineers are constantly trying to re-imagine solutions to problems that have been solved in traditional software development for decades. One of those areas that we often ignore and that can become a nightmare for data science teams is debugging.

What makes debugging so challenges in deep learning applications? The answer can be summarized in two main factors: unpredictability and the friction between interpretability-accuracy.


The unpredictable nature of deep learning programs is a result of its dynamic nature. While most software programs exhibit regular runtime patterns based on static code created by a programmer, the runtime behavior of deep learning applications is changing all the time. Let’s take the example of deep neural network that has been regularly achieving a 3.5% error rate. However, after retraining the model with a new dataset, the neural network exhibits an improved 3% error rate. For a data scientist, its almost impossible to determine whether the new behavior its optimal or not and what cause the improvement.

The Interpretability vs. Accuracy Friction

The interpretability/accuracy friction is one of those unfortunate dynamics that rules the current generation of deep learning technologies. Do you care about obtaining the best results or do you care about understanding how those results were produced? That’s a question that data scientists need to answer in every deep learning scenario. Many deep learning techniques are complex in nature and, although they result very accurate in many scenarios, they can become incredibly difficult to interpret.

In order to understand the behavior of deep learning programs, data scientists need to focus on two main tasks: improving interpretability and get really good at debugging 😉.

Practical Tips for Improving Interpretability

Interpretability is one of those elements of deep learning applications that is both broadly defined and difficult to quantify. However, there are some very practical methods that can we apply into deep learning programs to improve their interpretability. In a recent paper, researchers from Google proposed four fundamental elements that can improve the interpretability of deep learning models:

· Understanding what Hidden Layers Do: The bulk of the knowledge in a deep learning model is formed in the hidden layers. Understanding the functionality of the different hidden layers at a macro level is essential to be able to interpret a deep learning model.

Understanding what Hidden Layers Do: The bulk of the knowledge in a deep learning model is formed in the hidden layers. Understanding the functionality of the different hidden layers at a macro level is essential to be able to interpret a deep learning model.

· Understanding How Nodes are Activated: The key to interpretability is not to understand the functionality of individual neurons in a network but rather groups of interconnected neurons that fire together in the same spatial location. Segmenting a network by groups of interconnected neurons will provide a simpler level of abstraction to understand its functionality.

· Understanding How Concepts are Formed: Understanding how deep neural network forms individual concepts that can then be assembled into the final output is another key building block of interpretability.

Practical Tips for Deep Learning Debugging

The complex structure of deep neural networks and the lack of sophisticated tools makes the debugging of deep learning applications nothing short of a nightmare. However, there are a few practical tips that can help you to be more efficient when debugging deep learning programs:

1 — Visualize the Network and its Results

A pretty obvious point; When building a deep learning application, it is imperative to leverage tools that can help to visualize the connected graph and the results of the model based on certain inputs. This will give developers a visually intuitive way to reason through the model and try to understand the behavior of the algorithms.

2 — Analyze Training and Test Errors

The training and test errors in a deep learning model can offer helpful clues about potential problems before they occur. For instance, if a model is overfitting( test error is high) but the training error remains low, then is likely that there are errors in the algorithm. However, if the training error is high then the model is underfitting and we are likely to find an error in the training procedure.

3 — Test with Small Datasets

Building on the previous point; if a model is underfitting, we need to determine whether is a code or data defect. A way to achieve that is to test with a very small number of examples. If the model fails, then it is most likely due to issues with the code.

4 — Monitor Activations and Gradient Values

Keeping an eye on the activations of hidden units and the values of the gradients are essential measures to optimize a deep learning model. The number of node activations are an important metric to understand if a neural network is saturated. Similarly, getting a histogram view of the value of the gradients is a super helpful technique to understand the potential for future optimizations in the model.

Debugging and understanding deep learning programs feels unnatural to many mainstream software engineers. Improving the interpretability of deep learning architectures and setting up the right debugging processes are two of the factors that data science teams should consider implementing very early in the development lifecycle. As deep learning research evolves, the architecture of deep neural networks should become more interpretable and, consequently, easier to debug.

Source: Deep Learning on Medium