Source: Deep Learning on Medium
So you are building your new Deep Learning workstation to perform some state-of-the-art computations and run really deep and sophisticated models, but you are indecisive as to which GPU to go for, or you already have a set of GPUs that you are planning to use, but need to know just how efficient are these when compared to what’s out there. In this blog post, I plan to present to you an app that will solve both of these problems to you, with no cost associated.
Deep Learning is a field that requires some serious computational power, and by using a CPU, you might spend weeks training your model, while a strong GPU would finish the job during the day. This is mainly because of the difference between these two pieces of hardware regarding the design, as we shall see in a minute when we discuss the different types of HW used for Deep Learning, but for now it’s just good to bear in mind that more efficient hardware will mean not only faster training experiences, but also more room for model tuning and algorithms testing, that will make your life as a Deep Learning developer a lot easier.
Types of Hardware
If we are going to discuss what are the best pieces of hardware to perform deep learning tasks, we should first take a look at the different types, the following diagram shows the classification breaking it down to four classes.
As we can see in the previous diagram, general purpose hardware category splits into Central Processing Units (CPU) and Graphic Processing Units (GPU). The former is specifically designed to be latency oriented, this means it should be able to do complicated big tasks, one after the other, just like a big elephant. As for the GPU, this one is throughput oriented, which implies it specializes in performing many many small dumb tasks simultaneously, resembling a group of small ants.
Field Programmable Gate Arrays (FPGA) is a special piece of hardware that allows for programmable logic, this means that the developer can design the hardware structure of the device several times to implement a particular application. This might really come in handy if you want to try out new ideas and prototypes, and its performance increases relative to the general purpose hardware as long as the design is efficient enough.
Application Specific Integrated Circuits (ASIC) are much rarer to come by, it implies someone took the job of carefully designing the hardware that solves the problem at hand and printed the circuit, so this hardware would only make sense when used for that application. Google’s Tensor Processing Units (TPU) are a state of the art ASIC circuit. Although ASICs turn out to be faster than FPGAs, they are harder to obtain and assemble into our deep learning workstations.
The Deep Learning Bench Tools Application focuses on the General Purpose hardware, as it is by far the most repeatedly used.
Suppose you just bought your Graphics Card(s) and plugged it into your motherboard, expecting to run some next level algorithms very fast. It would be very useful if you had a tool that told you how fast is the combination of your CPU with the Graphic Processing Units at your disposal, and that on top of it let you compare the results to other deep learning workstations around the globe to see if you’re happy where you stand. Well, look no more, DLBT is the answer.
This hardware bench tool automatically recognizes the Machine Learning capable hardware in your computer, this might be just the CPU, in case you have no GPU, or you haven’t installed the required drivers (if this is the case, we walk you through how to do this, line by line), or it may be multiple GPUs, in which case you have the choice of where to run the benchmark models.
In its current version, the DLBT app is running a Convolutional Neural Net, with a standard structure in the background, while taking note of how long an episode lasts, as well as splitting this time into the prediction time and the back-propagation time for more advanced users.
The structure of the model used, might be seen in the following image.
As a future update, we are currently working on extending this feature into multiple known benchmarks having to do with Recurrent Neural Networks, Natural Language Processing, etc.
Obtaining the rating
How to measure exactly how effective is the device running? We use the formula displayed below. Intuitively, it would be better for the ratings to increase as the hardware efficiency rises. The K scaling factor serves the purpose of spreading the results more to allow for better comparison.
This application has been run on many GPUs to measure their performance running the model explained previously, the following table depicts some of the results thrown by the app. In here you will find many more results from other pieces of hardware.
There you have it, you just discovered an easy way to measure your hardware performance, without writing a single line of code. DLBT is a GUI application that automatically detects your GPUs, lets you monitor them and run deep learning benchmarks to compare their performance to the standards.
Anyone can download the app and test their hardware, check here