Build your own deep learning box – Benchmarking

Source: Deep Learning on Medium

Build your own deep learning box – Benchmarking

A trilogy in four parts

After installing the hardware and software, it is time to figure out how much power you really have. Was the blood and coin really worth it?

Short answer: A definite maybe.

Benchmark Scripts

The first benchmark script is a copy of a Fast-AI tutorial that trains a ResNet model to recognize numbers from the MNIST image data set. I chose this example as it is well-known, uses a small model (~41,000 trainable parameters), runs quickly and has a good mix of CPU and GPU workloads.

The second benchmark script is a copy of a Fast-AI tutorial that trains a UNet model to segment images from the CAMVID image data set. I chose this example as it uses a larger data volume (60MB), a much bigger model size (~20 million trainable parameters), runs for a longer duration and generates heavy CPU and GPU workloads.

The scripts have been modified to capture training logs in Weights And Biases ( helps capture training metrics, parameters, progress CPU, GPU, memory utilization and even system temperature across runs and makes it available in easy-to-read charts.

System Configurations

The benchmarking scripts were executed in each of the following environments (actually the CAMVID script was simply too slow on MacBook Pro and so aborted):

Comparison of system configurations for benchmarking


Charts comparing the run times on each environment are shown below.

Self-built GPU clearly out-performed Colab and Kaggle for small model / dataset
Self-built GPU is a bit slower than Colab and Kaggle for very large model / dataset

Here is my take on the results;

  • For small models and datasets, it appears that self-built GPU is a clear winner.
  • Larger models utilize more GPU memory and so Kaggle and Colab kernels are at a signficant advantage there. Even so run times are comparable.
  • Not all problems are going to come neatly packaged like a fast-ai tutorial. Real life will involve blundering about with models and hyper-parameters. That implies running many many training sessions with long run times.
  • So in balance, given my use case, the convenience and configurability appears to outweigh the cost.

I do intend to keep this article updated with my ongoing experience using the custom build. Stay tuned.