Fast and Reproducible Deep Learning

Original article can be found here (source): Deep Learning on Medium

Fast and Reproducible Deep Learning

ShopRunner’s open-source Creevey library is one tool for managing deep learning projects — it makes processing large datasets fast and easy.

There are endless resources for someone who wants to learn to train a deep learning model, but running a successful deep learning project requires managing many additional moving parts that are much less discussed. This talk contributes to filling that gap in our deep learning education resources.

Thanks to the Chicago ML Meetup for hosting.

Video

Slides

Abstract

Deep learning projects require managing large datasets, heavy-duty dependencies, complex experiments, and large amounts of code. This talk provides best practices for accomplishing these tasks efficiently and reproducibly. Tools that are covered include the Creevey library for processing large collections of files; pip-tools and nvidia-docker for managing dependencies; and MLflow Tracking for tracking experiments.

Additional Resources

Autofocus is a deep learning project that labels animals in images taken by motion-activated “camera traps.” It illustrates many of the ideas discussed in the talk.