Organizing creative iterations of the machine learning process



Working on any machine learning problem is a process of creative iteration. Apart from software development it involves research, experimentation and result analysis, among other things.

Written in pseudo code it would read somewhat like this:

You build your model, calculate metrics on the validation data, dive into the results to identify where your model fails and reasearch ideas on how to improve it. Once those ideas are formed you proceed to build the next model and repeat the process utill you run out of time, budget or ideas.

Typically there could be tens or hundreds of ideas that you try during the lifetime of a machine learning project. Oftentimes those ideas are lost in the abyss of irreproducible code, terminal logs and missing configuration files.

Let’s take a minute of silence for those ingenious thoughts that once shone but are now gone forever.

There is a world of value in recording each creative iteration so that you can keep track of what was done, what worked and what didn’t.

What if there was an easy way of doing that?

You guessed it, there is 😃

Neptune was designed to ensure reproducible machine learning and some of the features that help with that are tags and filters . You can read about tracking your metrics and custom colums in another post but let’s not talk about those here. If you somehow haven’t heard about Neptune yet, go here and read about it.

Ok, so let’s get to it, shall we?

Use tags viciously

Adding tags is stupid simple, so not using this feature is really … well, just use it, ok?

You can either add them in the neptune.yamlconfig file (more info about config file here):

or add it in your terminal command (more info about the command line interface here):

or even do it in the browser:

Click on the dashboard entry, type your tag and press enter

Hard to find an excuse not to use it, huh?

Now, that we have the “how” out of the picture let’s talk about the “what” . What should be added via tags?

The answer is simple: tag with everything that is not a parameter. If you experiment with cyclic learning rates: add it to the tag, if you explore squeeze and excitation blocks: add it to the tag if you haven’t had your coffee yet: you guessed it, add it to the tag. It could, and likely will, be useful later when you filter and compare experiments. It is just a little work now and can save you a ton of time in the future. Your future self will thank you for it.

Why not tag with parameters? You could, but that information is readily available in columns, so there is simply no need to duplicate it. For more information about using parameters in columns read this post.

Okey, so we got “how” and “what”, let’s dive into the “why” , shall we?

Divide and … filter

Setting up tags and parameters is all well and good, but at the end of the day, you do that to enable easy filtering of your experiments.

Let’s see how to do those things in the Neptune dashboard.

Ad-hoc filtering can be done at the top panel where you can filter by status, time or tags. For example let’s choose the experiments with the loss design that have either succeeded or were aborted.

Use top panel for add-hoc filtering

Cool, we can now compare the results, or go even deeper and explore charts, code and hyperparameters of any given experiment in the group by simply clicking on it.

Sometimes however, you may want to create a more advanced filter, something that you would like to save for later use. Those can be defined on the left side in Custom filters.

For example while working on the Salt-Detection project (by the way you can explore this public project here) during one of the creative iterations we realized that our model is having a lot of trouble in detecting small objects that are close to the border of the image. We explored various ideas of dealing with this issue, like target dilation and loss function tailored for small objects. Let’s create a filter that shows our efforts in this area:

and here you go. New custom filter is added. We can always go back to this group of experiments and design new improvement ideas. It’s that simple.

As you probably noticed I have a lot of custom filters. That’s because I like to keep my work structured and easy to communicate. It’s like code documentation, if not for other people do it for your future self. It will save you a lot of wtfs.

Conclusion

Using Neptune tags and filters to structure your experimentation process can help you keep your ideas alive and vibrant. Knowing what you tried and how it worked is crucial in designing model improvements so tags and filters will probably be some of the Neptune’s features that you will use the most.

P. S.

If you want to follow the project I’ve just shared with you, where we are building a semantic segmentation algorithm for salt detection just go here .

Source: Deep Learning on Medium