On Structuring Your Machine Learning Code

Original article was published on Artificial Intelligence on Medium

On Structuring Your Machine Learning Code

Structuring is essential in any project, and it’s same for machine learning. Structured machine learning code saves time and has number of other benefits,

  1. self-describing.
  2. easier to maintain.
  3. easier to communicate with peers.
  4. improves the code reusability.

In this article, we’d discuss how we can use LabML Configs to structure our machine learning code. Moreover, we’d be using a simple MNIST example and PyTorch to show how this works.

You can find the complete code with this link.

If you are new to LabML, find the LabML documentation here and check our previous articles on how LabML can help you to organize your machine learning projects.

In this tutorial, our focus is not on model implementation. Instead, we’d discuss how to use the LabML Configs to structure our machine learning code. Here is the code for the model implementation.

Configs Class

Now let’s have a look at how we can use BaseConfigs and define our won configurations.

Above, we have defined the Configs Class with a few parameters with their respective types. This class should be inherited from BaseConfigs.

If you are not familiar with typing in Python, check here. Typing helps to write nice and clean code.

We have already defined the values for some parameters like epochs , seed , learning_rate etc. For the rest, we have only defined the type (model , optimizer etc). Intuitive is that we cannot define those with just one line of code. Therefore, it’s much cleaner to define them separately.

Adding Separate Configs

We can add the device as a separate method. With the BaseConfigs.calc() decorator, LabML identifies and add it to configs in run time.

Isn’t it interesting? much cleaner way to separate lengthy pieces of the code. Let’s go ahead and add the model and seed configs just same as above.

Reusable Configs

Reusable configs are great. We can easily reuse our existing configs to define new configs with LabML Configs .

Here, we have defined the LoaderConfigsby inheriting BaseConfigs .

Next, we have defined the data_loaders method. Finally, we inherit the Configs Class from LoaderConfigsinstead of BaseConfigs . Therefore, we can now have all the parameters and definitions of LoaderConfigs inside Configs Class .

Note that we can define LoaderConfigs once and reuse in any of our projects.

Multiple Configs

Consider the following scenarios,

  1. we have multiple configs defined for a parameter since we are not too sure, which one to go with.
  2. we reuse previously defined Configs Class or we maintain a single Configs Class for all our projects with multiple configs defined for a parameter.

LabML Configs let we have multiple configs defined for a parameter and select a specific config.

Let’s explore the idea by using the above example.

Let’s say we want to try both sgd_optimizer and adam_optimizer, then we can have both the optimizers defined and select only one optimizer in the experiment.

we just need to,

  1. select the preferred optimizer .
  2. configure the optimizer parameter in Configs Class with the name of the selected optimizer method.

Model Training

Here is the rest of the code for training our model. We ‘d not discuss the code in this tutorial.

Experiment

We can create an experiment with create() .

In calculate_configs() , we need to pass the Configs() object. Moreover, there are two other optional arguments that we can pass.

  • configs_override : a dictionary of configs to be overridden.

Here we override the optimizer from adam_optimizer to sgd_optimizer .

  • run_order : a list of configs to be calculated and the order in which they should be calculated incalculate_configs() . If not provided, all configs will be calculated.

In the above code snippet, set_seed will be calculated first.

Following is the output when we run the above code snippet. LabML Configs will print this nice logs for you without any additional line of code.

print of configs on the screen

Here is the complete code for this tutorial.

LabML includes few other useful modules, which organize our machine learning experiments.

  1. tracker : track model metrics.
  2. monitor : monitor experiments.
  3. experiment : create experiments, save and load model checkpoints.
  4. analytics : Python API to get experiment metrics and plot interactive charts.
  5. logger : nice screen logs.

you can check the full documentation here.