Using Custom PyTorch Modules in FastAI

Source: Deep Learning on Medium


I’m several months deep into my deep learning journey, and I’ve been taking the excellent fastai course. This is the first in a series of posts aimed at beginners (like me!) that explore common stumbling blocks.

So let’s say you’re an aspiring young deep learning practitioner. You’ve been following along with the fastai course, trying your hand at some Kaggle competitions, and just generally having a blast. You’ve been mostly living (quite comfortably!) inside the fastai library so far, but you want to implement a few neural nets of your own, from scratch, to better understand how they work.

That’s exactly what we’re going to do in this post — move beyond using the default fastai modules, and see how we can easily swap in a custom model from PyTorch — while keeping all of the fastai data handling and training goodness.

1. Preparing the data

To keep things simple, we’re going to have our model predict a simple linear fit. We’ll feed our model x and let it predict y. Our training data will roughly fit y = mx + b , so in essence our model will be learning m and b.

Let’s generate that data, and stick it in a Pandas DataFrame:

Let’s take a look at what we’ve got:

Next, let’s stick our generated data into a Pandas DataFrame, and then create a fastai TabularDataBunch from that:

2. Creating our PyTorch module

Instead of using the default FastAI TabularModel, we’re going to create our own (similar) model in PyTorch.

Here’s what it will look like:

Linear Input -> ReLU -> BatchNorm -> Linear Output

In code, that’s just:

If you’re not that familiar with PyTorch, this is a great by-example primer.

A few things to note:

  • fastai passes both x_cont and x_cat— so our forward method needs to accept both as arguments. We’re just ignoring x_cat (it’s a dummy tensor of zeroes).
  • nn.Sequential — chains layers sequentially, which is a nice way to avoid nesting(lots(of(functions)))) in forward()

Let’s create an instance of our model, and see what it looks like:

LinearRegressionModel(
(layers): Sequential(
(0): Linear(in_features=1, out_features=10, bias=True)
(1): ReLU(inplace)
(2): BatchNorm1d(10, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Linear(in_features=10, out_features=1, bias=True)
)
)

Just for kicks, let’s compare this to what fastai would normally give us when we ask for a TabularModel:

TabularModel(
(embeds): ModuleList()
(emb_drop): Dropout(p=0.0)
(bn_cont): BatchNorm1d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(layers): Sequential(
(0): Linear(in_features=1, out_features=10, bias=True)
(1): ReLU(inplace)
(2): BatchNorm1d(10, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Linear(in_features=10, out_features=1, bias=True)
)
)

Except for the embedding layers and dropout(which we left off as we don’t have any categorical variables) and an extra batchnorm layer (which we left off for simplicity), it looks identical. Nice!

3. Creating our Learner

Next, let’s train our model and get some results, to make sure everything is working properly.

The fastai Learner class ties together some data and a model. We can simply create a new Learner with our TabularDataBunch and our custom LinearRegressionModel, like so:

learn = Learner(data, model);

and finally, we can train our model as we normally would:

learn.fit_one_cycle(10, 1e-1)

After 10 epochs we get a validation loss of 0.25, which seems pretty decent. Let’s plot the predictions vs. targets in our validation set:

Wrapping up

Even though this was an ultra-simplified example, you should now be comfortable creating your own models in PyTorch and plugging them in to your data manipulation and training pipeline in fastai.

So the next time you see a really interesting technique in a paper or a blog post, don’t wait for it to be added to fastai — try implementing it yourself!

-alonso