FAST.AI JOURNEY: PART 1. LESSON 5.

Source: Deep Learning on Medium


Documenting my fast.ai journey: CODE REVIEW. PYTORCH DEEP DIVE PROJECT. TORCH.NN.MODULES.LINEAR CLASS.

For the Lesson 5 Project, I decided to dive deeper into the implement the torch.nn.modules.linear class.

We will use the Official PyTorch Documentation as a guide, more than I would like to admit, and cover some concepts we have learned during class.

This is the code: https://gist.github.com/SOVIETIC-BOSS88/1a9fcf31f9d17d756b930e71fec1079b.

Courtesy of: https://pytorch.org/.

PyTorch Class: torch.nn.modules.linear.

Libraries.

Since we will do this in a Jupyter Notebook, first of all we will need to import the following libraries:

import math

import torch
from torch.nn.parameter import Parameter
#from .. import functional as F
#from .module import Module
from torch.nn import functional as F
from torch.nn.modules.module import Module
import numpy as np

Note that if we follow the torch documentation, the 3rd and 4th imports can cause the following Python error:

ValueError: attempted relative import beyond top-level package

If we make the imports more explicit, with the 5th and 6th statements, the errors disappears.

Source Code.

Courtesy of: https://code.fb.com/ai-research/announcing-pytorch-1-0-for-both-research-and-production/.

Having said that let’s jump straight into the source code:

class Linear(Module):
r"""Applies a linear transformation to the incoming data: :math:`y = xA^T + b`
Args:
in_features: size of each input sample
out_features: size of each output sample
bias: If set to False, the layer will not learn an additive bias.
Default: ``True``
Shape:
- Input: :math:`(N, *, in\_features)` where :math:`*` means any number of
additional dimensions
- Output: :math:`(N, *, out\_features)` where all but the last dimension
are the same shape as the input.
Attributes:
weight: the learnable weights of the module of shape
`(out_features x in_features)`
bias: the learnable bias of the module of shape `(out_features)`
Examples::
>>> m = nn.Linear(20, 30)
>>> input = torch.randn(128, 20)
>>> output = m(input)
>>> print(output.size())
"""
def __init__(self, in_features, out_features, bias=True):
super(Linear, self).__init__()
#super().__init__()

self.in_features = in_features
self.out_features = out_features
self.weight = Parameter(torch.Tensor(out_features, in_features))

if bias:
self.bias = Parameter(torch.Tensor(out_features))
else:
self.register_parameter('bias', None)
self.reset_parameters()
def reset_parameters(self):
stdv = 1. / math.sqrt(self.weight.size(1))
self.weight.data.uniform_(-stdv, stdv)
if self.bias is not None:
self.bias.data.uniform_(-stdv, stdv)
def forward(self, input):
return F.linear(input, self.weight, self.bias)
def extra_repr(self):
return 'in_features={}, out_features={}, bias={}'.format(
self.in_features, self.out_features, self.bias is not None)

Here we have 4 function definitions.

  1. def __init__: The first, is just an __init__, initializing our variables.
  2. def reset_parameters: Next, we reset our parameters, as our weight and our bias.
  3. def forward: In the third function we apply the linear transformation to our data.
  4. def extra_repr: Finally, we output a representation of inputs, outputs and if we have added a bias parameter.

Now let’s dig a little more deeper into the 1st and 3rd function declarations.

PyTorch Function: def __init__.

Let’s check our initialization function.

def __init__(self, in_features, out_features, bias=True):
super(Linear, self).__init__()
#super().__init__()

self.in_features = in_features
self.out_features = out_features
self.weight = Parameter(torch.Tensor(out_features, in_features))

Observe that our weights is an iterator over the output and input parameters that we have specified. To be more specific, the Parameter is a Tensor subclass, which gets added to the Module’s parameters.

This is important, since now it will appear in the Module’s parameters() function, which return an iterator, that will be passed to an optimizer. For further information on iterators check out this great stackoverflow answer.

PyTorch Function: def forward.

Now lets take a look at our forward function.

def forward(self, input):
return F.linear(input, self.weight, self.bias)

A we can see, it is calling the F.linear function. If we check it’s source code we can see that it simply applies a linear transformation to the data we passed to it.

def linear(input, weight, bias=None):
r"""
Applies a linear transformation to the incoming data: :math:`y = xA^T + b`.
Shape:
- Input: :math:`(N, *, in\_features)` where `*` means any number of
additional dimensions
- Weight: :math:`(out\_features, in\_features)`
- Bias: :math:`(out\_features)`
- Output: :math:`(N, *, out\_features)`
"""
if input.dim() == 2 and bias is not None:
# fused op is marginally faster
return torch.addmm(bias, input, weight.t())

output = input.matmul(weight.t())
if bias is not None:
output += bias
return output

First, it checks the dimensions of the input tensor.

If the dimensions are equal to 2, i.e., if we are dealing with a matrix, we can apply our linear transformation directly, with the torch.addmm function,

torch.addmm(beta=1, mat, alpha=1, mat1, mat2, out=None) → Tensor

which grabs the second (input)and third matrices (weight.t()) that we passed to it and performs a matrix multiply. The product we obtain is then added to the first matrix (bias).

Otherwise, we can perform a simple matrix multiplication, with the torch.matmul function,

torch.matmul(tensor1, tensor2, out=None) → Tensor

which is the product of our input tensor and the transposed weight tensor. And finally we add the the bias tensor to our output.