Source: Deep Learning on Medium

## Documenting my fast.ai journey: CODE REVIEW. PYTORCH DEEP DIVE PROJECT. TORCH.NN.MODULES.LINEAR CLASS.

For the Lesson 5 Project, I decided to dive deeper into the implement the **torch.nn.modules.linear** class.

We will use the Official PyTorch Documentation as a guide, more than I would like to admit, and cover some concepts we have learned during class.

This is the code: https://gist.github.com/SOVIETIC-BOSS88/1a9fcf31f9d17d756b930e71fec1079b.

### PyTorch Class: torch.nn.modules.linear.

#### Libraries.

Since we will do this in a Jupyter Notebook, first of all we will need to import the following libraries:

import math

import torch

from torch.nn.parameter import Parameter

#from .. import functional as F

#from .module import Module

from torch.nn import functional as F

from torch.nn.modules.module import Module

import numpy as np

Note that if we follow the torch documentation, the 3rd and 4th imports can cause the following **Python error**:

ValueError: attempted relative import beyond top-level package

If we make the imports more explicit, with the 5th and 6th statements, the errors disappears.

#### Source Code.

Having said that let’s jump straight into the source code:

class Linear(Module):

r"""Applies a linear transformation to the incoming data: :math:`y = xA^T + b`

Args:

in_features: size of each input sample

out_features: size of each output sample

bias: If set to False, the layer will not learn an additive bias.

Default: ``True``

Shape:

- Input: :math:`(N, *, in\_features)` where :math:`*` means any number of

additional dimensions

- Output: :math:`(N, *, out\_features)` where all but the last dimension

are the same shape as the input.

Attributes:

weight: the learnable weights of the module of shape

`(out_features x in_features)`

bias: the learnable bias of the module of shape `(out_features)`

Examples::

>>> m = nn.Linear(20, 30)

>>> input = torch.randn(128, 20)

>>> output = m(input)

>>> print(output.size())

"""

def __init__(self, in_features, out_features, bias=True):

super(Linear, self).__init__()

#super().__init__()

self.in_features = in_features

self.out_features = out_featuresself.weight = Parameter(torch.Tensor(out_features, in_features))

if bias:

self.bias = Parameter(torch.Tensor(out_features))

else:

self.register_parameter('bias', None)

self.reset_parameters()

def reset_parameters(self):

stdv = 1. / math.sqrt(self.weight.size(1))

self.weight.data.uniform_(-stdv, stdv)

if self.bias is not None:

self.bias.data.uniform_(-stdv, stdv)

def forward(self, input):

return F.linear(input, self.weight, self.bias)

def extra_repr(self):

return 'in_features={}, out_features={}, bias={}'.format(

self.in_features, self.out_features, self.bias is not None)

Here we have 4 function definitions.

- def __init__:
**,**is just an**__init__**, initializing our variables. - def reset_parameters:
- def forward:
**linear transformation to our data**. - def extra_repr:
**inputs**,**outputs**and if we have added a**bias**parameter.

Now let’s dig a little more deeper into the **1st** and **3rd** function declarations.

#### PyTorch Function: def __init__.

Let’s check our initialization function.

def __init__(self, in_features, out_features, bias=True):

super(Linear, self).__init__()

#super().__init__()

self.in_features = in_features

self.out_features = out_featuresself.weight = Parameter(torch.Tensor(out_features, in_features))

Observe that our **weights** is an iterator over the **output** and **input** parameters that we have specified. To be more specific, the **Parameter** is a **Tensor** subclass, which gets added to the **Module**’s parameters.

This is important, since now it will appear in the Module’s parameters() function, which return an iterator, that will be passed to an optimizer. For further information on iterators check out this great stackoverflow answer.

#### PyTorch Function: def forward.

Now lets take a look at our forward function.

def forward(self, input):

return F.linear(input, self.weight, self.bias)

A we can see, it is calling the F.linear function. If we check it’s source code we can see that it simply applies a linear transformation to the data we passed to it.

def linear(input, weight, bias=None):

r"""

Applies a linear transformation to the incoming data: :math:`y = xA^T + b`.

Shape:

-Input::math:`(N, *, in\_features)` where `*` means any number of

additional dimensions

- Weight: :math:`(out\_features, in\_features)`

- Bias: :math:`(out\_features)`

- Output: :math:`(N, *, out\_features)`

"""

ifinput.dim()== 2 and bias is not None:

# fused op is marginally faster

return torch.addmm(bias, input, weight.t())output= input.matmul(weight.t())

if bias is not None:

output += bias

return output

First, it checks the **dimensions** of the **input tensor**.

If the dimensions are **equal to 2**, i.e., if we are dealing with a **matrix**, we can apply our linear transformation directly, with the torch.addmm function,

torch.addmm(beta=1,mat,alpha=1,mat1,mat2,out=None) → Tensor

which grabs the **second (input)**and **third matrices** **(weight.t()) **that we passed to it and performs a matrix multiply. The product we obtain is then added to the **first matrix (bias).**

Otherwise, we can perform a simple matrix multiplication, with the torch.matmul function,

torch.matmul(tensor1,tensor2,out=None) → Tensor

which is the product of our **input tensor** and the **transposed weight tensor**. And finally we add the the **bias tensor **to our output.