PyTorch — A General Introduction

Source: Deep Learning on Medium

PyTorch — A General Introduction

Pytorch is an open source deeplearning framework developed by Facebook. It is very interactive like Python and it is getting very popular in deeplearnig community. Easy to learn, easy to implement.

It has strong GPU support as result models run faster. Basics of Pytorch are tensors which are very similar to numpy. Tensors multiplication becomes very fast in GPU (Graphics Processing Units).

Pytorch has already many algorithms implemented. It becomes very easy to use these algorithms on any dataset.

Matrix Multiplication

Matrix Multiplication

Matrix multiplication is very important in neural network. We can say that the basics of neural network is matrix multiplication since at every step we need to multiply tensors which are also a type of matrix

Pytorch Comparison with Numpy

import torch
import numpy as np
print("Tensor :","\n")torch.tensor([[2,3,5],[1,2,9]])Tensor :tensor([[2, 3, 5],
[1, 2, 9]])
print ("Numpy array :", "\n")np.array([[2,3,5],[1,2,9]])Numpy array :array([[2, 3, 5],
[1, 2, 9]])
  • Tensor and Numpy array looks similar
  • Lets generate some random array
torch.rand(2,2)tensor([[0.7703, 0.4446],
[0.0589, 0.2118]])
np.random.rand(2,2)array([[0.56424161, 0.08179224],
[0.57521072, 0.74610146]])

Let’s look some code in action

Before that Tensor In mathematics, a tensor is an algebraic object that describes a linear mapping from one set of algebraic objects to another. Objects that tensors may map between include, but are not limited to, vectors and scalars, and, recursively, even other tensors. Wikipedia

Creating tensors in PyTorch

Random tensors are important in developing and training a neural networks. Parameters of the neural networks are mostly initialized with random weights which are tensors ( random tensors).

Let us start building tensors in PyTorch. Tensors are arrays with an arbitrary number of dimensions, corresponding to numpy ndarrays. Lets us create a random tensor of sizes 3 by 3 and set it to variable first_tensor and calculate its size in variable tensor_size and print its value.


  • Import PyTorch main library.
  • Create the variable first_tensor and set it to a random torch tensor of size 3 by 3.
  • Calculate its shape (dimension sizes) and set it to variable tensor_size.
  • Print the values of first_tensor and tensor_size.
# Import torch
import torch
# Create random tensor of size 3 by 3
first_tensor = torch.rand(3, 3)
# Calculate the shape of the tensor
tensor_size = first_tensor.shape
# Print the values of the tensor and its shape
tensor([[0.1159, 0.4619, 0.7615],
[0.4279, 0.0205, 0.2362],
[0.0089, 0.1945, 0.1318]])
torch.Size([3, 3])

Matrix multiplication

Matrix are the base or building block of tensors. Let’s explore some important matrices:

  • matrices of ones where each entry is set to 1

and the

  • identity matrix where the diagonal is set to 1 while all other values are 0

The identity matrix is very important in linear algebra: any matrix multiplied with identity matrix will result into the original matrix.

Let us experiment with these two types of matrices. First build a matrix of ones with shape 3 by 3 called tensor_of_ones and an identity matrix of the same shape, called identity_tensor.

Lets see what happens after matrix multiplication of these two matrix as well as element wise multiplication.


  • Create a matrix of ones with shape 3 by 3, store it in variable tensor_of_ones.
  • Create an identity matrix with shape 3 by 3, store it in variable identity_tensor.
  • Perform matrix multiplication of tensor_of_ones with identity_tensor and print its value.
  • Perform an element-wise multiplication of tensor_of_ones with identity_tensor and print its value.
# Create a matrix of ones with shape 3 by 3
tensor_of_ones = torch.ones(3, 3)
print(" Matrix of ones : \n", tensor_of_ones,"\n")# Create an identity matrix with shape 3 by 3
identity_tensor = torch.eye(3)
print("Matrix of eye :\n", identity_tensor,"\n")# Do a matrix multiplication of tensor_of_ones with identity_tensormatrices_multiplied = torch.matmul(tensor_of_ones, identity_tensor)
print("Matrix Multiplication : " "\n", matrices_multiplied,"\n")
# Do an element-wise multiplication of tensor_of_ones with identity_tensor
element_multiplication = tensor_of_ones * identity_tensor
print("Element wise multiplication of matrix :\n",element_multiplication)
Matrix of ones :
tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
Matrix of eye :
tensor([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
Matrix Multiplication :
tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
Element wise multiplication of matrix :
tensor([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])

Lets talk about Neural Network and its components

Forward pass

Forward propagation is an important step in any neural network. It is also called as forward propagation.

Let’s have something resembling more a neural network. The computational graph has been given below. You are going to initialize 3 large random tensors, and then do the operations as given in the computational graph. The final operation is the mean of the tensor, given by torch.mean(your_tensor).


  • Initialize random tensors x, y and z, each having shape (1000, 1000).
  • Multiply x with y, putting the result in tensor q.
  • Do an element wise multiplication of tensor z with tensor q, putting the results in f
# Initialize tensors x, y and z
x = torch.rand(1000, 1000)
y = torch.rand(1000, 1000)
z = torch.rand(1000, 1000)
# Multiply x with y
q = torch.matmul(x,y)
# Multiply elementwise z with q
f = q*z
mean_f = torch.mean(f)
# Checking if elementwise multiplication has any difference if multiplied differently # Multiply elementwise z with q
f1 = z*q
mean_f1 = torch.mean(f1)

Backward pass

Backward pass is also know as backward propagation and is another important step in neural network. It basically optimizes the network or we can say it is the technique to optimize the weights of the network.

Given the computational graph above, we want to calculate the derivatives for the leaf nodes (x, y and z). To get you started we already calculated the results of the forward pass (in red) in addition to calculating the derivatives of f and q.

The rules for derivative computations have been given in the table below:

Backpropagation using PyTorch

Here, you are going to use automatic differentiation of PyTorch in order to compute the derivatives of x, y and z from the previous exercise.


  • Initialize tensors x, y and z to values 4, -3 and 5.
  • Put the sum of tensors x and y in q, put the product of q and z in f.
  • Calculate the derivatives of the computational graph.
  • Print the gradients of the x, y and z tensors.
# Initialize x, y and z to values 4, -3 and 5
x = torch.tensor(4., requires_grad= True)
y = torch.tensor(-3., requires_grad= True)
z = torch.tensor(5.,requires_grad= True)
# Set q to sum of x and y, set f to product of q with z
q = x + y
f = q * z
# Compute the derivatives
# Print the gradients
print("Gradient of x is: " + str(x.grad))
print("Gradient of y is: " + str(y.grad))
print("Gradient of z is: " + str(z.grad))
Gradient of x is: tensor(5.)
Gradient of y is: tensor(5.)
Gradient of z is: tensor(1.)