Learn PyTorch Basics

Source: Deep Learning on Medium

Which framework should you use for deep learning? According to my online research, TensorFlow, Keras, and PyTorch are the most popular libraries mentioned in ML community. TensorFlow works better with large-scale implementation while PyTorch works well for rapid prototyping in research. Both frameworks provide maximum mathematically-inclined flexibility. Keras, on the other hand, is the easiest to use but not as flexible as TensorFlow or PyTorch.

When I first started learning deep machine learning, I implemented my very first LSTM text classifier with Keras. It was a few simple lines of code to add new layers to the model with a huge learning curve on connecting my word embedding input to the model. This classifier is what the team used for detecting gender bias in our first prototype.

When we started our internship at Mila, one of our mentors strongly encouraged us to learn PyTorch as it is the most popular framework used in the research field. We have been given tutorials by him on understanding deep learning models as well as implementing them using PyTorch. In order to understand PyTorch better, we decided to create our own tutorials as teaching is the best way to learn!

In this blog post, I present a brief introduction to the framework and the working stones of PyTorch in order to build neural network models.

Roadmap for the post

  1. A brief introduction to PyTorch
  2. Understanding Tensors
  3. PyTorch & NumPy Bridge
  4. Basic Tensor Operations

What is PyTorch?

PyTorch is an open-source machine learning library for Python which allows maximum flexibility and speed on scientific computing for deep learning. It is a replacement for NumPy to use the power of GPUs.

Tensors in PyTorch

A tensor is an n-dimensional data container which is similar to NumPy’s ndarray. For example, 1d-tensor is a vector, 2d-tensor is a matrix, 3d-tensor is a cube, and 4d-tensor is a vector of cubes.

Let’s take a look at some examples of how to create a tensor in PyTorch. To Initialize a tensor, we can either assign values directly or set the size of the tensor. torch.Tensor(n,m) will initialize a tensor with size n x m.

import torch
# create a tensor
new_tensor = torch.Tensor([[1, 2], [3, 4]])
# create a 2 x 3 tensor with random values
empty_tensor = torch.Tensor(2, 3)
# create a 2 x 3 tensor with random values between -1and 1
uniform_tensor = torch.Tensor(2, 3).uniform_(-1, 1)
# create a 2 x 3 tensor with random values from a uniform distribution on the interval [0, 1)
rand_tensor = torch.rand(2, 3)
# create a 2 x 3 tensor of zeros
zero_tensor = torch.zeros(2, 3)

To access or replace elements in a tensor, use indexing. For eamxple, new_tensor[0][0] will return a tensor object that contains the element at position 0, 0. A scalar object can be also accessed via .item(). Additionally, slicing can also be used to access every row and column in a tensor.

new_tensor = torch.Tensor([[1, 2], [3, 4]])
# replace an element at position 0, 0
new_tensor[0][0] = 5
print(new_tensor) # tensor([[ 5., 2.],[ 3., 4.]])
# access an element at position 1, 0
(new_tensor[1][0]) # tensor([ 3.])
print(new_tensor[1][0].item()) # 3.
## slicing examples
slice_tensor = torch.Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# elements from every row, first column
(slice_tensor[:, 0]) # tensor([ 1., 4., 7.])
# elements from every row, first column
print(slice_tensor[:, -1]) # tensor([ 3., 6., 9.])
# all elements on the second row
print(slice_tensor[2, :]) # tensor([ 4., 5., 6.])
# all elements from first two rows
print(slice_tensor[:2, :]) # tensor([[ 1., 2., 3.],
[ 4., 5., 6.]])

Now, how do we access tensor information? In order to check the type of a tensor, .type() is used. For the shape of a tensor, either .shape or .size() can be used. .dim() is for accessing the dimension of a tensor.

new_tensor = torch.Tensor([[1, 2], [3, 4]])
# type of a tensor
print(new_tensor.type()) # 'torch.FloatTensor'
# shape of a tensor
print(new_tensor.shape) # torch.Size([2, 2])
print(new_tensor.size()) # torch.Size([2, 2])
# dimension of a tensor
print(new_tensor.dim()) # 2

To reshape a tensor, simply use the code .view(n,m). This will convert the shape of a tensor to the size n x m.

reshape_tensor = torch.Tensor([[1, 2], [3, 4]])
reshape_tensor.view(1,4)   # tensor([[ 1.,  2.,  3.,  4.]])
reshape_tensor.view(4,1) # tensor([[ 1.],[ 2.],[ 3.],[ 4.]])

PyTorch & NumPy Bridge

Sometimes, it is useful to convert Numpy ndarray to a Pytorch tensor and vice versa. Use .from_numpy() when converting from a NumPy ndarray to a PyTorch tensor. Conversely, use .numpy() to convert back to a NumPy array

np_ndarray = numpy.random.randn(2,2)
# NumPy ndarray to PyTorch tensor
to_tensor = torch.frum_numpy(np_ndarray)
# PyTorch tensor to NumPy array
to_ndarray = to_tensor.numpy()

Basic Tensor Operations

Here are a few basic examples of tensor operations in PyTorch:

Transpose: .t() or .permute(-1, 0)

# regular transpose function
# transpose via permute function

Cross Product: .cross()

tensor_1 = torch.randn(2, 2)
tensor_2 = torch.randn(2, 2)
cross_prod = tensor_1.cross(tensor_2)

Matrix Product: .mm()

maxtrix_prod = tensor_1.mm(tensor_2)

Elementwise Multiplication: .mult()

element_mult = tensor_1.mul(tensor_2)


Tensors can also be used on a GPU that supports CUDA to accelerate computing.

if torch.cuda.is_available():
tensor_1 = tensor_1.cuda()
tensor_2 = tensor_2.cuda()
tensor_1 + tensor_2

Today, we’ve learned about PyTorch basics through understanding basic PyTorch operations with tensors.

In the next tutorial, we will practice PyTorch with linear models to get more comfortable with the framework.

PyTorch Tutorial Schedule

  1. Learn PyTorch Basics
  2. Practice PyTorch with linear models (linear and logistic regression)
  3. Introduction to Neural Network with Pytorch
  4. RNN/LSTM Text Classifier
  5. RNN/LSTM Language Model
  6. Exploring Text Classification with CNN
  7. Can Language Model work with CNN?