Getting started with deep learning with PyTorch

Original article can be found here (source): Deep Learning on Medium

Getting started with deep learning with PyTorch

Here in this post, I will take you through basic model building, training and testing using PyTorch.

PyTorch is a deep learning framework built by Facebook.

I will take very popular MNIST dataset and classify the images between the digits (0–9). This is basically like a “hello, world” program of deep learning.

Rather than just writing and showing full code, I will take you through important parts of the process, which are basic fundamental blocks of any Neural network model building process.

  1. Data loading
  2. Defining model
  3. Set loss function and optimizer
  4. Training

Interesting thing is, starting from first step, PyTorch library provides APIs to easily load the data, define the model and out of the box functions for optimization and loss calculation.

Let’s quickly start with Data Loading

PyTorch provides torchvision module, which consists popular datasets, model architectures for computer vision analysis. We will download our MNIST dataset from torchvision. It’s as easy as writing two lines,

from torchvision.datasets import MNIST
trainset = MNIST(root=’~/datasets/’, train=True, download=True)
testset = MNIST(root=’~/datasets/’, train=False, download=True)

root: where to download the data
train: whether it is training data or not
download: it downloads and stores the data locally

Defining model

PyTorch fundamental library named torch provides us with nn.Module class, which we can use for defining neural network layers as below

from torch import nn
class MNIST_NN(nn.Module):
def __init__(self,):
super().__init__()
self.pool = nn.MaxPool2d(3, stride=2)
self.relu = nn.ReLU()
self.conv1 = nn.Conv2d(1, 16, (3, 3), padding=1)
self.conv2 = nn.Conv2d(16, 32, (3, 3), padding=1)
self.fc1 = nn.Linear(32 * 6 * 6, 240)
self.fc2 = nn.Linear(240, 120)
self.fc3 = nn.Linear(120, 10)

def forward(self, x):
x = self.conv1(x)
x = self.relu(x)
x = self.pool(x)
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 32 * 6 * 6)
x = self.relu(self.fc1(x))
x = self.relu(self.fc2(x))
x = self.fc3(x)
return x
net = MNIST_NN() # initialize the neural model

torch has mostly self-explanatory classes and methods,
nn.Conv2d is a 2D convolution layer takes (channels, filters, kernel-size & padding) as arguments
nn.Linear represents a fully connected layer takes (number of input nodes, output nodes) as arguments
view is same as the reshape function in numpy

**Most important**
In PyTorch we just initialize the layers in __init__ methods and we define the network in forward method. We don’t deal anything with backpropogation or gradient calculation. Based on the network, PyTorch internally defines the backpropogation network and calculates the gradients, which we will shortly see while training.

Before we move into training the neural network model, we need a way to divide the whole data into mini-batches and input it into the network. For this task, PyTorch provides a DataLoader class which can take the full dataset and return mini-batches based on the batch-size value.

from torch.utils.data import DataLoader
trainloader = DataLoader(trainset, batch_size=8, shuffle=True)
testloader = DataLoader(testset, batch_size=8, shuffle=True)

If you check the output of the trainloader, it is an iterable

# batch_index, (data, labels)
x_batch_idx, (x_data, x_labels) = next(enumerate(testloader))
x_batch_idx, x_data.shape, x_labels.shape
output:
(0, torch.Size([8, 1, 28, 28]), torch.Size([8]))

Next is defining the loss function & optimizer
PyTorch provides torch.optim library for defining the optimizer and we can define the criteria of loss function using nn.CrossEntropyLoss() method, since the number of class labels are 10.
I chose stochastic gradient descent with learning rate of 0.01, (lesser for faster convergence)

import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.01, momentum=0.9)

Training is very intuitive as you can see below

epochs = 2
for epoch in range(epochs):
for batch_idx, data in enumerate(trainloader):
inputs, labels = data
optimizer.zero_grad()
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()

We just define number of epochs to train and loop through the training process that many times
Everytime we iterate a batch, these steps are followed in same order for gradient calculation and updating parameters

  1. optimizer.zero_grad(), will reset all the gradients that were accumulated in previous mini-batch run
  2. loss.backward(), calculates the loss
  3. optimizer.step(), based on loss value takes a step at updating the gradients

Saving, Loading the model

torch.save(net.state_dict(), './models/mnist.pth') # to save
net.load_state_dict(torch.load('./models/mnist.pth')) # to load the weights into the neural net model object

Evaluating

correct = 0
total = 0
with torch.no_grad(): # because we don't want to update gradients
for data in testloader:
images, labels = data
outputs = net(images) # prediction
total += labels.size(0)
_, predicted = torch.max(outputs, axis=1)
correct += (predicted == labels).sum().item()
print(f'Accuracy: {(correct/total):.5f}')output:
Accuracy: 0.98650

Thanks for reading the article.
If you are experienced, please suggest if there are any mistakes. If you are a beginner, hope this article is helpful to you.