Numpy vs PyTorch: Linear Regression from scratch

Source: Deep Learning on Medium

Numpy vs PyTorch: Linear Regression from scratch

In the last article we compared Numpy array with PyTorch tensors. Now let’s build simple linear regression model using both Numpy and PyTorch.

#Create dummy dataset
X = np.array([1,2,4,6,8,10,12,13,14,16,16,18,20,22,24])
Y = np.array([39,42,43,46,47,56,60,59,64,66,68,72,71,75,80])

As we can see there is linear relationship between X and Y.(We’ll discuss more about correlation in another post). Here,we’ll use linear regression to build prediction model.

Linear Regression Basics:

Y = a*X+b is the equation of line/linear regression model.

Goal here is to find the values of a and b.

There are multiple techniques to achieve this:

1.Matrix calculations: Put all data into matrices to perform optimization.Used for small datasets because of memory constraints.

2.Gradient Descent : Try to minimize error/difference between actual and predicted values using derivatives.

3.Regularization: While minimizing error,also try to reduce impact of unnecessary features.

4.Simple linear regression:If there are single input variable and single output variable,use covariance and variance to find a and b.

More detailed explanation of above techniques is not in the scope here.We’ll implement method 2 i.e Gradient Descent here-more specific-Batch Gradient Descent.

Weights(a,b) are updated at end of complete batch/all rows as follow:

new a = old a-(learning_rate*gradient_a)

new b = old b-(learning_rate*gradient_b)

Linear Regression using Numpy:

learning_rate = 0.001
w = np.random.randn()
b = np.random.randn()
y_pred = np.empty(len(Y))
for i in range(epochs): print("-----------epoch:{}--------".format(i))
y_pred = w*X +b

#Error/loss calculation is Mean Squared Error
error = np.mean((Y - y_pred)**2)
print('Total Error:{}'.format(error))

#Gradient calculation
gradient_a = np.mean(-2*X*(Y-y_pred))
gradient_b = np.mean(-2*(Y-y_pred))

#Update weights
w -= learning_rate*gradient_a
b -= learning_rate*gradient_b

Error is reducing with increment in epochs. Number of epochs and learning rate are hyper-parameters to tune.

Let’s not play around with it and jump to PyTorch equivalent.

Linear Regression using PyTorch:

#initialise data/features and target
X_tensor = torch.from_numpy(X)
Y_tensor = torch.from_numpy(Y)
#Initialise weights
'''Here unlike numpy we have to mention that these variables are trainable(need to calculate derivatives).This can be done using requires_grad'''

w_tensor = torch.randn(1,requires_grad=True,dtype=torch.float)
b_tensor = torch.randn(1,requires_grad=True,dtype=torch.float)
torch.random.seed = 2
learning_rate = 0.001

First we’ll try to build a model without taking much help of in-built PyTorch methods.

#Model without PyTorch in-built methodsfor i in range(epochs): print("-----------epoch:{}--------".format(i)) #prediction
y_pred = w_tensor*X_tensor +b_tensor

#Error/loss calculation is Mean Squared Error
error = ((Y_tensor - y_pred)**2).mean()
print('Total Error:{}'.format(error))
'''Now no need to calculate gradients,PyTorch will do it if we tell which function/variable needs gradient calculation using backward()'''
'''Actual values of gradients can be seen using grad attribute'''
'''We can not directly use gradients in normal calculation,so use no_grad() method to get variables out of scope of computation graph '''
with torch.no_grad():
w_tensor-= learning_rate*w_tensor.grad
b_tensor-= learning_rate*b_tensor.grad
#After each step,Reinitialize gradients because PyTorch holds on to gradients and we need to ask it to release it.

Now,let’s use in-built PyTorch methods

#Model with PyTorch in-built methods
optimizer = torch.optim.SGD([w_tensor, b_tensor], lr=learning_rate)
loss = torch.nn.MSELoss(reduction='mean')
for i in range(epochs):
y_pred = w_tensor*X_tensor +b_tensor

#Error/loss calculation is Mean Squared Error
error = loss(Y_tensor, y_pred)
print('Total Error:{}'.format(error))
error.backward() #Update weights using Optimizer
#After each step,Reinitialize gradients because PyTorch holds on to gradients,reinitialize gradients using Optimizer optimizer.zero_grad()

Till now,we’ve explored loss calculation and Optimizers.Complete notebook available at git-repo

The only manual step remaining is prediction step which also can be done by PyTorch Model class,but we’ll explore more on Model class and neural networks in PyTorch in next article.

If you liked the article or have any suggestions/comments, please share them below!