Top 5 Statistical Functions in PyTorch to Rule in Data Science

Original article was published on Deep Learning on Medium

Top 5 Statistical Functions in PyTorch to Rule in Data Science

Photo by Caspar Camille Rubin on Unsplash

PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook’s AI Research lab. We all are aware of the fact that Statistics has huge applications in Deep Learning. In this article, out of thousands of functions only five most useful are mentioned. You can check out my notebook here.

Let’s start the discussion by importing the library.

# Import torch library
import torch

Function 1: torch.bernoulli()

This function can easily generate binary random numbers (0 or 1) using the Bernoulli Distribution. The input should be a tensor containing probabilities to be used to draw the binary random numbers.

In the above example, using uniform_(0,1) function we have generated a square matrix of order 4. The elements are probabilities and hence belongs to the range [0,1]. Applying torch.bernoulli() on a, we obtained a matrix whose elements are binary random numbers.

Function 2: torch.multinomial()

Like earlier, this function too takes input tensor containing probabilities and returns us a tensor where each row contains num_samples indices sampled from the Multinomial Probability Distribution located in the corresponding row of tensor input.

Here, we have created a tensor as weights with probabilities and applying torch.multinomial() on it, we get the tensor containing indices using Multinomial Probability Distribution.

Function 3: torch.poisson()

This functions returns a tensor of the same size as input with each element sampled from a Poisson Distribution with rate parameter given by the corresponding element in input.

torch.rand() function gives us a tensor with random numbers from uniform distribution on the interval [0, 1). Using torch.poisson() function on the tensor rates1, we get our expected output.

Photo by Lee Campbell on Unsplash

Function 4: torch.normal()

This function returns a tensor of random numbers drawn from separate Normal Distributions whose mean and standard deviation are given. Here the inputs i.e. mean and standard deviation are tensor too.

As expected we are returned to random numbers drawn from Normal Distribution with specified mean and standard deviation.

Function 5: torch.randn()

This is one of the most powerful functions in PyTorch. The input of this function is the size i.e. the number of random numbers we are desired to generate. It returns a tensor filled with random numbers from a Standard Normal Distribution i.e. a Normal Distribution with mean 0 and variance 1.

Here we see the output has 5 random numbers which follow Normal Distribution with mean 0 and standard deviation 1.

Conclusion

These functions are extremely helpful in the field of Data Science, specially in simulation. There are too many functions to cover. Indeed, the discussion is not exhaustive. For more details, please visit this website.

This is my first article on Medium. Your feedback will be highly appreciated. Click on the clap option, if you find the blog helpful.