PyTorch 0.4.0 Release & 1.0 Preview

Weekly Reading List #3

Photo Credit

Issue #3: 2018/04/30 to 2018/05/06

This is an experimental series in which I briefly introduce the interesting data science stuffs I read, watched, or listened to during the week. Please give this post some claps if you’d like this series to be continued.

I’ve been busy with other stuffs this week, so this issue will only cover the new Pytorch 0.4.0 and the roadmap to the production ready 1.0 version.

PyTorch 0.4.0

Released in late April:

with a migration guide:

A perhaps incomplete list of important changes with a brief summary for each one of them:

  • Merging Tensor and Variable class. Basically torch.Tensor has become torch.autograd.Variable. But the old code will still work.
  • Don’t use type() to query the underlying type of a Tensor object. Use isinstance() or x.type().
  • Add an in-place method .requires_grad_() to set the requires_grad flag.
  • The .data attribute now returns a Tensor with requires_grad=False. But changes to the returned Tensor won’t be tracked by autograd.
  • Use .detach() method if you want the changes to be tracked.
  • 0-dimensional (scalar) Tensors. Fixes the inconsistency between tensor.sum()and variable.sum() before 0.4.0.
  • Use .item() to get the Python number from a scaler variable instead of .data[0].
  • Use torch.no_grad() or torch.set_grad_enabled(is_train) to exclude variables from autograd instead of setting volatile=True.
  • Use torch.tensor to create new Tensor objects. When calling the function, assign the dtype, device, and layout with the new torch.dtype, torch.dtype, and torch.layout classes.
  • The new torch.*_like and tensor.new_* shortcuts. The former takes a Tensor; the latter takes a shape.
  • Use the new .to(device) method to write device-agnostic code.
  • Add a new .device attribute to get the torch.device for all Tensors.
Tensor Creation Functions (for quick reference, taken from the migration guide)

The code samples at the end of the migration guide are a good way to check if you’ve understood the above changes correctly.

Similarly, a maybe incomplete list of new features:

  • Windows support.
  • torch.where(condition, tensor1, tensor2)
  • torch.expm1
  • Use torch.utils.checkpoint.checkpoint to trade compute for memory.
  • torch.utils.checkpoint.checkpoin_sequential for sequential models.
  • torch.utils.bottleneck to identify hotspots.
  • reduce=False support for all loss functions.
  • nn.LayerNorm
  • nn.GroupNorm
  • torch.nn.utils.clip_grad
  • Embedding.from_pretrained factory
  • 24 basic probability distributions
  • TransformedDistribution and Constraint

PyTorch 1.0

Published on May 2:

Probably one of the most important takeaways:

In 1.0, your code continues to work as-is, we’re not making any big changes to the existing API.

Basically Facebook is merging Caffe2 and PyTorch to provide both a framework that works for both research and production settings, as hinted earlier in April:

So the gist of the solution is adding a just-in-time (JIT) compiler torch.jit to export your model to run on a Caffe2-based C++-only runtime. This compiler has two modes:

  1. Tracing Mode: tracing native Python code. But it will probably cause problems if your model contains if statements and loops (for example, RNN with variable lengths).
  2. Script Mode: compile code into a intermediate representation. But it only supports a subset of Python language, so usually you’ll have to isolate the code you want to be compiled.

The naming is still subject to change. The 1.0 version is expected to be released this summer.

Source: Deep Learning on Medium