Advanced Problems

Source: Deep Learning on Medium


Go to the profile of VLG IITR

People who are already familiar with the basics of Deep Learning can try out the following problems:

  1. Train an LSTM to solve the XOR Problem
  2. Learned Data Augmentation
  3. Template Matching
  4. Slitherin’

Note: Some of these problems are currently being researched upon. In case you get some substantial results, we would strongly suggest writing a blog or an arXiv report.

1. Train an LSTM to solve the XOR Problem

Given a sequence of bits, determine its parity. The LSTM should consume the sequence, one bit at a time, and then output the correct answer at the sequence’s end. Try out the two approaches below:

  • Generate a dataset of random 100,000 binary strings of length 50 and train the LSTM on it.
  • Generate a dataset of random 100,000 binary strings, where the length of each string is independently and randomly chosen between 1 and 50 and train the LSTM on this dataset.

Food for thought: Does the second approach succeed? What explains the difference?

2. Learned Data Augmentation

Use a learned VAE to perform Learned Data Augmentation.

  • First, train a VAE on input data (say MNIST) and transform each training point into a latent representation using the encoder. Apply a simple perturbation (such as Gaussian) in the latent space, and then decode back to the observed space.
  • Can such an approach be used to obtain improved generalization?
  • A potential benefit of such data augmentation is that it could include many nonlinear transformations like viewpoint changes and changes in scene lighting.
  • Can we approximate the set of transformations to which the label is invariant?

You can check some of these works if you need a place to get started:

3. Template Matching

Implement this project report and try to improve the architecture of the model or deficiencies in the approach. You may refer to the following paper for further information regarding the problem:

4. Slitherin’

Implement and solve a multiplayer clone of the classic Snake game (see slither.io for inspiration) as a Gym environment.

  • Environment: Have a reasonably large field with multiple snakes; snakes grow when eating randomly-appearing fruit; a snake dies when colliding with another snake, itself, or the wall; and the game ends when all snakes die. Start with two snakes, and scale from there.
  • Agent: Solve the environment using self-play with an RL algorithm of your choice [link][link]. You’ll need to experiment with various approaches to overcome self-play instability (which resembles the instability people see with GANs). For example, try training your current policy against a distribution of past policies. Which approach works best?

Food for thought: Does the agent learn to competently pursue food and avoid other snakes? Does the agent learn to attack, trap, or gang up against the competing snakes?

References

Almost all of these problems are a part of OpenAI’s Request for Research 2.0.