This AI Can Play 57 Games At SuperHuman Performance!

Original article was published on Artificial Intelligence on Medium


This AI Can Play 57 Games At SuperHuman Performance!

Overview of the DeepMind paper “Agent57: Outperforming the Atari Human Benchmark” by Badia et. al.

When we train intelligent agents for a particular game, we often try to obtain best performance from our agent on that game. For this purpose, we usually alter our Reinforcement Learning algorithm or our Neural Network model by incorporating some game-specific knowledge. While this approach gives better benchmarks on that game, this same method would likely give sub-optimal performance on other games. This means what we are doing can not really be called General Intelligence.

If we can somehow create one single approach that can excel at multiple games, it would make it very easy to train bots for new games without having to re-engineer our models or the learning algorithm every single time.

Agent57

This is why, today I want to cover the DeepMind paper “Agent 57: Outperforming the Atari Human Benchmark”, which introduces a single learning algorithm that can achieve superhuman performance on 57 different games in the Atari games suite with varying levels of difficulty. The beauty of this work is that it lays a solid foundation towards building artificial general intelligence.

Agent57 playing (Left to Right) Pitfall, Solaris, Skiing and Montezuma’s revenge. [source]

As you can see here, Agent57 is playing different classic arcade games from the Atari suite of games and in all cases, it manages to beat the top human performance as well. This is ground-breaking because certain games here provide extremely difficult challenges to our agent where you need to pursue long term planning in order to get positive results. This means taking correct set of actions right now which will yield game-winning results few steps/minutes down the line.

DeepMind’s progress from it’s early DQN days to Agent57. [source]

An additonal challenge here is to decide when an agent needs to experiment with new strategies which might provide even better results than its current strategy which is already providing decent results. This exploration strategy is usually hand-coded into the algorithms depending upon the game, but Agent57 handles this differently. It uses a meta-learning module that learns when to continue exploring new game strategiess and when to stop, which is why this approach adapts better to different games and generalizes really well. It eliminates the need to hand-craft our learning parameters in our algorithm and thus, it yields better results across a wide variety of games.

Results

Following video playlist from DeepMind’s YouTube channel shows Agent57 playing all 57 games in the Atari suite. Enjoy!

Few more advances in this line of work will mean we can easily reuse the same training method for developing our game-playing bots across different games, without having to worry about the adjusting the hyperparameters of our methods. Truly remarkable advancement in artifical general intelligence.