Game Theory Is Not Deep Reinforcement Learning

Original article was published on Deep Learning on Medium

Game Theory Is Not Deep Reinforcement Learning

Continuing where we left off last time… and if you need a basic introduction to Game Theory please look at…

Games in Game Theory vs Deep Reinforcement Learning

Games in Game Theory are

2 Player Games

Multi Player Games

While in Deep Reinforcement Learning they are…

Single Player Games played against the Environment

Or, Multiplayer Games Played against each other in an Environment

The Objective in Game Theory vs Deep Reinforcement Learning

The Objective in Deep Reinforcement Learning is

ONLY to Win, in single player games.

And in multiplayer games is ONLY to win and make the other Agents lose.

And to that effect the objective in Deep Reinforcement Learning is to learn the best Policy which will enable you to win the game.

The ultimate objective of Game Theory is to illuminate the nature of conflict and cooperation. The ultimate objective of game theorists, is peaceful coexistence.

And the objective in a Game Theory Game is to find the Collective Best Solution — i.e. a Nash Equilibrium such that no player can benefit by changing his decisions unilaterally in that equilibrium situation.

The Focus in Game Theory vs Deep Reinforcement Learning

The focus in Game Theory is on Analysis of the Games once an equilibrium is reached.

The focus in Deep Reinforcement Learning is only on learning how to play the game and Winning the Game. Once you win a game, nothing else matters.

#Games in Game Theory vs Deep Reinforcement Learning

In Deep Reinforcement Learning the player even though might learn to play a game by playing it multiple times during training. But the ultimate objective is to finally play the game once and win it.

In Game Theory, every players pay offs are known. So playing the game once doesn’t make sense as it is more or less trivial.

In Game Theory the focus is on playing “Repeated Games” and then in those Repeated Games to find the best solution.

In Deep Reinforcement Learning, once a player is trained, and if he plays multiple games, he carries no memory or learnings from previous games into the next.

While in Game Theory in Repeated Games, every player learns from one game and carries its memories into the next game. And uses that information to play the next game better.

In Deep Reinforcement Learning ‘The Game’ is The Deal. And it is a complex game. The Player moves from One State to Another by taking Decisions/Actions in an effort to win the game. Once a policy is learnt during training it stays constant.

In Game Theory “Repeated Games” The player chooses ‘One’ Policy which maps all possible actions. And the player can ‘update’ the Policy in the next game.

‘State’ in Game Theory vs Deep Reinforcement Learning

In a repeated game all changes in the expected reward are due to changes in strategy by the players. There is no changing environment state or state transition function external to the agents. Therefore, repeated games are sometimes also referred to as stateless games.

Game Theory — Repeated Games

In game theory, a repeated game is an extensive form game that consists of a number of repetitions of some base game (called a stage game). The stage game is usually one of the well-studied 2-person games. Repeated games capture the idea that a player will have to take into account the impact of his or her current action on the future actions of other players; this impact is sometimes called his or her reputation. Single stage game or single shot game are names for non-repeated games.

  • Finite games are those in which both players know that the game is being played a specific (and finite) number of rounds, and that the game ends for certain after that many rounds have been played. In general, finite games can be solved by backwards induction.
  • Infinite games are those in which the game is being played an infinite number of times. A game with an infinite number of rounds is also equivalently (in terms of strategies to play) to a game in which the players in the game do not know for how many rounds the game is being played. Infinite games (or games that are being repeated an unknown number of times) cannot be solved by backwards induction as there is no “last round” to start the backwards induction from.

Solving Repeated Games

In general, repeated games are easily solved using strategies provided by folk theorems. Complex repeated games can be solved using various techniques most of which rely heavily on linear algebra and the concepts expressed in fictitious play. It may be deducted that you can determine the characterization of equilibrium payoffs in infinitely repeated games.

The Bottomline

The most widely studied repeated games are games that are repeated an infinite number of times. In iterated prisoner’s dilemma games, it is found that the preferred strategy is not to play a Nash strategy of the stage game, but to cooperate and play a socially optimum strategy.

An essential part of strategies in infinitely repeated game is punishing players who deviate from this cooperative strategy. The punishment may be playing a strategy which leads to reduced payoff to both players for the rest of the game (called a trigger strategy).

A player may normally choose to act selfishly to increase their own reward rather than play the socially optimum strategy. However, if it is known that the other player is following a trigger strategy, then the player expects to receive reduced payoffs in the future if they deviate at this stage.

An effective trigger strategy ensures that cooperating has more utility to the player than acting selfishly now and facing the other player’s punishment in the future.

There are many results in theorems which deal with how to achieve and maintain a socially optimal equilibrium in repeated games. These results are collectively called “Folk Theorems”.


Game Theory is NOT Deep Reinforcement Learning. And has nothing to do with it. Its a wrong analogy to make.

And by mastering Game Theory you will make no benefits in training your Deep Reinforcement Learning models.

About us

This is our website