The Magic and Mystique of Deep Reinforcement Learning

Source: Deep Learning on Medium

When I was younger I trained as a magician and at one point practiced magic professionally. While my career as “Magical Micheal” was short lived, thankfully, I did learn the power of magic, illusion and deception. Now, if you think I am beginning to suggest Deep Reinforcement Learning is a trick, you’re wrong and it isn’t. DRL very much works. In fact, that may be the problem, it works so well in the hands of experienced practitioners that it has become wrapped in some dark mystique. Of course that may be a bigger problem with Deep Learning itself rather than specific to just Reinforcement Learning. Either way, do these technologies need to be wrapped in mystery or should they be accessible to everyone?

Magicians once strictly practiced this mantra of never ever tell how a trick was done to a layperson. You could, of course, teach other magicians who proved themselves worthy. That was how magic was taught for hundreds of years until these 2 guys named Penn and Teller came along. Penn and Teller not only told people how tricks were done they went a step further and blew your mind by doing another bigger trick on top of the trick they taught you. It was brilliant and not only did they change magic but they likely changed the way we as an audience now challenge all our performers. A recent clip showing Penn and Teller in action is provided below:

Now, continuing to borrow the magic analogy and in particular the reference to Penn and Teller. Many now feel the big players in DRL are merely doing a Penn and Teller. Telling us how something works but not explicitly showing us how it works. For instance, the recent and impressive results by Google DeepMind with AlphaStar, the AI that beat human players at StarCraft II shown below:

Of course, Google and others publish papers going into some detail on how these algorithms work but as they say the devil really is in the details. Therefore, in some ways, is not proving technologies like Reinforcement Learning on games nothing more than a Penn and Teller style magic trick? After all, what is the end game here? Cause teaching bots to beat games can’t be it.

Practical Applications

So how does Google, OpenAI, Unity and others break the mystique of DRL. Well, in my mind this is simple and it just comes down to demonstrating the tech in a practical manner. Now I am not ditching games, I was a game developer after all, what I am suggesting is something more real world. That is at least from the bigger players like DeepMind. Otherwise, I fear DRL is going to end up in some dark closet, hidden again in mystique for the next 10 years. My hope is that my fear is unfounded and Google and others quickly prove this technology in other practical applications. That of course remains to be seen but perhaps a little bit of prodding wouldn’t hurt. :)