Reinforcement Learning and Artificial Intelligence

The name is Bond James Bond and I have come here on a mission to save the world. How will I do it, by beating the best player in the world in the game of Go. That’s right, I will not only beat the best player in the world but, I will do it with ease!!!!

That was the story of AlphaGo, the computerized player created by Google DeepMind. It was easy, the computer was set in motion 24/7 for a while and played as many games as possible to get ready to take on its competitor. The greatest player in the world by far at this game. The computer went through rounds being rewarded and punished for each move as well as wins and losses. Lee Sodol below :

Lee Sedol

I bring this up because over the weekend, my good friend and mentor Julian Trajanson, who works in the R and D department at, was recently featured on the podcast :

Along side Jason Nichols, the director of Walmart’s AI department. There, they talked about the topic of reinforcement learning which I did vast research on.

Here is the scenario!

In the game above, you are mario and you are trying to get to the flag, that is very droopy. Do you jump now or run? if you jump now what is the purpose. -1 point. if you run now +1 point and then jump later +1 point. If you get hit by the enemy mushroom, -5 points. The question then becomes how many times can you play the game in order to not just reach the flag but to reach it in the quickest time. The algorithms used are neural net algorithms. They are interwoven into networks that are modeled based on the brain’s neural network.

A real life example of this concept is in Facebook which has had great success with identifying faces in photographs by using Deep Learning. It’s not just a marginal improvement, but a game changer: “Asked whether two unfamiliar photos of faces show the same person, a human being will get it right 97.53 percent of the time. New software developed by researchers at Facebook can score 97.25 percent on the same challenge, regardless of variations in lighting or whether the person in the picture is directly facing the camera.” This is deep learning artificial intelligence reinforcement learning method.

Programmers analyze the results and determine the reward punishments to be given out. Facial recognition and object recognition is the future that will shape how technology, robots, self driving cars will interact with our world. The project that Jason is working on is secret but it will impact the future greatly.

Source: Deep Learning on Medium