Simulators: The Key Training Environment for Applied Deep Reinforcement Learning

Deep reinforcement learning (DRL) is one of the most exciting fields in AI right now. It’s still early days, but there are obvious and underserved markets to which this technology can be applied today: enterprises that want to automate or optimize the efficiency of industrial systems and processes (including manufacturing, energy, HVAC, robotics, and supply chain systems).

But there is a key element for building applied DRL: simulation environments. In this blog, we’ll tell you what simulators can do, why you need them, and how you can use the Bonsai Platform + simulators to solve real business problems.

What is a simulation?

Let’s start with defining the term simulation as it’s quite an abstract concept. Simulations can range from flight simulators to simulations of electrical and mechanical components or models of entire cities.

Simulation is the imitation of the operation of a real-world process or system over time.”

Essentially, there is some kind of system that has a number of inputs, applies some mathematical functions to these inputs, and delivers back an output in the form of data that can be visual (like a robotics simulator) or just pure data (like the energy simulator, EnergyPlus).

Simulations have been used by computer scientists for quite some time, going back to the late 1950s. During the last 20 years, increased computing power and vast amounts of data have allowed simulations to dramatically increase in fidelity and value. Many leading industrial simulations match physical realities or business processes almost identically.

A huge influence has been the evolution of the digital gaming industry. Gamers wanted a more immersive experience, requiring high fidelity graphics and more realistic behaviors of items within the virtual worlds. Gaming middleware companies developed and delivered powerful 3D and 2D physics engines over the past 30 years.

Simulations in Industry

By utilizing some of these software products and a variety of mathematical libraries, enterprises are able to simulate complex systems with a large number of components that allow subject matter experts (SME) to test and evaluate systems prior to building them in the real world. Use cases include digital twins, robotics, tuning small and large industrial machines, electrical and physical systems of many kinds, and optimizing business processes like supply chains.

While there exist a large number of custom and very specialized simulations based on a single model, there are also a number of simulator platforms which are able to run and simulate a basically infinite number of models. Examples are MATLAB Simulink (engineering and manufacturing), ANSYS (engineering), AnyLogic (supply chain), Gazebo (robotics), TRNSYS (energy), and many others.

Simulations + Deep Reinforcement Learning

Reinforcement Learning (RL) is defined as:

“An area of machine learning concerned with how software agents ought to take actions in an environment to maximize a cumulative reward”.

In other words, RL trains an agent to learn a policy for how to act by trying a large number of actions in a given environment, optimizing for a defined reward function.

Deep reinforcement learning (DRL) follows the same method, using a deep neural network to represent the policy.

Reinforcement learning requires a very high volume of “trial and error” episodes — or interactions with an environment — to learn a good policy. Therefore simulators are required to achieve results in a cost-effective and timely way.

Just imagine trying to teach a robot to walk by watching a real, physical robot try and fall 100,000 times before it could successfully and consistently walk. Or training an AI to play the boardgame GO by actually playing a human competitor for hundreds of thousands of games. Simulators allow these episodes to happen in a digital world, training an AI to reach its full potential while saving time and money.

Some simulations model environments in which an agent can take continuousactions that impact the state of the environment; other simulations model settings where a discrete input creates a different output. Both of these types of simulations can be used for reinforcement learning.

Simulations + Deep Reinforcement Learning + Bonsai

Bonsai is an artificial intelligence platform that allows enterprises to program control into industrial systems, and the only commercially available product for programming control of industrial systems using deep reinforcement learning.

Using the Bonsai Platform, enterprises can build a BRAIN (an AI model), connect the simulator of their choice, and train the BRAIN in that environment to learn a desired behavior.

To learn more about building a simulation and applying DRL to your enterprise, head to our Getting Started page.

Source: Deep Learning on Medium