Source: Deep Learning on Medium
On January 24, Google’s DeepMind showcased their project AlphaStar, the successor of AlphaGo and AlphaZero. The two had previously defeated human world champions at the game of Go and chess, respectively. AlphaStar’s challenge was quite more difficult though: it had to beat humans at StarCraft 2.
AlphaStar achieved a decisive victory: 10 to 1. It defeated TLO and MaNa, both being Grandmasters in StarCraft 2. MaNa scored the only victory for humanity in the last game, livestreamed from DeepMind’s headquarters.
Many tech leaders, including the Late Stephen Hawking had warned us about AI. Elon Musk once claimed it could become an “immortal dictator” that could wipe out humanity. Can AI cause mass-unemployment, or could it have military use? With AlphaStar’s recent victory, it’s good to have a look at the balance of power between us and them.
The significance of AlphaStar’s victory
What makes StarCraft problematic for AI is that unlike chess and Go, the computer cannot see the entire board. Thus, it is presented with an “imperfect” version of the problem and needs to somehow predict the actions that would ultimately lead to victory hours later. Months before, another AI system developed by OpenAI had lost to human opponents in another real-time game; Dota 2.
AlphaStar used a combination of “supervised learning” and “reinforcement learning.” It learned first by looking at recording of human gameplay until it reached a point where it could beat StarCraft’s internal AI 95% of the times. Then, played against itself for mastery, over and over again. In total, it amassed an experience of 200 years of gameplay in a matter of weeks.
An unfair game
AlphaStar’s game was praised as “superhuman” — and unfortunately, it was. The tactics it pulled of required surgical clicking precision among tens of units in the midst of the battle field — something human players simply cannot pull. AlphaStar was using an API to interface with the game — it did not have to observe the objects, read the health and shield bars, move the mouse to select a unit then issue a command which requires another mouse click.
Of course, DeepMind’s team deliberately limited the number of actions per minute, which eventually was even lower than the human grandmasters. But at one point it reached 1500 actions per minute. In this sense, AlphaStar was not relying on superior tactics or gameplay — it relied on its supernatural clicking speed.
Furthermore, AlphaStar was seeing the complete map (the parts where it had forces on). In contrast, the human player had to scroll around the map to find their units.
One could argue that this is exactly what AI is about; it can do things humans do a hundred times better and more efficient. But there is a problem.
The Finite Game problem
Defeating human grandmasters at a game like StarCraft is a significant achievement. However, we must also consider an important factor. Games are used to train AI’s as they offer an excellent test-bed for AI strategies. Moreover, by letting the AI play against itself (reinforcement learning) we can train the system at a remarkable speed. Eventually, the benefits will pour into real world applications:
– AlphaGo’s algorithm was used to optimized power grids
– OpenAI’s Dota 2 AI was used for the development of a robotic arm that could hold objects
– AlphaStar’s AI can be used for weather prediction, climate modeling, language understanding and more
But games are finite; there are a specific set of rules everyone follows. Unlike games, the real world is not finite. That’s why we humans always find innovative ways to do things. AI’s strength is not creativity; it is just doing what humans have already done, but with much more perfection.
The significance of AlphaStar’s defeat
Perhaps more significant than the 10 wins it had was the single one it lost.
The AlphaStar AI that went head to head with MaNa on the livestream had the camera hack removed. As such, it was forced into a much more human-like gameplay. But more specifically, it lost to MaNa’s Warp Prism tactic, where a flying vessel was used to drop forces into AlphaStar’s base, then subsequently pulling them back when the AI mobilized its troops for a response.
And here’s the thing; the AI did not learn! It seemed like the AI could not categorize the flying vessel for the threat it was; bringing forces far stronger than itself into the base. It saw the vessel, gave it a very low threat score (Warp Prisms cannot do any harm on their own) and thus, forgot about it. MaNa was able to exploit this weakness several times, which ultimately led to AlphaStar’s defeat.
AlphaStar’s agents were trained with over 200 years of gameplay data. It took MaNa only five games to figure out a way to defeat it. And honestly, even a novice human player would have had a better response to MaNa’s tactic. It seems AlphaStar was not trained for this one. Even if it would be, it would need to go through another 200-years training as soon as Blizzard releases a StarCraft patch. The AI we deal with is still Narrow AI.
Also, it can only use one of the three StarCraft races — Protoss. It would be more challenging if it was trained on Zerg: in that race the units evolve into more advanced units, making the early choice of strategy much more important.
Not today, Skynet!
As a fan of StarCraft, I praise DeepMind for what they have done and for increasing the challenge bar. However, we must have a realistic understanding of the potential threats AI poses.
AI can replace human workers, but it cannot eliminate them. Probably AI can let 10 people run a company that would otherwise require 100 — but those 10 really need to be there. AI takes the repetitive work, offers insight and helps people making better decisions. It will be good at specific tasks, but would need humans for a broader interaction. The most efficient AI right now is the one that augments people — not replaces them.