Google DeepMind Hanabi — In March of 2016, Google’s DeepMind A.I. AlphaGo took on 18-time world champion Lee Sedol in a five match game of Go, a complex abstract strategy board game. AlphaGo won four out of five of the matches. DeepMind

The history of artificial intelligence is as much marked by what computers can’t do as what they can.

That’s not to say that the history of A.I. is a history of failure, but rather that, as a discipline, it has been driven forward in its quest for machine intelligence by a constant series of skeptical statements suggesting that “a computer will never [insert feat here].”

A computer will never learn. A computer will never play chess. A computer will never win at the game show Jeopardy! A computer will never be any good at translating languages. A computer will never drive a car. A computer will never win at Go, or StarCraft, or Texas Hold ‘Em.

AlphaStar: The inside story

Time and again, our list of arbitrary tasks that a computer will never be able to do is proven wrong — usually by stubborn computer scientists doing it precisely because skeptics thought it couldn’t be done.

Jump forward to 2019, and tasks a computer will “never” be able to do look a bit thinner on the ground. We’ve got lawyer bots, able to dispense legal advice at a fraction the cost of flesh-and-blood lawyers. We’ve got robots that can execute the kind of parkour moves that would impress any action movie star. Heck, machines are even painting pictures that sell for big bucks at auction.

Cue the fireworks

What’s left, then? The answer, at least according to researchers from the Alphabet-owned DeepMind Technologies and the University of Oxford, is “Hanabi.” If you’re confused, you’re not alone.

Hanabi, the Japanese word for fireworks, is a cooperative card game in which players work together to build up a series of cards in a specific order to set off a simulated fireworks show. The unique twist is that each player can see everyone’s cards but their own. The game, which has only been around for a decade, won the prestigious “Spiel des Jahres” prize for best board game in 2013.

Google DeepMind Jakob Foerster — Jakob Foerster, a former intern at DeepMind. Image used with permission by copyright holder

So why exactly is Hanabi the next great benchmark for A.I. to reach?

“Most games are focused on competition between the different players,” Jakob Foerster, a PhD student at the University of Oxford, who was previously an intern at DeepMind, told Digital Trends. “You can think, for example, of chess, poker and StarCraft. In these games, there typically isn’t a good reason for players to cooperate or communicate with each other. However, communication and cooperation are omnipresent and essential features of human life. Humans spend vast amount of time communicating with each other in a variety of settings — be it on a personal level or through media.”

“As a researcher, I have been fascinated by how A.I. agents can learn to communicate and cooperate with each other and ultimately also humans. Hanabi presents a unique opportunity for a grand challenge in this area, since it requires the players to reason over the intent, beliefs, and point of view of other players, which are all essential features for cooperation and communication.”

In Hanabi, players must communicate with each other to find out which cards they should play and which they should discard. The inherent challenge is that communication is restricted to costly hint actions. These use up a limited quantity of hint tokens that are available in the game.

How to Play Hanabi - Drentsoft Media

Successful players must convey extra information by agreeing on conventions, along with reasoning over intents, beliefs, and points of view of other players in the game.

“These aspects around communication, theory of mind, and cooperation make Hanabi unique compared to other benchmarks,” Foerster continued.

Lighting the fuse

In a recently published paper, DeepMind researchers propose two challenges for a Hanabi-playing A.I. The first of these is learn to play the game successfully with copies of themselves. This will require an impressive amount of innovation in machine learning methods. Even just calculating the number of available moves is tricky; in a 50-card deck with a massive number of possible hands, it’s extraordinarily challenging computationally.

“… The second part of the challenge includes settings where agents need to learn to adjust to new teammates.”

The tougher challenge is to get an A.I. to play with unknown teammates and humans. This will require capabilities such as understanding intent and point of view of others, and adapting to their approach. Humans typically learn at an early age that not everyone thinks in exactly the same way, but it’s a philosophically difficult idea for a machine to grapple with.

“We can essentially think of the first part as being a ‘search over conventions,’ which is technically hard but at least in principle can be written down,” Nolan Bard, a DeepMind research scientist, told us. “In contrast, in the second part A.I. systems may have to understand the ‘conventions for search’ — [in other words] how teammates decide on which moves to make.”

“While we are starting out with settings in which A.I.s learn to communicate with a fixed set of team mates, the second part of the challenge includes settings where agents need to learn to adjust to new teammates. While all of these are vital aspects for A.I. agents in order to interact smoothly with humans and other agents, they are currently not represented in the AI benchmarks. Our hope is for the Hanabi Learning Environment to play a vital role in filling this gap.”

The researchers claim that A.I. is now at the point where this is an approachable challenge for machines to take on. It won’t be easy, though, and will require big advances in fields like reinforcement learning, game theory, and more. To help drive research forward, the team has created an open-source Hanabi environment for other researchers to use as the basis for their work.

“I really don’t like stating numbers, but if I was forced to make a guess I would expect at least another five years or so,” Foerster continued. “However, I look forward to someone having some good ideas and proving me wrong. Inspiring innovation is the main motivation for this benchmark, so in some sense it would be a great success to be proven wrong.”

Your move, Skynet!

Cue the fireworks

Lighting the fuse

Editors’ Recommendations