Boaty McBoatface may never have actually come to fruition as the name of the new British polar research vessel, but the concept of the name lives on. Introducing Parsey McParseface, Google’s tool for understanding English.
While the name may be groan-worthy, what Google is offering here is actually pretty powerful. The company is essentially open-sourcing the overarching framework for parsing sentences, called SyntaxNet, as well as Parsey McParseface, which is the English-language module that plugs in to the framework.
According to Google, the framework can correctly identify different parts of a sentence, including subjects, objects, verbs, and so on, with up to 94 percent accuracy.
Going forward, being able to understand sentences is likely to be a very important part of how computers interact with humans, both in search and beyond. Computers also need to be able to understand the subtleties of language and determine which meaning the user is most likely to be using. Google gives the example “Alice drove down the street in her car.” A computer could take this to mean that she was in her car, driving down the street, or that she was driving down the street, which was located inside her car.
To figure out sentences like this, Google says that SyntaxNet first uses neural networks and then “Beam Search” to apply probabilities to each meaning of the sentence. It can then act on the meaning it finds to be most likely to help the user with whatever he or she needs.
“It is not uncommon for moderate-length sentences — say 20 or 30 words in length — to have hundreds, thousands, or even tens of thousands of possible syntactic structures. A natural language parser must somehow search through all of these alternatives, and find the most plausible structure given the context,” said the company in a blog post.
The fact that the code for these systems is now open-source is actually a pretty big deal. Hopefully, others will be able to use Google’s research to put together new apps and assistants. While, at 94 percent accuracy, SyntaxNet isn’t perfect, Google says that it offers enough to be useful in a range of applications.