Artificial intelligence keeps getting more intelligent.
Two years ago, the Google Brain team began employing machine learning techniques to teach a computer how to interpret and caption images. Sure, it won’t win any humor contests for being punny or particularly clever, but if you’re looking for a literal translation of what you’re looking at, Google’s AI system has you covered.
On Thursday, the internet giant announced that it had made “the latest version of our image captioning system available as an open source model in TensorFlow.” The most recent iteration of its AI “contains significant improvements to the computer vision component of the captioning system, is much faster to train, and produces more detailed and accurate descriptions compared to the original system,” Google said.
Called “Show and Tell,” the algorithm can recognize objects in imagery with an impressive 93.9 percent accuracy rate. That’s quite the improvement from just two years ago, when the AI was still scoring in the B-range, identifying images correctly just 89.6 percent of the time. So what’s changed? In essence, Google’s tool now tries to describe objects rather than simply classifying them.
“For example, an image classification model will tell you that a dog, grass and a Frisbee are in the image,” Google noted, “But a natural description should also tell you the color of the grass and how the dog relates to the Frisbee.”
While you may not need Google to tell you what you’re looking at on a daily basis, these machine learning capabilities could be used to help those with visual impairments, and further the work of other AI researchers. “We hope that sharing this model in TensorFlow will help push forward image captioning research and applications, and will also allow interested people to learn and have fun,” Google said.
For a full description of Google’s latest algorithm, check out “Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge,” published in IEEE Transactions on Pattern Analysis and Machine Intelligence.