Artificial intelligence is giving a simple photograph the power to recognize objects, faces, and landmarks — sometimes with more detail than a set of human eyes can assign. Now, more of those features will be coming to mobile devices, thanks to Google’s release of MobileNets software.
Google released MobileNets as open source software on Wednesday, opening up a neural network of computational imaging for other programmers to incorporate into their apps. The programming is designed specifically to run on the smaller hardware of mobile devices, overcoming some of the biggest obstacles in bringing computer imaging to smartphones through a design that maximizes the power of mobile processors. The program does not create new capabilities but brings computational imaging into a package small enough to run off a mobile device without storing data on a cloud, which means apps using the programming would not need an internet connection.
The programming gives smartphones and tablets the ability to recognize objects and people, along with even recognition popular landmarks. Google even lists fine-grain classification — like determining what breed a particular dog is — among the possible uses for the program.
For mobile users, the release means that third-party apps may soon be getting new or enhanced computational imaging features. By making the programming open source, Google is opening up the software for use in more than just Google-owned apps. The programming can be expanded for a number of different uses, from reverse image searches to augmented reality.
The ability to recognize objects and faces in a photography using a neural network is not new, but Google’s MobileNets are more efficient, creating a smaller, faster program for using the features on mobile devices — even when an internet connection is not available.
“Deep learning has fueled tremendous progress in the field of computer vision in recent years, with neural networks repeatedly pushing the frontier of visual recognition technology,” wrote Andrew Howard and Menglong Zhu, both Google software engineers. “While many of those technologies such as object, landmark, logo and text recognition are provided for internet-connected devices through the Cloud Vision API, we believe that the ever-increasing computational power of mobile devices can enable the delivery of these technologies into the hands of our users, anytime, anywhere, regardless of internet connection.”