Skip to main content

Facebook’s new A.I. takes image recognition to a whole new level

“Hit me,” says Morpheus. “If you can.” Neo adopts a martial arts fighting pose, then launches a furious flurry at his mentor, flailing at him with high-speed strikes. Morpheus blocks every attempted attack effortlessly. The scene is, of course, the training sequence from 1999’s The Matrix, a movie that blew minds at the time with its combination of artificial intelligence-focused storyline and cutting-edge computer graphics.

More than 20 years later, the scene is being used as part of a Facebook demo to show me some of the company’s groundbreaking A.I. image recognition technology. On the screen, the scene plays out as normal. Almost. While Morpheus and the background are exactly the same, the 2D footage of Keanu Reeves has been transformed into a 3D model. Although this particular “pose estimation” demo is pre-rendered, Andrea Vedaldi, one of Facebook’s A.I. experts when it comes to computer vision and machine learning, said that such transformations can be rendered in real time.

Recommended Videos

That means Facebook’s A.I. algorithms can watch a regular video and, as it is playing, figure out how to turn it into a fully 3D scene, frame by frame. The Matrix is a particularly impressive version of this because the exquisitely choreographed kung fu-style maneuvers are difficult for even a human to process, let alone a machine to extrapolate. It’s not perfect at it, but it’s pretty darn good.

matrix scene with body mapping ai
Facebook AI

“This is a very, very challenging video because it shows you … acrobatic poses,” Vedaldi told Digital Trends, with almost a hint of apology creeping into his voice. “[It’s not] what you would typically see in a user application. This is done for fun just to demonstrate the capabilities of the system.”

Faced with a simpler task — say, turning a video of your kid’s soccer practice into wireframe models or doing the same thing with a still holiday snap — Facebook’s algorithms are considerably more adept. And they’re getting way better all the time.

Extracting data from images

This might seem a strange piece of research for Facebook to focus on. Better news feed algorithms? Sure. New ways of suggesting brands or content you could be interested in interacting with? Certainly. But turning 2D images into 3D ones? This doesn’t immediately seem like the kind of research you’d expect a social media giant to be investing. But it is — even if there’s no immediate plan to turn this into a user-facing feature on Facebook.

For the past seven years, Facebook has been working to establish itself as a leading presence in the field of artificial intelligence. In 2013, Yann LeCun, one of the world’s foremost authorities on deep learning, took a job at Facebook to do A.I. on a scale that would be almost impossible in 99% of the world’s A.I. labs. Since then, Facebook has expanded its A.I. division — called FAIR (Facebook A.I. Research) — all over the world. Today, it dedicates 300 full-time engineers and scientists to the goal of coming up with the cool artificial intelligence tech of the future. It has FAIR offices in Seattle, Pittsburgh, Menlo Park, New York, Montreal, Boston, Paris, London, and Tel Aviv, Israel — all staffed by some of the top researchers in the field.

Figuring out how to better understand photos is a big focus for the company. Since 2017, Facebook has used artificial neural networks to auto-tag people in photos even when they are not manually labeled by users. Since then, the social media giant’s image recognition technology has gotten increasingly sophisticated.

Facebook AI

Ironically, one of the most recent ways this was highlighted to users was when Facebook experienced problems. In July 2019, a temporary outage stopped many photos from showing up on Facebook. In their place were borked image frames accompanied by the machine learning-generated tags describing what the company’s A.I. thought was in the pictures. Such a tag might, for instance, read: “Image may contain: Tree, sky, outdoor, nature, cat, people standing.” To return to The Matrix, it’s reminiscent of the final scenes of the first movie in which Neo, having achieved digital enlightenment, is able to see the world not as imaged, but as endless trailing lines of code.

Facebook’s now going further than playing “Where’s Waldo” with 2D images. As a slide that accompanies Facebook’s The Matrix demo makes abundantly clear: “We wish to understand everything in 3D, on a single glance.” It’s not just people, either. “We would like to really get A.I. to be able to understand the world just as we do,” Vedaldi said.

That means showing it a picture of an airplane and having it be able to recognize it as a plane, understand its shape in 3D space, and predict how it will move. Same thing for a chair. Or a bird. Or a car. Or a person doing yoga.

Coming soon to an app near you?

The rationale for this totally makes sense. No, this demo’s not going to pop up in a Facebook feature next week, but training an A.I. to better understand the world through the images that it sees is clearly of interest to Facebook’s overall business model. In Facebook’s life span, more than 250 billion photos have been uploaded to the platform. This translates to approximately 350 million every single day. Facebook also owns Instagram, which has had approximately 40 billion photos and videos uploaded since its inception, and some 95 million added every single day.

Training an A.I. to better understand the world through the images that it sees is clearly of interest to Facebook’s overall business model.

As one of the main ways that people communicate in social media, understanding what goes on in those images is immensely valuable — in all sorts of ways — to Facebook’s mission statement. Being able to understand and interact with images on a three-dimensional plane will also allow Facebook to thrive in new technologies like augmented reality. Imagine an AR app that turns your classic 2D Facebook photos, dating back years, into 3D ones you can explore in augmented reality. Will Facebook create such a thing? It’s not saying, but the technology is certainly there to make it — and a whole lot more — possible.

“The direction of our research here is pretty consistent with the company priorities,” Natalia Neverova, research lead at FAIR in Paris, told Digital Trends. “We would expect that at least a large chunk of our research eventually would be used for products. But I cannot tell specific timelines or applications.”

Luke Dormehl
Former Digital Trends Contributor
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
How to change margins in Google Docs
Laptop Working from Home

When you create a document in Google Docs, you may need to adjust the space between the edge of the page and the content --- the margins. For instance, many professors have requirements for the margin sizes you must use for college papers.

You can easily change the left, right, top, and bottom margins in Google Docs and have a few different ways to do it.

Read more
What is Microsoft Teams? How to use the collaboration app
A close-up of someone using Microsoft Teams on a laptop for a videoconference.

Online team collaboration is the new norm as companies spread their workforce across the globe. Gone are the days of primarily relying on group emails, as teams can now work together in real time using an instant chat-style interface, no matter where they are.

Using Microsoft Teams affords video conferencing, real-time discussions, document sharing and editing, and more for companies and corporations. It's one of many collaboration tools designed to bring company workers together in an online space. It’s not designed for communicating with family and friends, but for colleagues and clients.

Read more
Microsoft Word vs. Google Docs
A person using a laptop that displays various Microsoft Office apps.

For the last few decades, Microsoft Word has been the de facto standard for word processors across the working world. That's finally starting to shift, and it looks like one of Google's productivity apps is the heir apparent. The company's Google Docs solution (or to be specific, the integrated word processor) is cross-platform and interoperable, automatically syncs, is easily shareable, and perhaps best of all, is free.

However, using Google Docs proves it still has a long way to go before it can match all of Word's features -- Microsoft has been developing its word processor for over 30 years, after all, and millions still use Microsoft Word. Will Google Docs' low barrier to entry and cross-platform functionality win out? Let's break down each word processor in terms of features and capabilities to help you determine which is best for your needs.
How does each word processing program compare?
To put it lightly, Microsoft Word has an incredible advantage over Google Docs in terms of raw technical capability. From relatively humble beginnings in the 1980s, Microsoft has added new tools and options in each successive version. Most of the essential editing tools are available in Google Docs, but users who are used to Word will find it limited.

Read more