Skip to main content

Meta’s new AI model can turn text into 3D images in under a minute

an array of 3D generated images made by Meta 3D Gen
Meta

Meta’s latest foray into AI image generation is a quick one. The company introduced its new “3D Gen” model on Tuesday, a “state-of-the-art, fast pipeline” for transforming input text into high-fidelity 3D images that can output them in under a minute.

What’s more, the system is reportedly able to apply new textures and skins to both generated and artist-produced images using text prompts.

Recommended Videos

Per a recent study from the Meta Gen AI research team, 3D Gen will not only offer both high-resolution textures and material maps but support physically-based rendering (PBR) and generative re-texturing capabilities as well.

📣 New research from GenAI at Meta, introducing Meta 3D Gen: A new system for end-to-end generation of 3D assets from text in <1min.

Meta 3D Gen is a new combined AI system that can generate high-quality 3D assets, with both high-resolution textures and material maps end-to-end,… pic.twitter.com/rDD5GzNinY

— AI at Meta (@AIatMeta) July 2, 2024

The team estimates an average inference time of just 30 seconds in creating the initial 3D model using Meta’s 3D AssetGen model. Users can then go back and either refine the existing model texture or replace it with something new, both via text prompts, using Meta 3D TextureGen, a process the company figures should take no more than an additional 20 seconds of inference time.

“By combining their strengths,” the team wrote in its study abstract, “3DGen represents 3D objects simultaneously in three ways: in view space, in volumetric space, and in UV (or texture) space.” The Meta team set its 3D Gen model against a number of industry baselines and compared along a variety of factors including text prompt fidelity, visual quality, texture details and artifacts. By combining the functions of both models, images generated by the integrated two-stage process were picked by annotators over their single-stage counterparts 68% of the time.

Granted, the system discussed in this paper is still under development and not yet ready for public use, but the technical advances that this study illustrates could prove transformational across a number of creative disciplines, from game and film effects to VR applications.

Giving users the ability to not only create but edit 3D-generated content, both quickly and intuitively, could drastically lower the barrier to entry for such pursuits. It’s not hard to imagine the effect this could have on game development, for example.

Andrew Tarantola
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
Stability AI’s music tool now lets you generate tracks up to 3 minutes long
Soundwaves.

Fears are already growing over generative AI’s challenge to human talent in the creative industries, and an update from Stability AI on Wednesday will only serve to heighten those concerns.

The London-based startup has just released Stable Audio 2.0, the latest version of its music-generation platform.

Read more
Nvidia turns simple text prompts into game-ready 3D models
A colorful collage of images generated by Nvidia's LATTE3D.

Nvidia just unveiled its new generative AI model, dubbed Latte3D, during GTC 2024. Latte3D appears to be ChatGPT on extreme steroids. I's a text-to-3D model that accepts simple, short text prompts and turns them into 3D objects and animals within a second. Much faster than its older counterparts, Latte3D works like a virtual 3D printe that could come in handy for creators across many industries.

Latte3D was made to simplify the creation of 3D models for many types of creators, such as those working on video games, design projects, marketing, or even machine learning and training for robotics. In Nvidia's demo of the model, it appears super simple to use. Following a quick text prompt, the AI generates a 3D model and shortly after finishes it off with much more detail. While the end result is nowhere near as lifelike as OpenAI's Sora, it's not meant to be -- this is a way to speed up creating assets instead of having to build them from the ground up.

Read more
New ‘poisoning’ tool spells trouble for AI text-to-image tech
Profile of head on computer chip artificial intelligence.

Professional artists and photographers annoyed at generative AI firms using their work to train their technology may soon have an effective way to respond that doesn't involve going to the courts.

Generative AI burst onto the scene with the launch of OpenAI’s ChatGPT chatbot almost a year ago. The tool is extremely adept at conversing in a very natural, human-like way, but to gain that ability it had to be trained on masses of data scraped from the web.

Read more