Skip to main content

Google strikes back with an answer to OpenAI’s Sora launch

Veo 2 on VideoFX
Google DeepMind

Google’s DeepMind division unveiled its second generation Veo video generation model on Monday, which can create clips up to two minutes in length and at resolutions reaching 4K quality — that’s six times the length and four times the resolution of the 20-second/1080p resolution clips Sora can generate.

Of course, those are Veo 2’s theoretical upper limits. The model is currently only available on VideoFX, Google’s experimental video generation platform, and its clips are capped at eight seconds and 720p resolution. VideoFX is also waitlisted, so not just anyone can log on to try Veo 2, though the company announced that it will be expanding access in the coming weeks. A Google spokesperson also noted that Veo 2 will be made available on the Vertex AI platform once the company can sufficiently scale the model’s capabilities.

Recommended Videos

“Over the coming months, we’ll continue to iterate based on feedback from users,” Eli Collins told TechCrunch, “and [we’ll] look to integrate Veo 2’s updated capabilities into compelling use cases across the Google ecosystem … We expect to share more updates next year.”

Today, we’re announcing Veo 2: our state-of-the-art video generation model which produces realistic, high-quality clips from text or image prompts. 🎥

We’re also releasing an improved version of our text-to-image model, Imagen 3 – available to use in ImageFX through… pic.twitter.com/h6ejHaMUM4

— Google DeepMind (@GoogleDeepMind) December 16, 2024

Veo 2 reportedly holds a number of advantages over its predecessors, including a better understanding of physics (think better fluid dynamics and better illumination/shadowing effects) as well as the capacity to generate “clearer” video clips, in that generated textures and images are sharper and less prone to blurring when moving. The new model also offers improved camera controls, enabling the user to position the virtual camera lens with greater precision than before.

As TechCrunch notes, Veo 2 has not yet perfected the video generation process, though it does appear to hallucinate far less than rivals like Sora, Kling, Movie Gen, or Gen 3 Alpha. “Coherence and consistency are areas for growth,” Collins said. “Veo can consistently adhere to a prompt for a couple minutes, but [it can’t] adhere to complex prompts over long horizons. Similarly, character consistency can be a challenge. There’s also room to improve in generating intricate details, fast and complex motions, and continuing to push the boundaries of realism.”

Google also announced improvements to Imagen 3 on Monday, enabling the commercial image generation model to create “brighter, better-composed” outputs. The model, available on ImageFX, will also offer additional descriptive suggestions based on keywords in the user’s prompt, with each keyword spawning a drop-down menu of related terms.

Andrew Tarantola
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
Google’s AI detection tool is now available for anyone to try
Gemini running on the Google Pixel 9 Pro Fold.

Google announced via a post on X (formerly Twitter) on Wednesday that SynthID is now available to anybody who wants to try it. The authentication system for AI-generated content embeds imperceptible watermarks into generated images, video, and text, enabling users to verify whether a piece of content was made by humans or machines.

“We’re open-sourcing our SynthID Text watermarking tool,” the company wrote. “Available freely to developers and businesses, it will help them identify their AI-generated content.”

Read more
What is Gemini Advanced? Here’s how to use Google’s premium AI
Google Gemini on smartphone.

Google's Gemini is already revolutionizing the way we interact with AI, but there is so much more it can do with a $20/month subscription. In this comprehensive guide, we'll walk you through everything you need to know about Gemini Advanced, from what sets it apart from other AI subscriptions to the simple steps for signing up and getting started.

You'll learn how to craft effective prompts that yield impressive results and stunning images with Gemini's built-in generative capabilities. Whether you're a seasoned AI enthusiast or a curious beginner, this post will equip you with the knowledge and techniques to harness the power of Gemini Advanced and take your AI-generated content to the next level.
What is Google Gemini Advanced?

Read more
Meta and Google made AI news this week. Here were the biggest announcements
Ray-Ban Meta Smart Glasses will be available in clear frames.

From Meta's AI-empowered AR glasses to its new Natural Voice Interactions feature to Google's AlphaChip breakthrough and ChromaLock's chatbot-on-a-graphing calculator mod, this week has been packed with jaw-dropping developments in the AI space. Here are a few of the biggest headlines.

Google taught an AI to design computer chips
Deciding how and where all the bits and bobs go into today's leading-edge computer chips is a massive undertaking, often requiring agonizingly precise work before fabrication can even begin. Or it did, at least, before Google released its AlphaChip AI this week. Similar to AlphaFold, which generates potential protein structures for drug discovery, AlphaChip uses reinforcement learning to generate new chip designs in a matter of hours, rather than months. The company has reportedly been using the AI to design layouts for the past three generations of Google’s Tensor Processing Units (TPUs), and is now sharing the technology with companies like MediaTek, which builds chipsets for mobile phones and other handheld devices.

Read more