Skip to main content

Forget text-to-image; this AI makes videos from your prompts

You’ve likely heard about the amazing results realized by text-to-image AI such as Dall-E, Stable Diffusion, and Midjourney. As you might have expected, the revolution is marching onward, with the next target being text-to-video AI tools.

QuickVid generated this video about a DJI Drone and astronauts on Mars.

Google and Meta have teased their text-to-video capabilities in research reports from their AI labs, but this advanced technology hasn’t been available to the public. If you’ve been eagerly awaiting the chance to try creating entire videos with a simple AI prompt, now’s your chance, thanks to QuickVid.

Recommended Videos

Before your expectations climb too high, it’s important to realize that this isn’t equivalent to generating thousands of Stable Diffusion stills and assembling them to create a video or getting access to the most advanced AI systems in the world for true video generation. This is a very early entry into the race for a text-to-video solution.

The first step of the process for the AI is to generate a script based on your prompt. I tested the system by creating a YouTube Short from these words: “A video of a DJI drone flying over an astronaut on Mars, ending with a reaction shot of the surprised astronaut.”

The AI wrote a complete, 79-word narrative from my prompt, then synthesized the speech with a choice of a male or female voice. TechCrunch pointed out that the background video chosen for the generated video is taken from a stock library and there was apparently plenty of footage of “astronauts on Mars.”

As a questionable finishing touch, QuickVid overlays the script as titles and adds thumbnail images generated by the Dall-E API. The resulting YouTube short seen above is … interesting. Perhaps, it would handle more earthly videos better.

In a TechCrunch interview, the developer of QuickVid said improvements are coming, with more personalization options arriving in January. Eventually, QuickVid will also include captions and support avatars.

Next year could see many more text-to-video solutions arrive, along with other visual wonders such as AR glasses and more advanced VR headsets. It should be exciting.

Alan Truly
Alan Truly is a Writer at Digital Trends, covering computers, laptops, hardware, software, and accessories that stand out as…
New ‘poisoning’ tool spells trouble for AI text-to-image tech
Profile of head on computer chip artificial intelligence.

Professional artists and photographers annoyed at generative AI firms using their work to train their technology may soon have an effective way to respond that doesn't involve going to the courts.

Generative AI burst onto the scene with the launch of OpenAI’s ChatGPT chatbot almost a year ago. The tool is extremely adept at conversing in a very natural, human-like way, but to gain that ability it had to be trained on masses of data scraped from the web.

Read more
DALL-E 3 could take AI image generation to the next level
DALL-E 2DALL-E 2 Image on OpenAI.

OpenAI might be preparing the next version of its DALL-E AI text-to-image generator with a series of alpha tests that have now been leaked to the public, according to the Decoder.

An anonymous leaker on Discord shared details about his experience, having access to the upcoming OpenAI image model being referred to as DALL-E 3. He first appeared in May, telling the interest-based Discord channel that he was part of an alpha test for OpenAI, trying out a new AI image model. He shared the images he generated at the time.

Read more
5 things AI image generators still struggle with
Dall-E was an early AI leader but hands are not its thing.

AI image generators like Dall-E, Stable Diffusion, Midjourney, and Bing Image Creator produce amazing results, but sometimes they can be incredibly frustrating. With simple prompts containing just a few words, an AI can output impressive images that appear to be professional photographs and convincing art in various styles. However, the same prompt will occasionally create some horrific creature or hilariously flawed rendering.

Negative prompts might help reduce the likelihood of these errors, but complexity can't always save you. Even AI experts struggle with misshapen creatures and unworldly scenes, requiring long hours of refining prompts or touching-up images with a traditional photo editor. For the time being, if you look carefully in the right areas of an image, there's a good chance you'll be able to identify if it was made by a machine.
Hand salad and balls of fingers
AI developers have made progress in the struggle to teach artificial intelligence tools how human hands should look, but there's plenty of room for improvement. If fingers aren't featured prominently, it's easy to miss errors, but it's an ongoing problem.

Read more