Skip to main content

AI image generation just took a huge step forward

We’ve been living with AI-generated images for a while now, but this week, some of the major players took some big steps forward. In particular, I’m talking about significant updates to Midjourney, Google’s new model, and Grok.

Each company shows the technology evolving at different paces and in different directions. It’s still very much an open playing field, and each company demonstrates just how far the advances have come.

Recommended Videos

Midjourney hits the web

Professor Dumbledore by a pool in Wes Anderson's Harry Potter.
An AI image generated in Midjourney. Channel/Midjourney

Let’s start with Midjourney, which quietly rolled out a new web editor late Thursday that assembles a number of useful image manipulation tools into a single user interface.

Previously, functions like reframing, repainting (adding AI-generated assets to or modifying an existing image), panning, canvas extension (expanding the boundaries of the image and generating content to fill), and zooming all required their own specific tool to use and were located across multiple menus, requiring creators to constantly switch back and forth. This new UI offers a more coherent and streamlined editing process, a marked departure from the program’s start on Discord.

Midjourney just released their web editor!!

It's actually one of the coolest features they've dropped in a while

essentially lets you do inpainting, panning, zooming and more all in a single step

really really powerful pic.twitter.com/Wgyi9ElE5N

— Nick St. Pierre (@nickfloats) August 16, 2024

The new web editor is designed to make editing AI generated images easier and more seamless, per Midjourney CEO David Holz on Discord recently. “We think this makes editing your MJ images way more seamless than before and is a huge step forward,” he wrote.

Though Midjourney continues to migrate away from Discord toward being a web-based application, the company also announced that it will mirror messages from popular channels like “daily-theme,” “prompt-craft,” and “general-1” between its web rooms and Discord channels so that people can follow those threads from whichever platform they prefer. The company also introduced a new selection tool that works like a digital brush, and which has replaced both the square selection and lasso tools.

The new editor is available to all Midjourney users who have already generated more than 10 images on the platform. Initial reactions from the creator community have been largely positive.

The editor comes two weeks after the release of Midjourney 6.1, which improved image quality and coherence (such as the correct number of fingers), as well as significantly improved processing times and understanding text accuracy in its image prompts.

Grok-2 unleashes the monster

The Midjourney update also comes just two days after the release of Grok-2 by Elon Musk’s xAI startup, which is the next big thing that happened this week.

Grok’s image generation capabilities are powered by the Flux.1 model from Black Forrest Lab, which has been quickly growing in popularity due to its impressive image quality and free use.

24 hours since the launch of Grok 2.0 and its image creation capabilities!

I've prepared these 9 examples for you to make the most of it + access to an enless prompt library!

(Bookmark this for later) pic.twitter.com/7EDYSogfV2

— TechHalla (@techhalla) August 15, 2024

The biggest controversy with Grok-2 is not just its quality, which is quite good, but its seemingly undefined guidelines. Unlike many of the other AI image generators, Grok-2 appears to have very little in terms of guidelines around intellectual property, violence, and other explicit content. The isn’t the first time an AI image generator has seen this type of blunder, but with Grok, it feels intentional, with Musk calling it “the most fun AI in the world.”

People have already tested its limits and created all kinds of awful and bizarre imagery, evoking the early days of AI image generation. But if you believe Musk’s rhetoric, Grok-2’s lack of guidelines seem purposeful and could end up shaping how this technology evolves in the future.

Google gets competitive with Imagen 3

An AI image generated by Google’s Imagen-3 model. Google

Lastly, Google announced its new Imagen 3 AI model, which was released to all U.S. users on Thursday. Google calls it its “highest quality text-to-image model,” now able to produce “better detail, richer lighting and few distracting artifacts than our previous models.” Google also says that Imagen-3 is better at rendering text and now comes in different versions, built for the task at hand, such something light like a quick sketch or something much more detailed and high-resolution.

For now, Imagen 3 is only available through Google’s AI Test Kitchen, as part of ImageFX. This currently is in closed beta, meaning you’ll have to join the waitlist if you aren’t already a participant.

Andrew Tarantola
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
The best AI chatbots to try: ChatGPT, Gemini, and more
Bing Chat shown on a laptop.

The idea of chatbots has been around since the early days of the internet. But even compared to popular voice assistants like Siri, the generated chatbots of the modern era are far more powerful.

Yes, you can converse with them in natural language. But these AI chatbots can generate text of all kinds, from poetry to code, and the results really are exciting. ChatGPT remains in the spotlight, but as interest continues to grow, more rivals are popping up to challenge it.
OpenAI ChatGPT and ChatGPT Plus

Read more
Perplexity’s two new features take it beyond just a chatbot
An abstract image showing floating squares used for a Perplexity blog post.

Perplexity AI, makers of the popular chatbot by the same name, announced Thursday that it is rolling out a pair of new features that promise to give users more flexibility over the sorts of sources they employ: Internal Knowledge Search and Spaces.

"Today, we're launching Perplexity for Internal Search: one tool to search over both the web and your team's files with multi-step reasoning and code execution," Perplexity AI CEO Aravind Srinivas wrote on X (formerly Twitter). Previously, users were able to upload personal files for the AI to chew through and respond upon, the same way they could with Gemini, ChatGPT, or Copilot. With Internal Search, Perplexity will now dig through both those personal documents and the internet to infer its response.

Read more
What is Gemini Advanced? Here’s how to use Google’s premium AI
Google Gemini on smartphone.

Google's Gemini is already revolutionizing the way we interact with AI, but there is so much more it can do with a $20/month subscription. In this comprehensive guide, we'll walk you through everything you need to know about Gemini Advanced, from what sets it apart from other AI subscriptions to the simple steps for signing up and getting started.

You'll learn how to craft effective prompts that yield impressive results and stunning images with Gemini's built-in generative capabilities. Whether you're a seasoned AI enthusiast or a curious beginner, this post will equip you with the knowledge and techniques to harness the power of Gemini Advanced and take your AI-generated content to the next level.
What is Google Gemini Advanced?

Read more