Skip to main content

The ‘most powerful AI training system in the world’ just went online

Elon Musk talks to the press as he arrives to to have a look at the construction site of the new Tesla Gigafactory near Berlin.
Maja Hitij/Getty Images / Getty Images

The race for AI supremacy is once again accelerating as xAI CEO Elon Musk announced via Twitter that his company successfully brought its Colossus AI training cluster, which Musk bills as the world’s “most powerful,” online over the weekend.

“This weekend, the @xAI team brought our Colossus 100k H100 training cluster online. From start to finish, it was done in 122 days. Colossus is the most powerful AI training system in the world. Moreover, it will double in size to 200k (50k H200s) in a few months. Excellent work by the team, Nvidia and our many partners/suppliers,” Musk wrote in a post on X.

Musk’s “most powerful” claim is based on the number of GPUs employed by the system. With 100,000 Nvidia H100s driving it, Colossus is estimated to be larger than any other AI system developed to date.

Musk began purchasing tens of thousands of GPUs in April 2023 to accelerate his company’s AI efforts, shortly after penning an open letter calling for an industrywide, six month “pause” on AI development.  In March of that year, Musk claimed that the company would leverage AI to “detect & highlight manipulation of public opinion” on Twitter, though the GPU supercomputer will likely also be leveraged to train its large language model (LLM), Grok.

Grok was introduced by xAI in 2023 in response to the success of rivals like ChatGPT, Gemini, Llama 3.1, and Claude. The company released the updated Grok-2 as a beta in August. “We have introduced Grok-2, positioning us at the forefront of AI development,” xAI wrote in a recent blog post. “Our focus is on advancing core reasoning capabilities with our new compute cluster. We will have many more developments to share in the coming months.”

Musk claims that he can also develop Tesla into “a leader in AI & robotics,” however, a recent report from CNBC suggests that Musk has been diverting shipments of Nvidia’s highly sought-after GPUs from the electric automaker to xAI and Twitter. Doing so could delay Tesla’s efforts to install the compute resources needed to develop its autonomous vehicle technology and the Optimus humanoid robot.

“Elon prioritizing X H100 GPU cluster deployment at X versus Tesla by redirecting 12k of shipped H100 GPUs originally slated for Tesla to X instead,” an Nvidia memo from December obtained by CNBC reads. “In exchange, original X orders of 12k H100 slated for [January] and June to be redirected to Tesla.”

Andrew Tarantola
Andrew has spent more than a decade reporting on emerging technologies ranging from robotics and machine learning to space…
OpenAI just took the shackles off the free version of ChatGPT
ChatGPT results on an iPhone.

OpenAI announced the release of its newest snack-sized generative model, dubbed GPT-4o mini, which is both less resource intensive and cheaper to operate than its standard GPT-4o model, allowing developers to integrate the AI technology into a far wider range of products.

It's a big upgrade for developers and apps, but it also expands the capabilities and reduces limitations on the free version of ChatGPT. GPT-4o mini is now available to users on the Free, Plus, and Team tiers through the ChatGPT web and app for users and developers starting today, while ChatGPT Enterprise subscribers will gain access next week. GPT-4o mini will replace the company's existing small model, GPT-3.5 Turbo, for end users beginning today.

Read more
This new free tool lets you easily train AI models on your own
Gigabyte AI TOP utility branding

Gigabyte has announced the launch of AI TOP, its in-house software utility designed to bring advanced AI model training capabilities to home users. Making its first appearance at this year’s Computex, AI TOP allows users to locally train and fine-tune AI models with a capacity of up to 236 billion parameters when used with recommended hardware.

AI TOP is essentially a comprehensive solution for local AI model fine-tuning, enhancing privacy and security for sensitive data while providing maximum flexibility and real-time adjustments. According to Gigabyte, the utility comes with a user-friendly interface and has been designed to help beginners and experienced users easily navigate and understand the information and settings. Additionally, the utility includes AI TOP Tutor, which offers various AI TOP solutions, setup guidance, and technical support for all types of AI model operators.

Read more
Apple is tackling one of the most frustrating aspects with AI today
Apple Intelligence on AI

As companies like Google, Anthropic, and OpenAI update and upgrade their AI models, the way that those LLMs interact with users is sure to change as well. However, getting used to the new system can become a hassle for users who then have to adjust how they pose their queries in order to get the results they've come to expect. An Apple research team has developed a new method to streamline that upgrade transition while reducing inconsistencies between the two versions by as much as 40%.

As part of their study, "MUSCLE: A Model Update Strategy for Compatible LLM Evolution," published July 15, the researchers argue that when upgrading their models, developers tend to focus more on upping the overall performance, rather than making sure that the transition between models is seamless for the user. That includes making sure that negative flips, wherein the new model predicts the incorrect output for a test sample that was correctly predicted by the older model, are kept to a minimum.

Read more