Apple is tackling one of the most frustrating aspects with AI today

By Andrew Tarantola July 15, 2024

Apple Intelligence on AI — Apple

As companies like Google, Anthropic, and OpenAI update and upgrade their AI models, the way that those LLMs interact with users is sure to change as well. However, getting used to the new system can become a hassle for users who then have to adjust how they pose their queries in order to get the results they’ve come to expect. An Apple research team has developed a new method to streamline that upgrade transition while reducing inconsistencies between the two versions by as much as 40%.

As part of their study, “MUSCLE: A Model Update Strategy for Compatible LLM Evolution,” published July 15, the researchers argue that when upgrading their models, developers tend to focus more on upping the overall performance, rather than making sure that the transition between models is seamless for the user. That includes making sure that negative flips, wherein the new model predicts the incorrect output for a test sample that was correctly predicted by the older model, are kept to a minimum.

Recommended Videos

This is because, the study authors argue, each user has their own quirks, quibbles, and personalized ways of interacting with chatbots. Having to continually adjust and adapt the manner in which they interact with a model can become an exhausting affair — one that is antithetical to Apple’s desired user experience.

The research team even argues that incorrect predictions by the AI should remain between versions, “There is value in being consistent when both models are incorrect,” they wrote. “A user may have developed coping strategies on how to interact with a model when it is incorrect.”

Apple presents MUSCLE

A Model Update Strategy for Compatible LLM Evolution

Large Language Models (LLMs) are frequently updated due to data or architecture changes to improve their performance. When updating models, developers often focus on increasing overall performance… pic.twitter.com/ATm2zM4Poc

— AK (@_akhaliq) July 15, 2024

To address this, the researchers first developed metrics by which to measure the degree of regression between models and then developed a strategy to minimize their occurrence. The result is MUSCLE, a strategy that doesn’t require developers to retrain the entire base model and instead relies on the use of training adapters. Adapters small AI modules that can integrate at different points along the overall LLM.

Developers can then fine-tune these specific modules instead of the entire model. This enables the model as a whole to perform distinct tasks at a fraction of the training cost and with only a small increase in the number of parameters. They’re essentially plug-ins for large language models that allow us to fine-tune specific sections of the overall AI instead of the whole thing.

The research team upgraded LLMs including Meta’s Llama and Microsoft’s Phi as part of their study, using specific math queries as samples, and found that negative flips occurred as much as 60% of the time. By incorporating the MUSCLE strategy, the team wasn’t able to fully eliminate negative flips, but they did manage to reduce their occurrence by as much as 40% compared to the control.

Topics

Andrew Tarantola

Computing Writer

Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…

Computing

We just learned something surprising about how Apple Intelligence was trained

Apple Intelligence update on iPhone 15 Pro Max.

A new research paper from Apple reveals that the company relied on Google's Tensor Processing Units (TPUs), rather than Nvidia's more widely deployed GPUs, in training two crucial systems within its upcoming Apple Intelligence service. The paper notes that Apple used 2,048 Google TPUv5p chips to train its AI models and 8,192 TPUv4 processors for its server AI models.

Nvidia's chips are highly sought for good reason, having earned their reputation for performance and compute efficiency. Their products and systems are typically sold as standalone offerings, enabling customers to construct and operate them as the best see fit.

Computing

Musk promises to deliver ‘the world’s most powerful AI’ by later this year

Elon Musk stands looking to his right.

Tesla CEO and Twitter/X owner Elon Musk announced Monday that his AI startups, xAI, had officially begun training its Memphis supercomputer, what he describes as “the most powerful AI training cluster in the world."

Once fully operational, Musk plans to use it to build "world’s most powerful AI by every metric by December of this year,” which presumably will be Grok 3.

Computing

The best free AI video generators you can try out today

dogs running and melting

The days of needing extensive editing experience and a hefty budget just to make a professional-quality video are over. A new wave of AI-powered video generators has arrived, empowering anyone with a laptop and internet connection to craft stunning and engaging videos in just a few clicks.
Leading the way is OpenAI's Sora, an AI capable of generating minutes' worth of photorealistic video in moments -- or it will when it's actually released to the public. Until then, you can try out any of these innovative AI tools for free and easily turn your cinematic ideas into reality. At the very least, they're pretty fun to play around with.

Haiper 1.5