“A new era of mobile AI.” That’s how Samsung is hyping up its upcoming slate of smartphones in the Galaxy S24 series. For its Unpacked event happening later this month, the company is promising “an all-new mobile experience powered by AI.”
Samsung won’t be the first name to dip its toes into the AI hype stirred up by the likes of ChatGPT and Midjourney. The two smartphone silicon heavyweights – Qualcomm and MediaTek – recently made a huge show about the on-device generative AI capabilities of their latest flagship and mid-range processors.
The shift is evident. AI is going to be the marketing buzzword for phones. But what exactly are we in for? How these so-called “AI advances” are going to add any meaningful value remains a secret. Or maybe it’s just an existing trick waiting to be repackaged under a different name or native app.
Samsung can deliver the goods or falter spectacularly
Let’s start with the Galaxy S24 series phones. They are going to ship with Qualcomm’s Snapdragon 8 Gen 3 chip. The chipmaker is making some bold claims about the generative AI capabilities of its new top-tier chipset. For example, it is said to produce an image from text prompts within a second using the Stable Diffusion AI tech.
Qualcomm has offered a fascinatingly technical explanation of how it optimized Stable Diffusion tech for on-device operation. Rival MediaTek also claims that the on-device AI chops of its flagship Dimensity 9300 will allow text-to-image generation in less than a second courtesy of Stable Diffusion.
Right now, we already have a phone powered by Qualcomm’s latest flagship. The device in question is the iQoo 12, but it’s interesting to see that the phone’s marketing materials don’t mention any generative AI tricks, especially the kind being hyped by Qualcomm and MediaTek.
What do I do with these AI-generated ninja cat pictures?
Let’s assume Samsung will be the one destined to offer that text-to-image generation facility. What is it going to accomplish at the end of the day? Right now, we don’t know if the text-to-image trick will be bundled inside a third-party app or if Samsung will integrate it within one of its apps.
The real question is how much value is it going to add to our day-to-day smartphone usage patterns. What is an average Galaxy S24 buyer going to do with images generated with a single line of text prompt? Perhaps, they will use those AI-generated images to add some zest in chats or make some buzz on social media.
But there is still some friction here. You will have to generate these images from a line of text, save them locally (or directly copy them to the clipboard), and then paste those AI images into the chat app of your choice. The most optimum solution would be if Samsung somehow integrates the image-to-text generation trick right in the keyboard.
Once again, why go through all the trouble when emojis, GIFs, and stickers can do the job? Also, the output of 512 x 512 pixels is not enough resolution to put these AI-generated images in a college assignment or work presentation.
Furthermore, the system is likely not going to be free. MediaTek’s demonstration video mentions a Premium option being on the table. Galaxy S24 shoppers might just end up running into a limit for text-to-image tokens when they hit a certain number of outputs, after which they are either downgraded to a slower image generation speed tier or asked to pay a subscription fee.
But if that’s the reality, then the whole point is moot because we already have solutions like OpenAI’s Dall-E out there. You can get it to generate images for free or pay for ChatGPT Plus to enjoy the perks of faster and more detailed image generation with the latest Dall-E 3 model. That’s just one of the many text-to-image generators out there.
Qualcomm says the Snapdragon 8 Gen 3 is the “first to support multimodal Gen AI models.” That means the chatbot (based on Meta’s Llama model) running natively on a phone will accept inputs in the form of text and images, as well as voice. Once again, that’s not unique, as ChatGPT-4 already offers that capability, albeit behind a subscription wall.
Do we really need it?
One of the most promising features that Qualcomm advertises is AI image expansion. Essentially, you can expand the canvas of an image in any direction you want, and the on-device generative AI will intelligently generate pixels based on your text prompt and add more objects to the original frame.
It’s amazing to witness this trick — seeing an image expand with the addition of more objects, and the horizon broaden as if a master painter is giving his work a retouch. But you don’t want to do it to pictures you click on a phone to save them as a memory. Using generative AI expansion on them is like polluting your memories with fake visuals and scenarios you never actually experienced.
Summarization is another big bet for on-device generative AI deployment on phones. It’s great for reading the news and keeping abreast of the latest developments across different domains. However, this trick would stand out only if there is minimal friction. For example, if users can summarize a news article on the same browser page instead of opening another app.
If the latter is the case, why not just shift to an app that already does it? For example, Artifact is a stunningly designed app from Instagram co-founder Kevin Systrom that uses AI to summarize articles for you.
There are already apps and websites that serve news in the form of summarized nuggets, such as Inshorts. For your inbox, Shortwave is an excellent app that can do more than just summarize email chains for you at no extra cost.
On-device generative AI is also promising tricks like voice-based photo editing. It sounds amazingly handy, but it’s hard to imagine just how much convenience it will add to our lives when one-tap filters and granular sliders offer an equally swift and more rewarding flow to edit media on phones.
Next, let’s move to the bread-and-butter situation around using AI just for getting some generic chats going or obtaining answers that would otherwise require internet-fueled research. Once again, we are going to run into qualitative problems.
The generative AI models running natively on phones — like Meta’s Llama — are not the most advanced of their kind owing to the fundamental availability of resources. Look no further than Google. The Pixel 8 Pro only runs the smallest one of Google’s large language models called Gemini Nano. Why not jump to something like ChatGPT or Pi via their dedicated mobile apps instead of settling for a less capable language model?
Where generative AI really needs to be
Right now, where I see generative AI doing its best trick is in decoupling smartphone tasks from the cloud (and the requirement to be online all the time) and offering an extra dash of safety. But to do that, these on-device AI tricks need to double as an assistant, somewhat like the Google Assistant, Alexa, or Siri.
Or better yet, they need to become a part of the assistant. Tell your generative AI assistant to pick up all cat images from your library, weave them into a collage, and send them to your dad. Or, ask it to plan the best itinerary for a day trip to Disneyland, find you the cheapest ticket for the next weekend, and neatly arrange all those details on Google Calendar.
Moreover, if an on-device generative AI tool no longer pushes your data to the cloud servers and keeps every operation local to your smartphone, there is little to worry about data privacy. At least theoretically, that is. For now, I am not sure about the Galaxy AI vision that Samsung is selling, but it would be interesting to see whether Samsung can truly offer meaningful generative AI experiences or just a bunch of barely practical, gimmicky tricks.