Skip to main content

Researchers just solved AI’s biggest conundrum

The Harth Sleep-Shift Light Bulb running next to a bed.
Harth / Amazon

The large language models that power today’s chatbots like ChatGPT, Gemini, and Claude are immensely powerful generative AI systems, and immensely power-hungry ones to boot.

They apparently don’t need to be, as recent research out of University of California, Santa Cruz has shown that modern LLMs running billions of parameters can operate on just 13 watts of power without a loss in performance. That’s roughly the draw of a 100W light bulb, and a 50x improvement over the 700W that an Nvidia H100 GPU consumes.

“We got the same performance at way less cost — all we had to do was fundamentally change how neural networks work,” lead author of the paper, Jason Eshraghian, said. “Then we took it a step further and built custom hardware.” They did so by doing away with the neural network’s multiplication matrix.

Matrix multiplication is a cornerstone of the algorithms that power today’s LLMs. Words are represented as numbers and then organized into matrices where they are weighted and multiplied against one another to produce language outputs depending on the importance of certain words and their relationship to other words in the sentence or paragraph.

These matrices are stored on hundreds of physically separate GPUs and fetched with each new query or operation. The process of shuttling data that needs to be multiplied among the multitude of matrices costs a significant amount of electrical power, and therefore money.

To get around that issue, the UC Santa Cruz team forced the numbers within the matrices into a ternary state — every single number carried a value of either negative one, zero, or positive one. This allows the processors to simply sum the numbers instead of multiplying them, a tweak that makes no difference to the algorithm but saves a huge amount of cost in terms of hardware. To maintain performance despite the reduction in the number of operations, the team introduced time-based computation to the system, effectively creating a “memory” for the network, increasing the speed at which it could process the diminished operations.

“From a circuit designer standpoint, you don’t need the overhead of multiplication, which carries a whole heap of cost,” Eshraghian said. And while the team did implement its new network on custom FGPA hardware, they remain confident that many of the efficiency improvements can be retrofitted to existing models using open-source software and minor hardware tweaks. Even on standard GPUs, the team saw a 10 times reduction in memory consumption while improving operational speed by 25%.

With chip manufacturers like Nvidia and AMD continually pushing the boundaries of GPU processor performance, electrical demands (and their associated financial costs) for the data centers housing these systems have soared in recent years. With the increase in computing power comes a commensurate increase in the amount of waste heat the chips produce — waste heat that now requires resource-intensive liquid cooling systems to fully dissipate.

Arm CEO Rene Haas warned The Register in April that AI data centers could consume as much as 20-25% of the entire U.S. electrical output by the end of the decade if corrective measures are not taken, and quickly.

Andrew Tarantola
Andrew has spent more than a decade reporting on emerging technologies ranging from robotics and machine learning to space…
OpenAI’s recent acquisition could change PCs forever
smartphone running chatgpt held above a keyboard

OpenAI, the creator of ChatGPT, has announced that it has acquired a startup company called Multi, indicating that some powerful new capabilities could be coming to its AI systems. Multi is an advanced screensharing and collaboration tool made specifically for software engineering teams in mind, allowing for features such as shared cursors and simultaneous screensharing with up to 10 people.

That startup is being shut down in its acquisition to OpenAI, however, having posted this statement on their blog:
"What if desktop computers were inherently multiplayer? What if the operating system placed people on equal footing to apps? Those were the questions we explored in building Multi, and before that, Remotion. Recently, we’ve been increasingly asking ourselves how we should work with computers. Not on or using computers, but truly with computers. With AI. We believe it’s one of the most important product questions of our time."
That's definitely some provocative positioning, especially in regard to making PCs "inherently multiplayer."

Read more
Google is bringing AI to the classroom — in a big way
a teacher teaching teens

Google is already incorporating its Gemini AI assistant into the rest of its product ecosystem to help individuals and businesses streamline their existing workflows. Now, the Silicon Valley titan is looking to bring AI into the classroom.
While we've already seen the damage that teens can do when given access to generative AI, Google argues that it is taking steps to ensure the technology is employed responsibly by students and academic faculty alike.
Following last year's initial rollout of a teen-safe version of Gemini for personal use, the company at the time decided to not enable the AI's use with school-issued accounts. That will change in the coming months as Google makes the AI available free of charge to students in over 100 countries though its Google Workspace for Education accounts and school-issued Chromebooks.
Teens that meet Google's minimum age requirements -- they have to be 13 or older in the U.S., 18 or over in the European Economic Area (EEA), Switzerland, Canada, and the U.K. -- will be able to converse with Gemini as they would on their personal accounts. That includes access to features like Help me write, Help me read, generative AI backgrounds, and AI-powered noise cancellation. The company was quick to point out that no personal data from this program will be used to train AI models, and that school administrators will be granted admin access to implement or remove features as needed.
What's more, teens will be able to organize and track their homework assignments through Google Task and Calendar integrations as well as collaborate with their peers using Meet and Assignments.
Google Classroom will also integrate with the school's Student Information System (SIS), allowing educators to set up classes and import pertinent data such as student lists and grading settings. They'll also have access to an expanded Google for Education App Hub with 16 new app integrations including Kami, Quizizz, and Screencastify available at launch.
Students will also have access to the Read Along in Classroom feature, which provides them with real-time, AI-based reading help. Conversely, educators will receive feedback from the AI on the student's reading accuracy, speed, and comprehension.
In the coming months, Google also hopes to introduce the ability for teachers to generate personalized stories tailored to each student's specific education needs. The feature is currently available in English, with more than 800 books for teachers to choose from, though it will soon offer support for other languages, starting with Spanish.
Additionally, Google is piloting a suite of Gemini in Classroom tools that will enable teachers to "define groups of students in Classroom to assign different content based on each group’s needs." The recently announced Google Vids, which helps users quickly and easily cut together engaging video clips, will be coming to the classroom as well. A non-AI version of Vids arrives on Google Workspace for Education Plus later this year, while the AI-enhanced version will only be available as a Workspace add-on.
That said, Google has apparently not forgotten just how emotionally vicious teenagers can be. As such, the company is incorporating a number of safety and privacy tools into the new AI system. For example, school administrators will be empowered to prevent students from initiating direct messages and creating spaces to hinder bullying.
Admins will also have the option to block access to Classroom from compromised Android and iOS devices, and can require multiparty approval (i.e. at least two school officials) before security-sensitive changes (like turning off two-step authentication) can be implemented.
Google is introducing a slew of accessibility features as well. Chromebooks will get a new Read Aloud feature in the Chrome browser, for example. Extract Text from PDF will leverage OCR technology to make PDFs accessible to screen readers through the Chrome browser, while the Files app will soon offer augmented image labels to assist screen readers with relaying the contents of images in Chrome.
Later this year, Google also plans to release a feature that will allow users to control their Chromebooks using only their facial expressions and head movements.
These features all sound impressive and should help bring AI into the classroom in a safe and responsible manner -- in theory, at least. Though given how quickly today's teens can exploit security loopholes to bypass their school's web filters, Google's good intentions could ultimately prove insufficient.

Read more
A dangerous new jailbreak for AI chatbots was just discovered
the side of a Microsoft building

Microsoft has released more details about a troubling new generative AI jailbreak technique it has discovered, called "Skeleton Key." Using this prompt injection method, malicious users can effectively bypass a chatbot's safety guardrails, the security features that keeps ChatGPT from going full Taye.

Skeleton Key is an example of a prompt injection or prompt engineering attack. It's a multi-turn strategy designed to essentially convince an AI model to ignore its ingrained safety guardrails, "[causing] the system to violate its operators’ policies, make decisions unduly influenced by a user, or execute malicious instructions," Mark Russinovich, CTO of Microsoft Azure, wrote in the announcement.

Read more